Communications In Mathematical Physics - Volume 290

Commun. Math. Phys. 290, 1–14 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0821-5 Communications in Mathe...

Author: M. Aizenman (Chief Editor)

27 downloads 792 Views 20MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 290, 1–14 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0821-5

Communications in

Mathematical Physics

Global Well-Posedness Issues for the Inviscid Boussinesq System with Yudovich’s Type Data Raphaël Danchin1 , Marius Paicu2 1 Université Paris-Est, Laboratoire d’Analyse et de Mathématiques Appliquées,

UMR 8050, 61 avenue du Général de Gaulle, 94010 Créteil Cedex, France. E-mail: [email protected] 2 Université Paris 11, Laboratoire de Mathématiques, Bâtiment 425, 91405 Orsay Cedex, France. E-mail: [email protected] Received: 20 June 2008 / Accepted: 19 February 2009 Published online: 9 May 2009 – © Springer-Verlag 2009

Abstract: The present paper is dedicated to the study of the global existence for the inviscid two-dimensional Boussinesq system. We focus on finite energy data with bounded vorticity and we find out that, under quite a natural additional assumption on the initial temperature, there exists a global unique solution. No smallness conditions are imposed on the data. The global existence issues for infinite energy initial velocity, and for the Bénard system are also discussed. Introduction The incompressible Euler equations have been intensively studied from a mathematical viewpoint. The present paper aims at extending the celebrated result by Yudovich concerning the two-dimensional Euler system (see [17]) to the following two-dimensional Boussinesq system: ⎧ ⎪ ⎨∂t θ + u · ∇θ − κθ = 0, (Bκ,ν ) ∂t u + u · ∇u − νu + ∇ = θ e2 with e2 = (0, 1), ⎪ ⎩ div u = 0. The above system describes the evolution of the velocity field u of a two-dimensional incompressible fluid moving under a vertical force the magnitude θ of which is transported with or without diffusion by u. Above the molecular diffusion and viscosity parameters κ and ν are nonnegative, and stands for the pressure in the fluid. For the sake of simplicity, we restrict our attention to the case where the space variable x belongs to the whole plan R2 (our results extend with no difficulty to periodic boundary conditions, though). The Boussinesq system is of relevance to study a number of models coming from atmospheric or oceanographic turbulence where rotation and stratification play an important role (see e.g. [15]). The scalar function θ may for instance represent temperature variation in a gravity field, and θ e2 , the buoyancy force.

2

R. Danchin, M. Paicu

From the mathematical point of view, if both κ and ν are positive then standard energy methods yield global existence of smooth solutions for arbitrarily large data (see e.g. [5,12]). In contrast, in the case when κ = ν = 0, the Boussinesq system exhibits vorticity intensification and the global well-posedness issue remains an unsolved challenging open problem (except if θ0 is a constant of course) which may be formally compared to the similar problem for the three-dimensional axisymmetric Euler equations with swirl (see e.g. [10] for more explanations). As pointed out by H. K. Moffatt in [14], knowing whether having κ > 0 or ν > 0 precludes the formation of finite time singularities is an important issue. In [9], we stated that in the case κ = 0 and ν > 0 no such formation may be encountered for finite energy initial data. More precisely, we stated that for any (θ0 , u 0 ) in L 2 (R2 ) with div u 0 = 0, System (B0,ν ) has a unique global finite energy solution. In the present paper, we aim at investigating the opposite case, namely κ > 0 and ν = 0. The corresponding Boussinesq system thus reads ⎧ ⎪ ⎨∂t θ + u · ∇θ − κθ = 0 (Bκ,0 ) ∂t u + u · ∇u + ∇ = θ e2 ⎪ ⎩ div u = 0 and may be seen as a coupling between the two-dimensional Euler equations and a transport-diffusion equation. In passing, let us point out that in the case θ ≡ 0, System (Bκ,0 ) reduces to the Euler equation. It is well known that the standard Euler equation is globally well-posed in H s for any s > 2. A similar result has been stated for (Bκ,0 ) in the case s ≥ 3 by D. Chae in [6], then extended to rough data by T. Hmidi and S. Keraani in [13]. There, global well-posedness 1+ 2

is shown whenever the initial velocity u 0 belongs to B p,1p and the initial temperature θ0 is in L r for some ( p, r ) satisfying 2 < r ≤ p ≤ ∞ (plus a technical condition if p = r = ∞). Let us emphasize that in the Besov spaces framework, the assumption on u 0 is somewhat optimal (since it is optimal for the standard Euler equations, see [16]). Here we want to state global existence for less regular data satisfying Yudovich’s type conditions. Roughly, we want to consider data (θ0 , u 0 ) in L 2 such that the initial vorticity ω0 := ∂1 v02 − ∂2 v01 is bounded. Note however that, since we expect the corresponding solution to have bounded vorticity for all positive time, we have to introduce an additional assumption on θ0 . Indeed, the vorticity equation reads ∂t ω + u · ∇ω = ∂1 θ. Therefore, since no gain of smoothness may be expected from the above transport equa1 (R ; L ∞ ). Now, considering that θ tion, having ω bounded requires that ∂1 θ ∈ L loc + satisfies the following heat equation: ∂t θ − κθ = f with f := −u · ∇θ, the assumptions on θ0 should ensure that 1 ∇eκt θ0 ∈ L loc (R+ ; L ∞ ),

(1)

where (eλ )λ>0 stands for the heat semi-group. It turns out that (1) is equivalent to having ∇θ0 in the nonhomogeneous Besov space −2 (see e.g. [3]). This motivates the following statement which is the main result of B∞,1 the paper:

Global Well-Posedness Issues for the Inviscid Boussinesq System

3

−1 Theorem 1. Let θ0 ∈ L 2 ∩ B∞,1 and u 0 ∈ L 2 with div u 0 = 0. Assume in addition that the initial vorticity ω0 belongs to L r ∩ L ∞ for some finite r ≥ 2. System (Bκ,0 ) admits a unique global solution (θ, u) satisfying −1 2 1 1 θ ∈ C(R+ ; L 2 ∩ B∞,1 ) ∩ L loc (R+ ; H 1 ) ∩ L loc (R+ ; B∞,1 ), 0,1 ∞ u ∈ Cloc (R+ ; L 2 ) and ω ∈ L loc (R+ ; L r ∩ L ∞ ).

(2)

Remark 1. As a by-product of our proof, we gather that if in addition θ0 ∈ L p (resp. 1 ) for some p ∈ [1, +∞] then θ ∈ L ∞ (R ; L p ) (resp. u ∈ C(R ; B 1 )). u 0 ∈ B∞,1 + + ∞,1 −1 hypothesis over θ0 is quite mild compared to the L 2 hypothesis. Remark 2. The B∞,1 −1 Indeed, it may be shown that L 2 is continuously embedded in the Besov space B∞,2 −1 which is slightly larger than B∞,1 .

The paper unfolds as follows. In the first section, we prove Theorem 1. In the second section, motivated by the fact that having u 0 in L 2 and ω0 ∈ L 1 requires the vorticity to have zero average over R2 , we consider initial velocities which are L 2 perturbations of infinite energy smooth stationary solutions for the incompressible Euler equations. Some extensions to Theorem 1 are discussed in the third section. A few technical inequalities have been postponed in the Appendix. 1. Proof of Theorem 1 Proving Theorem 1 requires our using the (nonhomogeneous) Littlewood-Paley decomposition. One can proceed as in [7]: first we consider a dyadic partition of unity: 1 = χ (ξ ) + ϕ(2−q ξ ), q≥0

for some nonnegative function χ ∈ C ∞ (B(0, 43 )) with value 1 over the ball B(0, 43 ), and ϕ(ξ ) := χ(ξ/2) − χ (ξ ). Next, we introduce the dyadic blocks q of our decomposition by setting q u := 0 if q ≤ −2,

−1 u := F −1 (χ Fu) and

q u := F −1 (ϕ(2−q ·)Fu) if q ≥ 0. One may prove that for all tempered distribution u the following Littlewood-Paley decomposition holds true: u= q u. q≥−1

For s ∈ R, p ∈ [1, ∞] and r ∈ [1, ∞], one can now define the nonhomogeneous Besov space B sp,r := B sp,r (R2 ) as the set of tempered distributions u over R2 so that u B sp,r (R2 ) := 2qs q u L p (R2 ) r (Z) < ∞. We shall also use several times the following well-known fact for incompressible fluid mechanics (see the proof in e.g. [7], Chap. 3):

4

R. Danchin, M. Paicu

Proposition 1. For any p ∈]1, ∞[ the operator ω → ∇u is bounded in L p . More precisely, there exists a constant C such that ∇u L p ≤ C

p2 ω L p . p−1

One can now tackle the proof of Theorem 1. One shall proceed as follows. 1. 2. 3. 4. 5. 6.

We smooth out the data so as to get a sequence of global smooth solutions to (Bκ,0 ). Energy estimates are proved. We establish estimates in larger norms. We state uniform estimates for the first order time derivatives. We pass to the limit in the system by means of compactness arguments. Uniqueness is proved.

First step. We smooth out the initial data (θ0 , u 0 ) (use e.g. a convolution process) and get a sequence of smooth initial data (θ0n , u n0 )n∈N which is bounded in the space given in the statement of the theorem. In addition, those smooth data belong to all the Sobolev spaces H s . Hence, applying Chae’s result [6] provides us with a sequence of smooth global solutions (θ n , u n )n∈N which belong to all the spaces C(R+ ; H s ). From system (Bκ,0 ) and standard product laws in Sobolev spaces, we deduce that (θ n , u n ) belongs to C 1 (R+ ; H s ) for all s ∈ R, and thus also to C 1 (R+ ; L p ) for all p ∈ [2, ∞]. This will be more than enough to make the computations in the following two steps rigorous. Second step. We want to state energy type estimates for (θ n , u n ). Let us first take the L 2 (R2 ) inner product of θ n with the equation satisfied by θ n . Performing a space integration by parts in the diffusion term and a time integration yields t n 2 ∇θ n 2L 2 dτ = θ0n 2L 2 for all t ∈ R+ . (3) θ (t) L 2 + 2κ 0

As for the velocity u n , a similar argument gives t n n θ n L 2 dτ. u (t) L 2 ≤ u 0 L 2 + 0

Hence, bounding

θ n

L2

according to (3), we get u n (t) L 2 ≤ u n0 L 2 + tθ0n L 2 .

(4)

Third step. This is the core of the proof of global existence. We here want to get uniform estimates for the Besov norms of θ n and for ωn L r ∩L ∞ . Let us first consider the vorticity. As explained in the Introduction, we have ∂t ωn + u n · ∇ωn = ∂1 θ n . Therefore, for all p ∈ [r, ∞], ωn (t) L p ≤ ω0n L p +

t 0

∂1 θ n L p dτ.

(5)

Global Well-Posedness Issues for the Inviscid Boussinesq System

5

Hence, getting uniform bounds on ωn L r ∩L ∞ requires uniform bounds for ∂1 θ n in the 1 (R ; L r ∩ L ∞ ). Because Equality (3) supplies a bound in L 2 (R ; L 2 ) for space L loc + + 1 (R ; L ∞ ). Given that the n ∂1 θ , it is enough to get a suitable bound for (∂1 θ n )n∈N in L loc + 1 0 , and that B 0 ∞ in B∞,1 operator ∂1 maps B∞,1 ∞,1 → L , the problem reduces to proving 1 (R ; B 1 ). uniform estimates for θ n in L loc + ∞,1 For doing so, we rewrite the equation for θ n as follows : ∂t θ n − κθ n = −u n · ∇θ n ,

(6)

and take advantage of the smoothing properties of the heat equation. More precisely, it is stated in the Appendix that for all α ∈ [1, ∞], t 1 1 κ α θ n α −1+ α2 ≤ C(1 + κt) α θ0n B −1 + u n · ∇θ n B −1 dτ . (7) ∞,1

L t (B∞,1 )

∞,1

0

In order to bound the source term, one may use the following Bony’s decomposition: u n · ∇θ n = div R(u n , θ n ) +

2

T∂ j θ n u nj + Tu nj ∂ j θ n .

(8)

j=1

In the above formula, T (resp. R) stands for the paraproduct (resp. remainder) operator defined by ⎛ ⎞ q g ⎠ T f g := Sq−1 f q g ⎝resp. R( f, g) := q f (9) q≥1

q≥−1

p := p−1 + p + p+1 , and we have used the fact with S p := p ≤ p−1 p and n that, owing to div u = 0, we have 2

R(u nj , ∂ j θ n ) = div R(u n , θ n ).

j=1

For the remainder term R, it is standard (see e.g. [3]) that ≤ Cθ n B∞,∞ u n B∞,∞ . R(u n , θ n ) B∞,∞ 1 0 1

(10)

Now, because u n = ∇ ⊥ ωn with ∇ ⊥ := (−∂2 , ∂1 ), one may decompose u n into ∇ ⊥ (−)−1 q ωn . u n = −1 u n − q≥0

Using Bernstein inequalities and the fact that the operator ∇ ⊥ (−)−1 is homogeneous of degree −1, we eventually get u n B∞,∞ (11) ≤ C u n L ∞ + ωn L ∞ . 1 −1 1 0 0 0 in B∞,∞ , and as B∞,∞ → B∞,1 and H 1 → B∞,∞ , we As operator div maps B∞,∞ thus get from (10) and (11), div R(θ n , u n ) B −1 ≤ Cθ n H 1 u n L ∞ + ωn L ∞ . (12) ∞,1

6

R. Danchin, M. Paicu

Next, making use of continuity properties for the paraproduct operator (see e.g. [3]), we discover that T∂ j θ n u nj B −1 + Tu nj ∂ j θ n B −1 ≤ Cu nj L ∞ ∂ j θ n B −1 ∞,1

∞,1

for j = 1, 2.

∞,1

Plugging this latter inequality and (12) in (8), we get

u n · ∇θ n B −1 ≤ C u n L ∞ + ωn L ∞ θ n H 1 + u n L ∞ θ n B 0 . ∞,1

∞,1

(13)

In order to conclude, one may use the following two inequalities, the proof of which has been postponed in the Appendix: 1

1

1 2

1 2

u n L ∞ ≤ Cu n L2 2 ωn L2 ∞ , θ n B 0

∞,1

(14)

≤ Cθ n L 2 θ n B 1 .

(15)

∞,1

Inserting (14) and (15) in (13) then using Young inequality, we get for all ε > 0, t t u n · ∇θ n B −1 dτ ≤ C θ n H 1 u n L 2 + ωn L ∞ dτ ∞,1 0 0 t 1 + κt t n εκ + u L 2 ωn L ∞ θ n L 2 dτ + θ n B 1 dτ . ∞,1 εκ 1 + κt 0 0 Taking ε sufficiently small and coming back to (7), we end up with t n (t) ≤ C(1 + κt) n0 + θ n H 1 u n L 2 dτ 0 t

θ n H 1 + (κ −1 + t)u n L 2 θ n L 2 ωn L ∞ dτ , + 0

1

where n (t) := supα∈[1,∞] κ α θ n

−1+ 2

L αt (B∞,1 α )

and n0 := θ0n B −1 . ∞,1

On the one hand, the above inequality is rewritten: t g n (τ )ωn (τ ) L ∞ dτ n (t) ≤ f n (t) + (1 + κt)2

(16)

0

t ⎧ ⎨ f n (t) = C(1 + κt) n + n n θ u dτ , 1 2 H L 0 with 0 ⎩ n g (t) = C θ n (t) H 1 + κ −1 u n (t) L 2 θ n (t) L 2 . 0 → L ∞ , we have On the other hand, according to (5) and as B∞,1

ωn (t) L ∞ ≤ ω0n L ∞ + Cκ −1 n (t).

(17)

Inserting the above inequality in (16) and making use of the Gronwall Lemma thus yields t −1 2 t n n (t) ≤ f n (t) + (1 + κt)2 ω0n L ∞ g n (τ ) dτ eCκ (1+κt) 0 g (τ ) dτ . (18) 0

Global Well-Posedness Issues for the Inviscid Boussinesq System

7

∞ (R ; L 2 ) and that (θ n ) Obviously, (3) and (4) imply that (u n )n∈N is bounded in L loc + n∈N 2 ∞ 2 1 is bounded in L (R+ ; L ) ∩ L loc (R+ ; H ). Therefore the right-hand side of (18) may be bounded independently of n. This provides a uniform bound for θ n in the space 1 (R ; B 1 )∩L ∞ (R ; B −1 ). Next, coming back to (17) yields a bound for (ωn ) L loc + + n∈N ∞,1 ∞,1 loc ∞ (R ; L ∞ ). in L loc +

Fourth step. In order to show that (θ n , u n )n∈N converges (up to extraction), a boundedness information over (∂t θ n , ∂t u n ) is needed. As for the temperature, because ∂t θ n = κθ n − u n · ∇θ n , 2 (R ; H −1 ). the previous steps imply that (∂t θ n )n∈N is bounded in L loc + ∞ n 2 We claim that (∂t u )n∈N is bounded in L loc (R+ ; L ). Indeed, applying the Leray projector P over divergence free vector-fields to the velocity equation yields

∂t u n = −P(θ n e2 − u n · ∇u n ). Since (θ n )n∈N is bounded in L ∞ (R+ ; L 2 ), so is P(θ n e2 ). Next, as (ωn )n∈N is bounded ∞ (R ; L r ), so is (∇u n ) in L loc + n∈N according to Proposition 1. Finally, the previous results ∞ (R ; L 2 ∩ L ∞ ), thus in L ∞ (R ; L s ) n imply that sequence (u )n∈N is bounded in L loc + + loc with s = 2r/(r − 2). Thanks to the Hölder inequality, one can thus conclude that ∞ (R ; L 2 ). (u n · ∇u n )n∈N is bounded in L loc + Fifth step. Passing to the limit. According to the previous steps, we have • • • •

∞ (R ; L 2 ∩ B 1 ) ∩ L 2 (R ; H 1 ) ∩ L 1 (R ; B 1 ), (θ n )n∈N is bounded in L loc + + + ∞,1 ∞,1 loc loc 2 (R ; H −1 ), n (∂t θ )n∈N is bounded in L loc + ∞ (R ; L 2 ), (u n )n∈N and (∂t u n )n∈N are bounded in L loc + ∞ (R ; L r ∩ L ∞ ). (ωn )n∈N is bounded in L loc +

Because H −1 is (locally) compactly embedded in L 2 the classical Aubin-Lions argument (see e.g. [2]) ensures that, up to extraction, Sequence (θ n , u n )n∈N strongly con∞ (R ; H −1 ) to some function (θ, u) so that verges in L loc + loc ∞ 1 2 1 1 θ ∈ L loc (R+ ; L 2 ∩ B∞,1 ) ∩ L loc (R+ ; H 1 ) ∩ L loc (R+ ; B∞,1 ), 0,1 ∞ (R+ ; L 2 ) and ω ∈ L loc (R+ ; L r ∩ L ∞ ). u ∈ Cloc

Now, interpolating with the uniform bounds stated in the previous steps, it is easy to pass to the limit in (Bκ,0 ). Finally, from standard properties for the heat equation (see e.g. −1 [8]) we get in addition θ ∈ C(R+ ; L 2 ∩ B∞,1 ). This completes the proof of existence. Sixth step. In order to show the uniqueness part of our statement, we shall use the Yudovich argument [17] revisited by P. Gérard in [11]. Let (θ1 , u 1 , 1 ) and (θ2 , u 2 , 2 ) satisfy (2) and (Bκ,0 ) with the same data. Denote δθ := θ2 − θ1 , δu := u 2 − u 1 and δ := 2 − 1 . Because ∂t δu + u 2 · ∇δu + ∇δ = −δu · ∇u 1 + δθ e2 ,

8

R. Danchin, M. Paicu

a standard energy method combined with Hölder inequality yields for all p ∈ [2, ∞[, p 1 d δu2L 2 ≤ ∇u 1 L p δu2L 2 p + δθ L 2 δu L 2 with p := · 2 dt p−1 This inequality is rewritten: 2 2 1 d δu2L 2 ≤ p∇u 1 L δu Lp ∞ δu Lp 2 + δθ L 2 δu L 2 2 dt

(19)

with ∇u 1 L p · p r ≤ p<∞

∇u 1 L := sup

∞ (R ; L r ∩ L ∞ ) the term Let us point out that, by virtue of Proposition 1, as ω1 ∈ L loc + ∞ (R ; L 2 ) ∇u 1 (t) L is locally bounded. Of course, combining the fact that u i ∈ L loc + ∞ (R ; L ∞ ) for i = 1, 2, implies that δu ∈ L ∞ (R ; L ∞ ). and ωi ∈ L loc + + loc Next, we notice that δθ satisfies

∂t δθ − κδθ = −u 2 · ∇δθ − δu · ∇θ1 ,

∂t δθ|t=0 = 0.

Our regularity assumptions over the solutions ensure that the right-hand side belongs 2 (R ; L 2 ). Hence, according to a standard maximal regularity result for the heat to L loc + 2 (R ; L 2 ) and, using an energy method yields equation, we deduce that ∂t δθ ∈ L loc + 1 d δθ 2L 2 ≤ ∇θ1 L ∞ δθ L 2 δu L 2 . 2 dt Let ε be a small parameter (bound to tend to 0). Denote X ε (t) := δθ (t)2L 2 + δu(t)2L 2 + ε2 .

(20)

Putting inequalities (19) and (20) together gives 2 1− 2 d 1 X ε ≤ p∇u 1 L δu Lp ∞ X ε p + (1 + ∇θ1 L ∞ )X ε . dt 2 1 ∞ Let γ (t) := 2 (1+∇θ1 (t) L ). The assumptions over θ1 ensure that the function γ is in

1 (R ). Therefore, setting Y := e− L loc + ε

t 0

γ (τ ) dτ X

ε,

the previous inequality is rewritten:

t 2 2 2p −1 d −2 γ (τ ) dτ Yε Yε ≤ 2∇u 1 L δu Lp ∞ e p 0 . p dt Performing a time integration yields 2p t 2 2 p Yε (t) ≤ ε p + 2 ∇u 1 L δu L ∞ dτ . 0

Having ε tend to 0, we end up with

t p δθ (t)2L 2 + δu(t)2L 2 ≤ δu2L ∞ (L ∞ ) 2 ∇u 1 L dτ t

for all t ∈ R+ . (21)

0

As explained above, the term ∇u 1 (t) L is locally bounded. Hence one may find a T positive time T so that 0 ∇u 1 L dτ < 21 . Letting p tend to infinity in (21) thus entails that (δθ, δu) ≡ 0 on [0, T ]. Because δθ and δu are continuous in time with values in L 2 , it is now easy to conclude that (δθ, δu) ≡ 0 on R+ , by means of a standard connectivity argument.

Global Well-Posedness Issues for the Inviscid Boussinesq System

9

2. A Global Result for Infinite Energy Initial Velocity In dimension two, the assumption that u 0 is in L 2 is somewhat restrictive since it entails that the vorticity ω0 has 0 average over R2 . This in particular precludes our considering vortex patches like structures or, more generally, data with compactly supported nonnegative vorticity. The present section aims at generalizing our study to initial velocity fields with (possibly) infinite energy. The functional setting we shall introduce below is borrowed from Chemin’s in [7]. Let us first notice that whenever g is a radial Cc∞ function supported away from the origin then the smooth vector field σ defined by |x| x⊥ rg(r ) dr (22) σ (x) = |x|2 0 is a stationary solution to the two-dimensional incompressible Euler equations, and has vorticity ωσ : x → g(|x|). For m ∈ R, we then define E m as the set of all divergence-free L 2 perturbations of a velocity field σ satisfying (22) and g(|x|) d x = m. (23) R2

Showing that the definition of E m depends only on m is left to the reader (it is only a matter of using Fourier variables). The rest of this section is devoted to the proof of the following generalization of Theorem 1. −1 Theorem 2. Let θ0 ∈ L 2 ∩ B∞,1 and u 0 ∈ E m for some m ∈ R. Assume in addition that the initial vorticity ω0 belongs to L r ∩ L ∞ for some finite r ≥ 2. Then System (Bκ,0 ) admits a unique global solution (θ, u) such that 1 2 1 1 θ ∈ C(R+ ; L 2 ∩ B∞,1 ) ∩ L loc (R+ ; H 1 ) ∩ L loc (R+ ; B∞,1 ), 0,1 ∞ u ∈ Cloc (R+ ; E m ) and ω ∈ L loc (R+ ; L r ∩ L ∞ ).

(24)

Proof. As it is very similar to that of Theorem 1, we just sketch the proof and point out what has to be changed. Throughout we fix a stationary vector-field σ satisfying (22) and (23). Setting u = v + σ, System (Bκ,0 ) is rewritten: ⎧ ⎪ ⎨∂t θ + (v + σ ) · ∇θ − κθ = 0 (25) ∂t v + (v + σ ) · ∇v + v · ∇σ + ∇ = θ e2 ⎪ ⎩ div v = 0. As div σ = div v = 0, the energy estimates for θ remain the same. As for the velocity field, having the new term v · ∇σ in the equation implies that t∇σ ∞ L −1 e θ0 L 2 . (26) v(t) L 2 ≤ et∇σ L ∞ v0 L 2 + ∇σ L ∞ Now, the vorticity ωv associated to v satisfies ∂t ωv + (v + σ ) · ∇ωv + v · ∇ωσ = ∂1 θ.

10

R. Danchin, M. Paicu

Hence for all p ∈ [r, ∞],

ωv (t) L p ≤ ωv (0) L p +

t

0

Splitting v into v = −1 v −

t

∂1 θ L p dτ + 0

v L p ∇ωσ L ∞ dτ.

∇ ⊥ (−)−1 q ωv

q∈N

and using the Bernstein inequality, we readily get v L p ≤ C v L 2 + ωv L p . ∞ (R ; L r ∩ L ∞ ), Therefore, as in the proof of Theorem (1), in order to bound ωv in L loc + 1 ∞ it suffices to get a bound for ∂1 θ in L loc (R+ ; L ). This may be achieved by bounding 1 (R ; B 0 ), given that ∂1 θ in L loc + ∞,1

∂t θ − κθ = −v · ∇θ − σ · ∇θ. Arguing as in (7) reduces the problem to getting an appropriate bound for the new term 1 (R ; B −1 ). For this purpose, one may use again Bony’s decomposition, the σ ·∇θ in L loc + ∞,1 fact that div σ = 0 and classical continuity properties for the paraproduct and remainder operators. One ends up for instance with: ε θ L ∞ . σ · ∇θ B −1 ≤ Cσ B∞,∞ ∞,1

Combining (14) and the Young inequality, it is now easy to get an inequality similar to 1 (R ; B 1 ) ∩ L ∞ (R ; B −1 ). (16), and thus a bound for θ in L loc + + ∞,1 ∞,1 loc In order to prove the uniqueness, it is fundamental to notice that if (θ1 , u 1 ) and (θ2 , u 2 ) both solve (Bκ,0 ) with the same data, and satisfy (24) with the same m (an assumption which is not restrictive since we know that u 1 and u 2 coincide initially) then one may write u 1 = σ + v1 and u 2 = σ + v2 for some stationary vector-field σ satisfying (22),(23) ∞ (R ; L 2 ). and v1 , v2 in L loc + ∞ (R ; L 2 ). Taking advantage of Eq. (25)2 , it is obvious that ∂t v1 and ∂t v2 are in L loc + Now, we notice that (δv, δθ ) := (v2 − v1 , θ2 − θ1 ) satisfy ∂t δθ + u 2 · ∇δθ − κδθ = −δv · ∇θ1 , ∂t δv + u 2 · ∇δv + ∇δ = −δv · ∇u 1 + δθ e2 − δv · ∇σ. Up to the additional term −δv · ∇σ, which may be bounded as follows: δv · ∇σ L 2 ≤ δv L 2 ∇σ L ∞ , the energy bounds for the above system are the same as in the case σ = 0. Hence, from an argument similar to those used in the previous section, it is easy to conclude the proof of uniqueness. The details are left to the reader. 3. Further Results and Concluding Remarks In this concluding section, we list a few extensions which may be obtained by straightforward generalizations of our method.

Global Well-Posedness Issues for the Inviscid Boussinesq System

11

3.1. Remarks concerning the Boussinesq system. Let us stress that the key to the proof of Theorems 1 and 2 is that, on the one hand, the solution does not develop singularities as long as

T

∇θ L ∞ dt < ∞,

0

and that, on the other hand, under quite weak assumptions over the initial data, the above integral remains finite for all T < ∞. In fact, a quick revisitation of our proof shows that if one assumes in addition that −1+ε ) for some ε ∈]0, 1[ then both ∇θ and ω0 ∈ C ε and θ0 ∈ C −1+ε (with C −1+ε := B∞,∞ 1 ∞ 2 ∇u are in L loc (R+ ; L (R )) so that the additional Hölder regularity is conserved during the evolution. We believe that, more generally, our study opens a way to investigate vortex patches structures (or striated regularity) for the Boussinesq system with κ > 0 and ν = 0. Let us also emphasize that if, in addition to the hypotheses of Theorem 1, we have 1 u 0 ∈ B∞,1 then the corresponding solution (θ, u) also satisfies 1 ). u ∈ C(R+ ; B∞,1

Indeed, according to a result by M. Vishik in [16] concerning the transport equation, one 0 1 (R ; B 0 ) can propagate the B∞,1 regularity over the vorticity ω provided ∂1 θ is in L loc + ∞,1 and there exists some universal constant C such that t t ∞ ω0 B 0 + . (27) ∇u L ∂1 θ B 0 ω(t) B 0 ≤ C 1 + ∞,1

0

∞,1

0

∞,1

1 (R ; B 0 ) Now, under the sole assumptions of Theorem 1, one may bound ∂1 θ in L loc + ∞,1 0 ∞ by means of the norms of the data. Because, owing to B∞,1 → L and (14), one may write

∇u L ∞ ≤ C u L 2 + ω B 0 . ∞,1

Inequality (27) combined with the Gronwall Lemma ensures the conservation of the 0 1 additional B∞,1 regularity for the vorticity (and thus of the B∞,1 regularity for the velocity). This argument provides another proof of Hmidi and Keraani’s result in [13] under somewhat weaker assumptions over θ0 (there having θ0 in (a subspace of) L ∞ was needed).

3.2. The Bénard system. Our method may also be adapted with almost no change to the study of the following Bénard system: ⎧ ⎪ ⎨∂t θ + u · ∇θ − κθ = u 2 (28) ∂t u + u · ∇u + ∇ p = θ e2 ⎪ ⎩(θ, u)| t=0 = (θ0 , u 0 ), which describes convective motions in a heated two-dimensional inviscid incompressible fluid under thermal effects (see e.g. [1], Chap. 6). We get

12

R. Danchin, M. Paicu

−1 Theorem 3. For all data (θ0 , u 0 ) with θ0 ∈ L 2 ∩ B∞,1 and u 0 ∈ L 2 satisfying div u 0 = 0 r ∞ and ω0 ∈ L ∩ L for some r ∈ [2, ∞[, System (28) has a unique global solution (θ, u) such that 1 2 1 1 θ ∈ C(R+ ; L 2 ∩ B∞,1 ) ∩ L loc (R+ ; H 1 ) ∩ L loc (R+ ; B∞,1 ), 0,1 ∞ u ∈ Cloc (R+ ; L 2 ) and ω ∈ L loc (R+ ; L r ∩ L ∞ ).

(29)

Proof. We just briefly indicate what has to be changed compared to the proof of Theorem 1. Owing to the new term u 2 in the equation for the temperature, the energy estimates read 1 d θ 2L 2 + κ∇θ 2L 2 = θ u 2 d x, (30) 2 dt 1 d u2L 2 = θ u 2 d x. (31) 2 dt Adding up inequalities (30) and (31) yields 1 d (θ, u)(t)2L 2 + κ∇θ 2L 2 = 2 2 dt

θ u 2 d x ≤ (θ, u)2L 2 .

Thanks to the Gronwall inequality, we thus infer that t 2 (θ, u)(t) L 2 + 2κ ∇θ (τ )2L 2 dτ ≤ (θ0 , u 0 )2L 2 e2t . 0

The rest of the proof of Theorem 3 follows the lines of that of Theorem 1, once it has been noticed that the computations leading to Inequality (7) (see the Appendix) also yield t T (t−s)κ κ e u (s)ds ≤ C(1 + κ T ) u L ∞ dt. 1 1 2 L (B ) 0

T

∞,1

0

Note also that having the new (lower order) term u 2 in Eq. (28)1 is harmless for proving uniqueness. Appendix Here we prove a few inequalities which have been used throughout the paper. Proof of Inequality (7). Assume that θ satisfies ∂t θ − κθ = f,

θ|t=0 = θ0 .

Then applying the dyadic operator q to the above equality yields ∂t q θ − κq θ = q f for all q ≥ −1. From the maximum principle, we readily get −1 θ (t) L ∞ ≤ −1 θ0 L ∞ +

t 0

−1 f (τ ) L ∞ dτ,

Global Well-Posedness Issues for the Inviscid Boussinesq System

13

whence for all α ∈ [1, ∞] and t > 0,

1 −1 θ L α ([0,t];L ∞ ) ≤ Ct α −1 θ0 L ∞ + −1 f L 1 ([0,t];L ∞ ) .

(32)

Next, for bounding the high frequency blocks q θ with q ≥ 0, one may write q θ (t) = e

κt

t

q θ0 +

eκ(t−τ ) q f (τ ) dτ,

(33)

0

where (eλ )λ>0 stands for the heat semi-group, and take advantage of the following inequality stated by J.-Y. Chemin in [8]: there exists two positive constants c and C such that eλ q g L ∞ ≤ Ce−cλ2 q g L ∞ for all λ > 0 and q ≥ 0. 2q

(34)

From (33) and (34), we get q θ (t)

L∞

t −cκ22q t −cκ22q (t−τ ) ∞ ∞ ≤C e q θ0 L + e q f (τ ) L dτ . 0

Therefore, for all α ∈ [1, ∞], q ≥ 0 and t > 0,

1 2 κ α 2( α −1)q q θ L α ([0,t];L ∞ ) ≤ C2−q q θ0 L ∞ + q f L 1 ([0,t];L ∞ ) . Summing on q ≥ 0 and using (32), it is now easy to complete the proof of Inequality (7). Proof of Inequalities (14) and (15). For proving the first inequality, let us consider a L 2 divergence free vector-field u with bounded vorticity ω. As u is in L 2 , one may write u=

˙ q u with ˙ q := ϕ(2−q D).

q∈Z

Let N be an integer parameter to be chosen hereafter. Given that u = −∇ ⊥ (−)−1 ω and using the Bernstein inequalities, we have u L ∞ ≤

q≤N

˙ q u L ∞ +

˙ q u L ∞ ≤ C2 N u L 2 + C

q>N

˙ q ω L ∞ . 2−q

q>N

Therefore, u L ∞ ≤ C2 N u L 2 + C2−N ω L ∞ . Taking N so that 2 N u L 2 ≈ 2−N ω L ∞ , we get the desired inequality. Proving Inequality (15) relies on the similar decomposition into low and high frequencies. The details are left to the reader.

14

R. Danchin, M. Paicu

References 1. Ambrosetti, A., Prodi, G.: A Primer of Nonlinear Analysis. Cambridge Studies in Advanced Mathematics 34, Cambridge, Cambridge Univ. Press, 1995 2. Aubin, J.-P.: Un théorème de compacité. Comptes Rendus de l’Académie des Sciences, Paris 256, 5042–5044 (1963) 3. Bahouri, H., Chemin, J.-Y., Danchin, R.: Fourier Analysis and Nonlinear Partial Differential Equations. Springer, to appear 4. Bony, J.-M.: Calcul symbolique et propagation des singularités pour les équations aux dérivées partielles non linéaires. Ann. Scie. de l’école Normale Sup., 14, 209–246 (1981) 5. Cannon, J.R., Dibenedetto, E.: The Initial Value Problem for the Boussinesq Equations with Data in L p , Lecture Notes in Math. 771, Berlin-Heidelberg-New York: Springer, 1980, pp. 129–144 6. Chae, D.: Global regularity for the 2-D Boussinesq equations with partial viscous terms. Adv. Math. 203(2), 497–513 (2006) 7. Chemin, J.-Y.: Fluides parfaits incompressibles. Astérisque 230, 1995 8. Chemin, J.-Y.: Théorèmes d’unicité pour le système de Navier-Stokes tridimensionnel. J. d’Anal. Math. 77, 25–50 (1999) 9. Danchin, R., Paicu, M.: Le théorème de Leray et le théorème de Fujita-Kato pour le système de Boussinesq partiellement visqueux. Bull. So. Math. France 136(2), 261–309 (2008) 10. E, W., Shu, C.-W.: Small-scale structures in Boussinesq convection. Phy. Fluids 6(1), 49–58 (1994) 11. Gérard, P.: Résultats récents sur les fluides parfaits incompressibles bidimensionnels (d’après J.-Y. Chemin et J.-M. Delort). Séminaire Bourbaki, Vol. 1991/92, Astérisque 206, 411–444 (1992) 12. Guo, B.: Spectral method for solving two-dimensional Newton-Boussineq equation. Acta Math. Appl. Sinica 5, 27–50 (1989) 13. Hmidi, T., Keraani, S.: On the global well-posedness of the Boussinesq system with zero viscosity, to appear in Indiana University Mathematical Journal 14. Moffatt, H.K.: Some remarks on topological fluid mechanics. In: An Introduction to the Geometry and Topology of Fluid Flows. R. L. Ricca, ed., Dordrecht: Kluwer Academic Publishers, 2001, pp. 3–10 15. Pedlosky, J.: Geophysical Fluid Dynamics. New-York:Springer Verlag 1987 16. Vishik, M.: Hydrodynamics in Besov spaces. Arch. Rat. Mech. Anal. 145(3), 197–214 (1998) 17. Yudovich, V.: Non-stationary flows of an ideal incompressible fluid. Akademija Nauk SSSR. Žurnal Vyˇcislitel’no˘ı Matematiki i Matematiˇcesko˘ı Fiziki 3, 1032–1066 (1963) Communicated by P. Constantin

Commun. Math. Phys. 290, 15–22 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0819-z

Communications in

Mathematical Physics

A Criterion for Uniqueness of Lagrangian Trajectories for Weak Solutions of the 3D Navier-Stokes Equations James C. Robinson, Witold Sadowski Mathematics Institute, University of Warwick, Coventry CV4 7AL, UK. E-mail: [email protected] Received: 7 July 2008 / Accepted: 17 February 2009 Published online: 19 May 2009 – © Springer-Verlag 2009

Abstract: Foias, Guillopé, & Temam showed in 1985 that for a given weak solution u ∈ L ∞ (0, T ; L 2 ) ∩ L 2 (0, T ; H 1 ) of the three-dimensional Navier-Stokes equations on a domain , one can define a ‘trajectory mapping’ Φ : × [0, T ] → that gives a consistent choice of trajectory through each initial condition a ∈ , ξa (t) = Φ(a, t), and that respects the volume-preserving property one would expect for smooth flows. The uniqueness of this mapping is guaranteed by the theory of renormalised solutions of non-smooth ODEs due to DiPerna & Lions. However, this is a distinct question from the uniqueness of individual particle trajectories. We show here that if one assumes a little more regularity for u than is known to be the case, namely that u ∈ L 6/5 (0, T ; L ∞ ()), then the particle trajectories are unique and C 1 in time for almost every choice of initial condition in . This degree of regularity is more than can currently be guaranteed for weak solutions (u ∈ L 1 (0, T ; L ∞ )) but significantly less than that known to ensure that u is regular (u ∈ L 2 (0, T ; L ∞ )). We rely heavily on partial regularity results due to Caffarelli, Kohn, & Nirenberg and Ladyzhenskaya & Seregin. 1. Introduction In this paper we consider the flow of an incompressible fluid in a bounded three-dimensional domain (with C 2 boundary) governed by the unforced Navier-Stokes equations u t − νu + (u · ∇)u + ∇ p = 0 ∇ ·u =0

(1) (2)

with Dirichlet boundary condition u|∂ = 0 and initial condition u(x, 0) = u 0 .

(3)

Permanent address: Faculty of Mathematics, Informatics and Mechanics, University of Warsaw,

Banacha 2, 02-097, Warszawa, Poland

16

J. C. Robinson, W. Sadowski

The coefficient ν > 0 is the kinematic viscosity of the fluid, u denotes the velocity, and p is the pressure. It is still an open question whether or not the system (1)–(3) can develop a singularity, even if u 0 is smooth. Nevertheless, the existence of a weak solution u of the system (1)–(3) has been known for more than a half century due to the work of Leray [17] and Hopf [13]: recall that u ∈ L ∞ (0, T ; H ) ∩ L 2 (0, T ; H01 ()) is a weak solution of (1)–(3) if it satisfies (1) in a distributional sense. (H denotes the closure of divergence-free smooth functions with compact support in with respect to the norm of [L 2 ()]3 .) It is not known whether weak solutions have any physically desirable properties: there is currently no proof that they are unique, and while it has been proved that there exist weak solutions satisfying the energy inequality, t u(t)2 + Du(s)2 ds ≤ u 0 2 , 0

we do not know if this property is enjoyed by every weak solution. However, each weak solution is sufficiently regular to define the Lagrangian trajectories of ‘fluid particles’, but again uniqueness of these trajectories is an open problem. More precisely, for a given weak solution u with u 0 ∈ H01 ∩ H , Foias, Guillopé, & Temam [11] showed that for every a ∈ there exists a continuous function ξ : [0, T ] → ¯ satisfying t u(ξ(s), s) ds, (4) ξ(t) = a + 0

i.e. the integral form of the ODE ξ˙ = u(ξ(t), t) with ξ(0) = a. However, the solution of (4) may not be unique. Thus one cannot exclude a priori that a single weak solution u may give rise to many completely different flows of fluid particles, and it is not obvious that one can choose a collection of solutions of (4) that fit together in a consistent way. However, Foias et al. show that given a weak solution u there does exist at least one ‘solution map’ Φ : × [0, T ] → such that (i) (ii) (iii) (iv)

ξa (·) = Φ(a, ·) satisfies (4), ξa (·) ∈ W 1,1 (0, T ), ¯ and the mapping a → Φ(a, ·) belongs to L ∞ (; C([0, T ], )), Φ is volume-preserving, in the sense that for all continuous functions g with compact support in , g(Φ(x, t)) dx = g(x) dx. (5)

Note that, extending (5) by continuity to hold for all Borel bounded functions on and then setting g to be the characteristic function of B, (iv) implies that µ[Φ(·, t)−1 (B)] = µ(B),

(6)

i.e. Φ(·, t) is volume-preserving in a more conventional sense. Subsequent work by DiPerna & Lions [9], based on a generalised notion of a solution of (4) in terms of a solution of the transport equation ρt + (u · ∇)ρ = 0,

Lagrangian Trajectories in the 3D Navier-Stokes Equations

17

can be used (see Lions [19] for a treatment that makes this more explicit) to show not only the existence of such a mapping (and under the less restrictive requirement that u ∈ L 1 (0, T ; W 1,1 )) but also the fact that it defines a ‘generalised flow’ (so that Φ(Φ(a, s), t) = Φ(a, s + t) for a.e. a ∈ and all t, s ≥ 0) and that this flow is unique. However, the uniqueness of such a generalised flow is distinct from the question of the uniqueness of particle trajectories, since the above results guarantee only that there is one and only one selection from the a priori many possible solutions for each initial condition that fit together to give a measurable volume-preserving flow (this point is picked up explicitly in [9] and the more recent paper by Ambrosio [2] that extends the DiPerna-Lions theory to u ∈ L 1 (0, T ; BV )). In this paper we show that if u 0 ∈ H 1/2 ∩ H , u is a suitable weak solution (in the sense of Caffarelli, Kohn, & Nirenberg [3]), and if in addition u ∈ L 6/5 (0, T ; L ∞ ), then the particle trajectories are unique for almost every initial condition a ∈ . This provides a ‘conditional uniqueness result’, since the best known estimate for the L ∞ norm of weak solutions is u ∈ L 1 (0, T ; L ∞ ), see1 Foias, Guillopé, & Temam [10]. Even this estimate (which is not sufficient for our purposes) lies far outside the scope of standard Sobolev inequalities, which guarantee only that each Leray-Hopf weak solution satisfies u ∈ L r (0, T ; L s (R3 )) where

3 2 3 + ≤ , 2 ≤ s ≤ 6. r s 2

(7)

Such a conditional result is in the spirit of much recent work that guarantees the smoothness (and hence uniqueness) of weak solutions under additional conditions on their regularity: if u is a Leray-Hopf weak solution of the Navier-Stokes equations with u ∈ L r (0, T ; L s (R3 )) for some r, s with

2 3 + ≤ 1, 3 ≤ s ≤ ∞, r s

(8)

then u is regular (Serrin [23], Escauriaza, Seregin, & Sverák [12]). Note that this criterion requires that u ∈ L 2 (0, T ; L ∞ ) to guarantee regularity, significantly stronger than our assumption that u ∈ L 6/5 (0, T ; L ∞ ). 2. Main Theorem Just as the argument of Foias et al. [11] relies on bounds on the set of singular times due to Scheffer ([22]; see also Robinson & Sadowski [20] for a simpler proof of a similar result), our argument here uses the celebrated space-time regularity result of Caffarelli, Kohn, & Nirenberg [3], and an extension due to Ladyzhenskaya & Seregin [16]. Caffarelli et al. showed that for a given initial condition u 0 ∈ H there exists at least one suitable weak solution of (1)–(3) – by definition a weak solution for which the corresponding pressure p is an element of L 5/4 ((0, T ) × ) and a local form of the energy inequality holds – and then for such solutions gave a bound on the size of the set S1 , the complement of the set of ‘regular points’ {(x, t) : |u(x, t)| is essentially bounded in some neighbourhood of (x, t)}.

(9)

1 In fact Foias et al. show that Au ∈ L 2/3 (0, T ; L 2 ), where A is the Stokes operator. The L 1 (0, T ; L ∞ ) estimate follows from this using Agmon’s inequality (u∞ ≤ Du1/2 Au1/2 ) in space and Hölder’s inequality in time. Unfortunately it does not seem possible to improve on u ∈ L 1 (0, T ; L ∞ ) by using Constantin’s result [6] that Au ∈ L p (0, T ; L p ) for any p < 4/3.

18

J. C. Robinson, W. Sadowski

They showed that S1 has one-dimensional parabolic Hausdorff measure zero, P 1 (S1 ) = 0: more concretely, this means that for any n > 0 there exists a countable family of cylinders Bk × Ik , where Bk is a ball of radius rk in R3 and Ik is an interval of length rk2 , that covers S1 , and is such that

rk < 1/n.

k

Ladyzhenskaya & Seregin [16] showed, under similar conditions, that u(x, t) is in fact Hölder continuous in x and t on a set whose complement S has P 1 (S) = 0. We refer to this set S as ‘the singular set’. We now show that any suitable weak solution2 with enough regularity has particle paths that avoid the singular set for almost every choice of initial condition. Theorem 1. If u is a suitable weak solution with u ∈ L 6/5 (0, T ; L ∞ ) then the set of initial conditions a ∈ that give rise to trajectories intersecting the singular set S is of Lebesgue measure zero. Proof. For any fixed n ∈ N we cover the set of singular points S with the countable family of cylinders {Ck }∞ k=1 , where C k = Bk × Ik , Bk denotes a ball of radius rk centred at some point xk ∈ R3 , and Ik = [tk , tk + rk2 ]. Since the one-dimensional parabolic Hausdorff measure of S is zero we can choose the {Ck } in such a way that ∞

1 . n

rk <

k=1

Next, we define Rk =

tk +rk2

tk

u(s)∞ ds

and consider the balls Bˆ k = B(xk , rk + Rk ). We claim that if ξa (tk ) ∈ / Bˆ k then ξa (t) ∈ / Bk 2 for t ∈ [tk , tk +rk ], i.e. every trajectory that misses Bˆ k at time tk misses the whole cylinder Ck , see Fig. 1. Indeed, since ξa (·) ∈ W 1,1 it follows that ξa is absolutely continuous and t u(ξa (s), s) ds. ξa (t) = a + 0

Thus for all tk ≤ t ≤ tk + rk2 , |ξa (t) − ξa (tk )| =

t

tk

u(ξa (s), s) ds ≤

t

tk

u(s)∞ ds ≤ Rk , tk ≤ t1 < t2 .

2 Note that while we treat the unforced case, this is largely for simplicity of presentation. Our results extend to any forcing term for which the partial regularity results of Caffarelli et al. [3] and Ladyzhenskaya & Seregin [16] hold, namely f ∈ L 5/2+δ ((0, T ) × ) for any δ > 0. Recently, Kukavica [15] has extended the result of [3] to treat f ∈ L 5/3 ((0, T ) × ), which includes the standard case f ∈ L 2 (0, T ; H ) (see also [14]), but it is not clear if this regularity is sufficient for a result along the lines of [16].

Lagrangian Trajectories in the 3D Navier-Stokes Equations

19

ˆk B Rk rk

Ck

x

rk2

(xk , t k )

Fig. 1. Trajectories that do not meet Bˆ k = B(xk , rk + Rk ) at time t = tk cannot enter the cylinder Ck = Bk × [tk , tk + rk2 ]

Therefore, since ξa (tk ) ∈ / Bˆ k implies that |ξa (tk ) − xk | > rk + Rk , |ξa (t) − xk | ≥ |ξa (tk ) − xk | − |ξa (tk ) − ξa (t)| > rk , / Ck for every t ∈ Ik . i.e. (ξa (t), t) ∈ ˆ Now we show that the sum ∞ k=1 µ( Bk ) tends to zero as n tends to infinity. To this end, notice that ∞

µ(Bk ) =

k=1

∞ 4π k=1

3

(rk + Rk )3 ≤

∞ 16π 3 (rk + Rk3 ). 3 k=1

Obviously, ∞

∞

rk3 ≤

k=1

rk ≤ 1/n.

(10)

k=1

Moreover, Rk =

tk +rk2 tk

≤

u(s)∞ ds ≤

tk +rk2

1/6 ds

tk

tk +rk2 tk

5/6 6/5 u(s)∞ ds

1/3 rk u L 6/5 (0,T ;L ∞ ) .

So ∞ k=1

Rk3 ≤ u3L 6/5 (0,T ;L ∞ )

∞ k=1

rk ≤

1 u3L 6/5 (0,T ;L ∞ ) . n

(11)

20

J. C. Robinson, W. Sadowski

Thus (10) and (11) yield ∞ k=1

1 µ(Bk ) ≤ n

16π 3

1 + u3L 6/5 (0,T ;L ∞ )

ˆ from which it follows that ∞ k=1 µ( Bk ) → 0 as n → ∞. To finish the proof of the theorem define the set Dn by

Dn = a ∈ : ξa (tk ) ∈ Bˆ k for some k ∈ N and set D ∗ = {a ∈ : (ξa (t), t) ∈ S for some t ∈ [0, T ]}. The above argument shows that µ(D ∗ ) ≤ µ(Dn ), while using the volume-preserving property of Φ in (6), we obtain ∞ ∞ ∞

−1 ˆ µ(Dn ) = µ Φt ( Bk ) ≤ µ Φt−1 Bˆ k = µ Bˆ k → 0 k

k=1

k

k=1

k=1

as n → ∞. This implies that µ(D ∗ ) = 0 and the proof is complete.

Our uniqueness result is a corollary of the above. Note that in order to achieve uniqueness ‘at t = 0’ we require a slightly smoother initial condition than simply u 0 ∈ H , since the result of Caffarelli et al. [3] only treats the singularities that occur for t > 0. For simplicity we choose u 0 ∈ H ∩ H 1/2 (), since this ensures uniqueness while u ∈ L ∞ (0, T ; H 1/2 ) ∩ L 2 (0, T ; H 3/2 ()) (see Chemin & Lerner [4] or Dashti & Robinson [7,8]). Corollary 1. If u is a suitable weak solution corresponding to u 0 ∈ H ∩ H 1/2 (), then almost every initial condition a ∈ gives rise to a unique particle trajectory, which is a C 1 function of time. Proof. We show that any trajectory ξa (·) that misses the singular set S is unique. First, note that since u 0 ∈ H 1/2 (), trajectories are unique on some finite time interval [0, ) (see [4] or [7,8]), so we need only be concerned with non-uniqueness that arises for some t > 0. Now assume that for some x ∈ and t > 0, (x, t) lies on ξa and that there are two trajectories through (x, t). Since S is compact there is a neighbourhood of (x, t) that does not intersect S and within which the essential supremum of u is finite. It follows from results of Serrin [23] that u is uniformly Lipschitz (in fact C ∞ ) in the space-variable in some neighbourhood of (x, t), and hence the solution of ξ˙ = u(ξ, t) is unique at (x, t), a contradiction. Since u is also Hölder continuous in (x, t) on the complement of S, it follows that ξ is a C 1 function of time.

We note that, under the assumption that u ∈ L 6/5 (0, T ; L ∞ ), this corollary gives a proof of the (a.e.) uniqueness of the solution map Φ of Foias et al. [10] that is independent of the results of DiPerna & Lions [9], since the argument used in the proof of Theorem 1 only requires the existence of some volume-preserving map Φ that gives rise to solutions of dξ/dt = u(ξ, t).

Lagrangian Trajectories in the 3D Navier-Stokes Equations

21

3. Conclusion We have shown that there is a natural intermediate regularity threshold between that known to hold for weak solutions (u ∈ L 1 (0, T ; L ∞ )) and that known to guarantee regularity (u ∈ L 2 (0, T ; L ∞ )). This level of regularity – u ∈ L 6/5 (0, T ; L ∞ ) – ensures the uniqueness of particle trajectories associated to any given suitable weak solution. We note that any improvement on the Caffarelli-Kohn-Nirenberg result, namely P d (S) = 0 for some d < 1 would lead to a corresponding lessening of the regularity required here, namely u ∈ L 6/(6−d) (0, T ; L ∞ ). We also note that there are interesting results concerning the avoidance of sets of small box-counting dimension by volumepreserving flows (Aizenman [1], Cipriano & Cruzeiro [5]), which raises the interesting question of whether one can find a bound on the box-counting dimension of the singular set S. This approach, which enables one to improve on the results presented here, will be elaborated in a future paper (Robinson & Sadowski [21]). Acknowledgements. JCR wishes to acknowledge support from the Leverhulme Trust, and from the EPSRC grant EP/G007470/1. He would also like to thank Prof. Ciprian Foias for some interesting discussions. WS was partially supported by Polish Government Grant 1 P03A 017 30.

References 1. Aizenman, M.: A sufficient condition for the avoidance of sets by measure preserving flows in R N . Duke Math. J. 45, 809–812 (1978) 2. Ambrosio, L.: Transport equation and Cauchy problem for BV vector fields. Invent. Math. 158, 227–260 (2004) 3. Caffarelli, L., Kohn, R., Nirenberg, L.: Partial regularity of suitable weak solutions of the Navier-Stokes equations. Comm. Pure Appl. Math. 35, 771–831 (1982) 4. Chemin, J.Y., Lerner, N.: Flot de champs de vecteurs non lipschitziens et équations de Navier-Stokes. J. Diff. Eq. 121, 314–328 (1995) 5. Cipriano, F., Cruzeiro, A.B.: Flows associated with irregular Rd –vector fields. J. Diff. Eq. 219, 183–201 (2005) 6. Constantin, P.: Navier-Stokes equations and area of interfaces. Commun. Math. Phys. 129, 241–266 (1990) 7. Dashti, M., Robinson, J.C.: A simple proof of uniqueness of the particle trajectories for solutions of the Navier-Stokes equations. Nonlinearity 22, 735–746 (2009) 8. Dahsti, M., Robinson, J.C.: The uniqueness of Lagrangian trajectories in Navier-Stokes flows. In: Robinson, J.C., Rodrigo, J.L. (eds.), Partial Differential Equations and Fluid Mechanics, LMS Lecture Notes Series, Cambridge:Cambridge University Press, 2009 9. DiPerna, R.J., Lions, P.L.: Ordinary differential equations, transport theory and Sobolev spaces. Invent. Math. 98, 511–547 (1989) 10. Foias, C., Guillopé, C., Temam, R.: New a priori estimates for Navier-Stokes equations in dimension 3. Comm. Part. Diff. Eq. 6, 329–359 (1981) 11. Foias, C., Guillopé, C., Temam, R.: Lagrangian representation of a flow. J. Diff. Eq. 57, 440–449 (1985) 12. Escauriaza, L., Seregin, G., Šverák, V.: L 3,∞ -Solutions to the Navier-Stokes equations and backward uniqueness. Russ. Math. Surv. 58(2), 211–250 (2003) 13. Hopf, E.: Über die Anfangswertaufgabe dür die hydrodynamischen Grundgleichungen. Math. Nachr. 4, 213–231 (1951) 14. Kukavica, I.: On partial regularity for the Navier-Stokes equations. Disc. Cont. Dyn. Syst. 21, 717–728 (2008) 15. Kukavica, I.: Partial regularity results for solutions of the Navier-Stokes system. In: Robinson, J.C., Rodrigo, J.L. (eds.), Partial Differential Equations and Fluid Mechanics, LMS Lecture Notes Series, Cambridge:Cambridge University Press, 2009 16. Ladyzhenskaya, O., Seregin, G.: On partial regularity of suitable weak solutions to the three-dimensional Navier-Stokes equations. J. Math. Fluid Mech. 1, 356–387 (1999) 17. Leray, J.: Essai sur le mouvement d’un fluide visqueux emplissant l’espace. Acta Math. 63, 193–248 (1934)

22

J. C. Robinson, W. Sadowski

18. Lin, F.: A new proof of the Caffarelli-Kohn-Nirenberg theorem. Comm. Pure Appl. Math. 51, 241–257 (1998) 19. Lions, P.L.: Sur les équations différentielles ordinaires et les équations de transport. C. R. Acad. Sci. Paris I 326, 833–838 (1998) 20. Robinson, J.C., Sadowski, W.: Decay of weak solutions and the singular set of the three-dimensional Navier-Stokes equations. Nonlinearity 20, 1185–1191 (2007) 21. Robinson, J.C., Sadowski, W.: Almost-everywhere uniqueness of Lagrangian trajectories for suitable weak solutions of the three-dimensional Navier-Stokes equations (2009, submitted) 22. Scheffer, V.: Turbulence and Hausdorff dimension. In: Turbulence and Navier-Stokes Equations, Orsay 1975, Springer LNM 565, Berlin:Springer-Verlag, 1976, pp. 174–183 23. Serrin, J.: On the interior regulariy of weak solutions of the Navier-Stokes equations. Arch. Rat. Mech. Anal. 9, 187–191 (1962) Communicated by P. Constantin

Commun. Math. Phys. 290, 23–82 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0820-6

Communications in

Mathematical Physics

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile Tai-Ping Liu1,2, , Yanni Zeng3, 1 Institute of Mathematics, Academia Sinica, Taipei, Taiwan 2 Department of Mathematics, Stanford University, Stanford, CA 94305, USA 3 Department of Mathematics, University of Alabama at Birmingham,

Birmingham, AL 35294, USA. E-mail: [email protected] Received: 12 July 2008 / Accepted: 15 February 2009 Published online: 19 May 2009 – © Springer-Verlag 2009

Abstract: We study the nonlinear stability of shock waves for viscous conservation laws. Our approach is based on a new construction of a fundamental solution for a linearized system around a shock profile. We obtain, for the first time, the pointwise estimates of nonlinear wave interactions across a shock wave. Our results apply to all ranges of weak shock waves and small perturbations. In particular, our results reduce to the time-asymptotic behavior of constant state perturbation, uniformly as the strength of the shock wave tends to zero. 1. Introduction In this paper we study the Cauchy problem of a system of viscous conservation laws u t + f (u)x = u x x , x ∈ IR, t > 0, u ∈ IRn , u(x, 0) = u 0 (x).

(1.1) (1.2)

We are interested in the nonlinear stability of viscous shock wave and large time behavior of the solution. Our paper is a continuation of [Liu3], in this regard. Here we make precise the rich nonlinear interaction of waves and the shape of the time asymptotic solution, which leads to a rigorous proof of nonlinear stability of the viscous shock wave. Our approach is based on a new construction of the fundamental solution of the system linearized along a shock profile. The construction of the fundamental solution differs from that of [Liu3] in that, instead of considering scalar equations only, we interpolate the fundamental solutions, and the approach applies to the whole system as well. Such a construction allows us to obtain details of the interwinding of the decay in space and The research of the first author was partially supported by NSC Grant 96-2628-M-001-011 and NSF Grant DMS-0709248. The research of the second author was partially supported by NSF Grant DMS-0207154 and UAB Advance Program, sponsored by NSF.

24

T.-P. Liu, Y. Zeng

time, and the dependence of these on the shock strength ε in a more straightforward manner. In particular, our stability analysis holds uniformly in the shock strength and reduces to that for the perturbation of the constant states in the limit ε → 0+ , [LZ1]. The system (1.1) is with artificial viscosity. The analysis in the present paper for such a system provides the basic understanding of the wave behavior. With the basic ideas of the present paper, particularly the new construction of the fundamental solutions and the detailed analysis of wave coupling, it becomes possible to study systems with physical viscosity, which contains additional nonlinear wave coupling. Relevant results are given in a forthcoming monograph, [LZ3], with applications to the Navier-Stokes equations for the compressible fluids and the full system of magnetohydrodynamics, including the cases of multiple eigenvalues in the transversal fields. A viscous shock wave of (1.1) is a traveling wave solution connecting two different end states: u(x, t) = φ(x − st),

φ(∓∞) = u ∓ .

(1.3)

Substituting (1.3) into (1.1) we have −sφ (x) + f (φ(x)) = φ (x). Therefore, the wave speed s and the end states u ∓ satisfy the Rankine-Hugoniot condition s(u + − u − ) = f (u + ) − f (u − ),

(1.4)

as for the hyperbolic conservation laws, u t + f (u)x = 0.

(1.5)

We now assume that (1.5) is completely hyperbolic. That is, f (u) has n real eigenvalues λ1 (u) ≤ λ2 (u) ≤ · · · ≤ λn (u), and a complete set of eigenvectors: f (u)ri (u) = λi (u)ri (u), li (u)r j (u) = δi j ,

li (u) f (u) = λi (u)li (u), 1 ≤ i, j ≤ n.

(1.6)

We further assume that each characteristic field is either genuinely nonlinear, ∇λi (u) · ri (u) = 0, or linearly degenerate, ∇λi (u) · ri (u) ≡ 0, [Lax], for all u under consideration. If for some p, 1 ≤ p ≤ n, λ p (u) is simple, and the p th characteristic field is genuinely nonlinear, then a viscous p-shock wave exists, [Sm], provided ε ≡ |u + − u − | 1.

(1.7)

Besides (1.4), the viscous p-shock wave φ(x − st) in (1.3) satisfies the Lax entropy condition, [Lax]: λ p (u − ) > s > λ p (u + ).

(1.8)

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

25

Assuming the Cauchy data u 0 in (1.2) is a small perturbation of the viscous shock wave, we write the solution u to (1.1), (1.2) as u(x, t) = φ(x − st) + u(x, ¯ t).

(1.9)

Our goal is to obtain a precise pointwise estimate on u, ¯ with optimal decay rates in space, in time, and in the shock strength ε. The leading term of u¯ was identified by Liu, [Liu1], through a time-asymptotic analysis. Under assumption (1.7) we are able to decompose the initial perturbation as ∞ ∞ u(x, ¯ 0) d x = x0 u + − u − [u 0 (x) − φ(x)] d x = −∞ −∞ + ci ri (u − ) + ci ri (u + ). (1.10) i p

The coefficient x0 defines the translation of the shock in the time-asymptotic solution, while those ci , i = p, yield the diffusion waves in it, as to be discussed later. This is to replace (1.9) by u(x, t) = φ(x − st + x0 ) + θi (x, t)ri (u − ) + θi (x, t)ri (u + ) + v(x, t), (1.11) i p

where θi , i = p, are the diffusion waves, and v is the higher order term. To further simplify our notation, without loss of generality we may set the translation of shock as zero and the shock as stationary. This can always be done by change of independent variables. Therefore, we arrive at u(x, t) = φ(x) + θi (x, t)ri0 + v(x, t). (1.12) i= p

Here we have adopted the notations u i0

=

u − if i p

(1.13)

λi0 = λi (u i0 ), ri0 = ri (u i0 ), li0 = li (u i0 ), etc. We now assume that there is no eigenvalue splitting along φ(x). The diffusion waves are defined as follows. If λi is simple, θi is the self-similar solution to θit + λi0 θi x + Cii (θi2 )x = θi x x , ∞ θi (x, t) d x = ci , −∞

(1.14) (1.15)

where Cii =

1 0 0 0 0 l f (u i )(ri , ri ), 2i

(1.16)

and ci is given by (1.10). By differentiating (1.6) we have li (u) f (u)(ri (u), ri (u)) = ∇λi (u) · ri (u). Therefore, if the i th field is genuinely nonlinear, we have Cii = 0, and (1.14) is the Burgers equation. If the i th field is linearly degenerate, Cii = 0 and (1.14)

26

T.-P. Liu, Y. Zeng

becomes the heat equation. In either case θi is found explicitly. The Burgers equation can be solved by the Hopf-Cole transformation, [Ho,Co], while the heat equation gives the heat kernel carrying the mass ci . If λi is not simple, we use multiple-mode diffusion waves introduced by Chern, [Ch]. Precisely, let λi−1 < λi = · · · = λi+m−1 < λi+m and i = p. Then i = (θi , · · · , θi+m−1 )t is the self-similar solution to the m × m generalized Burgers equation it + λi0 i x +

1 0 0 L f (u i )(Ri0 i , Ri0 i )x = i x x , 2 i

(1.17)

satisfying

∞ −∞

i (x, t)d x = (ci , . . . , ci+m−1 )t ,

(1.18)

where ⎞ li ⎟ ⎜ L i0 = ⎝ ... ⎠ (u i0 ), li+m−1 ⎛

Ri0 = (ri , . . . , ri+m−1 )(u i0 ),

(1.19)

and c j , i ≤ j ≤ i + m − 1, are given in (1.10). Although i is not found explicitly, we have a precise estimate on it, [Ch]. It is alike to the Burgers wave. Chern’s estimate, together with the explicit expressions for the heat kernel and the Burgers wave, gives us the following lemma. Lemma 1.1. Let θi , i = p, be the diffusion waves given by (1.14) – (1.16) or (1.17) – (1.19). For −∞ < x < ∞, t ≥ 0, we have 1

θi (x, t) = O(1)|ci |(t + 1)− 2 e−yi , 2

θi x (x, t) = O(1)|ci |(t + 1)−1 (|yi | + 1)e−yi , 2

θit (x, t) = O(1)|ci |(t + 1)−1 (yi2 + |yi | + 1)e−yi , 2

(1.20) 3

|θit + λi0 θi x |(x, t) + |θi x x (x, t)| = O(1)|ci |(t + 1)− 2 (yi2 + |yi | + 1)e−yi , yi ≡

2

x − λi0 (t + 1) . √ 4(t + 1)

The main result of this paper is for v in (1.12). Since u, φ and θi all satisfy conservation laws, v carries zero total mass at all time:

∞ −∞

v(x, t) d x = 0.

(1.21)

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

27

This leads to the faster decay of v than θi . To describe v we introduce the following notations: 1

ψi (x, t) ≡ [(x − λi0 (t + 1))2 + t + 1]− 2 , ψ¯ i (x, t) ≡ [|x

− λi0 (t

+ 1)| + (t + 1) ] 3 2

1

i = p,

2 − 31

3

1

1

,

i = p, 1 2

χi (x, t) ≡ min{ε− 2 ψi (x, t), ε 2 (t + 1)− 2 ψi (x, t)} chari (x, t), ⎧ 0 ⎪ ⎨1 if 0 < x < λi (t + 1) and i > p 0 chari (x, t) ≡ 1 if λi (t + 1) < x < 0 and i < p , ⎪ ⎩0 otherwise

i = p, (1.22)

1

ψ p (x, t) ≡ [(|x| + ε(t + 1))2 + t + 1]− 2 , where recall that ε is the shock strength defined by (1.7). Since the viscous shock wave has been set stationary, the entropy condition (1.8) becomes − + + λ− p ≡ λ p (u ) > 0 > λ p (u ) ≡ λ p .

(1.23)

Assuming that all the functions encountered are sufficiently smooth, we have Theorem 1.2. Suppose that the initial perturbation u(x, ¯ 0) ≡ u 0 (x) − φ(x) satisfies 3

|u(x, ¯ 0)| + |u¯ x (x, 0)| = O(1)(x 2 + 1)− 4 ,

u¯ x x (·, 0) ∈ L ∞ (R).

(1.24)

Let 3

sup {(x 2 + 1) 4 (|u(x, ¯ 0)| + |u¯ x (x, 0)|)} + u¯ x x (·, 0) L ∞ = δ0 .

(1.25)

x∈R

Then there exist δ¯0 > 0 and ε¯ > 0 such that if δ0 < δ¯0 and ε < ε¯ , the Cauchy problem (1.1), (1.2) has a unique global solution u(x, t), which tends to the viscous shock wave and diffusion waves, (1.12), (1.14) – (1.19), as follows: For −∞ < x < ∞, t ≥ 0, v(x, t) =

n

v¯i (x, t)ri (φ(x)),

(1.26)

i=1

⎡

3

v¯i (x, t) = O(1)δ0 ⎣ψi2 (x, t) +

3

ψ¯ j2 (x, t)

j=i, p

⎤ 1 2

+ χi (x, t) + ε2 e−|λ p ||x|/µ ψ p (x, t)⎦ , i = p, ⎡ 3 2

v¯ p (x, t) = O(1)δ0 ⎣ψ p (x, t) +

(1.27)

⎤ 3 2

1 2

ψ¯ j (x, t) + εe−|λ p ||x|/µ ψ p (x, t)⎦ , (1.28)

j= p + where |λ p | = min{λ− p , −λ p }, while µ > 1 is an arbitrarily fixed constant.

28

T.-P. Liu, Y. Zeng

Corollary 1.3. Under the assumptions of Theorem 1.2, we have 1

1

v(·, t) L 2 = O(1)δ0 (t + 1)− 2 ,

v(·, t) L 1 = O(1)δ0 (t + 1)− 4 , 3 1 1

v(·, t) L ∞ = O(1)δ0 (t + 1)− 4 + ε 2 (t + 1)− 2 .

(1.29)

Proof. While (1.29) is straightforward from (1.26) – (1.28), we point out that appropriate expressions of χi are needed in different intervals for its integration. For instance, if i > p, by change of variables, ∞ χi (x, t) d x −∞

√ λi0 (t+1− t+1)

= 0

+ =

λi0 (t+1) √

3

1

λi0 (t+1− t+1) √ ε−1 λi0 t+1 1 2 √ λi0 t+1 − 14

= O(1)(t + 1)

1

1

1

1

min{ε− 2 [λi0 (t + 1) − x]− 2 , ε 2 (t + 1)− 2 [λi0 (t + 1) − x]− 2 } d x 3

ε 2 (t + 1)− 4 d x

ε (t + 1)

− 12 − 12

x

dx +

λi0 (t+1)

ε−1 λi0

√

1

3

1

1

ε− 2 x − 2 d x + O(1)ε 2 (t + 1)− 4

t+1

.

Theorem 1.2 gives us nonlinear stability of weak shocks, together with details of convergence to the time-asymptotic solution. In particular, it spells out explicitly the interwinding of shock strength and decay rates in space and in time. The nonlinear stability of weak shocks is an important problem. In physics, weak shocks can be justified from more basic equations, such as the Boltzmann equation in kinetic theory through Chapman-Enskog or Hilbert expansions. The problem has been studied by Goodman, [Go], and Matsumura and Nishihara, [MN], for perturbations with zero total mass, by Liu, [Liu1] for nonzero mass perturbations under a stringent initial condition. The intrinsic difficulty associated with generic perturbations then motivates Liu’s work on the pointwise approach in the absence of shock waves, [Liu2]. Such an idea is adopted by Szepessy and Xin, [SX], to study the nonlinear stability of weak shocks under generic perturbations. With partial construction of the fundamental solution, they obtain stability in the energy norms and with no convergence rate. There are attempts, e.g. [Zu], in deriving nonlinear stability from linear and spectral stability. One notes that in the one space dimensional case here, there are strong coupling of waves pertaining to different characteristic families which make the study of nonlinear stability interesting and involved. The pointwise approach is initiated in [Liu3], from the construction of the fundamental solution to wave interaction, and to a priori estimates via Duhamel’s Principle. Such an approach gives details of convergence besides nonlinear stability. The same methodology is used in the present paper. Our paper, however, differs significantly from [Liu3] as follows. As in the nonlinear stability of constant states, [LZ1], there are waves of algebraic 3 3 3 types in the asymptotic ansatz. These are ψ 2 , ψ p2 and ψ¯ 2 in (1.27) and (1.28), with i

j

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

29

3

L ∞ rates (t + 1)− 4 or faster. The perturbed shock wave, however, converges to its time1 asymptotic location at slower rate (t + 1)− 2 . This induces a slow-decaying term around 1 the shock. In [Liu3] such a term is given as (|x| + 1)−1 (|x| + t + 1)− 2 in v¯i , 1 ≤ i ≤ n. The rate in x is algebraic and such an inaccuracy leads to gaps in the proof. In the present 1

1

paper this term is identified as εe−|λ p ||x|/µ ψ p2 (x, t) in v¯ p (ε2 e−|λ p ||x|/µ ψ p2 (x, t) in v¯i , i = p), with γ1 ε ≤ |λ p | ≤ γ2 ε for some constants γ2 > γ1 > 0, see (2.22). Thus the slower decay is confined in the shock layer. In [Liu3] there are χ p in v¯ p and χi in v¯i , 3

i = p. Here these terms are also made precise (and improved) as ψ p2 in v¯ p and a new χi in v¯i , (1.22). The accurate asymptotic ansatz (1.27), (1.28) allows us a rigorous proof in the a priori estimates, Sect. 5. A weak shock of strength ε has large width in the order of 1/ε. With such a scale the transmission and reflection of waves are subtly coupled as the interaction zone is wide. Moreover, the compression of the shock is weak, and so is its stabilization mechanism. The aforementioned slower decay is an example of such a difficulty. It is then very important to understand the interwinding of rates in ε and in the decay of x and t. In [Liu3] ε is assumed small but fixed. That is, the smallness of the perturbation, δ0 , depends on the smallness of ε. In contrast, these two are independent in our theorem. Theorem 1.2 is a convergence result uniform in ε: The O(1) in (1.27) and (1.28) is bounded uniformly with respect to ε besides δ0 , x and t. The theorem is optimal in the sense that it recovers our earlier result on stability of constant states, [LZ1], as ε → 0. To see this, let u + = u + (ε) be on the p-shock curve (Hugoniot curve) of the inviscid system (1.5) starting at u − , u + (0) = u − . Let u¯ 0 (x) ∈ Rn be a sufficiently smooth function satisfying sup {(x 2 + 1) 4 |u¯ 0 (x)|} + u¯ 0 L ∞ + u¯ 0 L ∞ = δ0 < δ¯0 , 3

x∈R

∞

−∞

u¯ 0 (x) d x =

(1.30)

ci ri (u − ),

i= p

where δ¯0 is given in Theorem 1.2. For a sequence εk → 0+ denote u + (εk ) by u +k . Let φk (x) be the viscous shock wave connecting the end states u − and u +k . Let u k (x, t) be the solution to (1.1) with initial data u k (x, 0) = φk (x) + u¯ 0 (x).

(1.31)

The decomposition (1.10) for the total mass of the perturbation becomes ∞ (k) (k) (k) u¯ 0 (x) d x = x0 (u +k − u − ) + ci ri (u − ) + ci ri (u +k ). −∞

i p

Comparing (1.32) with (1.30), it is clear that (k)

x0 = O(1),

(k)

ci

= ci + O(εk ), i = p.

Replacing {εk } by a subsequence, we have lim x (k) k→∞ 0

= x0∗ ,

lim c(k) k→∞ i

= ci , i = p,

(1.33)

30

T.-P. Liu, Y. Zeng

where x0∗ is a finite number. Denote the shock speed of φk as sk . It is a classical result that lim sk = λ− p.

(1.34)

k→∞

Under assumption (1.30) we can apply Theorem 1.2 to u k . From (1.11), (1.14) – (1.19) and (1.26), (k) θi (x + x0(k) , t)ri (u − ) u k (x, t) = φk (x − sk t + x0(k) ) + +

(k)

(k)

θi (x + x0 , t)ri (u +k ) +

i> p

i< p n

(k)

(k)

v¯i (x, t)ri (φk (x − sk t + x0 )).

(1.35)

i=1

Using (1.27), (1.28) and (1,22), as k → ∞, φk (x − sk t + x0(k) ) → u − , (k)

(k)

θi (x + x0 , t) → θi∗ (x + x0∗ , t), i = p, ⎧ ⎫ ⎨ ⎬ 3 3 (k) ψ¯ ∗j (x + x0∗ , t) 2 , i = p, (1.36) v¯i (x, t) → O(1)δ0 ψi∗ (x + x0∗ , t) 2 + ⎩ ⎭ v¯ (k) p (x, t) → O(1)δ0

⎧ ⎨ ⎩

j=i, p

3 2

ψ p∗ (x + x0∗ , t) +

ψ¯ ∗j (x + x0∗ , t)

j= p

3 2

⎫ ⎬ ⎭

,

(1.37)

where θi∗ , ψi∗ , and ψ¯ ∗j are θi , ψi , and ψ¯ j with u i0 replaced by u − . In particular, − 1 2 2 (t + 1)) + t + 1 . ψ p∗ (x, t) = (x − λ− p That is, as k → ∞, u k in (1.35) tends to u ∗ (x, t) = u − +

θi∗ (x + x0∗ , t)ri (u − ) +

i= p

where vi∗ (x, t) = O(1)δ0

⎧ ⎨ ⎩ ⎧ ⎨

n

vi∗ (x, t)ri (u − ),

(1.38)

i=1

3 2

ψi∗ (x, t) +

ψ¯ ∗j (x, t)

3 2

⎫ ⎬

⎭ ⎫ ⎬ 3 3 ψ¯ ∗j (x, t) 2 . v ∗p (x, t) = O(1)δ0 ψ p∗ (x, t) 2 + ⎩ ⎭

, i = p,

(1.39)

j=i, p

(1.40)

j= p

Note that from (1.36) and (1.37) to (1.39) and (1.40) we have used the fact that x0∗ is finite. This is due to the assumption that the total mass of perturbation has zero projection 3 on r p (u − ) direction, (1.30). The same assumption also leads to the absence of ψ¯ p∗ (x, t) 2

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

31

in (1.39). Equations (1.38) – (1.40) are exactly the result for perturbations of constant state u − subject to (1.30), obtained in [LZ1]. Our paper follows the same general pointwise approach as in [Liu3]. The fundamental solution along the shock wave, however, is constructed in a completely different way. The construction in [Liu3] does not have an obvious way to extend to the much more complicated systems with physical viscosity, where the linearized equations no longer decouple into n scalar equations. The new construction in the present paper is based on the interpolation of fundamental solutions for the end states, together with appropriately chosen partition functions and shifts of characteristic lines. The idea can be extended to systems with physical viscosity in a straightforward manner. The details of construction, however, are much more involved. The results on these systems, including the Navier-Stokes equations and the equations for magnetohydrodynamics, of fundamental solutions and nonlinear stability of shock waves, are written in a forthcoming research monograph, [LZ3]. Results on stability of constant states and large time behavior for partially dissipative systems can be found in [Ze1,Ze2,LZ1,SZ,LZ2,Ze3,Ze4]. The plan of this paper is the following. In Sect. 2 we construct an approximate fundamental solution of the linearized system and give an intuitive explanation. In Sect. 3 we derive estimates on the fundamental solution. In particular, we assess the truncation error. In Sect. 4 we give results on wave interaction. The estimates in Sects. 3 and 4 are then applied to the a priori analysis in Sect. 5, where we finally prove Theorem 1.2. In the rest of the paper we adopt the following notations: To be consistent with (1.23) we let λi∓ ≡ λi (u ∓ ), li∓ ≡ li (u ∓ ), ri∓ ≡ ri (u ∓ ), etc. Throughout this paper we use C > 0 to denote a sufficiently large constant, and any O(1) is a function bounded uniformly with respect to δ0 , ε, x, t, y, τ and any other independent variables. 2. Construction of Fundamental Solution The faster decay of v than the diffusion waves θi in Theorem 1.2 comes from the fact that v carries zero total mass, (1.21). It is crucial to utilize such a fact in the proof of the theorem. For this we define ∞ x v(y, t) dy = − v(y, t) dy. (2.1) w(x, t) = −∞

x

Decompose w(x, t) into eigenvector directions along the shock, w(x, t) =

n

wi (x, t)ri (φ(x))

(2.2)

i=1

and let vi (x, t) = wi x (x, t).

(2.3)

Equations (2.1) – (2.3) yield v(x, t) = wx (x, t) =

n i=1

vi (x, t)ri (φ(x)) +

n i=1

wi (x, t)

d ri (φ(x)). dx

(2.4)

In the proof of Theorem 1.2, the a priori estimate starts first with wi and then proceeds to vi . We need wit and higher derivatives as well to complete the analysis. In this section we

32

T.-P. Liu, Y. Zeng

derive the equation for wi , and construct the fundamental solution for the linearization along the shock wave. To simplify our notation we write (1.12) as u(x, t) = φ(x) + θ (x, t) + v(x, t), where θ (x, t) =

(2.5)

ri0 θi (x, t),

(2.6)

i= p

and the ri0 and θi are understood, respectively, as the n × m matrix Ri0 and the m-vector i in (1.17) if λi has multiplicity m, hence the summation in (2.6) is for all i = p with distinct λi . Consistent with (1.16) we define 1 (2.7) Ci j (θ j , θ j ) = li (u 0j ) f (u 0j )(r 0j θ j , r 0j θ j ), 2 where li is the m × n matrix that consists of the m left eigenvectors corresponding to λi . Thus L i0 in (1.19) is now li0 . With such notations, (1.14) and (1.17) are simplified to θit + λi0 θi x + Cii (θi , θi )x = θi x x ,

i = p.

(2.8)

Similarly, vi and wi are now m-vectors if λi is m-multiple, and (2.2) and (2.4) become ri (φ(x))wi (x, t), (2.9) w(x, t) = v(x, t) =

i

ri (φ(x))vi (x, t) +

i

d ri (φ(x))wi (x, t), dx

(2.10)

i

with summation over i for distinct λi . From (2.5), (1.1), (2.6) and (2.8), we have ri0 λi0 θi + Cii (θi , θi ) − ri0 θi x x , vt = − f (u)x + u x x + x

i= p

which yields wt = f (φ) − f (u) − φx + u x +

i= p

ri0 λi0 θi + Cii (θi , θi ) − ri0 θi x ,

i= p

i= p

using (2.1) and the definition of φ. With (2.9), for all i we have wit + λi (φ)wi x = wi x x + gi (x, t), where gi (x, t) = −λi (φ)li (φ)

(2.11)

d r j (φ)w j dx j

⎧ ⎫ ⎨ d ⎬ d2 r j (φ)v j + +li (φ) 2 r (φ)w j j ⎩ ⎭ dx dx2 j j ⎧ ⎫ ⎨ ⎬ −li (φ) f (u)− f (φ)− f (φ)v− r 0j λ0j θ j + C j j (θ j , θ j ) (2.12) ⎩ ⎭ j= p

by (2.5), (2.6), (2.10) and (2.3).

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

33

We now construct the fundamental solution for (2.11) without the source gi (x, t). Notice that if λi (φ) is replaced by a constant, the fundamental solution is a heat kernel. For λi depending on φ(x), the idea is to construct an approximate one that is an interpolation of the two heat kernels for the end states. For this we introduce two sets of partition functions. The first pair of partition functions are for the transversal fields. Recall that the shock strength of φ is ε, (1.7), hence the width of the shock layer is 1/ε. We define partition functions ρa∓ as mollified Heaviside functions with the same width. Precisely, ρa+ (x)

≡ J (x; ε) ∗

0 if x < 0 , 1 if x > 0

(2.13)

where

J (x; ε) = ε

⎧ ⎪ ⎨ K exp − ⎪ ⎩0

1 1 − (εx)2

1 ε 1 if |x| ≥ ε

if |x| <

is the mollifier. The constant K is so chosen that ∞ J (x; ε) d x = 1. −∞

Evaluating (2.13) we have ⎧ ⎪ ⎪ ⎪0 ⎪ ⎪ ⎨ εx 1 + exp − dy ρa (x) = K ⎪ 1 − y2 −1 ⎪ ⎪ ⎪ ⎪ ⎩1

1 if x ≤ − ε 1 if |x| < . ε 1 if x ≥ ε

(2.14)

We define ⎧ ⎪ ⎪ 1 ⎪ ⎪ ⎪ ⎨ 1 1 ρa− (x) ≡ 1 − ρa+ (x) = K dy exp − ⎪ 1 − y2 ⎪ εx ⎪ ⎪ ⎪ ⎩0 Lemma 2.1. For any integer j ≥ 1 and any λ = ⎧ ⎪ ⎪ ⎨ O(1)ε j if |x| < dj ∓ ρ (x) = ⎪ dx j a ⎪ ⎩0 if |x| ≥

1 ε 1 if |x| < . ε 1 if x ≥ ε

if x ≤ −

(2.15)

O(1)ε, 1 ε = O(1)ε j e−|λ||x| . 1 ε

Proof. Equation (2.16) follows from (2.14) and (2.15).

(2.16)

34

T.-P. Liu, Y. Zeng

The second pair of partition functions are for the compression field. They are the Burgers weights as to be seen in (2.34). For a parameter |λ| > 0 define ρb− (x; |λ|) ≡

1 , 1 + e|λ|x

ρb+ (x; |λ|) ≡ 1 − ρb− (x; |λ|) =

1 . 1 + e−|λ|x

(2.17)

Notice that as x → ±∞, ρb∓ → 0 or 1, and they also have the width 1/ε if γ1 ε < |λ| < γ2 ε for some constants γ2 > γ1 > 0. Lemma 2.2. Let σ = ∓ and −σ = ±. Let µ > 1 be any fixed constant. Then ∓

−λ p x/µ ρb± (x; |λ∓ p |)e

ρbσ (x; |λ|) = ρb−σ (x; |λ|)eσ |λ|x , (2.18) ∓ ∓ −|λ p ||x|(1−1/µ) . (2.19) = O(1) e−|λ p ||x|/µ + ρb∓ (x; |λ∓ p |)e

Proof. Equation (2.18) is straightforward from (2.17). Equation (2.19) is a consequence of (2.18) and the entropy condition (1.23). Lemma 2.3. For any integer j ≥ 1 we have ε j q j−1 (eεx ) eεx dj − − ρ (x; ε) = ρ (x; ε) b b dx j (1 + eεx ) j j ε q j−1 e−εx e−εx + = ρb (x; ε) j 1 + e−εx = O(1)ε j e−ε|x| = O(1)ε j ρb∓ (x; ε) = O(1)ε j ,

(2.20)

where q j−1 (y) denotes a universal polynomial in y of degree not more than j − 1. Proof. The first equality in (2.20) can be shown by induction, while the others are direct consequences of it. Next we define the shifts of the initial point in different characteristic fields. This is to guarantee the continuity of characteristic curves when crossing the shock and to make interpolation feasible, as to be explained below. For each i we define ⎧ − ⎪ 1 ⎨ λi x − 1 + 1 if i ≥ p and x > + − − xi = xi (x; ε) = λi ε ε ε , ⎪ ⎩ x otherwise (2.21) ⎧ λ+ ⎪ 1 1 1 ⎨ i − if i ≤ p and x < − x+ xi+ = xi+ (x; ε) = λi− ε ε ε . ⎪ ⎩ x otherwise Besides the entropy condition (1.23), it is a classical result that γ1 ε ≤ |λ∓ p | ≤ γ2 ε

(2.22)

for some constants γ2 > γ1 > 0, and + 2 λ− p + λ p = O(1)ε ,

see [Lax]. Using (2.21) – (2.23) and (1.23), we have

(2.23)

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

35

Lemma 2.4. For each i (i = p or i = p), ∓ x − xi (x; ε) = O(1)ε x ∓

1 = O(1)ε xi∓ ∓ ε d ∓ 1 x = ∓ , x (x; ε) = 1 + O(1)ε, dx i ε ⎧ − ⎪ ⎨ λi x + O(1) xi− = λi+ ⎪ ⎩ x ⎧ + ⎪ ⎨ λi x + O(1) + xi = λi− ⎪ ⎩ x

if i ≥ p and x >

1 , ε

(2.24) (2.25)

1 ε ,

(2.26a)

otherwise if i ≤ p and x < −

1 ε .

(2.26b)

otherwise

We introduce the following notation for heat kernels: 1 (x − λt)2 H (x, t; λ, µ) ≡ √ , exp − 4µt 4π µt H (x, t; λ) ≡ H (x, t; λ, 1).

(2.27)

The approximate fundamental solution of (2.11) is defined as G i (y, τ ; x, t) = ρaσ (y) ρaσ (x)H (x − y, t − τ ; λiσ ) σ =−,+ + ρa−σ (x)H (xiσ

G p (y, τ ; x, t) =

ρaσ (y)

− y, t − τ ; λiσ ) ,

ρbσ (x; |λσp |)H (x

− y, t

i = p,

(2.28)

− τ ; λσp )

σ =−,+

+ ρb−σ (x; |λσp |)H (x σp − y, t − τ ; −λσp ) .

(2.29)

Notice that if λi , i = p, has multiplicity m i > 1, without the source term gi the m i equations in (2.11) are the same scalar equations repeating m i times. The fundamental solution of this scalar equation is then given by (2.28), and G i should be multiplied by an m i × m i identity. The same comment applies to the following lemma as well. Lemma 2.5. Let −∞ < x, y < ∞ and t ≥ 0. For all i under consideration G i (y, t; x, t) = δ(x − y),

(2.30)

where δ is the Dirac δ-function. Proof. Consider the case i = p; the case i = p is similar. For |x| ≤ 1/ε, from (2.29), (2.27), (2.21), (2.17) and (2.15), we have G p (y, t; x, t) = ρaσ (y) ρbσ (x; |λσp |)δ(x − y) + ρb−σ (x; |λσp |)δ(x − y) σ =−,+

= δ(x − y).

36

T.-P. Liu, Y. Zeng

For x < −1/ε, by (2.21) x +p < −1/ε. From (2.14), ρa+ (x) = ρa+ (x +p ) = 0. Therefore, (2.29) yields G p (y, t; x, t) =

σ =−,+

=

ρaσ (y) ρbσ (x; |λσp |)δ(x − y) + ρb−σ (x; |λσp |)δ(x σp − y) ρaσ (x)ρbσ (x; |λσp |)δ(x − y)

σ =−,+

+ =

ρaσ (x σp )ρb−σ (x; |λσp |)δ(x σp − y)

σ =−,+ − ρa (x)ρb− (x, λ− p )δ(x

− y) + ρa− (x)ρb+ (x, λ− p )δ(x − y) = δ(x − y),

where we have used (1.23), (2.21), (2.17) and (2.15). The case with x > 1/ε is treated similarly. We now give an intuitive explanation of G i defined in (2.28) and (2.29). For the compression field i = p we consider the simplest but important case, the Burgers equation ut +

u2 2

= uxx .

(2.31)

x

From Rankine-Hugoniot condition (1.4) and the entropy condition (1.8), the stationary shock for the inviscid equation is (ε0 , −ε0 ), ε0 > 0, with shock strength ε = 2ε0 . It is straightforward to verify that the viscous shock connecting ε0 and −ε0 is φ B (x) = ε0

1 − eε0 x . 1 + eε0 x

(2.32)

Linearize (2.31) around φ B , we want to find the fundamental solution G B for wt + φ B (x)wx = wx x .

(2.33)

This is to solve the dual equation with the initial condition G B (y, t; x, t) = δ(x − y). Direct calculation yields 1 + eε0 y H (x − y, t − τ ; ε0 ) 1 + eε0 x 1 + e−ε0 y = H (x − y, t − τ ; −ε0 ) 1 + e−ε0 x = ρbσ (x; ε0 )H (x − y, t − τ ; −σ ε0 ).

GB (y, τ ; x, t) =

(2.34)

σ =−,+

Here we have used the identity (x − y − λ(t − τ ))2 λ(x − y) (x − y + λ(t − τ ))2 exp − = exp − , (2.35) 4µ(t − τ ) µ 4µ(t − τ )

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

37

µ > 0 and λ being constants. If we apply (2.29) to (2.33), we have λ∓ p = ±ε0 , and = x by (2.21). Thus the approximate fundamental solution is x∓ p ρaσ (y) ρbσ (x; ε0 )H (x − y, t − τ ; −σ ε0 ) G p (y, τ ; x, t) =

=

σ =−,+ + ρb−σ (x; ε0 )H (x − y, t − τ ; σ ε0 ) ρaσ + ρa−σ (y)ρbσ (x; ε0 )H (x σ =−,+

− y, t − τ ; −σ ε0 )

= GB (y, τ ; x, t), using (2.15) and (2.34). That is, the approximate fundamental solution recovers the exact one in the case of the Burgers equation. In the general case of (1.1), the nonlinearity represented by ρb∓ in (2.29) is the right one. This will become clear when we assess the truncation error in Theorem 3.14. For that we will need the following lemma. Lemma 2.6. For −∞ < x, y < ∞ and 0 ≤ τ ≤ t, 1 − e|λ|y σ ρb (x; |λ|)H (x − y, t − τ ; −σ |λ|) |λ| + σ |λ| = 0, 1 + e|λ|y σ =−,+

(2.36)

where |λ| > 0 is a parameter. Proof. Using (2.18) and (2.35), (2.36) is obtained by direct calculation.

Next, we consider the transversal fields i = p. If y < 0 and x < 0, G i given by (2.28) is roughly H (x − y, t − τ ; λi− ). Meanwhile, we are solving the dual equation of (2.11) in the left quarter plane with the initial point (x, t) in it. Therefore, λi (φ(y)) ≈ λi− , and the solution indeed is about H (x − y, t − τ ; λi− ). For the more complicated case y < 0 and x > 0, G i is about H (xi− − y, t − τ ; λi− ). Here we still solve the dual equation in the left quarter plane. The initial point (x, t), however, is in the right quarter plane. This means that initially (τ ≈ t) the solution is about H (x − y, t − τ ; λi+ ), restricted to the left quarter plane y < 0. As τ goes backward (becomes smaller), the center of the heat kernel may or may not enter the shock layer, depending on whether i > p or i < p. If it does, the characteristic speed is λi+ before the center enters the shock, gradually changes inside the layer, and becomes λi− as the center leaves the shock into the left quarter plane. If the center does not enter or has not entered the shock, the heat kernel exponentially decays when restricted to y < 0. Therefore, it makes no significant difference to replace the speed λi+ by λi− . On the other hand, if the center has entered and left the shock, the correct characteristic speed is λi− . To unify our notation, we may use λi− for all possibilities. This leads to H (x − y, t − τ ; λi− ), which differs from G i in the initial point. We make a brief comparison here with G p before we continue to discuss the shift of initial point from x to xi− . The partition functions ρb∓ in (2.29) capture the correct nonlinearity as illustrated with the Burgers equation. There is another important difference between (2.28) and (2.29): The second heat kernel in G p is along the −λσp , not − λσp , direction. In the case of y < 0 and x > 0, G p is about H (x − p − y, t − τ ; −λ p ). Because of the entropy condition (1.23), the exact fundamental solution in the compression field should be roughly a heat kernel starting at (x, t) and moving backward with

38

T.-P. Liu, Y. Zeng

τ x

t

_

xi

+

_

λi

λi

y −1/ε

1/ε Fig. 2.1.

speed λ+p < 0. Thus the center never enters the shock. Unlike the transversal field, we cannot replace λ+p by λ− p since that will result in the p-characteristic family traveling backward into the shock, a violation of the entropy condition. To unify our notation, however, we may replace λ+p by −λ− p . The difference between the two is a higher order term, (3.25) and (3.27). This leads to the definition of the second heat kernel in G p . The shift of the initial point in G i is based on two considerations. Suppose i > p and consider the i th characteristic curve, which is the center of the heat kernel. The curve starts at (x, t) and travels backward. In leading order the slope on the right-hand side of the shock is λi+ while on the left-hand side is λi− . Therefore, the curve can be approximately regarded as piecewise linear, two line segments with slopes λi± joining at a point in the shock. For instance, we let them join on the right boundary of the shock as shown in Fig. 2.1. If we trace the second line segment up to the initial line τ = t, they intersect at a point other than x. We denote that point as xi− , whose expression is given in (2.21). That is, our first consideration of the shift is to guarantee the continuity of characteristic curves when crossing the shock. In the above intuitive discussion we have basically regarded the shock as with zero width. For the viscous shock wave with strength ε, the width is O(1/ε), and it is crucial to look into the shock layer. Inside the layer, |y| ≤ 1/ε, y < 0, the summation in (2.28) no longer collapses into a single term, even when x > 0 is large. It is an interpolation of two heat kernels, one starting at (x, t) with speed λi+ , and the other starting at (xi− , t) with speed λi− . Since the speed difference is λi+ − λi− = O(ε) and the shock layer is O(1/ε), the overlapping of the two centers at some point inside the shock implies that the maximum difference of the two centers is O(1) within the layer. This makes the interpolation feasible. Otherwise, without the appropriate shift xi− , the difference of the two centers would be O(1)εx, which could be large. In (2.29), a similar consideration of the interpolation between two heat kernels with speeds λ+p and −λ− p , respectively, leads . to shift x − p

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

39

The other combinations of x and y can be discussed in a similar way. In particular, the case y > 0 and x < 0 leads to the definition of xi+ . The rigorous justification of (2.28) and (2.29) is by calculation of the truncation error, given in the next section. 3. Estimates on the Fundamental Solution The purpose of this section is to obtain estimates on the approximate fundamental solution (2.28), (2.29). These include the estimates on the derivatives of (2.28) and (2.29) and their truncation errors. The partition functions used for the interpolation there will be used in an essential way in the following estimates. Theorem 3.1. Let j ≥ 0 be an integer. For −∞ < x, y < ∞, x = ∓1/ε and 0 ≤ τ ≤ t, we have ∂j ∂j j σ σ G (y, τ ; x, t) = (−1) ρ (y) ρ (x) H (x − y, t − τ ; λiσ ) i a a j ∂x j ∂ y σ =−,+ σ j j d xi ∂ −σ σ σ + ρa (x) H (xi − y, t − τ ; λi ) , i = p, (3.1) dx ∂y j ∂j G p (y, τ ; x, t) ∂x j j j d j− j σ σ σ j ∂ ρa (y) ρ (x; |λ p |)(−1) H (x − y, t − τ ; λσp ) = j− j b j d x ∂ y σ =−,+ j =0

+

d j− j

d x j− j

d xσ j σ ∂ j p λσp (x−x σp ) σ σ − ρb (x; |λ p |)e eλ p y j H (x σp − y, t − τ ; λσp ) . dx ∂y (3.2)

Proof. Taking the j th derivative with respect to x to (2.28), we obtain (3.1). Here we d ∓ ρ (x) = 0 only when |x| < 1/ε, which implies note that by (2.14) and (2.15), dx a ∓ σ −σ xi = x, (2.21). But then ρa (x) + ρa (x) = 1 by (2.15), and all its derivatives are zero. Similarly, we take the j th derivative with respect to x to (2.29). Using (2.18), (1.23) and (2.35), we obtain (3.2). Remark. By (1.23), (2.21), (2.18) and (2.17), ρb− (x; λ− if x ≤ 1/ε p) λ− (x−x − ) − − p p ρb (x; |λ p |)e = −λ− x− + − p p if x > 1/ε ρb (x; λ p )e if x ≤ 1/ε ρb− (x; λ− p) , = − − O(1)e−|λ p ||x p | if x > 1/ε

(3.3)

λ p (x−x p ) + + which is basically of the form ρb− (x; λ− is about p ). Similarly, ρb (x; |λ p |)e +

σ

σ

+

ρb+ (x; −λ+p ). Also, for σ = −, +, ρaσ (y)eλ p y = O(1)e−|λ p ||y| . Thus the leading term in (3.2) is the first summation.

40

T.-P. Liu, Y. Zeng

To assess the truncation error we need a sequence of lemmas. First we discuss properties of the viscous shock wave φ. For this we make the following assumption. Assumption 3.2. Let φ (x) = ri (φ(x))ai (x),

φ(x) − u ∓ =

i

ri (φ(x))b∓,i (x).

(3.4)

i

For all i = p and all integers j ≥ 0, j j d d d x j ai (x) = O(1)ε d x j a p (x) , b∓,i (x) = O(1)ε b∓, p (x) .

(3.5a) (3.5b)

Remark. Assumption 3.2 says the φ(x) − u ∓ and its derivatives are along r p (φ) in the leading term. This is a reasonable assumption, true for physical systems such as Navier-Stokes equations. For (1.1) it has been verified in [SX] that (3.5a) is true for j = 0. Theorem 3.3. Let j ≥ 0 be an integer. Let µ > 1 be a fixed constant. Under Assumption 3.2 and the normalization ∇λ p (φ)r p (φ) = 1, we have, for −∞ < x < ∞, ⎧ − − ⎪ dj 1 − eλ p x − ⎪ ⎪ λp + O(1)ε j+2 eλ p x/µ if x ≤ 0 ⎪ − j ⎨ j λ x dx d 1+e p , λ p (φ(x)) = + j ⎪ dx + x/µ ⎪ 1 − eλ p x dj λ + j+2 ⎪ ⎪ if x ≥ 0 + O(1)ε e p ⎩ j λp + dx 1 + eλ p x φ(x) − u − = O(1)εeλ−p x/µ if x ≤ 0, φ(x) − u + = O(1)εeλ+p x/µ if x ≥ 0, − j λ p x/µ d if x ≤ 0 j+1 e = O(1)ε , φ(x) dx j λ+p x/µ e if x ≥ 0

j ≥ 1.

(3.6)

(3.7a) (3.7b)

(3.8)

A more general form of Theorem 3.3 applies to systems with physical viscosity, and is proved in [LZ3]. Next we have a few lemmas related to cancellations. Lemma 3.4. Let j be any nonnegative integer and K > 0 be a constant. For |y| ≤ K /ε and 0 ≤ τ ≤ t − 1, if x ≶ ±K /ε and i ≶ p, then for ε sufficiently small, ∂j H (x − y, t − τ ; λi∓ ) − H (xi± − y, t − τ ; λi± ) ∂y j j 1 = O(1) ε + (t − τ )− 2 (t − τ )− 2 H (x − y, t − τ ; λi∓ , µ),

(3.9)

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

41

where µ > 1 is an arbitrarily fixed constant. Also for |y| ≤ K /ε, 0 ≤ τ ≤ t − 1, and x ≶ ±K /ε, ∂j ∓ ± ± H (x − y, t − τ ; λ ) − H (x − y, t − τ ; −λ ) p p p ∂y j = O(1)(t − τ )−

j+1 2

−ε H (x − y, t − τ ; λ∓ p , µ)e

2 (t−τ )/C

,

(3.10)

where C > 0 is some constant and µ is as above. Proof. We prove (3.9) with the first set of superscripts. That is, we assume x < K /ε and i < p. First we consider x < −1/ε. By (2.21) and (1.7), since |y| ≤ K /ε, ! " + λi+ 1 − + x+ x − y − λi (t − τ ) − xi − y − λi (t − τ ) = 1 − − ε λi

=

λi− − λi+ λi−

#

−(λi− − λi+ )(t − τ )

$

x+

1 − λi− (t − τ ) = O(1)ε x − y − λi− (t − τ ) + O(1). ε

(3.11)

If −1/ε ≤ x < K /ε, xi+ = x by (2.21). The left-hand side of (3.11) becomes (λi+ − λi− )(t − τ ) = O(1)ε(t − τ ) = O(1)ε x − y − λi− (t − τ ) + O(1). Thus (3.11) is still true. This implies xi+ − y − λi+ (t − τ ) = [1 + O(1)ε] x − y − λi− (t − τ ) + O(1).

(3.12)

Denote the left-hand side of (3.9) with the first set of superscripts as D ( j) . If we have j 1 D ( j) = O(1) ε + (t − τ )− 2 (t − τ )− 2 × H (x − y, t − τ ; λi− , µ) + H (xi+ − y, t − τ ; λi+ , µ) , (3.13) using (3.12) and t − τ ≥ 1 we obtain (3.9). To prove (3.13), notice that by induction we have ∂j H (x − y, t − τ ; λ) ∂y j

= H (x − y, t −τ ; λ)

K j j (x − y −λ(t −τ )) j−2 j (t −τ )− j+ j , (3.14)

0≤ j ≤[ j/2]

where K j j is a constant determined only by j and j . Applying (3.14) to D ( j) gives us D ( j) = H (x − y, t − τ ; λi− )

0≤ j ≤[ j/2]

− H (xi+ − y, t − τ ; λi+ )

j−2 j K j j x − y − λi− (t − τ ) (t − τ )− j+ j

0≤ j ≤[ j/2]

= I + I I,

j−2 j K j j xi+ − y − λi+ (t − τ ) (t − τ )− j+ j

42

T.-P. Liu, Y. Zeng

where

I = H (x − y, t − τ ; λi− ) − H (xi+ − y, t − τ ; λi+ ) j−2 j × K j j x − y − λi− (t − τ ) (t − τ )− j+ j ,

II =

0≤ j ≤[ j/2] H (xi+ − y, t

×

0≤ j ≤[ j/2]

− τ ; λi+ ) (3.15) j−2 j j−2 j K j j x − y − λi− (t − τ ) − xi+ − y − λi+ (t − τ )

×(t − τ )− j+ j . By (3.12),

I = O(1) H (x − y, t

− τ ; λi− ) +

H (xi+

− y, t

− τ ; λi+ )

(x − y − λi− (t − τ ))2 4(t − τ )

j−2 j (xi+ − y − λi+ (t − τ ))2 x − y − λi− (t − τ ) (t − τ )− j+ j − 4(t − τ ) 0≤ j ≤[ j/2] j 1 = O(1) ε + (t − τ )− 2 (t − τ )− 2 H (x − y, t − τ ; λi− , µ) + H (xi+ − y, t −τ ; λi+ , µ) .

II in (3.15) can be treated similarly. Thus we have (3.13) hence (3.9). Equation (3.10) is proved in a similar way, using (2.22) and (2.23). Lemma 3.5. Let j be any nonnegative integer and K > 0 be a constant. Let ε be sufficiently small. For |y| ≤ K /ε and 0 ≤ τ ≤ t − 1, if x < −K /ε and i > p, or x > K /ε and i < p, then ± ∂j ∂j ± = H x − y, t − τ ; λ H x − y, t − τ ; λi± i i j j ∂y ∂y ! " (x − y − λi± (t − τ ))2 −(t−τ )/C = O(1)e exp − 4µ(t − τ ) ! " (x − y − λi∓ (t − τ ))2 −(t−τ )/C exp − = O(1)e , 4µ(t − τ )

(3.16)

where µ > 1 is an arbitrarily fixed constant and C > 0 is some constant. Proof. We prove (3.16) for x < −K /ε and i > p. The other case is similar. From (2.21) we have xi± = x. This gives us the first equality. By (3.14) and by the assumption t − τ ≥ 1 we bound the left-hand side of (3.16) by ! " (x − y − λi± (t − τ ))2 O(1) exp − 4µ∗ (t − τ ) with a constant µ∗ , 1 < µ∗ < µ. Using the assumption |y| ≤ K /ε and x < −K /ε we have x − y < 0. Also notice that λi± > 0 since i > p. These give us the second equality in (3.16). The third one is also true since λi∓ − λi± = O(1)ε and ε is sufficiently small.

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

43

Lemma 3.6. Let ε be sufficiently small. Let j1 ≥ j¯ ≥ 0 and j2 ≥ 0 be any integers, σ = −, +, and µ > 1 be a constant. For −∞ < x, y < ∞, x = ∓1/ε, and 0 ≤ τ ≤ t − 1, we have σ j2 j1 d x ∂ p H (x σp − y, t − τ ; −λσp ) − H (x − y, t − τ ; −λσp ) ρaσ (y) j ∂y 1 dx $ (3.17) ¯ # ¯ j − j+1 ∂j σ − 1 2 σ −ε2 (t−τ )/C = ρa (y) ¯ O(1)(t − τ ) , H (x − y, t − τ ; −λ p , µ)e ∂y j where C > 0 is some constant. Proof. We consider σ = −. The case σ = + is similar. By (2.21) and (2.15), the left-hand side of (3.17) is nonzero only when x > 1/ε and y < 1/ε. This implies x−y>x−

1 > 0. ε

(3.18)

Together with (2.24), |x − x − p | = O(1)ε|x − 1/ε| = O(1)ε|x − y|.

(3.19)

Applying (3.14) to the left-hand side of (3.17), and similar to the proof of Lemma 3.4, we write it as I + I I , where ⎧ ! " − j2 j¯ ⎨ d x ∂ p − − − H (x − y, t − τ ; −λ ) − H (x − y, t − τ ; −λ ) I = ρaσ (y) p p p dx ∂ y j¯ ⎩ ⎫ ⎬ ¯ j ¯ j − − j1 − j−2 − j1 + j+ K j1 − j, (x − y + λ (t − τ )) (t − τ ) , × ¯ j p p ⎭ ¯ 0≤ j ≤[( j1 − j)/2] ⎧! " ⎨ d x − j2 ∂ j¯ p (3.20) H (x − y, t − τ ; −λ− I I = ρaσ (y) p) ¯ j ⎩ dx ∂y ¯ j ¯ − − j1−j−2 × K j1−j, (t − τ )− j1+j+ j ¯ j (x p − y + λ p (t − τ )) ¯ 0≤ j ≤[( j1−j)/2]

−

∂ j1 ∂ y j1

⎫ ⎬

H (x − y, t − τ ; −λ− p) . ⎭

By (2.25), (3.19), (3.18) and (1.23), ⎧ j¯ ⎨ ∂ − − I = ρaσ (y) ¯ O(1) H (x − − y, t − τ ; −λ ) + H (x − y, t − τ ; −λ ) (t − τ )−1 ε p p p ∂y j ⎩ × |x − y| ×

¯ 0≤ j ≤[( j1 − j)/2]

⎫ ⎬ j1 − j−2 ¯ j +1 ¯ j − j1 + j+ |x − y| + λ− (t − τ ) (t − τ ) . p ⎭

44

T.-P. Liu, Y. Zeng

With the smallness of ε and (2.22), we have − − H (x − p − y, t − τ ; −λ p ) + H (x − y, t − τ ; −λ p ) ! " 2 ((1 + O(1)ε)|x − y| + λ− p (t − τ )) − 21 = O(1)(t − τ ) exp − 4(t − τ ) −ε = O(1)H (x − y, t − τ ; −λ− p , µ)e

2 (t−τ )/C

for some constant C > 0. This gives I = ρaσ (y)

#

¯

∂j

∂ y j¯

O(1)(t − τ )−

j1 − j¯ 2

−ε ε H (x − y, t − τ ; −λ− p , µ)e

2 (t−τ )/C

$ ,

which can be written as the right-hand side of (3.17). The term I I in (3.20) is treated similarly. We introduce the notation + |λ p | ≡ min{λ− p , −λ p } > 0.

(3.21)

Lemma 3.7. Let j be any nonnegative integer, and µ > 1 be a constant. For −∞ < x < ∞ we have

dj dx j

∓ ∓ ± ρb∓ (x; |λ∓ p |) + ρb (x; |λ p |) = O(1)ρb (x; |λ p |), ∓ ± ρb∓ (x; |λ∓ |) − ρ (x; |λ |) = O(1)ε j+1 e−|λ p ||x|/µ . p p b

(3.22) (3.23)

Proof. Equation (3.22) is straightforward from (2.17). Consider j = 0 in (3.23). By (2.17), (1.23), (2.23) and (2.22), − + |ρb− (x; λ− p ) − ρb (x; |λ p |)|

+ − + λ− p x − e −λ p x = ρb− (x; λ− p )ρb (x; −λ p ) e & % − + λp x − + = O(1)ρb− (x; λ− + e−λ p x ε2 |x| = O(1)εe−|λ p ||x|/µ . p )ρb (x; −λ p ) e

For j ≥ 1 we use (2.20).

Lemma 3.8. Let µ∗ > µ > 0 be constants. For x ≶ ±1/ε, |y| ≤ 1/ε and 0 ≤ τ ≤ t −1, we have, for ε sufficiently small, ∓ ∗ −ε H (x − y, t − τ ; λ∓ p , µ) = O(1)H (x − y, t − τ ; λ p , µ )e

2 (t−τ )/C

,

(3.24)

where C > 0 is some constant. Proof. We prove (3.24) with the first set of signs. Since x < 1/ε and |y| ≤ 1/ε, 2 − − 2 2 x − y < 2/ε. Thus (x − y − λ− p (t − τ )) ≥ −2(x − y)λ p (t − τ ) + (λ p ) (t − τ ) > − − 2 2 −4λ p (t − τ )/ε + (λ p ) (t − τ ) . Using (2.22) we have (3.24).

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

45

Lemma 3.9. Let j1 ≥ 1 and j2 ≥ 0 be any integers and µ > 1 be a constant. Let ε be sufficiently small. For x ≶ ±1/ε, |y| ≤ 1/ε and 0 ≤ τ ≤ t − 1, we have & % ∂ j2 ∓ H x − y, t − τ ; λ p ∂ y j2 & % ∂ j2 (3.25) − ρb∓ (x; |λ± H x± − y, t − τ ; −λ± p |) p p j ∂y 2 % & j2 +1 −ε2 (t−τ )/C = O(1) ρb∓ (x; |λ p |)+e−|λ p ||x|/µ (t − τ )− 2 H x − y, t −τ ; λ∓ , p,µ e

ρb∓ (x; |λ∓ p |)

& % j2 d j1 ∓ ∓ ∂ ∓ ρ (x; |λ |) H x − y, t − τ ; λ p p d x j1 b ∂ y j2 & % d j1 ∂ j2 − j ρb∓ (x; |λ± |) j H x ± − y, t − τ ; −λ± p p p dx 1 ∂y 2 % & j2 +1 −ε2 (t−τ )/C = O(1)e−|λ p ||x|/µ ε j1 (t − τ )− 2 H x − y, t − τ ; λ∓ , p,µ e

(3.26)

where C > 0 is some constant. Proof. Equation (3.25) is obtained from (3.23), (3.24), (3.22) and (3.10). Equation (3.26) is obtained similarly, using (2.20) and (2.22) as well. Lemma 3.10. Let j1 and j2 be any nonnegative integers, and µ > 1 be a constant. For x ≷ ±1/ε, |y| ≤ 1/ε and 0 ≤ τ ≤ t − 1, we have, for ε sufficiently small, & % j2 d j1 ∓ ∓ ∂ ∓ (3.27) ρ (x; |λ |) H x − y, t − τ ; λ p p d x j1 b ∂ y j2 & % d j1 ∂ j2 ± ± − j ρb∓ (x; |λ± |) H x − y, t − τ ; −λ p p p dx 1 ∂ y j2 j2 = O(1) ρb± (x; |λ p |) + e−|λ p ||x|/µ ε j1 +1 (t − τ )− 2 % & 2 × H x − y, t − τ ; λ± , µ e−[ε (t−τ )+ε|x|]/C , p j & % j2 d 1 ∓ ∓ ∂ ∓ (3.28) ρ (x; |λ |) H x − y, t − τ ; λ p p d x j1 b ∂ y j2 j & % j2 d 1 ∓ ± ∂ ± ± + j ρb (x; |λ p |) j H x p − y, t − τ ; −λ p dx 1 ∂y 2 % & j2 2 = O(1)ρb± (x; |λ p |)ε j1 (t − τ )− 2 H x − y, t − τ ; λ± , µ e−[ε (t−τ )+ε|x|]/C , p where C > 0 is some constant. Proof. We prove (3.27) with j1 = 0 and the first set of superscripts. Our assumption for this set is x > 1/ε, which implies x +p = x by (2.21). Using (2.35) and (2.18), the

46

T.-P. Liu, Y. Zeng

left-hand side of (3.27) is & & % % j2 ∂ j2 − − + ∂ + − ρ H x − y, t − τ ; λ (x; −λ ) H x − y, t − τ ; −λ ρb− (x; λ− p) p p p b ∂ y j2 ∂ y j2 (3.29) & % j 2 − ∂ = ρb+ (x; λ− e−λ p y H x − y, t − τ ; −λ− p) p j 2 ∂y & % ∂ j2 + − ρb+ (x; −λ+p ) j eλ p y H x − y, t − τ ; λ+p ∂y 2 & ∂ j2 + % λp y + + + = ρb+ (x; λ− e ) − ρ (x; −λ ) H x − y, t − τ ; λ p p p b ∂ y j2 & & % − ∂ j2 λ+p y % e H x − y, t − τ ; λ+p − e−λ p y H x − y, t − τ ; −λ− . − ρb+ (x; λ− p) p j ∂y 2 Note that x > 1/ε and |y| ≤ 1/ε imply x − y > 0. Applying (3.23), the first term on the right-hand side of (3.29) can be written as the right-hand side of (3.27). To treat the second term we need & d j % λ+p y −λ− py e = O(1)ε j+1 − e (3.30) dy j for j ≥ 0 and |y| ≤ 1/ε. Equation (3.30) is from (2.22) and (2.23). We also need & % & ∂j % H x − y, t − τ ; λ+p − H x − y, t − τ ; −λ− p j ∂y % & j 2 = O(1)ε(t − τ )− 2 H x − y, t − τ ; λ+p , µ e−[ε (t−τ )+ε|x|]/C

(3.31)

for j ≥ 0, x > 1/ε and |y| ≤ 1/ε. Equation (3.31) is true by (3.14), (2.22) and (2.23), together with x − y > 0. Using (3.22), (3.30) and (3.31), the second term on the right-hand side of (3.29) is settled. For the case j1 ≥ 1, we use (2.20) and note that % − & − % + & + λp x j1 (−λ+p ) j1 q j1 −1 e−λ p x e−λ p x eλ p x (λ− p ) q j1 −1 e − % & j1 % & j1 − + 1 + eλ p x 1 + e−λ p x = O(1)ε j1 +1 (1 + ε|x|),

(3.32)

where q j1 −1 is given in (2.20). Modify (3.29) accordingly and use the result for j1 = 0. We settle the case j1 ≥ 1. Equation (3.27) with the second set of superscripts is treated in a similar way. Equation (3.28) is easier to obtain than (3.27). We now estimate the truncation error of the approximate fundamental solution (2.28), (2.29), which is defined as ( ' ∂ ∂ ∂2 (3.33) Ti (y, τ ; x, t) = Gi + [λi (φ(y))G i ] + 2 G i (y, τ ; x, t) ∂τ ∂y ∂y for all i.

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

47

Theorem 3.11. Let j be any nonnegative integer and i = p. For −∞ < x, y < ∞, x = ∓1/ε, and 0 ≤ τ ≤ t, we have ∂j Ti (y, τ ; x, t) = ∂x j ( 'd d2 σ σ λi (φ(y))ρa (y) + 2 ρa (y) (−1) j dy dy σ =−,+ σ j j σ d xi ∂ ∂j σ σ −σ σ H xi − y, t − τ ; λi ρa (x) j H x − y, t − τ ; λi + ρa (x) ∂y dx ∂y j ( ' d σ σ σ + ρa (y) λi (φ(y)) − λi + 2 ρa (y) (−1) j dy σ =−,+ ∂ j+1 ρaσ (x) j+1 H x − y, t − τ ; λiσ ∂y σ j j+1 σ d xi ∂ −σ σ + ρa (x) . (3.34) H xi − y, t − τ ; λi dx ∂ y j+1 Proof. Substituting (2.28) into (3.33), we have for i = p, ( ' d Ti (y, τ ; x, t) = ρaσ (y) λi (φ(y)) − λiσ + 2 ρaσ (y) dy σ =−,+ # $ σ ∂ ∂ σ σ −σ σ ρa (x) H x − y, t − τ ; λi + ρa (x) H xi − y, t − τ ; λi ∂y ∂y ( 2 'd d σ σ λi (φ(y))ρa (y) + 2 ρa (y) + dy dy σ =−,+ σ σ −σ (3.35) ρa (x)H x − y, t −τ ; λi + ρa (x)H xiσ − y, t −τ ; λiσ . Take the j th derivative with respect to x on both sides. As in the proof of Theorem 3.1, we note that ddx ρa∓ (x) = 0 only when |x| < 1/ε, which implies xi∓ = x, and that d σ −σ d x [ρa (x) + ρa (x)] = 0. Theorem 3.12. Let j be any nonnegative integer, 0 ≤ j¯ ≤ j and 0 ≤ J ≤ j + 1. Let ε be sufficiently small and i = p. For −∞ < x, y < ∞, x = ∓1/ε, and 0 ≤ τ ≤ t − 1, we have $ # ∂j d σ d2 σ λ ρ T (y, τ ; x, t) = (φ(y)) (y) + ρ (y) i i ∂x j dy a dy 2 a σ =−,+ ( ¯ ' j¯ ∂j σ − 21 − j− σ (t − τ ) 2 H (x − y, t − τ ; λi , µ) ×ρa (x) ¯ O(1) ε + (t − τ ) ∂y j ∂ j+1 σ σ j (3.36) ρa (y) λi (φ(y)) − λi (−1) ρaσ (x) j+1 H x − y, t − τ ; λiσ + ∂y σ =−,+

48

T.-P. Liu, Y. Zeng

σ ∂ j+1 σ + H xi − y, t − τ ; λi ∂ y j+1 # d ∂j + λi (φ(y))ρaσ (y)(−1) j ρaσ (x) j H x − y, t − τ ; λiσ dy ∂y σ =−,+ σ j j σ d xi ∂ −σ σ + ρa (x) H xi − y, t − τ ; λi dx ∂y j d ρaσ (y)ρaσ (x) + dy σ =−,+ ρa−σ (x)

×

d xiσ dx

j

* ∂J ) − 21 − j−J2 +1 σ O(1) ε + (t − τ ) (t − τ ) H (x − y, t − τ ; λ , µ) , i ∂yJ

where µ > 1 is an arbitrarily fixed constant. Proof. We apply Lemmas 3.4 and 3.5 to (3.34) to achieve cancellation between terms involving derivatives of ρaσ (y). For instance, d σ ∂j λi (φ(y)) ρa (y)(−1) j ρaσ (x) j H x − y, t − τ ; λiσ dy ∂ y σ =−,+ σ j j d x ∂ i + ρa−σ (x) H xiσ − y, t − τ ; λiσ dx ∂y j ¯ ¯ ∂ j− j d σ ∂j j σ = λi (φ(y)) ρa (y)(−1) ρa (x) ¯ H x − y, t − τ ; λiσ ¯ j− j j dy ∂y ∂y σ =−,+ ⎤ ! "j ¯ −σ d xi−σ ∂ j− j −σ ⎦ − H xi − y, t − τ ; λi dx ∂ y j− j¯ by replacing σ by −σ in the second term on the left-hand side and using d σ − dy ρa (y).

d −σ dy ρa (y)

=

We then apply (2.25), (3.9) and (3.16) to the right-hand side. This gives us the corresponding term in (3.36). Theorem 3.13. Let j be any nonnegative integer. For −∞ < x, y < ∞, x = ∓1/ε, and 0 ≤ τ ≤ t, we have ∂j T p (y, τ ; x, t) = ∂x j ( j ' d σ d j− j σ σ σ ρa (y) λ p (φ(y)) − λ p + 2 ρa (y) ρ (x; |λσp |) j− j b dy d x σ =−,+

(3.37)

j =0

∂ j +1

%

&

H x − y, t − τ ; λσp ∂ y j +1 ( j ' d σ d j− j σ λσp (x−x σp ) σ σ σ ρa (y) λ p (φ(y)) + λ p +2 ρa (y) ρ (x; |λ |)e + p b dy d x j− j σ =−,+ ×(−1) j

j =0

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

49

& % d x σp j λσ y ∂ j +1 σ σ × − e p H x − y, t − τ ; λ p p dx ∂ y j +1 & % j σ ∂ + λ p j H x σp − y, t − τ ; λσp ∂y ( 'd d2 σ σ λ p (φ(y))ρa (y) + 2 ρa (y) + dy dy σ =−,+ j & % j d j− j σ σ j ∂ σ ρ (x; |λ |)(−1) H x − y, t − τ ; λ × p p d x j− j b ∂y j j =0

+

d xσ j σ ∂ j & % d j− j σ p λσp (x−x σp ) σ ρ − (x; |λ |)e eλ p y j H x σp − y, t −τ ; λσp . p b j− j dx dx ∂y

Proof. Substituting (2.29) into (3.33), we have T p (y, τ ; x, t) = ( % & ' d ∂ ρaσ (y) λ p (φ(y))−λσp + 2 ρaσ (y) ρbσ (x; |λσp |) H x − y, t −τ ; λσp dy ∂y σ =−,+ ' ( d ρaσ (y) λ p (φ(y)) + λσp + 2 ρaσ (y) ρb−σ (x; |λσp |) + dy σ =−,+ & % ∂ × H x σp − y, t − τ ; −λσp ∂y d2 d λ p (φ(y))ρaσ (y) + 2 ρaσ (y) + dy dy σ =−,+ & & % % σ σ σ × ρbσ (x; |λσp |)H x − y, t −τ ; λσp +ρ−σ . (3.38) b (x; |λ p |)H x p − y, t −τ ; −λ p

Taking the j th derivative with respect to x and using (2.18) and (2.35) as in the proof of (3.2), we obtain (3.34). Theorem 3.14. Let j be any nonnegative integer, 0 ≤ j¯ ≤ j and 0 ≤ J ≤ j + 1. Let µ > 1 be a constant. For −∞ < x, y < ∞, x = ∓1/ε, and 0 ≤ τ ≤ t − 1, we have, for ε sufficiently small, ∂j T p (y, τ ; x, t) = ∂x j

ρaσ (y)

j d j− j

ρ σ (x; |λσp |)(−1) j j− j b d x σ =−,+ j =0 σ % & & % 1 − eλ p y ∂ j +1 λσp y σ σ × 1+e H x − y, t − τ ; λ λ p (φ(y)) − λ p σ p 1 + eλ p y ∂ y j +1 ! σ " & & % ∂j 1 − eλ p y d % λσp y σ σ 1+e H x − y, t − τ ; λ + λ p (φ(y)) − λ p σ p dy ∂y j 1 + eλ p y

(3.39)

50

T.-P. Liu, Y. Zeng

+

σ =−,+

× +

J −2 j =0

ρaσ (y) λ p (φ(y)) + λσp #

∂J ∂yJ d j− j

d x j− j

O(1)(t − τ )−

j+2−J 2

σ

ρb−σ (x; |λσp |)e−λ p x/µ % &$ σ eλ p y/µ H x − y, t − τ ; λσp , µ

ρbσ (x; |λσp |)

σ

× e−λ p x/µ

∂ j +1

j +1

% & σ 1 2 O(1)(t −τ )− 2 eλ p y/µ H x − y, t −τ ; λσp , µ e−ε (t−τ )/C

∂y d σ ρa (y) ρbσ (x; |λ p |) + e−|λ p ||x|/µ + dy σ =−,+ ⎧ ⎨ ∂J # % &$ − j+2−J σ,µ 2 O(1)(t − τ ) H x − y, t − τ ; λ × p ⎩∂yJ

J −2

+

j =0

+

σ =−,+

×

⎫

⎬ % & j +1 1 ∂ 2 O(1)(t − τ )− 2 H x − y, t − τ ; λσp , µ e−ε (t−τ )/C ε j− j ⎭ ∂ y j +1

d ρaσ (y) λ p (φ(y)) dy

¯ # ∂j

O(1)(t − τ )− ∂ y j¯

σ

ρb−σ (x; |λσp |)e−λ p x/µ j+1− j¯ λσ y/µ 2 e p H

%

&$ x − y, t − τ ; λσp , µ

¯

+

j−1 d j− j

j =0

d x j− j

ρbσ (x; |λσp |)

% & σ ∂j − 21 eλσp y/µ H x − y, t − τ ; λσ , µ e−ε2 (t−τ )/C × e−λ p x/µ O(1)(t − τ ) p j ∂y d σ d2 σ + λ p (φ(y)) ρa (y) + 2 ρa (y) ρbσ (x; |λ p |) + e−|λ p ||x|/µ dy dy σ =−,+ ⎧ ⎨ ∂ j¯ # % &$ j¯ − j+1− σ,µ 2 O(1)(t − τ ) H x − y, t − τ ; λ × p ⎩ ∂ y j¯

¯

+

j−1

ε

j− j

∂j

∂y j

j =0

O(1)(t −

1 τ )− 2

⎫ ⎬ & 2 H x − y, t − τ ; λσp , µ e−ε (t−τ )/C , ⎭ %

where |λ p | is defined by (3.21) and C > 0 is some constant. Proof. Applying (2.36) to (3.38), we have T p (y, τ ; x, t) = σ =−,+

ρaσ (y)

∂ ∂y

& % ρbσ (x; |λσp |)H x − y, t − τ ; λσp

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

+

ρb−σ (x; |λσp |)H

%

x − y, t

− τ ; −λσp

&

λσp y σ1−e λ p (φ(y)) − λ p λσp y

51

1+e ) ∂ λ p (φ(y)) + λσp ρaσ (y)ρb−σ (x; |λσp |) + ∂ y σ =−,+ * × H (x σp − y, t − τ ; −λσp ) − H (x − y, t − τ ; −λσp ) $ & % # d d2 λ p (φ(y)) ρaσ (y) + 2 ρaσ (y) ρbσ (x; |λσp |)H x − y, t − τ ; λσp + dy dy σ =−,+ & % + ρb−σ (x; |λσp |)H x σp − y, t − τ ; −λσp # & % d σ ∂ 2 ρa (y) ρbσ (x; |λσp |) H x − y, t − τ ; λσp + dy ∂y σ =−,+ % &$ ∂ −σ σ σ σ + ρb (x; |λ p |) H x p − y, t − τ ; −λ p . (3.40) ∂y

Take the j th derivative with respect to x. For the first summation we apply (2.18) and (2.35), and for the second one (3.17), (2.35), (2.17) and (2.20). Now we consider the third summation. Note that by (2.15) d j ρa− (y)/dy j = −d j ρa+ (y)/dy j for j ≥ 1. Therefore, replacing σ by −σ in the terms containing the second heat kernel, we write the third summation as $ & % # d d2 λ p (φ(y)) ρaσ (y) + 2 ρaσ (y) ρbσ (x; |λσp |)H x − y, t − τ ; λσp dy dy σ =−,+ & % −σ −σ − ρbσ (x; |λ−σ . p |)H x p − y, t − τ ; −λ p Applying (2.25), (2.17), (2.20), (3.22), (3.21), (2.14), (2.15) and (3.24) – (3.28), we obtain the last summation in (3.39). The fourth summation in (3.40) can be treated similarly, and gives the third one in (3.39). In the first summation of (3.39) the cancellation is achieved via (3.6). In this regard the following lemma is needed as well. Lemma 3.15. Let j ≥ 0 be any integer. For −∞ < x < ∞ we have λ− λ+p x px dj if j = 0 −1 − e +1−e 2 1 − λp , (3.41) λp = O(1)ε − j e−|λ p ||x|/µ λ+p x λ x if j ≥1 ε dx j p 1+e 1+e where µ > 1 is an arbitrarily fixed constant. Proof. By the definition of ρb∓ in (2.17) we write the left-hand side of (3.41) as * dj ) − − − − + + λ [2ρ (x; λ ) − 1] − λ [1 − 2ρ (x; −λ )] p p p p b b dx j j j − − − + d − + d − + 2ρ ρ = (λ− (x; λ ) − 1 − 2λ (x; λ ) − ρ (x; −λ ) . p + λp) p p p p b b b dx j dx j Applying (2.23) and (2.20) to the first term, and (2.22) and (3.23) to the second one, we obtain the right-hand side of (3.41).

52

T.-P. Liu, Y. Zeng

4. Wave Interaction In this section we assess wave interaction in the pointwise sense, both in space and in time. This kind of estimate is an important component of our analysis, and originally started in [Liu2]. First we discuss the dissipation of diffusion waves, such as heat kernels, Burgers waves, and waves of algebraic types. Introduce the following notations: − α2

α

θ (x, t; λ, µ) ≡ (t + 1)

(x − λ(t + 1))2 exp − 4µ(t + 1)

,

α

ψ α (x, t; λ) ≡ [(x − λ(t + 1))2 + t + 1]− 2 , α ψ¯ α (x, t; λ) ≡ [|x − λ(t + 1)|3 + (t + 1)2 ]− 3 ,

(4.1)

where α > 0, λ and µ > 0 are constants. It is easy to verify that θ α (x, t; λ, µ) + ψ¯ α (x, t; λ) = O(1)ψ α (x, t; λ).

(4.2)

Comparing (4.1) with (2.27) and (1.22), we have 1 θ (x, t; λ, µ), H (x, t + 1; λ, µ) = √ 4π µ ψi (x, t) = ψ(x, t; λi0 ), ¯ ψ¯ i (x, t) = ψ(x, t; λi0 ),

i = p,

(4.3)

i = p.

Since (1.24), (1.25) and (1.10) imply ci = O(1)δ0 , with the new notation in (4.1), the estimates for heat kernels, Burgers waves, and multiple-mode Burgers waves in (1.20) can be written as θi (x, t) = O(1)δ0 θ (x, t; λi0 , 1), |θi x (x, t)| + |θit (x, t)| = O(1)δ0 θ 2 (x, t; λi0 , µ), |θit + λi0 θi x |(x, t) + |θi x x (x, t)|

= O(1)δ0 θ

3

(4.4)

(x, t; λi0 , µ),

where µ > 1 is an arbitrarily fixed constant. In the proof of Theorem 1.2 we perform a priori estimates via Duhamel’s principle. This requires us to assess wave interaction in the following form: J α,β (x, t; t1 , t2 ; λ, λ , µ) ≡

t2

t1

∞

−∞

(t − τ )−

α−α −1 2

α

(t − τ + 1)− 2 β

3

H (x − y, t − τ ; λ, µ)(τ + 1)− 2 ψ 2 (y, τ ; λ ) dydτ, (4.5) where a ≥ α ≥ 0, α − α < 3, β ≥ 0, µ > 0 if λ = λ , and α ≥ 1, α ≥ 0, 0 ≤ α − α < 3, β ≥ 0, µ > 0 if λ = λ . For t1 = 0 and t2 = t, the pointwise estimates of J α,β are given as Lemmas 2.3 and 2.4 in [LZ2] (also see similar estimates in [LZ1] and [Liu3]). The next two lemmas are corollaries of those two lemmas when applied to the special cases that we will encounter later.

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

53

Lemma 4.1. Let µ > 0 and λ be constants. For −∞ < x < ∞ and t ≥ 0, we have 1

J 1,1 (x, t; 0, t; λ, λ, µ) = O(1)ψ 2 (x, t; λ), 9 3 J 1, 4 (x, t; 0, t; λ, λ, µ) = O(1) θ (x, t; λ, µ∗ ) + ψ 2 (x, t; λ) , 1 2

3 2

J 2,0 (x, t; 0, t; λ, λ, µ) = O(1)(t + 1) ψ (x, t; λ), J J

3, 23

2,1

3 2

(x, t; 0, t; λ, λ, µ) = O(1)ψ (x, t; λ), − 34

(x, t; 0, t; λ, λ, µ) = O(1)(t + 1) J

4,β

ψ (x, t; λ) log(t + 2),

(x, t; 0, t; λ, λ, µ) = O(1)(t + 1)

(4.7) (4.8) (4.9)

3 2

− β2

(4.6)

3 2

ψ (x, t; λ),

(4.10) (4.11)

where µ∗ > µ is an arbitrarily fixed constant, and in (4.11), 0 ≤ β ≤ 3/2 is a constant. Lemma 4.2. Let µ > 0 and λ = λ be constants. For −∞ < x < ∞ and t ≥ 0, we have 1

1

(4.12) J 1, 2 (x, t; 0, t; λ, λ , µ) = O(1)ψ 2 (x, t; λ), 5 3 J 1, 2 (x, t; 0, t; λ, λ , µ) = O(1) θ (x, t; λ, µ∗ ) + ψ 2 (x, t; λ) , (4.13) 1 3 3 J 2,0 (x, t; 0, t; λ, λ , µ) = O(1) (t + 1) 2 ψ 2 (x, t; λ) + (t + 1)(|x| + t + 1)− 2 , J

2, 23

(4.14) * ) 3 5 (x, t; 0, t; λ, λ , µ) = O(1) ψ (x, t; λ) + min ψ¯ 2 (x, t; λ ), (t + 1)− 4 ,

3 2

(4.15) J

J

3, 23

3 2

(x, t; 0, t; λ, λ , µ) = O(1)ψ (x, t; λ), (4.16) 3 1 3 (x, t; 0, t; λ, λ , µ) = O(1)(t + 1)− 2 ψ 2 (x, t; λ) + ψ 2 (x, t; λ ) , (4.17) 2,2

7

3

J 3, 4 (x, t; 0, t; λ, λ , µ) = O(1)(|x| + t + 1)− 2 , (4.18) 3 4, 2 −1 23 − 34 23 J (x, t; 0, t; λ, λ , µ) = O(1) (t + 1) ψ (x, t; λ) + (t + 1) ψ (x, t; λ ) , (4.19) where µ∗ > µ is an arbitrarily fixed constant. Lemmas 2.3 and 2.4 in [LZ2] can be modified to evaluate J with different [t1 , t2 ]. As a corollary we have Lemma 4.3. Let β > 0, µ > 0, λ and λ be constants (λ and λ not necessary distinct). For −∞ < x < ∞ and t ≥ 0 we have √ 1 3 J 2, 2 (x, t; 0, t; λ, λ , µ) = O(1)ψ 2 (x, t; λ), √ 5 3 J 1, 2 (x, t; t, t; λ, λ , µ) = O(1)ψ 2 (x, t; λ), t ∞ 1 (t − τ )− 2 H (x − y, t − τ ; λ, µ)(τ + 1)−β dydτ

(4.20) (4.21)

max{0,t−1} −∞

= O(1)(t + 1)−β .

(4.22)

54

T.-P. Liu, Y. Zeng

When we apply Duhamel’s principle in the next section, the above lemmas are not sufficient to handle the leading terms in the nonlinear ources, and we need to take cancellation into account. This is done through Lemma 4.4, which is Lemma 3.4 in [LZ1], and Lemma 5.6 in the next section. Lemma 4.4. Let the constants 1 ≤ α < 3, µ > 0, µ > 0, λ = λ , and C¯ ≥ |λ − λ |. Let k ≥ 0 be an integer. If a function h(x, t) satisfies h(x, t) = O(1)θ α (x, t; λ , µ ), ∂k h(x, t) = O(1)θ α+k (x, t; λ , µ ), ∂xk (4.23) h t + λ h x − µh x x = O(1)θ α+2 (x, t; λ , µ ) + O(1)θ α+1 (x, t; λ , µ ) , x

∂k α+k+2 α+k+1 h = O(1)θ + λ h − µh (x, t; λ , µ ) + O(1)θ (x, t; λ , µ ) , t x x x x ∂xk then for −∞ < x < ∞ and t ≥ 0, t ∞ ∂ k+1 H (x − y, t − τ ; λ, µ) k+1 h(y, τ ) dydτ ∂y 0 −∞ α+1 − k2 ψ 2 (x, t; λ) + θ min(α,2) (x, t; λ , µ∗ ) = O(1)(t + 1) α

1

+|x − λ(t + 1)|− 2 |x − λ (t + 1)|− 2 ) * √ √ · char min(λ, λ )(t + 1) + C¯ t + 1 ≤ x ≤ max(λ, λ )(t + 1) − C¯ t + 1 , (4.24) where µ∗ > max(µ, µ ) is arbitrarily chosen, and char is the characteristic function. Next we discuss the dissipation of damping waves. Define t2 ∞ α−α −1 α α,β K (x, t; t1 , t2 ; ε, λ, µ) ≡ (t − τ )− 2 (t − τ + 1)− 2 t1

−∞

β

H (x − y, t − τ ; λ, µ)(τ + 1)− 2 e−ε|y|/µ dydτ, t ∞ α−α −1 α Lα,β (x, t; ε, µ) ≡ (t − τ )− 2 (t − τ + 1)− 2 −∞

0

β

H (x − y, t − τ ; −ε, µ)(τ + 1)− 2 e−ε|y|/µ dydτ, t ∞ α−α −1 α α,β M (x, t; ε, µ) ≡ (t − τ )− 2 (t − τ + 1)− 2 0

(4.26)

−∞

β

H (x − y, t − τ ; ε, µ)(τ + 1)− 2 e−ε|y|/µ dydτ. Introduce notation

(4.25)

⎧ ⎪ ⎨1 β (t) ≡ log(t + 1) ⎪ 2−β ⎩ (t + 1) 2

if β > 2 if β = 2 . if β < 2

(4.27)

(4.28)

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

For t ≥ 0 it is clear that

t

β

(τ + 1)− 2 dτ = O(1) β (t).

55

(4.29)

0

The next two lemmas deal with K, L and M. These are modifications and extensions of Lemmas 6.1 – 6.3 in [Liu3], and the proofs are parallel to those. Recall that our O(1) is uniformly bounded with respect to x, t and small ε > 0. Lemma 4.5. Let α ≥ 1, α ≥ 0, 0 ≤ α − α < 3, β ≥ 0, µ∗ > µ > 0, λ > 0 and C¯ > 2 be constants. Then there exists a constant C > 0 such that the following is true for t ≥ 0 : (i) For x ≤ 0,

1−α Kα,β (x, t; 0, t; ε, λ, µ) = O(1) (t + 1) 2 β (t)e−εt/C β + (t + 1)− 2 min{ α−1 (ε−1 ), α−1 (t)} e−ε|x|/µ .

(4.30)

√ (ii) For 0 < x < λ[(t + 1) − C¯ t + 1],

β 3−α Kα,β (x, t; 0, t; ε, λ, µ) = O(1) (|x| + 1) 2 |λt − x|− 2 (1 + ε|x|)−1 β

+ min{ α−1 (ε−1 ), α−1 (t)}(t + 1)− 2 e−ε|x|/C 1−α

+ β (|λt − x|)(t + 1) 2 e−ε|λt−x|/C $ % &−1 √ √ 2−α β ∗ + (t + 1) 2 1 + ε t + 1 ( t + 1)H (x, t; λ, µ ) .

(4.31)

√ (iii) For |x − λ(t + 1)| ≤ |λ|C¯ t + 1,

1−α Kα,β (x, t; 0, t; ε, λ, µ) = O(1) (t + 1) 2 β (t)e−εt/C √ β 2−α + (t + 1) 2 β ( t + 1)(1 + εt)−1 + (t + 1)− 2 α−1 (t)e−εt/C . (4.32)

√ (iv) For x > λ[(t + 1) + C¯ t + 1],

β 3−α Kα,β (x, t; 0, t; ε, λ, µ) = O(1) (t + 1) 2 |λt − x|− 2 (1 + εt)−1 e−ε|λt−x|/µ + , 1−α + (t + 1) 2 min β (t), β (|λt − x|) e−ε|λt−x|/C % - &−1 √ 2−α + (t + 1) 2 β ( t + 1) 1 + ε |x| H (x, t; λ, µ∗ ) β (4.33) + (t + 1)− 2 α−1 (t)e−ε|λt−x|/C−εt/C .

The estimates for λ < 0 are obtained by Kα,β (x, t; t1 , t2 ; ε, λ, µ) = Kα,β (−x, t; t1 , t2 ; ε, −λ, µ).

(4.34)

Lemma 4.6. Let α ≥ 1, α ≥ 0, 0 ≤ α − α < 3, β ≥ 0, µ∗ > µ > 0 and C¯ > 0 be constants. Then there exists a constant C > 0 such that the following is true for t ≥ 0:

56

T.-P. Liu, Y. Zeng

(i) For x ≤ 0, β Mα,β (x, t; ε, µ) = O(1) (t + 1)− 2 min{ α−1 (ε−2 ), α−1 (t)} 1−α 2 ∗ + (t + 1) 2 β (t)e−ε t/C e−ε|x|/µ . (4.35) (ii) For 0 < x ≤ εt, Mα,β (x, t; ε, µ) √ √ 2−α = O(1) (t + 1) 2 min{ β (t), β ( t/ε)}(1 + ε t)−1 H (x, t; ε, µ∗ ) β

+(t + 1)− 2 min{ α−1 (ε−2 ), α−1 (t)}e−ε|x|/µ β

+ (t − x/ε + 1)− 2 x

1−α 2

ε

α−5 2

√ char{1/ε ≤ x ≤ ε(t + 1) − C¯ t + 1} . (4.36)

(iii) For εt < x ≤ 2εt, Mα,β (x, t; ε, µ) # ' √ ( 2−α t t , β , β (t) = O(1) (t + 1) 2 min β ε ε(|x − εt| + 1) $ √ −1 β ×(1 + ε t) H (x, t; ε, µ) + (t + 1)− 2 min{ α−1 (ε−2 ), α−1 (t)}e−ε|x|/C .

(4.37) (iv) For 2εt < x ≤ 3εt, Mα,β (x, t; ε, µ) ( # ' √ 2−α 3εt −x t , β (ε−2 ), β = O(1) (t + 1) 2 min β ε ε −1 ) * 1−α 3εt − x × 1+ √ H (x, t; ε, µ) + (t +1) 2 min β (ε−2 ), β (t) e−ε|x|/C t $ β ∗ + (t + 1)− 2 min{ α−1 (ε−2 ), α−1 (t)}e−ε|x|/µ . (4.38)

(v) For x > 3εt, ) * 1−α Mα,β (x, t; ε, µ) = O(1) t + 1) 2 min β (t), β (ε−2 ) e−ε|x−2εt|/µ β ∗ (4.39) + (t + 1)− 2 min{ α−1 (t), α−1 (ε−2 )}e−ε|x|/µ . The estimates on Lα,β can be obtained by Lα,β (x, t; ε, µ) = Mα,β (−x, t; ε, µ).

(4.40)

The next lemma is a direct consequence of Lemma 4.6, formulated for our convenience.

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

57

Lemma 4.7. Let µ > 0 and γ2 > γ1 > 0 be constants and λ be a parameter satisfying γ1 ε ≤ λ ≤ γ2 ε. Let µ∗ > µ be arbitrarily fixed and ψ p be defined in (1.22). For −∞ < x < ∞ and t ≥ 0, we have 3

1

ε 2 M1,1 (x, t; λ, µ) + ε 2 M2,1 (x, t; λ, µ) 1 ∗ ψ p2 (x, t)e−λ|x|/µ if x ≤ 0 = O(1) . 1 2 if x > 0 ψ (x, t; λ) For −∞ < x < ∞ and t ≥ ε−2 , we have M

3,1

1 2

(x, t; λ, µ) = O(1)ε log ε

−1

1

(4.41)

∗

ψ p2 (x, t)e−λ|x|/µ 1 ψ 2 (x, t; λ)

if x ≤ 0 , if x > 0

⎧ 1 ⎪ ⎨ψ p2 (x, t)e−λ|x|/µ∗ 1 # 1 $ 4,1 M (x, t; λ, µ) = O(1)ε 2 1 2 ⎪ 2 ψ (x, t) + εψ (x, t; λ) ⎩ p

if x ≤ 0 if x > 0

(4.42)

. (4.43)

For |x| ≤ λt/2 and t ≥ ε−5/2 , we have 1

3

εM1,3 (x, t; λ, µ) + M2,3 (x, t; λ, µ) = O(1)ε 2 ψ p2 (x, t),

(4.44)

3 2

3

M3,3 (x, t; λ, µ) = O(1)ε 2 log ε−1 ψ p (x, t), − 23

M4,3 (x, t; λ, µ) = O(1)(t + 1)

(4.45)

.

(4.46)

In the last case of Lemma 4.7, we note that t ≥ ε−5/2 implies e−ε The next lemma is an application of Lemma 4.5.

= O(1)e−t

2 t/C

1/5 /C

.

Lemma 4.8. Let µ > 0, λ = 0 and C > 2 be constants. For −∞ < x < ∞ and t ≥ 0, we have 1

1

εK1,1 (x, t; 0, t; ε, λ, µ) + ε 2 K2,1 (x, t; 0, t; ε, λ, µ) = O(1)ψ 2 (x, t; λ). (4.47) For t ≥ 0, and x ≥ (1 − 1/C)λ(t + 1) if λ > 0, or x ≤ (1 − 1/C)λ(t + 1) if λ < 0, we have εK2,1 (x, t; 0, t; ε, λ, µ) 1 1 (t + 1)− 2 ψ 2 (x, t; λ) = O(1) 1 1 1 3 min{(t + 1)− 2 ψ 2 (x, t; λ), ε− 2 ψ 2 (x, t; λ)}

if if

x λ x λ

≤t +1 . (4.48) ≥t +1

3

ε 2 K3,1 (x, t; 0, t; ε, λ, µ) 1 1 3 ε 2 (t + 1)−1 ψ 2 (x, t; λ) + (t + 1)− 2 = O(1) 1 3 3 (t + 1)− 2 ψ 2 (x, t; λ) + (|x| + t + 1)− 2

if if

x λ x λ

≤t +1 ≥t +1

.

(4.49)

Lemma 4.9. For constants j = 0, 1, 0 < t1 < t2 , λ = 0 and µ > 0, we define K˜ j+1 (x, t; t1 , t2 ; ε, λ, µ) ≡ t2 ∞ 3 j (t − τ )− 2 H (x − y, t − τ ; λ, µ)ε2 e−|λ p ||y|/µ ψ p2 (y, τ ) dydτ. (4.50) t1

−∞

58

T.-P. Liu, Y. Zeng

Assuming λ > 0, for −∞ < x < ∞ and t ≥ 0, we have K˜ 1 (x, t; 0, t; ε, λ, µ) ⎧ 3 1 ⎪ ⎪ εe−|λ p ||x|/µ min{ψ p2 (x, t), εψ p2 (x, t)} ⎪ ⎪ ⎨ 1 3 ∗ 4 2 = O(1) ε θ (x, t; 1λ, µ3 ) + ψ (x, t;3 λ) ⎪ ⎪ +min{ε− 2 ψ 2 (x, t; λ), εψ 4 (x, t; λ)} ⎪ ⎪ ⎩ char{0 < x < λ(t + 1)} ˜2

if x ≤ 0 ,

(4.51)

if x > 0

3 2

(4.52) K (x, t; 0, t; ε, λ, µ) = O(1)ψ (x, t; λ), √ 1 K˜ (x, t; t, t; ε, λ, µ) ⎧ 3 1 ⎪ 2 2 ⎪ ⎨εe−|λ p ||x|/µ min{ψ p (x, t), εψ p (x, t)} if x ≤ 0 = O(1) min{ε− 21 ψ 23 (x, t; λ), εψ 34 (x, t; λ), , (4.53) ⎪ ⎪ ⎩ ε2 (t + 1) 54 ψ 23 (x, t; λ)} if x > 0 √ 2 K˜ (x, t; t, t; ε, λ, µ) 3 1 1 1 3 = O(1) (|x| + t + 1)− 2 +min{1, ε− 2 ((|x|+1)− 2 +(t +1)− 2 )}ψ 2 (x, t; λ) , (4.54) µ∗

where > µ is an arbitrarily fixed constant. If λ < 0, we interchange the expressions for x ≶ 0 in (4.51) and (4.53), while (4.52) and (4.54) stay the same. Proof. To prove (4.51) we need the following estimate for x ≤ 0: t ∞ β α−α −1 α (t −τ )− 2 (t −τ +1)− 2 H (x − y, t −τ ; λ, µ)e−|λ p ||y|/µ (|y|+1)− 2 dydτ 0 −∞ β 3−α = O(1) (t + 1) 2 e−εt/C + min{ α−1 (ε−1 ), α−1 (t)} (|x| + 1)− 2 e−|λ p ||x|/µ , (4.55) α

α

where α ≥ 1, ≥ 0, 0 ≤ α − < 3 and β ≥ 0 are constants. Equation (4.55) is obtained by modifying the proof of Lemma 4.5. Note that * ) 3 3 1 3 3 ε2 ψ p2 (y, τ ) = O(1) min ε2 (|y| + 1)− 2 , ε 2 (τ + 1)− 2 , ε2 (τ + 1)− 4 . (4.56) If we use the first expression in (4.56) and apply (4.55) with α = 1, α = 0, β = 3, 3 then K˜ 1 (x, t; 0, t; ε, λ, µ) = O(1)εe−|λ p ||x|/µ (|x| + 1)− 2 for x ≤ 0. Similarly, if we 3 apply (4.30) and use the third expression in (4.56), we obtain O(1)εe−|λ p ||x|/µ (t + 1)− 4 . But if we use the second expression in (4.56) for the second term in (4.30) (only), 1 3 then we have O(1)ε− 2 e−|λ p ||x|/µ (t + 1)− 2 . The minimum of these three gives us 3 K˜ 1 (x, t; 0, t; ε, λ, µ) = O(1)εe−|λ p ||x|/µ ψ p2 (x, t) for x ≤ 0, which is O(1)ε2 e−|λ p ||x|/µ 1

ψ p2 (x, t) if |x| ≥ 1/ε or t + 1 ≥ ε−2 . Similar to (4.55), we also have t ∞ 3 H (x − y, t − τ ; λ, µ)e−|λ p ||y|/µ (y 2 + τ + 1)− 4 dydτ 0

−∞

1

= O(1)(x 2 + t + 1)− 4 e−|λ p ||x|/µ

(4.57)

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

59

1

for x ≤ 0. This gives O(1)ε2 e−|λ p ||x|/µ ψ p2 (x, t) for K˜ 1 (x, t; 0, t; ε, λ, µ) if x ≤ 0 and t + 1 ≤ ε−2 . Therefore, (4.51) is proved for x ≤ 0. The case x ≥ 0 is similar, using (4.31) – (4.33) with the appropriate expression from (4.56) for each term. The proof of (4.52) – (4.54) is similar, with the following remark: If the integration √ √ 1 β in τ is on [ t, t] and β > 2, β ( t + 1) in (4.31) – (4.33) is replaced by (t + 1) 2 − 4 . The following lemma is proved in a similar way. Lemma 4.10. Let µ > 0, λ = 0 and λ = 0 be constants (λ and λ not necessary distinct). For −∞ < x < ∞ and t ≥ 0, we have # $ √t ∞ 1 1 1 (t −τ )− 2 H (x − y, t −τ ; λ, µ) ψ 2 (y, τ ; λ )+ψ p2 (y, τ ) e−|λ p ||y|/µ dydτ 0 −∞ 1 e−|λ p ||x|/µ (|x| + t + 1)− 2 if xλ ≤ 0 = O(1) − 3 3 . (4.58) if xλ > 0 ε 2 ψ 2 (x, t; λ) Also, for λ being any constant, ∞ t 1 (t − τ )− 2 H (x − y, t − τ ; λ, µ)e−|λ p ||y|/µ dydτ = O(1)e−|λ p ||x|/µ . max{0,t−1} −∞

(4.59) The following lemma is proved in a similar way as Lemma 5.1 in the next section. Lemma 4.11. Let λ = 0 and µ > 0 be constants. For −∞ < x < ∞ and t > 1, we have # $ ∞ 1 √ √ √ 3 3 H (x − y, t − t; λ, µ) (|y| + t + 1)− 2 + ε 2 e−|λ p ||y|/µ ψ p2 (y, t) dy −∞

3

= O(1)ψ 2 (x, t; λ).

(4.60)

Lemma 4.12. Let α > 0, β ≥ 0, µ > 0, λ > 0 and λ be constants (λ and λ not necessary distinct). For −∞ < x < ∞ and t ≥ 0, we have ∞ e−λη ψ α (η + x, t; λ )(|η + x| + 1)−β dη = O(1)ψ α (x, t; λ )(|x| + 1)−β , 0

(4.61)

∞

e 0 ∞

∞

0

∞

−λη

¯α

¯α

ψ (η + x, t; λ ) dη = O(1)ψ (x, t; λ ),

(4.62)

e−λη (|η + x| + t + 1)−α dη = O(1)(|x| + t + 1)−α ,

(4.63)

e−λη e−|λ p ||η+x|/µ ψ pα (η + x, t) dη = O(1)e−|λ p ||x|/µ ψ pα (x, t),

(4.64)

0

1

5

3

e−λη min{ε− 2 , ε2 (t + 1) 4 }ψ 2 (η + x, t; λ) char{η + x ≥ 0} dη

0 1

= O(1)ε2 e−|λ p ||x| ψ p2 (x, t) if x ≤ 0.

(4.65)

60

T.-P. Liu, Y. Zeng

Proof. Equation (4.61) is obtained by considering η ≶ 21 |x − λ (t + 1)| and η ≶ 21 |x|. Equations (4.62) and (4.63) are proved similarly. For (4.64) we consider x < 0 first. The left-hand side is −x ∞ e−λη+|λ p |(η+x)/µ ψ pα (η + x, t) dη + e−λη−|λ p |(η+x)/µ ψ pα (η + x, t) dη 0

= e|λ p |x/µ

0

+e

−|λ p |x/µ

−x

−x

e−(λ−|λ p |/µ)η ψ pα (η + x, t) dη ∞

−x

e−(λ+|λ p |/µ)η ψ pα (η + x, t) dη

* ) α = O(1) e−|λ p ||x|/µ ψ pα (x, t) + eλx [(ε(t + 1))2 + t + 1]− 2 = O(1)e−|λ p ||x|/µ ψ pα (x, t), where we have considered η ≶ |x|/2 for the first integral. The case x > 0 is easier. Similarly, by change of variables, for x ≤ 0 the left-hand side of (4.65) is ∞ * ) 1 3 5 eλx e−λη ψ 2 (η, t; λ) dη min ε− 2 , ε2 (t + 1) 2 . 0

Applying (4.61) and considering t + 1 ≶ ε−2 , we write it as * ) 1 3 5 O(1)eλx ψ 2 (0, t; λ) min ε− 2 , ε2 (t + 1) 2 * ) 1 3 1 = O(1)eλx min ε− 2 (t + 1)− 2 , ε2 (t + 1)− 4 1

= O(1)e−|λ p ||x| ε2 ψ p2 (x, t). The following lemmas handle the influence of partition functions on the characteristic lines, and can be verified by direct calculation. Lemma 4.13. Let α > 0 and µ∗ > µ > 0 be constants and i = p. For −∞ < x < ∞ and t ≥ 0, we have ρaσ (x)ψ α (x, t; λiσ ) + ρa−σ (x)ψ α (xiσ , t; λiσ ) = O(1)ψiα (x, t), (4.66) σ =−,+

ρaσ (x)θ (x, t; λiσ , µ) + ρa−σ (x)θ (xiσ , t; λiσ , µ) = O(1)θ (x, t; λi0 , µ∗ ),

σ =−,+

( 1 ψiα (x, t) = O(1)ψ α (x, t; λi∓ ). char x ≶ ± ε '

If we define the following quantities for i, j = p: λ− i if x > 1ε and i, j > p + − − νi j = νi j (x) ≡ λi , 1 otherwise λ+ 1 i − if x < − ε and i, j < p + + νi j = νi j (x) ≡ λi , 1 otherwise

(4.67) (4.68)

(4.69a)

(4.69b)

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

61

then we also have for i, j = p, −∞ < x, y < ∞ and t, τ ≥ 0, ( ' 1 ψ αj (y, τ ) = O(1)ψ α (y, τ ; νi∓j (x)λ0j ), char y ≶ ± ε ( ' 1 θ (y, τ ; λ0j , µ) = O(1)θ (y, τ ; νi∓j (x)λ0j , µ∗ ), char y ≶ ± ε ρa−σ (x)ψ α (xiσ , t; νiσj λ0j ) = O(1)ψ αj (x, t),

(4.70) (4.71) (4.72)

σ =−,+

ρa−σ (x)ψ¯ α (xiσ , t; νiσj λ0j ) = O(1)ψ¯ αj (x, t).

(4.73)

σ =−,+

Lemma 4.14. Let α > 0, |λ| ≥ γ ε for some constant γ > 0, µ∗ > µ > 1, and j ≥ 1 be constants. For −∞ < x < ∞, x = ∓1/ε, and t ≥ 0, we have σ σ ρbσ (x; |λ|)ψ α (x, t; λσp ) + ρbσ (x; |λσp |)eλ p (x−x p ) ψ α (x σp , t; λσp ) σ =−,+

(4.74) = O(1)ψ pα (x, t), σ σ ρbσ (x; |λ p |/µ)θ (x, t; λσp , µ∗ ) + ρbσ (x; |λσp |)eλ p (x−x p ) θ (x σp , t; λσp , µ)

σ =−,+

= O(1)θ (|x|, t; −|λ p |, µ∗ ),

(4.75)

ψ p (x ∓ p , t)

= O(1)ψ p (x, t), (4.76) σ σ σ ρbσ (x; |λσp |)eλ p (x−x p ) + e−|λ p ||x|/µ ψ α (x σp , t; λ0k ) = O(1)ψkα (x, t), k = p,

σ =−,

(4.77) λσp (x−x σp ) −|λσp ||x|/µ ¯ α σ σ σ 0 α ρb (x; |λ p |)e ψ (x p , t; λk ) = O(1)ψ¯ k (x, t), k = p, +e σ =−,

(4.78) % &j % &j ∓ ∓ ∓ ∓ ∓ dj ∓ λ p (x−x p ) ρ (x; |λ∓ = O(1) λ∓ e−|λ p ||x|/µ = O(1) λ∓ e−|λ p ||x p |/µ , p |)e p p dx j b (4.79) ' d j σ σ α σ d x j ρb (x; |λ p |) ψ (x, t; λ p )

σ =−,+

j ( d λσp (x−x σp ) α σ σ σ σ ψ (x , t; λ ) + j ρb (x; |λ p |)e p p dx

= O(1)ε j e−|λ p ||x|/µ ψ pα (x, t), (4.80) dj λσp (x−x σp ) σ σ σ σ j ∗ θ (x p , t; λ p , µ) = O(1)ε θ (|x|, t; −|λ p |, µ ). d x j ρb (x; |λ p |)e

σ =−,+

(4.81)

62

T.-P. Liu, Y. Zeng

5. Stability Analysis In this section we prove Theorem 1.2. From (2.11), for each i we have wiτ + λi (φ(y))wi y = wi yy + gi (y, τ ),

(5.1)

where gi is the coupling of different characteristic fields and is given by (2.12). Multiply (5.1) by G i (y, τ ; x, t) and integrate the equation over (−∞, ∞) × [0, t]. With (2.30) we obtain ∞ t ∞ G i (y, 0; x, t)wi (y, 0) dy + G i (y, τ ; x, t)gi (y, τ ) dydτ wi (x, t) = −∞ 0 −∞ t ∞ Ti (y, τ ; x, t)wi (y, τ ) dydτ (5.2) + 0

−∞

for all i, where Ti is the truncation error for the approximate fundamental solution G i and is defined in (3.33). By the definition of vi in (2.3), vi (x, t) ≡ wi x (x, t), for 0 ≤ j ≤ 2, we have ∞ j+1 ∂ ∂j v (x, t) = G (y, 0; x, t)wi (y, 0) dy i j j+1 i ∂x −∞ ∂ x t ∞ j+1 ∂ G (y, τ ; x, t)gi (y, τ ) dydτ + j+1 i ∂ 0 −∞ x t ∞ j+1 ∂ T (y, τ ; x, t)wi (y, τ ) dydτ. (5.3) + j+1 i 0 −∞ ∂ x To complete our a priori analysis for vi we need a new quantity: z i (x, t) ≡ wit (x, t).

(5.4)

z iτ + λi (φ(y))z i y = z i yy + giτ (y, τ ).

(5.5)

From (5.1) z i satisfies Thus similar to (5.2) and (5.3), for j = 0, 1, ∞ j ∂j ∂ z i (x, t) = G (y, 0; x, t)z i (y, 0) dy j j i ∂x −∞ ∂ x t ∞ j ∂ G (y, τ ; x, t)giτ (y, τ ) dydτ + j i 0 −∞ ∂ x t ∞ j ∂ T (y, τ ; x, t)z i (y, τ ) dydτ. + j i ∂ 0 −∞ x

(5.6)

Together with (1.25), (2.5), (2.6), (4.4), (2.1), (2.9), (2.10), (3.8), (1.23), (2.22), (5.1) and (2.12), the hypothesis (1.24) implies 1

|wi (x, 0)| = O(1)δ0 (x 2 + 1)− 4 , 3

|vi (x, 0)| + |vi x (x, 0)| + |z i (x, 0)| = O(1)δ0 (x 2 + 1)− 4 , |vi x x (x, 0)| + |z i x (x, 0)| = O(1)δ0 for all i.

(5.7)

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

63

With the notations in (1.22) we define the following asymptotic ansatz: 3

i (x, t) ≡ ψi2 (x, t) +

3

1

ψ¯ k2 (x, t) + ε2 e−|λ p ||x|/µ ψ p2 (x, t) + χi (x, t), i = p,

k=i, p 3 2

p (x, t) ≡ ψ p (x, t) +

3

1

ψ¯ k2 (x, t) + εe−|λ p ||x|/µ ψ p2 (x, t),

k= p 3 2

ˆ i (x, t) ≡ ψ (x, t) + i

3

ψ¯ k2 (x, t)

k=i, p

' ( 3 1 + min εe−|λ p ||x|/µ ψ p2 (x, t), ε2 e−|λ p ||x|/µ ψ p2 (x, t) + χi (x, t), i = p, 3

ˆ p (x, t) ≡ ψ p2 (x, t) +

3

ψ¯ k2 (x, t),

(5.8)

k= p

⎡ (1) i (x, t) ≡ (t + 1)− 2 ⎣ 1

⎤ 3

ψk2 (x, t) + χi (x, t)⎦

k= p 1

3

+ (|x| + t + 1)− 2 + ε2 e−|λ p ||x|/µ ψ p2 (x, t), i = p, 3 1 − 21 ψk2 (x, t) + ε2 e−|λ p ||x|/µ ψ p2 (x, t), (1) p (x, t) ≡ (t + 1) k (2)

i (x, t) ≡ (t + 1)

− 54

1

+ ε2 e−|λ p ||x|/µ ψ p2 (x, t), i = p,

5

1

5

−4 + ε 2 e−|λ p ||x|/µ ψ p2 (x, t), (2) p (x, t) ≡ (t + 1) + where |λ p | = min{λ− p , −λ p } and µ > 1 is an arbitrarily fixed constant. Set

# 1 ˆ i )(·, τ ) L ∞ M(t) ≡ sup max (wi /ψi2 )(·, τ ) L ∞ + (vi /i )(·, τ ) L ∞ + (z i / 0≤τ ≤t

i

$ 5 (1) (2) + (vi x /i )(·, τ ) L ∞ + (vi x x /i )(·, τ ) L ∞ + z i x (·, τ )(t + 1) 4 L ∞ . (5.9) This implies that for −∞ < x < ∞, t ≥ 0 and all i, 1

|wi (x, t)| ≤ M(t)ψi2 (x, t), |vi x (x, t)| ≤

(1) M(t)i (x, t),

ˆ i (x, t), |z i (x, t)| ≤ M(t)

|vi (x, t)| ≤ M(t)i (x, t), (2)

|vi x x (x, t)| ≤ M(t)i (x, t), |z i x (x, t)| ≤ M(t)(t + 1)

− 54

(5.10)

.

The goal of this section is to prove M(t) = O(1)δ0 for sufficiently small M(t) and ε. Once this is done, (5.10), (5.8), (2.10) and (3.8) give (1.26) – (1.28), hence Theorem 1.2 is proved.

64

T.-P. Liu, Y. Zeng

First we express gi and its derivatives in terms of elementary waves. From (2.12), (2.5), (2.6) and (2.10), and by Taylor expansion, we have for all i, gi (y, τ ) = −

Cik (θk , θk ) +

k=i, p

Dikk (θk , vk ) +

k= p k

+li (φ) −λi (φ)

E ikk (vk , vk )

k,k

rk (φ) y wk + 2

k

rk (φ) y vk +

k

+O(1) |θ |3 + |v|3 + |φ − u 0k ||θk | + k= p

rk (φ) yy wk

k

|θk ||θk |

k, k

= p k = k ! " +|φ | |wk | |θ | + |φ | |wk | + |vk | , k

k

(5.11)

k

where Cik (θk , θk ) is defined in (2.7), and Dikk ( p, q) = −li (u 0k ) f (u 0k )(rk0 p, rk (u 0k )q), 1 E ikk ( p, q) = − li (φ) f (φ)(rk (φ) p, rk (φ)q). 2

(5.12)

Therefore, by (5.10), (5.8), (1.22), (3.8), (3.7) and (1.13), for all i we have gi (y, τ ) = −

Cik (θk , θk ) +

k=i, p

Dikk (θk , vk ) − λi (φ)li (φ)

k= p

⎡

+ O(1) δ0 + M(τ )2 + ε M(τ ) ⎣(τ + 1)− 4 1 2

3

k= p

+ ε2 e−|λ p ||y|/µ˜ (τ + 1)

− 21

rk (φ) y wk

k 3

ψk2 (y, τ ) + ψ p3 (y, τ ) ⎤

char{τ ≥ ε−2 }⎦ ,

(5.13)

where µ˜ is a constant satisfying 1 < µ˜ < µ, and we have used the fact that for τ + 1 ≤ ε−2 , εe−|λ p ||y|/C = O(1)ψ p (y, τ ).

(5.14)

Similarly, ⎤ ⎡ ∂ ⎣ Cik (θk , θk ) + Dikk (θk , vk )⎦ gi y (y, τ ) = − ∂y k=i, p

k= p k

(5.15)

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

65

3 1 5 +O(1) δ0 + M(τ )2 + ε 2 M(τ ) (τ + 1)− 4 ψk2 (y, τ ) k

# 1 $ 3 2 −|λ p ||y|/µ˜ 2 2 εψ p (y, τ ) + ψ p (y, τ ) , i = p, +O(1)M(τ )ε e ⎤ ⎡ ∂ ⎣ g py (y, τ ) = C pk (θk , θk ) + D pkk (θk , vk )⎦ − ∂y k= p k= p k 3 1 5 +O(1) δ0 + M(τ )2 + ε 2 M(τ ) (τ + 1)− 4 ψk2 (y, τ )

1 2

+O(1) ε M(τ ) + M(τ )

2

k 7 2

ε e

−|λ p ||y|/µ˜

1 2

ψ p (y, τ ),

(5.16)

⎤ ⎡ ∂ ⎣ Cik (θk , θk ) + Dikk (θk , vk )⎦ giτ (y, τ ) = − ∂τ k=i, p

k= p k

−λi (φ)li (φ)r p (φ) y z p

⎤ ⎡ ∂ ⎣ +2li (φ) rk (φ) y z ky + Dikk (vk , z k )⎦ ∂y k k= p k 3 1 5 +O(1) δ0 + M(τ )2 + ε 2 M(τ ) (τ + 1)− 4 ψk2 (y, τ )

k

3 5 +O(1) M(τ ) + εM(τ ) ε2 e−|λ p ||y|/µ˜ + ε 2 e−|λ p ||y|/µ ψ p2 (y, τ )

2

(5.17a)

⎡

=

∂ ⎣ 0 λk Cik (θk , θk ) + Dikk (θk , z k ) ∂y k=i, p k= p k ⎤ + Dikk (vk , z k )⎦ k= p k

−λi (φ)li (φ)r p (φ) y z p + 2li (φ)

rk (φ) y z ky

k

3 1 5 +O(1) δ0 + M(τ )2 + ε 2 M(τ ) (τ + 1)− 4 ψk2 (y, τ ) k

3 5 +O(1) M(τ ) + εM(τ ) ε2 e−|λ p ||y|/µ˜ + ε 2 e−|λ p ||y|/µ ψ p2 (y, τ )

2

(5.17b)

⎡

3 1 3 = O(1) δ0 + M(τ )2 + ε 2 M(τ ) ⎣(τ + 1)− 4 ψk2 (y, τ ) k=i, p 3 2

+(τ + 1)−1 ψi (y, τ )

66

T.-P. Liu, Y. Zeng

$ 3 3 5 5 + (τ + 1)− 4 ψ p2 (y, τ ) + ε 2 e−|λ p ||y|/µ ψ p2 (y, τ ) # 3 $ 5 +O(1)M(τ )ε2 e−|λ p ||y|/µ˜ ψ p2 (y, τ ) + (τ + 1)− 4 ,

i = p, (5.17c)

⎡ ∂ ⎣ 0 g pτ (y, τ ) = λk C pk (θk , θk ) + D pkk (θk , z k ) ∂y k= p k= p k ⎤ + D pkk (vk , z k )⎦ + 2l p (φ) rk (φ) y z ky k= p k

7 4

k 5

+O(1) δ0 + M(τ )2 + ε M(τ ) (τ + 1)− 4

1 2

3

ψk2 (y, τ )

k

3

(τ + 1)− 2 , +O(1) M(τ ) + ε M(τ ) εe 5 gi yy (y, τ ) = O(1) δ0 + M(τ )2 + ε2 M(τ ) (τ + 1)− 4 2

−|λ p ||y|/µ

(5.18)

+O(1)M(τ )2 ε3 e−|λ p ||y|/µ˜ ψ p (y, τ ) 1

+O(1)M(τ )ε4 e−|λ p ||y|/µ˜ ψ p2 (y, τ ),

all i.

(5.19)

Here when deriving (5.17b) from (5.17a) we have used (2.8), (5.1) and (5.13) as well. Lemma 5.1. Under the assumptions of Theorem 1.2, for −∞ < x < ∞, x = ∓1/ε and t ≥ 0, we have ∞ j ∂ G (y, 0; x, t)wi (y, 0) dy j i ∂ −∞ x ⎧ 1 2 ⎪ ⎪ if j = 0 ⎨ψi (x, t) 3 j−1 = O(1)δ0 (t + 1)− 2 ψ 2 (x, t) if j = 1, 2 , i = p, (5.20) ⎪ i ⎪ ⎩ − 74 if j = 3 (t + 1) ∞ j ∂ G (y, 0; x, t)w p (y, 0) dy j p ∂ −∞ x ⎧ 1 ⎪ 2 ⎪ ψ ⎪ ⎪ # p (x, t) $ if j = 0 ⎪ 3 1 ⎪ j−1 ⎨ (t + 1)− 2 ψ p2 (x, t) + ε j e−|λ p ||x|/µ ψ p2 (x, t) if j = 1, 2 , (5.21) = O(1)δ0 ⎪ ⎪ # $ ⎪ 1 ⎪ 7 ⎪ ⎪ if j = 3 ⎩ (t + 1)− 4 + ε3 e−|λ p ||x|/µ ψ p2 (x, t)

∞

∂j G (y, 0; x, t)z i (y, 0) dy j i −∞ ∂ x ⎧# $ 3 ⎪ ⎨ θ (x, t; λ0 , µ) + ψ 2 (x, t) i i = O(1)δ0 1 ⎪ ⎩(t + 1)− 2 θ (x, t; λ0 , µ) + (t + 1)− 34 i

if j = 0 if j = 1

, i = p,

(5.22)

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

67

∞

∂j G (y, 0; x, t)z p (y, 0) dy j p −∞ ∂ x $ ⎧# 3 ⎪ ⎨ θ (|x|, t; −|λ p |, µ) + ψ p2 (x, t) = O(1)δ0 ⎪ ⎩ θ 2 (|x|, t; −|λ |, µ) + (t + 1)− 54 p

if j = 0

.

(5.23)

if j = 1

Here µ is taken as the same one as in (5.8), and θ is defined by (4.1). Proof. We prove (5.20) with j = 1, 2. From (3.1), the left-hand side is ∞ ∂j j σ (−1) ρa (x) ρaσ (y) j H (x − y, t; λiσ )wi (y, 0) dy ∂y −∞ σ =−,+ (5.24) σ j ∞ d xi ∂j j −σ σ σ σ + (−1) ρa (x) ρa (y) j H (xi − y, t; λi )wi (y, 0) dy. dx ∂y −∞ σ =−,+ We estimate the integral in the first term of (5.24), denoted as I, as follows. For |x −λiσ t| ≤ (t + 1)1/2 , from (2.27), (5.7) and (4.1), we have # j+1 1 t − 2 (y 2 + 1)− 4 dy I = O(1)δ0 √ |y|< t+1 $ − 2j σ − 14 + t H (x − y, t; λ , µ)(t + 1) dy √ i |y|≥ t+1

j

1

= O(1)δ0 (t + 1)− 2 − 4 = O(1)δ0 (t + 1)−

j−1 2

3

ψ 2 (x, t; λiσ ),

(5.25)

where we have assumed t ≥ 1, while the case t < 1 is trivial. Similarly, for |x − λiσ t| ≥ (t + 1)1/2 , by (5.7), (2.14) and (2.15), we have j 1 I = O(1)δ0 t − 2 H (x, t; λiσ , 4µ)(y 2 + 1)− 4 dy |y|≤|x−λiσ t|/2

+t

− j−1 2

H (x, t; λiσ , 4µ)|x

+O(1) # ×

|y|≥|x−λiσ t|/2

t−

j−1 2

1 − λiσ t|− 2

$

H (x − y, t; λiσ , µ)

$ d σ ρa (y)wi (y, 0) + ρaσ (y)vi (y, 0) dy dy

= O(1)δ0 (t + 1)−

j−1 2

3

|x − λiσ t|− 2 = O(1)δ0 (t + 1)−

j−1 2

3

ψ 2 (x, t; λiσ ). (5.26)

Equations (5.25) and (5.26) apply to the second integral of (5.24) as well, with x replaced by xiσ . Thus by (2.25) and (4.66), (5.24) becomes 3 3 − j−1 σ σ −σ σ σ O(1)δ0 (t + 1) 2 ρa (x)ψ 2 (x, t; λi ) + ρa (x)ψ 2 (xi , t; λi ) σ =−,+

= O(1)δ0 (t + 1)−

j−1 2

3 2

ψi (x, t).

σ =−,+

(5.27)

68

T.-P. Liu, Y. Zeng

This is the right-hand side of (5.20). The case j = 0 is proved similarly, without integration by parts in (5.26). The case j = 3 is easier. To prove (5.21) we use (3.2). For j = 1, corresponding to (5.27), the left-hand side of (5.21) is O(1)δ0

3

ρbσ (x; |λσp |)ψ 2 (x, t; λσp ) +

σ =−,+

σ

σ

ρbσ (x; |λσp |)eλ p (x−x p ) ψ 2 (x σp , t; λσp ) 3

σ =−,+

1 d σ ρb (x; |λσp |)ψ 2 (x, t; λσp ) + d x σ =−,+

+

1 σ σ d σ ρb (x; |λσp |)eλ p (x−x p ) ψ 2 (x σp , t; λσp ) . dx σ =−,+

(5.28)

With (4.74), (2.22) and (4.80), we simplify (5.28) as # 3 $ 1 −|λ p ||x|/µ 2 2 O(1)δ0 ψ p (x, t) + εe ψ p (x, t) , which is the right-hand side of (5.21). The cases of j = 0, 2, 3 are treated similarly. In 3

particular, for j = 2, the term O(1)δ0 εe−|λ p ||x|/µ ψ p2 (x, t) is absorbed into the righthand side of (5.21), considering t + 1 ≶ ε−2 . For j = 3, we also need (3.3). Equations (5.22) and (5.23) are proved in a similar way as (5.20) and (5.21). Here we need (4.67) for (5.22), and (4.75), (3.22), (2.20) and (4.81) for (5.23). Lemma 5.2. Under the assumptions of Theorem 1.2, for all i and for −∞ < x < ∞ and t ≥ 0, we have 1 1 wi (x, t) = O(1) δ0 + M(t)2 + ε 4 M(t) ψi2 (x, t),

(5.29)

where ε1/4 can be replaced by ε1/2 if i = p. Proof. From (5.2), (5.20) and (5.21), we have wi (x, t) = Ii + I Ii ,

(5.30)

where Ii = I Ii =

∞

−∞

1

G i (y, 0; x, t)wi (y, 0) dy = O(1)δ0 ψi2 (x, t),

t 0

∞

−∞

(5.31) [G i (y, τ ; x, t)gi (y, τ ) + Ti (y, τ ; x, t)wi (y, τ )] dydτ.

To estimate I Ii we first consider i = p. For the term Ti wi we divide the time interval [0, t] into [0, t − 1] ∪ [t − 1, t], and apply (3.36) with j = j¯ = 0 and J = 1 on [0, t − 1], and (3.34) on [t − 1, t]. By (3.1), (2.16), (2.22), (3.7), (1.23) and (3.8), we have

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

I Ii =

ρaσ (x)

σ =−,+

+

t 0

ρa−σ (x)

σ =−,+

+

ρaσ (x)

∞

−∞

t 0

t

σ =−,+

ρaσ (y)H (x − y, t − τ ; λiσ )gi (y, τ ) dydτ ∞

−∞ ∞ −∞

0

69

ρaσ (y)H (xiσ − y, t − τ ; λiσ )gi (y, τ ) dydτ

1 σ O(1)εe−|λ p ||y|/µ˜ ε + (t − τ )− 2

×H (x − y, t − τ ; λiσ , µ) ˜ [wi (y, τ ) + vi (y, τ )] dydτ t ∞ 1 σ ρa−σ (x) O(1)εe−|λ p ||y|/µ˜ ε + (t − τ )− 2 + 0

σ =−,+

−∞

× H (xiσ − y, t − τ ; λiσ , µ)w ˜ i (y, τ ) dydτ,

(5.32)

where µ˜ is a constant satisfying 1 < µ˜ < µ. Substituting (5.13) for gi and using the notations (4.5) and (4.25), we write the first and the third terms of (5.32) as

σ =−,+

+J

⎡

O(1)ρaσ (x) δ0 + M(t)2 + ε M(t) ⎣ 1 4

1,1

1

J 1, 2 (x, t; 0, t; λiσ , λ0k , 1)

k=i, p 3 (x, t; 0, t; λiσ , λiσ , 1) + J 1, 2 (x, t; 0, t; λiσ , 0, 1)

⎤

+ εK1,1 (x, t; 0, t; |λ p |, λiσ , µ) ˜ + ε K2,1 (x, t; 0, t; |λ p |, λiσ , µ) ˜ ⎦. 1 2

(5.33)

Here we have used (2.7), (4.4), (4.1), (4.2), (5.10), (5.8), (4.68) and (3.8). Applying (4.12), (4.6) and (4.47) to (5.33), together with (4.66), we write it as 1 1 O(1)ρaσ (x) δ0 + M(t)2 + ε 4 M(t) ψ 2 (x, t; λiσ )

σ =−,+

1 1 = O(1) δ0 + M(t)2 + ε 4 M(t) ψi2 (x, t).

(5.34)

Similarly, the second and the fourth terms in (5.32) give us

1 1 O(1)ρa−σ (x) δ0 + M(t)2 + ε 4 M(t) ψ 2 (xiσ , t; λiσ )

σ =−,+

1 1 = O(1) δ0 + M(t)2 + ε 4 M(t) ψi2 (x, t).

Equations (5.30) – (5.32), (5.34) and (5.35) give us (5.29).

(5.35)

70

T.-P. Liu, Y. Zeng

The case i = p is similar. From (3.2), (3.39), (3.22), (3.6), (3.41) and (3.38), we have

I Ip =

σ =−,+

+

σ =−,+

ρbσ (x; |λσp |)

t ∞ −∞

0

σ

σ

ρbσ (x; |λσp |)eλ p (x−x p )

ρaσ (y)H (x − y, t − τ ; λσp )g p (y, τ ) dydτ

t ∞ 0

−∞

σ

ρaσ (y)eλ p y H (x σp − y, t −τ ; λσp )g p (y, τ ) dydτ

t ∞ σ σ ρbσ (x; |λ p |)+ρb−σ (x; |λσp |)e−λ p x/µ˜ +e−|λ p ||x|/µ˜ O(1)εe−|λ p ||y|/µ˜ + 0

σ =−,+

−∞

) * 1 1 ×H (x − y, t − τ ; λσp , µ) ˜ ε ε + (t − τ )− 2 w p (y, τ ) + (t − τ )− 2 v p (y, τ ) dydτ t ∞ σ σ σ 1 + ρbσ (x; |λσp |)eλ p (x−x p ) O(1)εe−|λ p ||y|/µ˜ (t − τ )− 2 σ =−,+ ×H (x σp − y, t − τ ; λσp , µ) ˜

t−1 −∞

εw p (y, τ ) + v p (y, τ ) dydτ.

(5.36)

Note that in (5.13) the third term is the leading term of slow decaying, and in the case i = p, λ p (φ) gives an extra ε by (2.22). Therefore, the first and the third terms of (5.36) are

1 O(1) ρbσ (x; |λ p |) + e−|λ p ||x|/µ˜ δ0 + M(t)2 + ε 2 M(t)

σ =−,+

⎧ ⎨ 1 J 1, 2 (x, t; 0, t; λσp , λ0k , µ) ˜ + J 2,1 (x, t; 0, t; λσp , λ0k , µ) ˜ ⎩ k= p ⎫ ⎬ 1 ˜ + J 2, 2 (x, t; 0, t; λσp , λσp , µ) ˜ + J 1,1 (x, t; 0, t; λσp , λσp , µ) ⎭ 1 + O(1) ρb− (x; |λ p |) + e−|λ p ||x|/µ˜ δ0 + M(t)2 + ε 2 M(t) 2,1 − × ε2 M1,1 (x, t; λ− , µ) ˜ + εM (x, t; λ , µ) ˜ p p 1 + −|λ p ||x|/µ˜ δ0 + M(t)2 + ε 2 M(t) + O(1) ρb (x; |λ p |) + e × ε2 L1,1 (x, t; −λ+p , µ) ˜ + εL2,1 (x, t; −λ+p , µ) ˜ ,

(5.37)

where M and L are defined in (4.27) and (4.26), and we have used (3.22) and (2.19). By (4.12), (4.14), (4.6), (4.8), (4.41), (2.17), (4.74) and (4.40), Eq. (5.37) is simplified as

1 2

O(1) δ0 + M(t) + ε M(t) 2

1 ρbσ (x; |λ p |) + e−|λ p ||x|/µ˜ ψ 2 (x, t; λσp ) σ =−,+

1 1 1 1 + ε 2 e−|λ p ||x|/µ ψ p2 (x, t) = O(1) δ0 + M(t)2 + ε 2 M(t) ψ p2 (x, t).

(5.38)

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

71

The second and the fourth terms in (5.36) are treated similarly, and give

1 2

O(1) δ0 + M(t) + ε M(t) 2

σ

σ

ρbσ (x; |λσp |)eλ p (x−x p ) ψ 2 (x σp , t; λσp ) 1

σ =−,+

+ρb− (x; λ− p )e

− − ε2 M1,1 (x − ˜ + εM2,1 (x − ˜ p , t; λ p , µ) p , t; λ p , µ)

− λ− p (x−x p )

+ + ˜ + εL2,1 (x +p , t; −λ+p , µ) ˜ + ρb+ (x; −λ+p )eλ p (x−x p ) ε2 L1,1 (x +p , t; −λ+p , µ) 1 1 = O(1) δ0 + M(t)2 + ε 2 M(t) ψ p2 (x, t),

(5.39)

using (3.3) and (4.76). These settle the case i = p. Lemma 5.3. Under the assumptions of Theorem 1.2, for −∞ < x < ∞, x = ∓1/ε and t ≥ 0, we have (x, t) 1 if j = 0 ∂j p 2 . (5.40) v p (x, t) = O(1) δ0 + M(t) + ε 2 M(t) ( j) j ∂x p (x, t) if j = 1, 2 Proof. From (5.3) we have for 0 ≤ j ≤ 2, ∂j v p (x, t) = I ( j) + I I ( j) + I I I ( j) , ∂x j

(5.41)

where I ( j) = I I ( j) =

−∞

t 0

I I I ( j) =

∞

0

∂ j+1 G p (y, 0; x, t)w p (y, 0) dy, ∂ x j+1 ∞

−∞ t ∞ −∞

∂ j+1 G p (y, τ ; x, t)g p (y, τ ) dydτ, ∂ x j+1

(5.42)

∂ j+1 T p (y, τ ; x, t)w p (y, τ ) dydτ. ∂ x j+1 ( j)

By (5.21) and (5.8), I ( j) is O(1)δ0 p for j = 0, and O(1)δ0 p for j = 1, 2. This settles I ( j) . We consider j = 0, 1. From (3.2) I I ( j) contains t

∞

∂ j+1 H (x − y, t − τ ; λσp )g p (y, τ ) dydτ ∂ y j+1 0 −∞ σ =−,+ t ∞ d x σp j+1 λσp (x−x σp ) σ σ σ − + ρa (y)ρb (x; |λ p |)e (5.43) dx 0 −∞ σ =−,+ ρaσ (y)ρbσ (x; |λσp |)(−1) j+1

σ

× eλ p y

∂ j+1 H (x σp − y, t − τ ; λσp )g p (y, τ ) dydτ. ∂ y j+1

72

T.-P. Liu, Y. Zeng

We write the first term in (5.43) as ⎧ ⎨ t ∞ σ σ ρb (x; |λ p |) H (x − y, t − τ ; λσp ) ⎩ 0 −∞ σ =−,+

∂ j+1 ρaσ (y) −C pk (θk , θk ) + D pkk (θk , vk ) (y, τ ) dydτ j+1 ∂y k= p t−1 ∞ ∂ j+1 (−1) j+1 ρaσ (y) j+1 H (x − y, t − τ ; λσp ) + ∂y 0 −∞ ⎡ ⎤ × ⎣g p + C pk (θk , θk ) − D pkk (θk , vk ) ⎦ (y, τ ) dydτ ×

t

(5.44)

k= p ∞ ∂

H (x − y, t − τ ; λσp ) ⎫ ⎛ ⎞ ⎤ ⎡ ⎬ j ∂ × j ⎣ρaσ (y) ⎝g p + C pk (θk , θk ) − D pkk (θk , vk ) ⎠ (y, τ )⎦ dydτ . ⎭ ∂y

−

t−1 −∞

∂y

k= p

Here for the first integral we apply Lemma 4.4, with h(y, τ ) = ρaσ (y) −C pk (θk , θk ) + D pkk (θk , vk ) (y, τ )

(5.45)

for each k = p. By (2.7), (5.12), (5.10), (5.8), (2.8), (5.1) and (5.15), we can verify that (4.23) is satisfied with α = 2. Therefore, from (4.24) and (4.74), the first integral in (5.44) gives ⎤ ⎡ 3 j 3 ψ¯ 2 (x, t)⎦ ρbσ (x; |λσp |)O(1) δ02 + δ0 M(t) (t + 1)− 2 ⎣ψ 2 (x, t; λσp ) + k

σ =−,+

= O(1) δ02 + M(t)2 (t + 1)

⎡ − 2j

3 2

⎣ψ p (x, t) +

⎤

k= p

3 2

ψ¯ k (x, t)⎦ .

(5.46)

k= p

Meanwhile, substituting (5.13) and (5.16) for g p and g py , respectively, in the second and the third integrals in (5.44), we arrive at t ∞ j 1 σ σ ρb (x; |λ p |) O(1)(t − τ )− 2 (t − τ + 1)− 2 H (x − y, t − τ ; λσp , µ) ˜ σ =−,+

0

−∞

⎧ ⎡ ⎤ ⎨ 3 1 3 3 δ0 + M(τ )2 + ε 2 M(τ ) (τ + 1)− 4 ⎣ ψk2 (y, τ ) + ψ 2 (y, τ ; λσp )⎦ ⎩ k= p ⎫ ⎬ 1 + ε2 e−|λ p ||y|/µ˜ (τ + 1)− 2 char{τ ≥ ε−2 } dydτ ⎭ (x, t) if j = 0 1 p =O(1) δ0 + M(t)2 + ε 2 M(t) , (1) (x, t) if j = 1

(5.47)

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

73

with 1 < µ˜ < µ. Here we have used (2.22), (3.8), (5.14), (4.15), (4.17), (4.9), (4.10), (4.41), (4.42) and (5.8). With (5.46) and (5.47), Eq. (5.44) gives the right-hand side of (5.40). The second term of (5.43) gives the same result, with similar treatment as in Lemma 5.2, and by (4.74), (4.78), (4.77) and (3.3). From (5.42) and (3.2), the difference 1

1

between I I ( j) and (5.43) is O(1)[δ0 + M(t)2 + ε 2 M(t)]εe−|λ p ||x|/µ ψ p2 (x, t) for j = 0, as shown in Lemma 5.2 (see (5.36)), using (4.79). For j = 1, by induction it is 1

1

O(1)[δ0 + M(t)2 + ε 2 M(t)][ε2 ψ p2 (x, t) + ε p (x, t)]e−|λ p ||x|/µ . These can be written as the right-hand side of (5.40) as well. Hence I I ( j) is settled for j = 0, 1. The case j = 2 is simpler, obtained from (4.17), (4.19), (4.11), (4.43), (5.19) and (4.22). To treat I I I ( j) in (5.42) we apply (3.39) to [0, t − 1] with J = 2 and j¯ = 1, and (3.37) to [t − 1, t]. After integration by parts, we have t ∞ j 1 ρbσ (x; |λ p |) + e−|λ p ||x|/µ˜ O(1)(t − τ )− 2 (t − τ + 1)− 2

I I I ( j) =

0

σ =−,+

−∞

σ × H (x − y, t − τ ; λσp , µ) ˜ ε3 |w p | + ε2 |v p | + ε|v py | (y, τ )e−|λ p ||y|/µ˜ dydτ −|λσ ||x| t−1 ∞ σ + e p O(1)H (x−y, t−τ ; λσp , µ)ε ˜ 4+ j e−|λ p ||y|/µ˜ |w p (y, τ )| dydτ 0

σ =−,+

−∞

t ∞ σ σ σ 1 ρbσ (x; |λσp |)eλ p (x−x p ) + e−|λ p ||x|/µ˜ O(1)(t − τ )− 2 + σ =−,+

t−1 −∞

σ ×H (x σp − y, t −τ ; λσp , µ) ˜ ε3 |w p |+ε2 |v p |+ε|v py | (y, τ )e−|λ p ||y|/µ˜ dydτ

h(2) if j = 2 + , 0 otherwise

where h(2) =

σ =−,+

ρbσ (x; |λσp |)

t

∞

t−1 −∞

1 O(1)(t − τ )− 2 H (x − y, t − τ ; λσp , µ) ˜

σ σ σ + eλ p (x−x p ) H (x σp − y, t − τ ; λσp , µ) ˜ εe−|λ p ||y|/µ˜ |v pyy (y, τ )| dydτ.

Here for some terms in ∂ j+1 T p /∂ x j+1 we have considered (t − τ )−1/2 ≶ ε, and used (4.79). By (5.10), (5.8) and (5.14), I I I ( j) is O(1)ε1/2 M(t) multiplied by a combination of ε3+ j [M1,1 + L1,1 ], ε2 [M j+2,1 + L j+2,1 ] char{t ≥ ε−2 } and J j+2,3/2 , and in the case j = 2, an extra term given by (4.22) with β = 5/4. From (4.41) – (4.43), (4.40), (4.9) – (4.11), (4.15), (4.17) and (4.19), and similar to the treatment of I I ( j) above, we express I I I ( j) as the right-hand side of (5.40). Lemma 5.4. Under the assumptions of Theorem 1.2, for −∞ < x < ∞, x = ∓1/ε and t ≥ 0, we have ˆ p (x, t) if j = 0 1 ∂j 2 . z p (x, t) = O(1) δ0 + M(t) + ε 8 M(t) 5 j ∂x (t + 1)− 4 if j = 1

(5.48)

74

T.-P. Liu, Y. Zeng

Proof. From (5.1) and (5.4), we have ∂j ∂j −λ p (φ(x))v p (x, t) + v px (x, t) + g p (x, t) . z (x, t) = p j j ∂x ∂x

(5.49)

Together with (5.40), (2.22), (3.7), (5.8), (5.13), (5.10), (3.8) and (5.16), we have (5.48) if |x| ≥ |λ p |(t + 1)/4 or t + 1 ≤ ε−5/2 . We now prove (5.48) for |x| ≤ |λ p |(t + 1)/4 and t + 1 ≥ ε−5/2 . From (5.6) we have ∂j z p (x, t) = I ( j) + I I ( j) + I I I ( j) , ∂x j

(5.50)

where I ( j) = I I ( j) =

∞

−∞

t 0

III

( j)

=

∂j G p (y, 0; x, t)z p (y, 0) dy, ∂x j ∞

−∞ t ∞

0

−∞

∂j G p (y, τ ; x, t)g pτ (y, τ ) dydτ, ∂x j

(5.51)

∂j T p (y, τ ; x, t)z p (y, τ ) dydτ. ∂x j

Note that I ( j) is given by (5.23), and for t + 1 ≥ ε−5/2 , θ (|x|, t; −|λ p |, µ) = O(1)θ (|x|, t; −|λ p |, 2µ)e−t

1/5 /C

.

(5.52)

Hence I ( j) is O(1)δ0 ψ p (x, t) if j = 0 and O(1)δ0 (t + 1)−5/4 if j = 1. From (5.18) we have 3/2

II

( j)

=

t

∂j ∂ 0 λ G (y, τ ; x, t) C (θ , θ )+ D (θ , z ) (y, τ ) dydτ p pk k k pkk k k k j ∂y 0 −∞ ∂ x k= p ⎡ t ∞ j ∂ ∂ G p (y, τ ; x, t) ⎣ D pkk (θk , z k ) + j ∂y 0 −∞ ∂ x k= p k =k ⎤ + D pkk (vk , z k ) + 2l p (φ) rk (φ) y z k ⎦ (y, τ ) dydτ ∞

k= p k

t

+

k

1 ∂j 2 2 G (y, τ ; x, t)O(1) δ + M(τ ) + ε M(τ ) 0 j p 0 −∞ ∂ x 3 − 54 −|λ p ||y|/µ − 32 2 ψk (y, τ ) + εe (τ + 1) dydτ. × (τ + 1) ∞

(5.53)

k

The first integral in (5.53) is treated in the same way as the first integral in (5.44), using Lemma 4.4. This gives the right-hand side of (5.46). For the case j = 1there are

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

75

extra terms, which can be either absorbed into the third integral in (5.53) or handled by induction, and give the right-hand side of (5.48). The second term in (5.53) is treated via integration by parts, on [0, t] for j = 0, and 1 on [0, t − 1] for j = 1. It is O(1)[δ0 + M(t)2 + ε 2 M(t)] multiplied by a combination of J 2+ j,3/2 , M2+ j,3 + L2+ j,3 and ε(M1+ j,3 + L1+ j,3 ), and in the case j = 1, an extra ˆ p . By (4.9), (4.10), (4.15), (4.17), (4.44), (4.45) and (4.40), these give term εe−|λ p ||x|/µ the right-hand side of (5.48). The third term in (5.53) is straightforward by (4.7), (4.9), (4.13), (4.15) and (4.44). Here we need to apply (5.52) to θ and θ 2 as well. The estimate for I I I ( j) is similar. Using (3.39) with j¯ = J = 0 and (3.37), I I I ( j) is O(1)M(t) multiplied by a combination of ε1/2 (M2+ j,3 +L2+ j,3 ), ε3/2 (M1+ j,3 +L1+ j,3 ) and ε−1/2 (M3+ j,3 + L3+ j,3 ), and in the case j = 1, ε5/2 (M1,3 + L1,3 ), and (4.22) multiplied by ε and with β = 5/4 as well. The first two terms are absorbed into I I ( j) . The third one is settled by (4.45) if j = 0, and by (4.46) if j = 1 since (t + 1)−1/4 = O(1)ε5/8 . For j = 1 the fourth term is given by (4.44). Lemma 5.5. Under the assumptions of Theorem 1.2, for i = p, t ≥ 0 and |x − λi0 (t + 1)| ≥ (t + 1)/C, we have ˆ i (x, t) if j = 0 1 ∂j 2 . (5.54) z i (x, t) = O(1) δ0 + M(t) + ε 8 M(t) 5 j ∂x (t + 1)− 4 if j = 1 Proof. We prove (5.54) with j = 0. The case j = 1 is simpler. Similar to (5.50) we have z i = I = I I = I I I,

(5.55)

where I , I I and I I I are the three integrals in (5.6). From (5.22) I is # $ 3 3 O(1)δ0 θ (x, t; λi0 , µ) + ψi2 (x, t) = O(1)δ0 ψi2 (x, t) since |x − λi0 (t + 1)| ≥ (t + 1)/C. From (3.1), (5.17b), (5.48) and (4.70), t ∞ ρaσ (x)H (x − y, t − τ ; λiσ ) + ρa−σ (x)H (xiσ − y, t − τ ; λiσ ) II = σ =−,+ 0

−∞

⎫ ⎧ ⎡ ⎤ ⎬ ⎨ ∂ × λ0k Cik (θk , θk ) + Dikk (θk , z k )⎦ (y, τ ) dydτ ρaσ (y) ⎣ ⎭ ∂y ⎩ k=i, p k= p O(1) δ0 + M(t)2 + σ =−,+

t ∞ 1 1 +ε 2 M(t) ρaσ (x) (t − τ )− 2 H (x − y, t − τ ; λiσ , µ) ˜ 0 −∞ ⎧ ⎡ ⎤ ⎨ 3 3 3 ψk2 (y, τ )⎦ × (t − τ )− 4 ⎣ψ 2 (y, τ ; λiσ ) + ⎩ k=i, p ⎫ ⎬ 3 5 + (t − τ )− 4 + ε2 e−|λ p ||y|/µ ψ p2 (y, τ ) dydτ ⎭

76

T.-P. Liu, Y. Zeng

+

σ =−,+ 1 2

O(1) δ0 + M(t)2

ρa−σ (x)

t

∞

1

+ ε M(t) (t − τ )− 2 H (xiσ − y, t − τ ; λiσ , µ) ˜ 0 −∞ ⎧ ⎡ ⎤ ⎨ 3 3 3 σ 0 ⎦ ψ 2 (y, τ ; νik λk ) × (t − τ )− 4 ⎣ψ 2 (y, τ ; λiσ ) + ⎩ k=i, p ⎫ ⎬ 3 5 + (t − τ )− 4 + ε2 e−|λ p ||y|/µ ψ p2 (y, τ ) dydτ ⎭ t ∞ 1 2 8 + O(1) δ0 + M(t) + ε M(t) ˜ ρaσ (x)H (x − y, t − τ ; λiσ , µ) σ =−,+ + ρa−σ (x)H (xiσ

− τ ; λiσ , µ) ˜

0

−∞

− y, t ⎧ ⎤ ⎡ ⎨ 3 5 3 ψk2 (y, τ )⎦ × (t − τ )− 4 ⎣ψ 2 (y, τ ; λiσ ) + ⎩ k=i ⎫ ⎬ 3 5 + ε2 e−|λ p ||y|/µ˜ + ε 2 e−|λ p ||y|/µ ψ p2 (y, τ ) dydτ. ⎭

(5.56)

The second integral is a combination of J 2,3/2 for λ on all transversal field directions, J 2,5/2 for λ on the shock field direction, and K˜ 2 . By (4.9), (4.15), (4.16) and (4.52), this term is ⎡ ⎤ 3 3 1 ψ¯ k2 (x, t)⎦ . O(1) δ0 + M(t)2 + ε 2 M(t) ⎣ψi2 (x, t) + k=i, p

The third term in (5.56) is the counterpart of the second one with the shift xiσ . Note σ λ0 , using (4.70). that it is necessary to alter the direction of ψk , k = i, p, from λ0k to νik k Applying (4.66) and (4.73), we then write the result in the same form as for the second term. The first term in (5.56) is similar to the first term in (5.44), and gives the same result as above. The part with the shift xiσ is treated with the same technique as for the third term, using (4.71). The last term in (5.56) is a leading term. Here we use (4.7) and (4.13) for J 1,5/2 . The term θ is absorbed into ψ 3/2 in the restricted domain. To handle K˜ 1 with µ˜ > µ and ε1/2 K˜ 1 with µ we apply (4.51). It is the first one that gives us the ˆ i , (5.8). Note that µ˜ > µ is necessary to absorb the error induced by last two terms in xiσ . Also note that by interpolation, ' ( ' ( 3 3 3 1 1 1 1 1 min ε− 2 ψi2 (x, t), εψi4 (x, t) = O(1) min ε− 2 ψi2 (x, t), ε 2 (t + 1)− 2 ψi2 (x, t) in the restricted domain. Finally, to handle I I I , the contribution from the truncation error, we apply (3.36) on [0, t − 1], with j¯ = J = 0, and (3.34) on [t − 1, t]. Most of this part is absorbed into I I in (5.56), while the rest can be handled with some minor modifications.

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

77

The next lemma plays a similar role as Lemma 4.4. It takes into account cancellation in wave interaction, needed for the estimate on vi . Lemma 5.6. Let λ = 0 be a constant and h(x, t) be a smooth function. For −∞ < x < ∞ and t ≥ 1, define I(x, t) ≡

t √

t

∞

−∞

H (x − y, t − τ ; λ)

∂ h(y, τ ) dydτ. ∂y

(5.57)

Then . I(x, t) =

∞ −λη q(η + x, t) dη 0. e 0 − −∞ e−λη q(η + x, t) dη

if λ > 0 , if λ < 0

(5.58)

where

∞ √ √ H (x − y, t − t; λ)h(y, t) dy q(x, t) = h(x, t) − −∞ t ∞ ∂ H (x − y, t − τ ; λ) h(y, τ ) dydτ. − √ ∂τ t −∞

(5.59)

Proof. Taking derivative with respect to x on both sides of (5.57) and by integration by parts, we have t ∞ 2 ∂ ∂ I(x, t) = √ H (x − y, t − τ ; λ)h(y, τ ) dydτ 2 ∂x t −∞ ∂ y $ t ∞# ∂ ∂ H (x − y, t − τ ; λ) + λ H (x − y, t − τ ; λ) h(y, τ ) dydτ =− √ ∂y t −∞ ∂τ = −q(x, t) + λI(x, t). Thus ∂∂x [e−λx I(x, t)] = −e−λx q(x, t). If λ > 0, we integrate both sides on [x, ∞]. This gives us ∞ ∞ I(x, t) = eλx e−λη q(η, t) dη = e−λη q(η + x, t) dη. x

0

Similarly, if λ < 0, we integrate on [−∞, x] to obtain (5.58). Lemma 5.7. Under the assumptions of Theorem 1.2, for i = p, −∞ < x < ∞, x = ∓1/ε and t ≥ 0, we have (x, t) if j = 0 1 ∂j i 2 8 . (5.60) vi (x, t) = O(1) δ0 + M(t) + ε M(t) ( j) j ∂x i (x, t) if j = 1, 2 Proof. From (5.3), (5.20) and (5.8), we have ∂j vi (x, t) = I ( j) + I I ( j) , ∂x j

(5.61)

78

T.-P. Liu, Y. Zeng

where I

( j)

=

I I ( j) =

∞

−∞

t 0

i (x, t) if j = 0 ∂ j+1 , G i (y, 0; x, t)wi (y, 0) dy = O(1)δ0 ( j) j+1 ∂x i (x, t) if j = 1, 2 ∞

−∞

∂ j+1 [G i (y, τ ; x, t)gi (y, τ ) + Ti (y, τ ; x, t)wi (y, τ )] dydτ. (5.62) ∂ x j+1

Since the case j = 2 is simpler, we consider j = 0, 1 for I I ( j) : ( j)

( j)

( j)

I I ( j) = I I1 + I I2 + I I3 ,

(5.63)

where ( j)

I I1

=

t 0

⎡

∞

−∞

× ⎣− ( j)

I I2

= 0

√

∂ j+1 G i (y, τ ; x, t) ∂ x j+1 Cik (θk , θk ) +

k=i, p

t

⎡

∞

k= p

'

−∞

∂ j+1 ∂ x j+1

× ⎣gi +

⎤ Dikk (θk , vk )⎦ (y, τ ) dydτ,

k

G i (y, τ ; x, t)

Cik (θk , θk ) −

⎤ Dikk (θk , vk )⎦ (y, τ )

k= p k

k=i, p

( ∂ j+1 T (y, τ ; x, t)w (y, τ ) dydτ, i i ∂ x j+1 t ∞ ' j+1 ∂ = √ G i (y, τ ; x, t) ∂ x j+1 t −∞ ⎡ ⎤ × ⎣gi + Cik (θk , θk ) − Dikk (θk , vk )⎦ (y, τ ) +

( j)

I I3

k= p k

k=i, p

+

∂ j+1 ∂ x j+1

(5.64)

(

Ti (y, τ ; x, t)wi (y, τ ) dydτ.

( j)

For I I1 we substitute (3.1) for ∂∂x j+1 G i (y, τ ; x, t) and perform integration by parts. It is similar to the first three terms in (5.56) (without K˜ 2 ), and gives j+1

O(1) δ02 + M(t)2

⎧# 3 ⎨ ψ 2 (x, t) + / ⎪ i

3

¯2 k=i, p ψk (x, t)

⎪ ⎩(t + 1)− 21 / k= p ψk (x, t)

where we have used (4.72) as well.

3 2

$ if j = 0 if j = 1

,

(5.65)

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

79

( j) For I I2 we use (3.1), (5.13), (4.68), (3.8), (5.29), (5.14), and (3.36) with j¯ = J = 0. This gives us ( j) I I2

=

√

t

∞

O(1)(t − τ )−

−∞ σ =−,+ 0 −σ + ρa (x)H (xiσ −

j+1 2

σ ρa (x)H (x − y, t − τ ; λiσ , µ) ˜

y, t − τ ; λiσ , µ) ˜ ⎧ ⎤ ⎡ ⎨ 3 1 1 3 ψk2 (y, τ ) + ψ 2 (y, τ ; λiσ )⎦ × δ0 + M(τ )2 + ε 4 M(τ ) (τ + 1)− 2 ⎣ ⎩ k=i ⎫ ⎬ 1 + ε2 e−|λ p ||y|/µ˜ ψ p2 (y, τ ) char{(τ + 1) ≥ ε−2 } dydτ, ⎭

with 1 < µ˜ < µ. Equations (4.20) and (4.58) then give us # 3 $ 1 2 − 2j 2 −|λ p ||x|/µ − 21 2 4 ψi (x, t) + ε e . (|x| + t + 1) O(1) δ0 + M(t) + ε M(t) (t + 1) (5.66) ( j)

To estimate I I3 ( j)

I I3,1

we first consider the part of the nonlinear source. From (3.1) it is t ∞ = ρaσ (x)H (x − y, t − τ ; λiσ ) √ σ =−,+

+ where

−∞

t

ρa−σ (x)

d xiσ dx

j+1

⎡

H (xiσ

h(y, τ ) = ρaσ (y) ⎣gi +

− y, t

− τ ; λiσ )

Cik (θk , θk ) −

∂ j+1 h(y, τ ) dydτ, (5.67) ∂ y j+1 ⎤

Dikk (θk , vk )⎦ (y, τ ).

k= p k

k=i, p

For definiteness let i > p, and apply Lemma 5.6 to (5.67). From (5.57) – (5.59) we have ∞ σ ( j) σ I I3,1 = e−λi η q(η + x, t) dη ρa (x) σ =−,+

+

ρa−σ (x)

0

d xiσ dx

j+1

∞

e 0

−λiσ η

q(η +

xiσ , t) dη

,

(5.68)

where by integration by parts, ∞ j √ √ ∂ ∂j j+1 h(x, t) + (−1) H (x − y, t − t; λiσ )h(y, t) dy q(x, t) = j ∂x j ∂ y −∞ t ∞ j ∂ ∂ H (x − y, t − τ ; λiσ ) h(y, τ ) dydτ. +(−1) j+1 √ (5.69) j ∂τ t −∞ ∂ y

80

T.-P. Liu, Y. Zeng

The first term in (5.69) is given by (5.13), (5.15), (5.10), (5.8) and (5.29). For the second term we perform integration by parts and substitute in the result from the first one. Equation (4.60) gives an estimate for this term. Therefore, the first two terms in (5.69) are $ # 1 j 1 3 O(1) δ0 + M(t)2 + ε 4 M(t) (t + 1)− 2 ψ 2 (x, t; λiσ ) + ε2 e−|λ p ||x|/µ˜ ψ p2 (x, t) . (5.70) Applying (5.17a) and (5.48) to the third term in (5.69) and performing integration by parts, √ this term is similar to (5.56), except that the integration with respect to τ is now on [ t, t]. Therefore, we use (4.21), (4.10), (4.17), (4.18), (4.59), and in particular, (4.53) and (4.54) as well. This gives an expression for q(x, t) very much like the right-hand side of (5.71) below. Substituting the result into (5.68) and applying (4.61) – (4.65), we arrive at 1 ( j) (5.71) I I3,1 = O(1) δ0 + M(t)2 + ε 8 M(t) ⎧ 3 3 1 / ⎪ ⎪ ψi2 (x, t) + k=i, p ψ¯ k2 (x, t) + ε2 e−|λ p ||x|/µ ψ p2 (x, t) ⎪ ⎪ ⎪ 3 3 1 ⎪ ⎨ if j = 0, + min{ε− 2 ψi2 (x, t), εψi4 (x, t)} char{x ≥ 0} 3 1 1 / 3 − − 2 2 2 −|λ ||x|/µ ⎪ ⎪ (t + 1) 2 k=i, p ψk (x, t) + (|x| + t + 1) 2 + ε e p ψ p (x, t) ⎪ ⎪ % & 3 ⎪ 1 1 1 ⎪ ⎩ if j = 1. + min{1, ε− 2 (|x| + 1)− 2 + (t + 1)− 2 }ψi2 (x, t) ( j)

( j)

The contribution from the truncation error in I I3 , denoted as I I3,2 , is treated similarly: We substitute (3.34) into (5.64), and perform integration by parts. After rearranging, the two leading terms can be written as ⎤ ⎡ ! " t ∞ −σ j+1 d x i ⎣ H (x − y, t − τ ; λiσ )− ρaσ (x) √ H (xi−σ − y, t −τ ; λi−σ )⎦ d x −∞ t σ =−,+ # $ ∂ j+1 dρaσ (y) w λ (φ(y)) (y, τ ) dydτ. (5.72) i i ∂ y j+1 dy ( j)

Applying Lemma 5.6 to each term in I I3,2 , and for the part given as (5.72) we need to further use Lemmas 3.4 and 3.5 to achieve cancellation between the two heat kernels, see the proof of Theorem 3.12. To simplify the analysis we can apply the partial result ( j) ( j) for z i from Lemma 5.5 to I I3,2 . It is then absorbed into I I3,1 , and gives the same result as in (5.71). Combining (5.63), (5.65), (5.66) and (5.71), I I ( j) takes the form of (5.71). ( j) Finally, if we calculate I I3 without using Lemma 5.6, for x ≥ (1 − 1/C)λi0 (t + 1), we have 1 I I ( j) = O(1) δ0 + M(t)2 + ε 4 M(t) (5.73) ⎧ 3 3 / ⎪ ψi2 (x, t) + k=i, p ψ¯ k2 (x, t) ⎪ ⎪ ⎪ 1 ⎪ 1 1 ⎨ +ε 2 (t + 1)− 2 ψi2 (x, t) char{(1 − 1/C)λi0 (t + 1) ≤ x ≤ λi0 (t + 1)} if j = 0 , 3 1 / 3 ⎪ ⎪ (t + 1)− 2 k= p ψk2 (x, t) + (|x| + t + 1)− 2 ⎪ ⎪ ⎪ 1 ⎩ 1 +ε 2 (t + 1)−1 ψi2 (x, t) char{(1 − 1/C)λi0 (t + 1) ≤ x ≤ λi0 (t + 1)} if j = 1

Time-Asymptotic Behavior of Wave Propagation Around a Viscous Shock Profile

81

using (4.48) and (4.49). Therefore, if x ≤ (1 − 1/C)λi0 (t + 1) we use (5.71) for I I ( j) , if x ≥ λi0 (t + 1) we use (5.73), and if (1 − 1/C)λi0 (t + 1) ≤ x ≤ λi0 (t + 1) we use the minimum of the two. This gives (5.60). Lemma 5.8. Under the assumptions of Theorem 1.2, for i = p, −∞ < x < ∞, x = ∓1/ε and t ≥ 0, we have ˆ i (x, t) if j = 0 1 ∂j 2 z i (x, t) = O(1) δ0 + M(t) + ε 8 M(t) . (5.74) 5 j ∂x (t + 1)− 4 if j = 1 Proof. From (5.1), (5.4) and the definition of vi , we have z i (x, t) = −λi (φ(x))vi (x, t) + vi x (x, t) + gi (x, t).

(5.75)

Applying (5.60), (5.13) and (5.15) to (5.75) and its derivative, we have (5.74) for |x − λi0 (t + 1)| ≤ (t + 1)/C. Together with Lemma 5.5, (5.74) is proved. Finally, (5.9), (5.29), (5.40), (5.60), (5.48) and (5.74) give us 1 M(t) = O(1) δ0 + M(t)2 + ε 8 M(t) , which implies M(t) = O(1)δ0 for sufficiently small M(t) and ε. References [Ch] [Co] [Go] [Ho] [Lax] [Liu1] [Liu2] [Liu3] [LZ1] [LZ2] [LZ3] [MN] [SZ] [Sm] [SX] [Ze1]

Chern, I.-L.: Multiple-mode diffusion waves for viscous nonstrictly hyperbolic conservation laws. Commun. Math. Phys. 138, 51–61 (1991) Cole, J.D.: On a quasi-linear parabolic equation occurring in aerodynamics. Quart. Appl. Math. 9, 225–236 (1951) Goodman, J.: Nonlinear asymptotic stability of viscous shock profiles for conservation laws. Arch. Rat. Mech. Anal. 95, 325–344 (1986) Hopf, E.: The partial differential equation u t + uu x = µu x x . Comm. Pure Appl. Math. 3, 201–230 (1950) Lax, P.D.: Hyperbolic systems of conservation laws, II. Comm. Pure Appl. Math. 10, 537–566 (1957) Liu, T.-P.: Nonlinear stability of shock waves for viscous conservation laws. Mem. Amer. Math. Soc. 56(328), v+108 pp. (1985) Liu, T.-P.: Interactions of nonlinear hyperbolic waves. In: Nonlinear Analysis, Liu, F.-C., Liu, T.-P. (eds)., Singapore: World Scientific, 1991, pp. 171–184 Liu, T.-P.: Pointwise convergence to shock waves for viscous conservation laws. Comm. Pure Appl. Math. 50, 1113–1182 (1997) Liu, T.-P., Zeng, Y.: Large time behavior of solutions for general quasilinear hyperbolic-parabolic systems of conservation laws. Mem. Amer. Math. Soc. 125(599), viii+120 pp. (1997) Liu, T.-P., Zeng, Y.: Compressible navier-stokes equations with zero heat conductivity. J. Diff. Eqs. 153, 225–291 (1999) Liu, T.-P., Zeng, Y.: Fundamental solution for hyperbolic-parabolic system around a shock profile. Preprint Matsumura, A., Nishihara, K.: On a stability of travelling wave solution of a one-dimensional model system for compressible viscous gas. Japan J. Appl. Math. 3, 1–13 (1986) Shu, C.-W., Zeng, Y.: High-order essentially non-oscillatory scheme for viscoelasticity with fading memory. Quart. Appl. Math. 55, 459–484 (1997) Smoller, J.: Shock Waves and Reaction-Diffusion Equations. 2nd ed. New York: Springer-Verlag, 1994 Szepessy, A., Xin, Z.P.: Nonlinear stability of viscous shock waves. Arch. Rat. Mech. Anal. 122, 53–103 (1993) Zeng, Y.: L 1 asymptotic behavior of compressible, isentropic, viscous 1-d flow. Comm. Pure Appl. Math. 47, 1053–1082 (1994)

82

[Ze2] [Ze3] [Ze4] [Zu]

T.-P. Liu, Y. Zeng Zeng, Y.: L p asymptotic behavior of solutions to hyperbolic-parabolic systems of conservation laws. Arch. Math. (Basel) 66, 310–319 (1996) Zeng, Y.: Gas dynamics in thermal nonequilibrium and general hyperbolic systems with relaxation. Arch. Rat. Mech. Anal. 150, 225–279 (1999) Zeng, Y.: Gas Flows with Several Thermal Nonequilibrium Modes. Arch. Rat. Mech. Anal. (to appear) Zumbrum, K.: Planar stability criteria for viscous shock waves of systems with real viscosity. In: Hyperbolic Systems of Balance Laws, Lecture Notes in Math. 1911, Berlin: Springer, 2007, pp. 229–326

Communicated by P. Constantin

Commun. Math. Phys. 290, 83–103 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0700-5

Communications in

Mathematical Physics

An Application of Mirror Extensions Feng Xu Department of Mathematics, University of California at Riverside, Riverside, CA 92521, USA. E-mail: [email protected] Received: 31 July 2008 / Accepted: 28 August 2008 Published online: 9 December 2008 – © Springer-Verlag 2008

Abstract: In this paper we apply our previous results of mirror extensions to obtain realizations of three modular invariants constructed by A. N. Schellekens by holomorphic conformal nets with central charge equal to 24. 1. Introduction Partition functions of chiral rational conformal field theories (RCFT) are modular invariant (cf. [47]). However there are examples of “spurious” modular invariants which do not correspond to any RCFT (cf.[6,38 and 15]) . It is therefore an interesting question to decide which modular invariants can be realized in RCFT. For many interesting modular invariants this question was raised for an example in [37] and more recently in [11]. For results on related questions, see [5,6,22,23,35 and 25] for a partial list. In this paper we examine the holomorphic modular invariants with central charge 24 constructed by A. N. Schellekens in [37]. Besides modular invariance, A. N. Schellekens showed that his modular invariants passed an impressive list of checks from tracial identities which strongly suggested that his modular invariants can be realized in chiral RCFT. Some of Schellekens’s modular invariants were constructed using level-rank duality. In [41] we proved a general theorem on mirror extensions (cf. Th. 2.25) which included modular invariants from level-rank duality (cf. §2.6). It is therefore an interesting question to see if mirror extensions can provide chiral RCFT realization of some of Schellekens’s modular invariants. Our main result in this paper is to show that three of Schellekens’s modular invariants can be realized by holomorphic conformal nets (cf. Th. 3.3): these nets are constructed by simple current extensions (cf. §2.4) of mirror extensions. Our results strongly suggest that there should be Vertex Operator Algebras which realize these modular invariants. We expect our methods to apply to other modular invariants in the literature, especially when level-rank duality plays a role. Supported in part by NSF.

84

F. Xu

This paper is organized as follows: after a preliminary section on nets, mirror extensions and simple current extensions, we examine three of Schellekens’s modular invariants in [37], and obtain realization of these invariants as simple current extensions of three mirror extensions. We end with two conjectures about holomorphic conformal nets with central charge 24 which are motivated by [14 and 37], and we hope that these conjectures will stimulate further research. 2. Preliminaries 2.1. Preliminaries on sectors. Given an infinite factor M, the sectors of M are given by Sect(M) = End(M)/Inn(M), namely Sect(M) is the quotient of the semigroup of the endomorphisms of M modulo the equivalence relation: ρ, ρ ∈ End(M), ρ ∼ ρ iff there is a unitary u ∈ M such that ρ (x) = uρ(x)u ∗ for all x ∈ M. Sect(M) is a ∗ -semiring (there are an addition, a product and an involution ρ → ρ) ¯ equivalent to the Connes correspondences (bimodules) on M up to unitary equivalence. If ρ is an element of End(M) we shall denote by [ρ] its class in Sect(M). We define Hom(ρ, ρ ) between the objects ρ, ρ ∈ End(M) by Hom(ρ, ρ ) ≡ {a ∈ M : aρ(x) = ρ (x)a ∀x ∈ M}. We use λ, µ to denote the dimension of Hom(λ, µ); it can be ∞, but it is finite if λ, µ have finite index. See [20] for the definition of index for the type I I1 case which initiated the subject and [31] for the definition of index in general. Also see §2.3 of [26] for expositions. λ, µ depends only on [λ] and [µ]. Moreover we have if ν has finite index, then νλ, µ = λ, ν¯ µ , λν, µ = λ, µ¯ν which follows from Frobenius duality. µ is a subsector of λ if there is an isometry v ∈ M such that µ(x) = v ∗ λ(x)v, ∀x ∈ M. We will also use the following notation: if µ is a subsector of λ, we will write it as µ ≺ λ or λ µ. A sector is said to be irreducible if it has only one subsector. 2.2. Local nets. By an interval of the circle we mean an open connected non-empty subset I of S 1 such that the interior of its complement I is not empty. We denote by I the family of all intervals of S 1 . A net A of von Neumann algebras on S 1 is a map I ∈ I → A(I ) ⊂ B(H) from I to von Neumann algebras on a fixed separable Hilbert space H that satisfies: A. Isotony. If I1 ⊂ I2 belong to I, then A(I1 ) ⊂ A(I2 ). If E ⊂ S 1 is any region, we shall put A(E) ≡ E⊃I ∈I A(I ) with A(E) = C if E has empty interior (the symbol ∨ denotes the von Neumann algebra generated). The net A is called local if it satisfies: B. Locality. If I1 , I2 ∈ I and I1 ∩ I2 = ∅ then [A(I1 ), A(I2 )] = {0}, where brackets denote the commutator.

Application of Mirror Extensions

85

The net A is called Möbius covariant if in addition satisfies the following properties C,D,E,F: C. Möbius covariance. There exists a non-trivial strongly continuous unitary representation U of the Möbius group Möb (isomorphic to P SU (1, 1)) on H such that U (g)A(I )U (g)∗ = A(g I ), g ∈ Möb, I ∈ I. D. Positivity of the energy. The generator of the one-parameter rotation subgroup of U (conformal Hamiltonian), denoted by L 0 in the following, is positive. E. Existence of the vacuum. There exists a unit U -invariant vector ∈ H (vacuum vector), and is cyclic for the von Neumann algebra I ∈I A(I ). By the Reeh-Schlieder theorem is cyclic and separating for every fixed A(I ). The modular objects associated with (A(I ), ) have a geometric meaning itI = U ( I (2π t)),

J I = U (r I ) .

Here I is a canonical one-parameter subgroup of Möb and U (r I ) is a an antiunitary acting geometrically on A as a reflection r I on S 1 . This implies Haag duality: A(I ) = A(I ),

I ∈I,

where I is the interior of S 1 I . F. Irreducibility. I ∈I A(I ) = B(H). Indeed A is irreducible iff is the unique U -invariant vector (up to scalar multiples). Also A is irreducible iff the local von Neumann algebras A(I ) are factors. In this case they are either C or III1 -factors with a separable predual in Connes classification of type III factors. By a conformal net (or diffeomorphisms covariant net) A we shall mean a Möbius covariant net such that the following holds: G. Conformal covariance. There exists a projective unitary representation U of Diff(S 1 ) on H extending the unitary representation of Möb such that for all I ∈ I we have U (ϕ)A(I )U (ϕ)∗ = A(ϕ.I ), ϕ ∈ Diff(S 1 ), U (ϕ)xU (ϕ)∗ = x, x ∈ A(I ), ϕ ∈ Diff(I ), where Diff(S 1 ) denotes the group of smooth, positively oriented diffeomorphism of S 1 and Diff(I ) the subgroup of diffeomorphisms g such that ϕ(z) = z for all z ∈ I . Note that by Haag duality we have U (ϕ) ∈ A(I ), ∀ϕ ∈ Diff(I ). Hence the following definition makes sense: Definition 2.1. If A is a conformal net, the Virasoro subnet of A, denoted by Vir A is defined as follows: for each interval I ∈ I, Vir A (I ) is the von Neumann algebra generated by U (ϕ) ∈ A(I ), ∀ϕ ∈ Diff(I ). A (DHR) representation π of A on a Hilbert space H is a map I ∈ I → π I that associates to each I a normal representation of A(I ) on B(H) such that π I A(I ) = π I ,

I ⊂ I,

I, I ⊂I.

86

F. Xu

π is said to be Möbius (resp. diffeomorphism) covariant if there is a projective unitary representation Uπ of Möb (resp. Diff(S 1 )) on H such that πg I (U (g)xU (g)∗ ) = Uπ (g)π I (x)Uπ (g)∗ for all I ∈ I, x ∈ A(I ) and g ∈ Möb (resp. g ∈ Diff(S 1 )). By definition the irreducible conformal net is in fact an irreducible representation of itself and we will call this representation the vacuum representation. Let G be a simply connected compact Lie group. By Th. 3.2 of [13], the vacuum positive energy representation of the loop group LG (cf. [32]) at level k gives rise to an irreducible conformal net denoted by AG k . By Th. 3.3 of [13], every irreducible positive energy representation of the loop group LG at level k gives rise to an irreducible covariant representation of AG k . Given an interval I and a representation π of A, there is an endomorphism of A localized in I equivalent to π ; namely ρ is a representation of A on the vacuum Hilbert space H, unitarily equivalent to π , such that ρ I = id A(I ). We now define the statistics. Given the endomorphism ρ of A localized in I ∈ I, choose an equivalent endomorphism ρ0 localized in an interval I0 ∈ I with I¯0 ∩ I¯ = ∅ and let u be a local intertwiner in Hom(ρ, ρ0 ) , namely u ∈ Hom(ρ I , ρ0, I ) with I0 following clockwise I inside I which is an interval containing both I and I0 . 2 , ρ 2 ). We The statistics operator (ρ, ρ) := u ∗ ρ(u) = u ∗ ρ I (u) belongs to Hom(ρ I I ∗ will call (ρ, ρ) the positive or right braiding and

(ρ, ρ) := (ρ, ρ) the negative or left braiding. The statistics parameter λρ can be defined in general. In particular, assume ρ to be localized in I and ρ I ∈ End((A(I )) to be irreducible with a conditional expectation E : A(I ) → ρ I (A(I )), then λρ := E( ) depends only on the sector of ρ. The statistical dimension dρ and the univalence ωρ are then defined by dρ = |λρ |−1 ,

ωρ =

λρ . |λρ |

The conformal spin-statistics theorem (cf. [17]) shows that ωρ = ei2π L 0 (ρ) , where L 0 (ρ) is the conformal Hamiltonian (the generator of the rotation subgroup) in the representation ρ. The right-hand side in the above equality is called the univalence of ρ. Let {[λ], λ ∈ L} be a finite set of all equivalence classes of irreducible, covariant, finite-index representations of an irreducible local conformal net A. We will denote the conjugate of [λ] by [λ¯ ] and identity sector (corresponding to the vacuum representaν = [λ][µ], [ν] . Here µ, ν denotes the tion) by [1] if no confusion arises, and let Nλµ dimension of the space of intertwiners from µ to ν (denoted by Hom(µ, ν)). We will denote by {Te } a basis of isometries in Hom(ν, λµ). The univalence of λ and the statistical dimension of (cf. §2 of [17]) will be denoted by ωλ and d(λ) (or dλ )) respectively. The following equation is called a monodromy equation (cf. [33]): ων Te , (1)

(µ, λ) (λ, µ)Te = ωλ ωµ where (µ, λ) is the unitary braiding operator. We make the following definitions for convenience:

Application of Mirror Extensions

87

Definition 2.2. Let λ, µ be (not necessarily irreducible) representations of A. H (λ, µ) := ε(λ, µ)ε(µ, λ). We say that λ is local with µ if H (λ, µ) = 1. Definition 2.3. Let be a set of DHR representations of A. If is an abelian group with multiplication given by composition and dλ = 1, ωλ = 1, ∀λ ∈ , then is called a local system of automorphisms. The following Lemma will be useful to check if a set is a local system of automorphims. Lemma 2.4. (1) Assume that [µ] = 1≤i≤n [µi ] and λ, µi , i = 1, . . . , n are representations of A. Then H (λ, µ) = 1 if and only if H (λ, µi ) = 1 for all 1 ≤ i ≤ n; (2) If H (λ, µ) = 1 and H (λ, ν) = 1, then H (λ, µν) = 1 (3) If λ1 , ..., λn generate a finite abelian group under composition, ωλi = 1, 1 ≤ i ≤ n, and H (λi , λ j ) = 1, 1 ≤ i, j ≤ n, then is a local system of automorphisms. Proof. (1) and (2) follows from [34] or Lemma 3.8 of [5]. As for (3), we prove by induction on n. If n = 1, then ε(λ1 , λ1 ) = ωλ1 = 1 since ε(λ1 , λ1 ) is a scalar, and it 2 follows that ωλi = ε(λ1 , λ1 )i = 1, ∀i ≥ 1. 1 Assume that (3) has been proved for n −1. Let µ be in the abelian group generated by λ1 , ..., λn−1 . Since for any integer k H (µ, λkn ) = 1 by (2) and assumption, by repeatedly applying (2) and the monodromy equation, we have ωµλkn = ωµ ωλkn = 1 by induction hypotheses. It follows that (3) is proved. Next we recall some definitions from [24] . Recall that I denotes the set of intervals of S 1 . Let I1 , I2 ∈ I. We say that I1 , I2 are disjoint if I¯1 ∩ I¯2 = ∅, where I¯ is the closure of I in S 1 . When I1 , I2 are disjoint, I1 ∪ I2 is called a 1-disconnected interval in [42]. Denote by I2 the set of unions of disjoint 2 elements in I. Let A be an irreducible Möbius covariant net . For E = I1 ∪ I2 ∈ I2 , let I3 ∪ I4 be the interior of the complement of I1 ∪ I2 in S 1 where I3 , I4 are disjoint intervals. Let ˆ A(E) := A(I1 ) ∨ A(I2 ), A(E) := (A(I3 ) ∨ A(I4 )) . ˆ Note that A(E) ⊂ A(E). Recall that a net A is split if A(I1 ) ∨ A(I2 ) is naturally isomorphic to the tensor product of von Neumann algebras A(I1 ) ⊗ A(I2 ) for any disjoint intervals I1 , I2 ∈ I. A is strongly additive if A(I1 ) ∨ A(I2 ) = A(I ), where I1 ∪ I2 is obtained by removing an interior point from I . Definition 2.5. [24,30]. A Möbius covariant net A is said to be completely rational if A ˆ is split, strongly additive, and the index [A(E) : A(E)] is finite for some E ∈ I2 . The ˆ value of the index [A(E) : A(E)] (it is independent of E by Prop. 5 of [24]) is denoted by µA and is called the µ-index of A . Note that, by results in [30], every irreducible, split, local conformal net with finite µ-index is automatically strongly additive. Also note that if A is completely rational, then A has only finitely many irreducible covariant representations by [24]. Definition 2.6. A Möbius net A is called holomorphic if A is completely rational and µA = 1, i.e., A has only one irreducible representation which is the vacuum representation.

88

F. Xu

Let B be a Möbius (resp. conformal) net. B is called a Möbius (resp. conformal) extension of A if there is a map I ∈ I → A(I ) ⊂ B(I ) that associates to each interval I ∈ I a von Neumann subalgebra A(I ) of B(I ), which is isotonic A(I1 ) ⊂ A(I2 ),

I1 ⊂ I2 ,

and Möbius (resp. diffeomorphism) covariant with respect to the representation U , namely U (g)A(I )U (g)∗ = A(g.I ) for all g ∈ Möb (resp. g ∈ Diff(S 1 )) and I ∈ I. A will be called a Möbius (resp. conformal) subnet of B. Note that by Lemma 13 of [28] for each I ∈ I there exists a conditional expectation E I : B(I ) → A(I ) such that E preserves the vector state given by the vacuum of B. Definition 2.7. Let A be a Möbius covariant net. A Möbius covariant net B on a Hilbert space H is an extension of A if there is a DHR representation π of A on H such that π(A) ⊂ B is a Möbius subnet. The extension is irreducible if π(A(I )) ∩ B(I ) = C for some (and hence all) interval I , and is of finite index if π(A(I )) ⊂ B(I ) has finite index for some (and hence all) interval I . The index will be called the index of the inclusion π(A) ⊂ Band will be denoted by [B : A]. If π as representation of A decomposes as [π ] = λ m λ [λ] where m λ are non-negative integers and λ are irreducible DHR representations of A, we say that [π ] = λ m λ [λ] is the spectrum of the extension. For simplicity we will write π(A) ⊂ B simply as A ⊂ B. Lemma 2.8. If A is completely rational, and a Möbius covariant net B is an irreducible extension of A. Then A ⊂ B has finite index, B is completely rational and µA = µB [B : A]2 . Proof. A ⊂ B has finite index follows from Prop. 2.3 of [22], and the rest follows from Prop. 24 of [24]. Lemma 2.9. If A is a conformal net, and a Möbius covariant net B is an extension of A with index [B : A] < ∞. Then B is a conformal net. Proof. Denote by π the vacuum representation of B. Denote by G the universal cover of Möb. By definition g ∈ G → Uπ (g) is a representation of G which implements the Möbius covariance of π A. On the other hand by §2 of [2] there is a representation of g ∈ G → Vπ (g) which implements the Möbius covariance of π A, and Vπ (g) ∈ I ∈I π(Vir A (I )), where Vir A is defined in Definition 2.1. Since by assumption π A has finite index, by Prop. 2.2 of [17] we have Uπ (g) = Vπ (g), ∀g ∈ G. Hence Vir A ⊂ B verifies the condition in Definition 3.1 of [9], and by Prop. 3.7 of [9] the lemma is proved. The following is Th. 4.9 of [29] (cf. §2.4 of [22]) which is also used in §4.2 of [22]:

Application of Mirror Extensions

89

Proposition 2.10. Let A be a Möbius covariant net, ρ a DHR representation of A localized on a fixed I0 with finite statistics, which contains id with multiplicity one, i.e., there is (unique up to a phase) isometry w ∈ Hom(id, ρ). Then there is a Möbius covariant net B which is an irreducible extension of A if and only if there is an isometry w1 ∈ Hom(ρ, ρ 2 ) which solves the following equations: w1∗ w = w1∗ ρ(w) ∈ R+ , w1 w1 = ρ(w1 )w1 ,

(ρ, ρ)w1 = w1 .

(2) (3) (4)

Remark 2.11. Let A ⊂ B be as in Prop.2.10. If U is an unitary on the vacuum representation space of A such that AdU A(I ) = A(I ), ∀I, then it is easy to check that ∗ , Ad (w ), Ad (w)) verifies the equations in Prop. 2.10, and determines (AdU ρAdU U 1 U a Möbius covariant net AdU (B). The spectrum of A ⊂ AdU (B) (cf. Definition 2.7) is ∗ which may be different from ρ, but Ad (B) is isomorphic to B by definition. AdU ρAdU U 2.3. Induction. Let B be a Möbius covariant net and A a subnet. We assume that A is strongly additive and A ⊂ B has finite index. Fix an interval I0 ∈ I and canonical endomorphism (cf. [29]) γ associated with A(I0 ) ⊂ B(I0 ). Then we can choose for each I ⊂ I with I ⊃ I0 a canonical endomorphism γ I of B(I ) into A(I ) in such a way that γ I B(I0 ) = γ I0 and ρ I1 is the identity on A(I1 ) if I1 ∈ I0 is disjoint from I0 , where ρ I ≡ γ I A(I ). Given a DHR endomorphism λ of A localized in I0 , the inductions αλ , αλ− of λ are the endomorphisms of B(I0 ) given by αλ ≡ γ −1 · Adε(λ, ρ) · λ · γ , αλ− ≡ γ −1 · Ad˜ε(λ, ρ) · λ · γ , where ε (resp. ε˜ ) denotes the right braiding (resp. left braiding) (cf. Cor. 3.2 of [3]). In [45] a slightly different endomorphism was introduced and the relation between the two was given in §2.1 of [43]. Note that Hom(αλ , αµ ) =: {x ∈ B(I0 )|xαλ (y) = αµ (y)x, ∀y ∈ B(I0 )} and Hom(λ, µ) =: {x ∈ A(I0 )|xλ(y) = µ(y)x, ∀y ∈ A(I0 )}. The following is Lemma 3.6 of [5] and Lemma 3.5 of [3]: Lemma 2.12. Hom(αλ , αµ− ) = {T ∈ B(I0 )|γ (T ) ∈ Hom(ρλ, ρµ)|ε(µ, ρ)ε(ρ, µ)γ (T ) = γ (T )}. As a consequence of Lemma 2.12 we have the following Prop. 3.23 of [3] ( Also cf. the proof of Lemma 3.2 of [45]): Lemma 2.13. [αλ ] = [αλ− ] iff ε(λ, ρ)ε(ρ, λ) = 1 . The following follows from Lemma 3.4 and Th. 3.3 of [45] (also cf. [3]) : Lemma 2.14. (1) : [λ] → [αλ ], [λ] → [αλ− ] are ring homomorphisms; (2) αλ , αµ = λρ, µ .

90

F. Xu

2.4. Local simple current extensions. Proposition 2.15. (1) Assume that B is a Möbius extension of A of finite index with spectrum [π ] = λ∈exp m λ [λ]. Let := {λ|λ ∈ exp}. Assume that dλ = 1, ∀λ ∈ exp. Then is a local system of automorphisms; (2) If is a finite local system of automorphisms of A, then there is a Möbius extension B of A with spectrum [π ] = λ∈ [λ]. Proof. Ad (1): By assumption we have αλ 1, ∀λ ∈ . By Lemma 3.10 of [5] ωλ = 1. Since dλ = dαλ = 1, it follows that [αλ ] = [αλ− ] = [1]. Note that if λ ∈ iff [αλ ] = [1] and it follows that is an abelian group with multiplication given by composition. By Lemma 2.13 and Lemma 2.4 (1) is proved. (2) It follows from Prop. 5.5 of [34] (also. cf Th. 5.2 of [12]) that there is a Möbius extension B of A with spectrum [π ] = λ∈ [λ]. Remark 2.16. (1) We will use the notation B = A for the extension in Prop. 2.15. (2) One can extend the above theorem to a case when B is not local but verifies twisted locality. Such extensions have been used for example in [25].

2.5. Mirror extensions. In this section we recall the mirror construction as given in §3 of [41]. Let B be a completely rational net and A ⊂ B be a subnet which is also completely rational. ⊂ B by A(I ) := A(I ) ∩ B(I ), ∀I ∈ I. Definition 2.17. Define a subnet A We note that since A is completely rational, it is strongly additive and so we have ) = (∨ J ∈I A(J )) ∩ B(I ), ∀I ∈ I. The following lemma then follows directly from A(I the definition: on the Hilbert space ∨ I A(I ) is an irreducible Lemma 2.18. The restriction of A Möbius covariant net. as in Lemma 2.18 will be called the coset of A ⊂ B. See [44] for a class of The net A cosets from Loop groups. The following definition generalizes the definition in §3 of [44]: ) ∨ A(I ) ⊂ B(I ) has finite Definition 2.19. A ⊂ B is called cofinite if the inclusion A(I index for some interval I . The following is Prop. 3.4 of [41]: Proposition 2.20. Let B be completely rational, and let A ⊂ B be a Möbius subnet which is also completely rational. Then A ⊂ B is cofinite if and only if A˜ is completely rational. Let B be completely rational, and let A ⊂ B be a Möbius subnet which is also completely rational. Assume that A ⊂ B is cofinite. We will use σi , σ j , ... (resp. λ, µ...) to label irreducible DHR representations of B (resp. A) localized on a fixed interval is completely rational by Prop. 2.20, A ⊗ A is completely rational, and so I0 . Since A ⊗ A, decomposes every irreducible DHR representation σi of B, when restricting to A ⊗ A of the form (i, λ) ⊗ λ by Lemma 27 of [24]. as direct sum of representations of A which may not be irreducible and we use the Here (i, λ) is a DHR representation of A

Application of Mirror Extensions

91

⊗ A which is localized tensor notation (i, λ) ⊗ λ to represent a DHR representation of A on I0 and defined by 0 ) ⊗ A(I0 ). (i, λ) ⊗ λ(x1 ⊗ x2 ) = (i, λ)(x1 ) ⊗ λ(x2 ), ∀x1 ⊗ x2 ∈ A(I and A as subnets of A ⊗ A in the natural way. We note that We will also identify A when no confusion arises, we will use 1 to denote the vacuum representation of a net. ) ∩ B(I ) = A for some I. Definition 2.21. A Möbius subnet A ⊂ B is normal if A(I The following is implied by Lemma 3.4 of [35] (also cf. p. 797 of [46]): Lemma 2.22. Let B be completely rational, and let A ⊂ B be a Möbius subnet which is also completely rational. Assume that A ⊂ B is cofinite. Then the following conditions are equivalent: (1) A ⊂ B is normal; and (1, λ) contains (1, 1) if and only if (2) (1, 1) is the vacuum representation of A λ = 1. The following is part of Proposition 3.7 of [41]: Proposition 2.23. Let B be completely rational, and let A ⊂ B be a Möbius subnet which is also completely rational. Assume that A ⊂ B is cofinite and normal. Then: (1) Let γ be the restriction of the vacuum representation of B to A ⊗ A. Then [γ ] = [(1, λ) ⊗ λ], where each (1, λ) is irreducible; λ∈exp (2) Let λ ∈ exp be as in (1), then [α(1,λ)⊗1 ] = [α1⊗λ¯ ], and [λ] → [α1⊗λ ] is a ring isomorphism where the α-induction is with respect to A⊗A ⊂ B as in Subsect. 2.3; Moreover the set exp is closed under fusion; (3) Let [ρ]= λ∈exp m λ [λ], where m λ =m λ¯ ≥ 0, ∀λ, and [(1, ρ)]= λ∈exp m λ [(1, λ)]. Then there exists an unitary element Tρ ∈ Hom(α(1,ρ)⊗1 , α1⊗ρ ) such that

(ρ, ρ);

((1, ρ), (1, ρ))Tρ∗ α1⊗ρ (Tρ∗ ) = Tρ∗ α1⊗ρ (Tρ∗ ) (4) Let ρ, (1, ρ) be as in (3). Then Hom(ρ n , ρ m ) = Hom(α1⊗ρ n , α1⊗ρ m ), Hom((1, ρ)n , (1, ρ)m ) = Hom(α(1,ρ)n ⊗1 , α(1,ρ)m ⊗1 ), ∀n, m ∈ N; Denote by 0 := {λ|[λ] = i [λi ], λi ∈ ex p}. Assume µi ∈ 0 , i = 1, ..., n. For such that [M(µi )] = each [µi ] = j m i j [λ j ], choose DHR representations M(µi ) of A j m i j [(1, λ j )]. Let Ti ∈ Hom(αm(µi )⊗1 , α1⊗µi ) be an unitary element (not necessarily unique up to phase when µi is not irreducible) as given in (3) of Prop. 2.23. Define Ti1 i2 ...ik := αµ1 ...µk−1 ⊗1 (Tik )...αµ1 ...µ2 ⊗1 (Ti3 )αµ1 ⊗1 (Ti2 )Ti1 ∈ Hom(α M(µ1 )...M(µk )⊗1 , α1⊗µ1 ...µk ). For each S ∈ Hom(µi1 ...µik , µ j1 ...µ jm ) we define M(S) := T j∗1 ... jm STi1 ...ik . Lemma 2.24. Assume that S1 , T ∈ Hom(λ, µ), S2 ∈ Hom(ν, λ), where λ, µ are products of elements from {µ1 , ..., µn }. If ν = µi1 ...µik we denote M(µi1 )...M(µik ) by M(ν). Then: M(S1 S2 ) = M(S1 )M(S2 ), M(ν(T )) = M(ν)(M(T )), M(ε(µi , µ j )) = ε˜ (M(µi ), M(µ j )).

92

F. Xu

Proof. The first two identities follow directly from definitions. The third follows from (3) of Prop. 2.3.1 of [43], as (3) of Prop. 2.23. The following is Th. 3.8 of [41]: Theorem 2.25. Let B be completely rational, and let A ⊂ B be a Möbius subnet which is also completely rational. Assume that A ⊂ B is cofinite and normal, and let exp be as in (1) of Prop.2.23. Assume that A ⊂ C is an irreducible Möbius extension of A with spectrum [ρ] = λ∈exp m λ [λ], m λ ≥ 0. Then there is an irreducible Möbius extension with spectrum [(1, ρ)] = Cof A λ∈exp m λ [(1, λ)]. Moreover C is completely rational. ⊂ C as given in Th. 3.3 Remark 2.26. Due to (5) of Prop. 3.7 of [41], the extension A will be called the mirror or the conjugate of A ⊂ C. By Lemma 2.8 and Th. 2.25 we have: Corollary 2.27. Let C be the mirror extension as given in Th. 2.25. Then

µC µA

=

µC µA .

⊂ C is constructed as follows: let (ρ, w, w1 ) be associated The mirror extension A ⊂ C is given by with extension A ⊂ C as given in Prop. 2.10. Then the extension A (M(ρ), M(w), M(w1 )) where the map M is defined before Lemma 2.24. Let µ, ν ∈ 0 . ⊂ C. Consider now inductions with respect to A ⊂ C and A Proposition 2.28. Assume that µ, ν ∈ 0 , M(ρ) = (1, ρ), M(µ) = (1, µ), M(ν) = (1, ν). Then αµ , αν = α M(µ) , α M(ν) , αµ , αν− = α M(µ) , α − M(ν) . Proof. By Lemma 2.14 we have αµ , αν = ρµ, ν , α M(µ) , α M(ν) = M(ρ)M(µ), M(ν) , By Prop. 2.23 we have proved the first equality. By Lemma 2.12 we have Hom(αµ , αν− ) = {T ∈ C(I0 )|γ (T ) ∈ Hom(ρµ, ρν), ε(ν, ρ)ε(ρ, ν)γ (T ) = γ (T )}. By [29], γ (C(I0 )) = {x ∈ A(I0 )|x = w1∗ ρ(x)w1 }. It follows that αµ+ , αν− is equal to the dimension of the following vector space: {T ∈ A(I0 )|T ∈ Hom(ρµ, ρν), ε(ν, ρ)ε(ρ, ν)T = T , T = w1∗ ρ(T )w1 }. Now apply the map M to the above vector space and using Lemma 2.24 we have that αµ+ , αν− is equal to the dimension of the following vector space: {T ∈ A(I 0 )|T ∈ Hom(M(ρ)M(µ), M(ρ)M(ν)), ε˜ (M(ν), M(ρ))˜ε (M(ρ), M(ν))T = T , T = M(w1 )∗ M(ρ)(T )M(w1 )}.

Since ε˜ (M(ν), M(ρ))˜ε (M(ρ), M(ν)) = (ε(M(ν), M(ρ))ε(M(ρ), M(ν)))∗ , we conclude that αµ+ , αν− is equal to the dimension of the following vector space: {T ∈ A(I 0 )|T ∈ Hom(M(ρ)M(µ), M(ρ)M(ν)), ε(M(ν), M(ρ))ε(M(ρ), M(ν))T = T , T = M(w1 )∗ M(ρ)(T )M(w1 )}

which is equal to α M(µ) , α − M(ν) by Lemma 2.12.

Application of Mirror Extensions

93

2.6. A series of normal extensions. Let G = SU (n). We denote LG the group of smooth maps f : S 1 → G under pointwise multiplication. The diffeomorphism group of the circle DiffS 1 is naturally a subgroup of Aut(LG) with the action given by reparametrization. In particular the group of rotations RotS 1 U (1) acts on LG. We will be interested in the projective unitary representation π : LG → U (H ) that are both irreducible and have positive energy. This means that π should extend to LG Rot S 1 so that H = ⊕n≥0 H (n), where the H (n) are the eigenspace for the action of RotS 1 , i.e., rθ ξ = exp(inθ ) for θ ∈ H (n) and dim H (n) < ∞ with H (0) = 0. It follows from [32] that for fixed level k which is a positive integer, there are only a finite number of such irreducible representations indexed by the finite set

k P++

⎧ ⎨ = λ∈ P|λ= ⎩

λi i , λi ≥ 0 ,

i=1,...,n−1

i=1,...,n−1

⎫ ⎬ λi ≤ k , ⎭

where P is the weight lattice of SU (n) and i are the fundamental weights. We will write λ = (λ1 , ..., λn−1 ), λ0 = k − 1≤i≤n−1 λi and refer to λ0 , ..., λn−1 as components of λ. We will use k0 or simply 1 to denote the trivial representation of SU (n). (δ) (δ) (δ∗) (δ) (δ) k , define N ν = For λ, µ, ν ∈ P++ /S0 , where Sλ is given by k Sλ Sµ Sν λµ δ∈P++ the Kac-Peterson formula: (δ) εw exp(iw(δ) · λ2π/n), Sλ = c w∈Sn

(δ)

where εw = det(w) and c is a normalization constant fixed by the requirement that Sµ ν are non-negative integers. is an orthonormal system. It is shown in [21]. p. 288 that Nλµ k with structure Moreover, define Gr (Ck ) to be the ring whose basis are elements of P++ ν k ∗ constants Nλµ . The natural involution ∗ on P++ is defined by λ → λ = the conjugate of λ as representation of SU (n). ()

()

We shall also denote S0 by S1 . Define dλ =

(λ)

S1

( ) S1 0

(δ)

. We shall call (Sν ) the S-matrix

of L SU (n) at level k. We shall encounter the Zn group of automorphisms of this set of weights, generated by J : λ = (λ1 , λ2 , . . . , λn−1 ) → J (λ) = (k − 1 − λ1 − · · · λn−1 , λ1 , . . . , λn−2 ). We will identity J with k1 in the following. Define col(λ) = i (λi − 1)i. The central 2πicol(λ) element exp 2πi ). n of SU (n) acts on representation of SU (n) labeled by λ as exp( n modulo n col(λ) will be called the color of λ. The irreducible positive energy representations of L SU (n) at level k give rise to an irreducible conformal net A SU (n)k (cf. [26]) and its covariant representations. A SU (n)k is completely rational (cf. [40 and 42]), and µA SU (n)k = (S11 )2 by [42]. We will use 1

λ = (λ1 , . . . λn−1 ) to denote irreducible representations of A and also the corresponding endomorphism of M = A(I ). All the sectors [λ] with λ irreducible generate the fusion ring of A.

94

F. Xu

For λ irreducible, the univalence ωλ is given by an explicit formula (cf. 9.4 of [PS]). 2 (λ) Let us first define h λ = ck+n , where c2 (λ) is the value of Casimir operator on representation of SU (n) labeled by dominant weight λ. h λ is usually called the conformal dimension. Then we have: ωλ = ex p(2πi h λ ). The conformal dimension of λ = (λ1 , ..., λn−1 ) is given by 1 1 i(n − i)λi2 + hλ = 2n(k + n) n(k + n) ×

1≤i≤n−1

j (n − i)λ j λi +

1≤ j≤i≤n−1

1 2(k + n)

j (n − j)λ j .

(5)

1≤ j≤n−1

Let G ⊂ H be inclusions of compact simple Lie groups. LG ⊂ L H is called a conformal inclusion if the level 1 projective positive energy representations of L H decompose as a finite number of irreducible projective representations of LG. LG ⊂ L H is called a maximal conformal inclusion if there is no proper subgroup G of H containing G such that LG ⊂ LG is also a conformal inclusion. A list of maximal conformal inclusions can be found in [18]. Let H 0 be the vacuum representation of L H , i.e., the representation of L H associated with the trivial representation of H . Then H 0 decomposes as a direct sum of irreducible projective representation of LG at level K . K is called the Dynkin index of the conformal inclusion. We shall write the conformal inclusion as G K ⊂ H1 . Note that it follows from the definition that A H1 is an extension of AG K . We will be interested in the following conformal inclusion: L(SU (m)n × SU (n)m ) ⊂ L SU (nm). In the classification of conformal inclusions in [18] the above conformal inclusion corresponds to the Grassmanian SU (m + n)/SU (n) × SU (m) × U (1). Let 0 be the vacuum representation of L SU (nm) on Hilbert space H 0 . The decomposition of 0 under L(SU (m) × SU (n)) is known, see, e.g. [1]. To describe such a decomposition, let us prepare some notation. We use S˙ to denote the S-matrices of SU (m), and S¨ to denote the S-matrices of SU (n). The level n (resp. m) weight of L SU (m) (resp. L SU (n)) will be denoted by λ˙ (resp. λ¨ ). We start by describing P˙+n (resp. P¨+m ), i.e. the highest weights of level n of L SU (m) (resp. level m of L SU (n)). P˙+n is the set of weights ˙ 0 + ˙ 1 + ··· + ˙ m−1 , λ˙ = k0 k1 km−1 where ki are non-negative integers such that m−1

ki = n,

i=0

˙ 0 + ω˙ i , 1 ≤ i ≤ m − 1, where ω˙ i are the fundamental weights of SU (m). ˙i = and Instead of λ˙ it will be more convenient to use λ˙ + ρ˙ =

m−1 i=0

˙i ki

Application of Mirror Extensions

95

m

with ki = ki + 1 and −1 → = 0 →

i

ki = m + n. Due to the cyclic symmetry of the

extended Dynkin diagram of SU (m), the group Zm acts on P˙+n by ˙i → ˙ (i+µ) ˙ ∈ Zm . ˙ modm , µ Let m,n = P˙+n /Zm . Then there is a natural bijection between m,n and n,m (see §2 of [1]). The idea is to draw a circle and divide it into m + n arcs of equal length. To each partition 0≤i≤m−1 ki = m + n, there corresponds a “slicing of the pie” into m successive parts with angles 2π ki /(m + n), drawn with solid lines. We choose this slicing to be clockwise. The complementary slicing in broken lines (the lines which are not solid) defines a partition of m + n into n successive parts, 0≤i≤n−1 li = m + n. We choose the later slicing to be counterclockwise, and it is easy to see that such a slicing corresponds uniquely to an element of n,m . We parameterize the bijection by a map β : P˙+n → P¨+m as follows. Set rj =

m

ki , 1 ≤ j ≤ m,

i= j

where km ≡ k0 . The sequence (r1 , . . . , rm ) is decreasing, m + n = r1 > r2 > · · · > rm ≥ 1. Take the complementary sequence (¯r1 , r¯2 , . . . , r¯n ) in {1, 2, . . . , m + n} with r¯1 > r¯2 > · · · > r¯n . Put S j = m + n + r¯n − r¯n− j+1 , 1 ≤ j ≤ n. Then m + n = s1 > s2 > · · · > sn ≥ 1. The map β is defined by (r1 , . . . , rm ) → (s1 , . . . , sn ). The following lemma summarizes what we will use: ˙ i , 0 ≤ i ≤ m − 1 its fundaLemma 2.29. (1) Let Q˙ be the root lattice of SU (m), ˙ i ) ∩ P˙+n . Let ∈ Zmn denote a level 1 highest mental weights and Q˙ i = ( Q˙ + weight of SU (mn) and λ˙ ∈ Q˙ modm . Then there exists a unique λ¨ ∈ P¨+m with λ¨ = µβ(λ˙ ) for some unique µ ∈ Zn such that Hλ˙ ⊗ Hλ¨ appears once and only once ˙ is one-to-one. Moreover, H , as representations in H . The map λ˙ → λ¨ = µβ(λ) of L(SU (m) × SU (n)), is a direct sum of all such Hλ˙ ⊗ Hλ¨ ; (2) µA SU (n)m = mn µA SU (m)n ; (3) The subnets A SU (n)m ⊂ A SU (nm)1 are normal and cofinite. The set exp as in (1) n+m which belong to the root lattice of SU (n). Prop. 2.23 is the elements of P++ Proof. (1) is Th. 1 of [1]. (2) follows from Th. 4.1 of [42]. (3) is Lemma 4.1 of [41]. 3. Schellekens’s Modular Invariants and Their Realizations By Conformal Nets In this section we examine three modular invariants constructed by A. N. Schellekens in [37] which are based on level-rank duality. These are entries 18, 27, and 40 in the table of [37]. Our goal in this section is to show that they can be realized by conformal nets as an application of mirror extensions in Sect. 2.5. For simplicity in this section we will use G k to denote the corresponding conformal net AG k when no confusion arises.

96

F. Xu

3.1. Three mirror extensions. (10)2 is the simplest nontrivial example of mirror extensions apply3.1.1. SU (10)2 . SU ing to SU (2)10 ⊂ Spin(5)1 and SU (2)10 × SU (10)2 ⊂ SU (20)1 in Theorem 2.25. By Cor. 2.27 and Lemma 2.29, µ = 20. SU (10)2 Consider the induction for SU (10)2 ⊂ SU (10)2 . By Th. 5.7 of [7] the matrix Z λµ = − αλ , αµ commutes with the S, T matrix of SU (10)2 . Such matrices are classified in [16], and it follows that there are 15 irreducible representations of SU (10)2 given as follows: α iJ , 0 ≤ i ≤ 9, α iJ σ, 0 ≤ j ≤ 4. The fusion rules are determined by the following relations: [σ¯ ] = [α 2J σ ], [α 5J σ ] = [σ ], [σ σ¯ ] = [1] + [α 5J ]. The restrictions of these representations to SU (10)2 are given as follows:

α iJ = [J i (20 )] + [J i (3 + 7 )], 0 ≤ i ≤ 9; α iJ σ = [J i (0 + 3 )] + [J i (5 + 8 )], 0 ≤ j ≤ 4. It follows that modulo integers the conformal dimensions are given as i(10 − i) 77 25 , 0 ≤ i ≤ 9, h σ = , hαJ σ = , 10 80 16 157 173 , h α3 σ = = h α4 σ . = J J 80 80

h αi = J

h α2 σ J

The following simple lemma will be used later: Lemma 3.1. A Spin(n)1 is a completely rational net whose irreducible representations are in one to one correspondence with irreducible representations of LSpin(n) √ 1 . When n is odd there are three irreducible representations 1, µ0 , µ1 with index 1, 1, 2 respectively and fusion rules [µ21 ] = [1] + [µ0 ]; when n = 4k + 2, k ∈ N the fusion rule is Z4 ; when n = 4k, k ∈ N the fusion rule is Z2 × Z2 . Proof. By Th. 3.10 of [8] it is enough to prove that µA Spin(n)1 = 4. When n = 5 this follows from conformal inclusion SU (2)10 ⊂ Spin(5)1 and Lemma 2.8. Consider the inclusion SO(n)×U (1) ⊂ SO(n +2). Note that the fundamental group of SO(n) is Z2 . It follows that loops with even winding numbers in LU (1) can be lifted to L Spin(n), and we have a conformal inclusion L Spin(n − 2)1 × LU (1)4 ⊂ L Spin(n)1 . Since µAU (1)4 = 4 by §3 of [43], and the index of A Spin(n−2)1 × AU (1)4 ⊂ A Spin(n)1 is checked to be 2, by induction one can easily prove the lemma for all odd n. When n is even we use the conformal inclusion A SU (n/2)1 × AU (1)2n ⊂ A Spin(n)1 with index n/2. Note that µA SU (n/2)1 = n/2, µAU (1)2n = 2n by §3 of [43], and by Lemma 2.8 we have µA Spin(n)1 = 4.

Application of Mirror Extensions

97

3.1.2. SU (9)3 . SU (9)3 is an extension of SU (9)3 by applying Th. 2.25 to SU (3)9 ⊂ (E 6 )1 and SU (3)9 × SU (9)3 ⊂ SU (27)1 . By Cor. 2.27 and Lemma 2.29 µ = 9. SU (9)3 Recall the branching rules for SU (3)9 ⊂ (E 6 )1 . (We use 10 to denote the vacuum representation of (E 6 )1 and 1+ , 1− the other two irreducible representations of (E 6 )1 .): ([ J˙i (90 )] + [ J˙i (0 + 41 + 42 )]), [1+ ] = [1− ] [10 ] = 0≤i≤2

=

([ J˙i (50 + 21 + 22 )]),

0≤i≤2

where J˙ := 91 . Consider inductions with respect to SU (9)3 ⊂ SU (9)3 . By Th. 2.25 and Lemma 2.29 the vacuum of SU (9)3 restricts to representation ([J 3i (90 )] + [J 3i (3 + 7 + 8 )] 0≤i≤2

of SU (9)3 . Since J is local with the above representation, by Lemma 2.13 α J is a DHR representation of SU (9)3 , and [α 3J ] = [1]. One can determine the remaining irreducible representations of SU (9)3 by using [16] as in §3.1.1. Here we give a different approach which will be useful in §3.1.3. We note that M( J˙) = J 3 , M( J˙i (50 + 21 + 22 )) = J 3i (4 +λ6 +8 ), i = 0, 1, 2 by Lemma 2.29, where M is defined as before Lemma 2.24. By Prop. 2.28 we have − = 2. α4 +6 +8 , α 4 +6 +8

(9)3 such that It follows that there are two irreducible DHR representations τ1 , τ2 of SU α4 +6 +8 [τ1 ]+[τ2 ], and τ1 , τ2 are the only two irreducible subsectors of α4 +6 +8 which are DHR representations. We have for i = 1, 2 τi , αµ− ≤ α4 +6 +8 , αµ− . Note that if the color of µ is nonzero, then α4 +6 +8 , αµ− = 0 by Lemma 2.14 since 4 + 6 + 8 has color 0. If µ has color 0, by Lemma 2.29 and Prop. 2.28 we have α4 +6 +8 , αµ− is nonzero only when µ = J 3i (4 + 6 + 8 ), i = 0, 1, 2. It follows that τi , αµ = 1 when µ = J 3i (4 + 6 + 8 ), i = 0, 1, 2, and τi , αµ = 0 when µ = J 3i (4 + 6 + 8 ), i = 0, 1, 2, Hence the restriction of τi to SU (9)3 are given as follows: [J 3 j (4 + 6 + 8 )]. [τi ] = 0≤ j≤2

98

F. Xu

It follows that the index of τi , i = 1, 2 is one, and since [(α J τi ) ] = [J 3 j+1 (4 + 6 + 8 )], 0≤ j≤2

it follows that [α j τi ] = [τi ]. Hence the irreducible representations of SU (9)3 are given by 1, α J , α 2J , α iJ τk , 0 ≤ i ≤ 2, k = 1, 2. These representations generate an abelian group of order 9, it must be either Z3 × Z3 or Z9 . Note that by Lemma 2.14, k α J , τik ≤ α J , α = 0, ∀k ≥ 0 4 +6 +8

since J has color 3 while 4 + 6 + 8 has color 0, it follows that these representations generate an abelian group Z3 × Z3 . Modulo integers the conformal dimensions of τk , α J are given by hαJ =

4 7 7 11 14 , h τk = , h α 2 = , h α J τk = , h α 2 τk = , k = 1, 2. J J 3 3 3 3 3

3.1.3. SU (8)4 . From conformal inclusion Spin(6)8 ⊂ Spin(20)1 and Spin(6) SU (4) we obtain conformal inclusion SU (4)8 ⊂ Spin(20)1 . For simplicity we use (0), (5/4)1 , (5/4)2 , (1/2) to denote irreducible representations of Spin(20)1 with conformal dimensions 0, 5/4, 5/4, 1/2 respectively. By comparing conformal dimensions the branching rules for SU (4)8 ⊂ Spin(20)1 are given by: ([ J˙i ] + [ J˙i (40 + 1 + 22 + 3 )]), [(0) ] = 0≤i≤3

[(5/4)1 ] = [(5/4)2 ] = [(1/2) ] =

[ J˙i (30 + 1 + 22 + 33 )],

0≤i≤3

([ J˙i (60 + 22 )] + [ J˙i (30 + 32 + 23 )]).

0≤i≤3

Note that all representations appearing above have color 0. SU (8)4 is the extension of SU (8)4 by applying Th. 2.25 to SU (4)8 ⊂ Spin(20)1 and SU (4)8 × SU (8)4 ⊂ SU (32)1 . By Lemma 2.29 the spectrum of SU (8)4 ⊂ SU (8)4 is given by ([J 2i ] + [J 2i (0 + 4 + 25 + 7 )]). 0≤i≤3

By Lemma 3.1 and Lemma 2.8 µ = 8. SU (8)4 By using Prop. 2.28 similar as in §3.1.2 we obtain all irreducible representations of SU (8)4 as follows: 1, α J , (3/4)1 , (3/4)2 , (1/2), α J (3/4)1 , α J (3/4)2 , α J (1/2).

Application of Mirror Extensions

99

These representations restrict to SU (8)4 as follows: ([J 2i+1 ] + [J 2i (0 + 4 + 25 + 7 )]), [α J ] = 0≤i≤3

j (α J (3/4)k ) = [J 2i (0 + 3 + 26 + 7 )]),

j = 0, 1, k = 1, 2;

0≤i≤3

j (α J (1/2)) = ([J 2i+ j (20 + 3 + 5 )] + [J 2i+ j (25 + 27 )]),

j = 0, 1.

0≤i≤3

The conformal dimensions modulo integers are as follows: h α J = 7/4, h (3/4)k = 3/4, k = 1, 2, h (1/2) = 1/2, h α J (3/4)1 = h α J (3/4)2 = 5/2, h α J (1/2) = 9/4, which explain our notations. The irreducible representations of SU (8)4 generate an abelian group of order 8 under compositions, so the abelian group is Z2 × Z2 × Z2 , Z2 × Z4 j or Z8 . By Lemma 2.14 α J , (3/4)k = α J , (1/2) j = 0, k = 1, 2, ∀ j ≥ 0 since the restriction of α J to SU (8)4 has color 4 while the restriction of (3/4)k , (1/2) to SU (8)4 has color 0, it follows that Z8 is impossible. Note that the conjugate of (1/2) has conformal dimension 1/2, and it must be (1/2), so [(1/2)2 ] = [1]. To rule out the possibility of Z2 × Z4 , note that this can only happen when the order of (3/4)1 is 4, and we must have [(1/2)] = [(3/4)21 ], [(3/4)2 ] = [(3/4)31 ]. By monodromy equation we have ε((3/4)1 , (3/4)1 )2 = 1, ε((3/4)1 , (1/2))ε((1/2), (3/4)1 ) = −1. On the other hand by Lemma 4.4 of [34] we have ε((3/4)1 , (1/2))ε((1/2), (3/4)1 ) = ε((3/4)1 , (3/4)21 ))ε((3/4)21 , (3/4)1 ) = ε((3/4)1 , (3/4)1 )4 = 1, a contradiction. It follows that irreducible representations of SU (8)4 generate Z2 ×Z2 × Z2 under compositions, and we have [(3/4)1 ] = [(3/4)1 ], [(1/2)(3/4)1 ] = [(3/4)2 ]. 3.2. Further extensions by simple currents. 3.2.1. No. 40 of [37]. The modular invariant No. 40 in [37] suggests that we look for simple current extensions of SU (10)2 × SU (5)1 × SO(7)1 . For simplicity we use y i = i , 0 ≤ i ≤ 9 to denote the irreducible representation of SU (5)1 . Note that h y 2 = 3/5. We use (1/2), (7/16) to denote the irreducible representations of SO(7)1 with conformal dimensions 1/2, 7/16. Note that the index of (1/2), (7/16) are 1, 2 respectively. By §3.1.1 the conformal dimension of u = (α J , y 2 , (1/2)) is h α J + h y + 1/2 = 2. It follows that u i , 0 ≤ i ≤ 9 is a local system of automorphisms. By Prop. 2.15 there is a Möbius (10)2 × SU (5)1 × SO(7)1 . extension D = ( SU (10)2 × SU (5)1 × SO(7)1 ) Z10 of SU By Cor. 2.27 and Lemma 3.1 µD = 4. Consider now the inductions for SU (10)2 × SU (5)1 × SO(7)1 ⊂ D.

100

F. Xu

By using formulas for conformal dimensions in §3.1.1 one checks easily that H ((σ, y 3 , (7/16)), u) = H ((1, 1, (1/2)), u) = 1. By Lemma 2.13 we conclude that α(σ,y 3 ,(7/16)) , α(1,1,(1/2)) are DHR representations of D with index 2, 1 respectively. Note that by Lemma 2.14, (σ, y 3 , (7/16)), (σ, y 3 , (7/16))u i = 2, α(σ,y 3 ,(7/16)) , α(σ,y 3 ,(7/16)) = 0≤i≤9

where in the last step we have used [σ a 5J ] = [1]. It follows that [α(σ,y 3 ,(7/16)) ] = [δ1 ] + [δ2 ]. Since µD = 4, the list of irreducible representations are given by 1, α(1,1,(1/2)) , δ1 , δ2 . The conformal dimensions modulo integers are h δ1 = h δ2 = 1, h α(1,1,(1/2)) = 1/2. These representations generate an abelian group of order 4. To rule out Z4 , note that 2 [α(1,1,(1/2)) ] = [1]. Without losing generality we assume that δ1 has order 4. Then we must have [δ12 ] = [α(1,1,(1/2)) ], [δ13 ] = [δ2 ]. By the monodromy equation we have ε(δ1 , δ1 )2 = −1, ε(δ1 , δ2 )ε(δ2 , δ1 ) = 1. On the other hand by Lemma 4.4 of [34] we have ε(δ1 , δ2 )ε(δ2 , δ1 ) = ε(δ1 , δ1 )6 = −1, a contradiction. In particular we have [δ12 ] = [1]. Hence 1, δ1 is a local system of automorphisms, and by Prop. 2.15 we conclude that there is further extension D Z2 of D. By Lemma 2.8 we have µDZ2 = 1, i.e., D Z2 is holomorphic. The spectrum of SU (10)2 × SU (5)1 × Spin(7)1 ⊂ D Z2 is given by entry 40 in the table of [37]: ([(J i , y 2i , (1/2)i )]+[(J i (3 +7 ), y 2i , (1/2)i )]+[(J i (3 +6 ), y 2i+4 , (7/16))]). o≤i≤9

3.2.2. No. 27 of [37]. No. 27 in the table of [37] suggests that we look for simple current extensions of SU (9)3 × SU (3)1 × SU (3)1 . Label irreducible representations of SU (3)1 by their conformal dimensions as 1, (1/3)1 , (1/3)2 . Denote by x1 = (α J , (1/3)1 , (1/3)1 ), x2 = (τ1 , (1/3)1 , (1/3)2 . By using formulas for conformal dimensions in j §3.1.2 and Lemma 2.4 it is easy to check that the following set x1i x2 , 0 ≤ i, j ≤ 2 is a local system of automorphisms. Hence by Prop. 2.15 there is a Möbius extension D1 = ( SU (9)3 × SU (3)1 × SU (3)1 ) (Z3 × Z3 ) of SU (9)3 × SU (3)1 × SU (3)1 j i with spectrum 0≤i, j≤2 [x1 x2 ]. By Lemma 2.8 µD1 = 1, so D1 is holomorphic. The spectrum of SU (9)3 × SU (3)1 × SU (3)1 ⊂ D1 is given by (entry (27) of [37]): i+1 ([(J i , (1/3)i1 , (1/3)i1 )] + [(J i (4 + 6 + 8 ), (1/3)i−1 1 , (1/3)1 )] 0≤i≤9 i−1 i i i + [(J i (4 + 6 + 8 ), (1/3)i+1 1 , (1/3)1 )] + [(J (3 +7 +8 ), (1/3)1 , (1/3)1 )].

Remark 3.2. One can choose other local systems of automorphisms which generate Z3 × Z3 . For an example one such choice is a local system of automorphisms given by x1i x2 j , 0 ≤ i, j ≤ 2 with x1 = (α J , (1/3)1 , (1/3)2 ), x2 = (τ1 , (1/3)1 , (1/3)1 ). However by Remark 2.11 it is easy to check that the corresponding extension is simply AdU (D1 ) which is isomorphic to D1 , where AdU implements the outer automorphism of the last factor of SU (3)1 . A similar statement holds for other choices of local systems of automorphisms which generate Z3 × Z3 .

Application of Mirror Extensions

101

3.2.3. No. 18 of [37]. No. 18 in the table of [37] suggests that we look for simple current extensions of SU (8)4 × SU (2)1 × SU (2)1 × SU (2)1 . As before we label the non-vacuum representation (1/4) of SU (2)1 by its conformal dimension. Set z 1 = (α J , (1/4), 0, 0), z 2 = ((3/4)1 , 0, (1/4), 0), z 3 = ((3/4)2 , 0, 0, (1/4)). Then by the formulas for conformal dimensions and fusion rules in §3.1.3 one checks easily that H (z i , z j ) = 1, 1 ≤ i, j ≤ 3. Hence {z 1 , z 2 , z 3 } generate an abelian group Z2 × Z2 × Z2 which is a local system of automorphisms by Lemma 2.4. By Prop. 2.15 we conclude (8)4 × SU (2)1 × SU (2)1 × SU (2)1 )(Z2 × that there is a Möbius extension D2 := ( SU Z2 × Z2 ). By Lemma 2.8 we have µD2 = 1, i.e., D2 is holomorphic. The spectrum of SU (8)4 × SU (2)1 × SU (2)1 × SU (2)1 ⊂ D2 is given by (entry (18) of [37]):

([(J i , (1/4)i , 0, 0)] + [(J i (0 + 4 + 5 + 7 ), (1/4)i , 0, 0)]

0≤i≤7

+ [(J i (5 + 27 ), J1i , (1/4), (1/4))] + [(J i (20 + 3 + 5 ), (1/4)i , (1/4), (1/4))] + [(J i (0 + 3 + 6 + 7 ), (1/4)i , 0, (1/4))] + [(J i (0 + 3 + 6 + 7 ), (1/4)i , (1/4), 0)]) 3.2.4. The main theorem. By Lemma 2.9 D Z2 , D1 , D2 as constructed in §3.2.1, §3.2.2 and §3.2.3 are in fact conformal nets since they contain conformal subnets with finite index, and in summary we have proved the following: Theorem 3.3. There are holomorphic conformal nets (with central charge 24) which are conformal extensions of SU (10)2 × SU (5)1 × Spin(7)1 , SU (9)3 × SU (2)1 × SU (2)1 , SU (8)4 × SU (2)1 × SU (2)1 × SU (2)1 with spectrum given by the representations at the end of §3.2.1, §3.2.2 and §3.2.3 respectively. 3.3. Two conjectures. The holomorphic conformal net corresponding to V of [14] was constructed in [23]. This net can also be constructed using the result of [10] as a simple current Z2 extension of a Z2 orbifold conformal net associated with the Leech lattice given in [10]. Our first conjecture is an analogue of the conjecture in [14] for V : Conjecture 3.4. Up to isomorphism there exists a unique holomorphic conformal net with central charge 24 and no elements of weight one. Our second conjecture is motivated by the results of [37]: Conjecture 3.5. Up to isomorphism there exists finitely many holomorphic conformal nets with central charge 24 . Note that if one can obtain a theorem like the theorem in §2 of [37] in the setting of conformal nets, then modulo conjecture (3.4) conjecture (3.5) is reduced to show that up to equivalence, there are only finitely many conformal extensions of a given completely rational net, and this should be true in view of the results of [19]. However new methods have to be developed to carry though this idea.

102

F. Xu

References 1. Altschüler, D., Bauer, M., Itzykson, C.: The branching rules of conformal embeddings. Commun. Math. Phys. 132, 349–364 (1990) 2. D’ Antoni, C., Fredenhagen, K., Koester, S.: Implementation of conformal covariance by diffeomorphism symmetry. Lett. Math. Phys. 67, 239–247 (2004) 3. Böckenhauer, J., Evans, D.: Modular Invariants, Graphs and α-Induction for Nets of Subfactors I. Commun. Math. Phys. 197, 361–386 (1998) 4. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors. II. Commun. Math. Phys. 200, 57–103 (1999) 5. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors. III. Commun. Math. Phys. 205, 183–228 (1999) 6. Böckenhauer, J., Evans, D.E.: Modular invariants from subfactors: Type I coupling matrices and intermediate subfactors. Commun. Math. Phys. 213(2), 267–289 (2000) 7. Böckenhauer, J., Evans, D.E., Kawahigashi, Y.: On α-induction, chiral generators and modular invariants for subfactors. Commun. Math. Phys. 208, 429–487 (1999) 8. Böckenhauer, J.: Superselection Sectors of SO(n) Wess-Zumino-Witten Models. http://arXiv.org/list/hepth/9607016, 1996 9. Carpi, S.: On the representation theory of Virasoro nets. Commun. Math. Phys. 244(2), 261–284 (2004) 10. Dong, C., Xu, F.: Conformal nets associated with lattices and their orbifolds. Adv. Math. 206(1), 279–306 (2006) 11. Witten, E.: Three-Dimensional Gravity Revisited. http://arXiv.org/abs/0706.3359v, [hepth], 2007 12. Doplicher, S., Haag, R., Roberts, J.E.: Fields, observables and gauge transformations II. Commun. Math. Phys. 15, 173–200 (1969) 13. Gabbiani, F., Fröhlich, J.: Operator algebras and Conformal field theory. Commun. Math. Phys. 155, 569–640 (1993) 14. Frenkel, I., Lepowsky, J., Meurman, A.: Vertex operator algebras and the Monster. Pure and Applied Mathematics, 134. Boston, MA: Academic Press, Inc. 1988 15. Fuchs, J., Schellekens, A.N., Schweigert, C.: Galois modular invariants of WZW models. Nuclear Phys. B 437(3), 667–694 (1995) 16. Gannon, T.: The level two and three modular invariants of SU(n). Lett. Math. Phys. 39(3), 289–298 (1997) 17. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11–35 (1996) 18. Goddard, P., Nahm, W., Olive, D.: Symmetric spaces, Sugawara’s energy momentum tensor in two dimensions and free fermions. Phys. Lett. B 160(1–3), 111–116 (1985) 19. Izumi, M., Kosaki, H.: On a subfactor analogue of the second cohomology. Dedicated to Professor Huzihiro Araki on the occasion of his 70th birthday. Rev. Math. Phys. 14(7–8), 733–757 (2002) 20. Jones, V.F.R.: Index for subfactors. Invent. Math. 72, 1–25 (1983) 21. Kac, V.G.: Infinite Dimensional Lie Algebras. 3rd Edition. Cambridge: Cambridge University Press, 1990 22. Kawahigashi, Y., Longo, R.: Classification of local conformal nets. Case c < 1. Ann. Math. 160, 493–522 (2004) 23. Kawahigashi, Y., Longo, R.: Local conformal nets arising from framed vertex operator algebras. Adv. Math. 206(2), 729–751 (2006) 24. Kawahigashi, Y., Longo, R., Müger, M.: Multi-interval subfactors and modularity of representations in conformal field theory. Commun. Math. Phys. 219, 631–669 (2001) 25. Kawahigashi, Y., Longo, R., Pennig, U., Rehren, K.-H.: The classification of non-local chiral CFT with c < 1. Commun. Math. Phys. 271(2), 375–385 (2007) 26. Kac, V., Longo, R., Xu, F.: Solitons in affine and permutation orbifolds. Commun. Math. Phys. 253, 723–764 (2005) 27. Kac, V.G., Wakimoto, M.: Modular and conformal invariance constraints in representation theory of affine algebras. Advances in Math. 70, 156–234 (1988) 28. Longo, R.: Conformal subnets and intermediate subfactors. Commun. Math. Phys. 237(1–2), 7–30 (2003) 29. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 30. Longo, R., Xu, F.: Topological sectors and a dichotomy in conformal field theory. Commun. Math. Phys. 251, 321–364 (2004) 31. Pimsner, M., Popa, S.: Entropy and index for subfactors. Ann. Scient. Ec. Norm. Sup. 19, 57–106 (1986) 32. Pressley, A., Segal, G.: Loop Groups. Oxford: Oxford University Press, 1986 33. Rehren, K.-H.: Braid group statistics and their superselection rules. “The Algebraic Theory of Superselection Sectors”. In: Kastler, D. (ed.) Singapore: World Scientific, 1990 34. Rehren, K.-H.: Space-time fields and exchange fields. Commun. Math. Phys. 132(2), 461–483 (1990) 35. Rehren, K.-H.: Chiral observables and modular invariants. Commun. Math. Phys. 208, 689–712 (2000)

Application of Mirror Extensions

103

36. Rehren, K.-H.: Canonical tensor product subfactors. Commun. Math. Phys. 211, 395–408 (2000) 37. Schellekens, A.N.: Meromorphic c = 24 conformal field theories. Commun. Math. Phys. 153(1), 159–185 (1993) 38. Schellekens, A.N., Yankielowicz, S.: Field identification fixed points in the coset construction. Nuclear Phys. B 334(1), 67–102 (1990) 39. Turaev, V.G.: Quantum invariants of knots and 3-manifolds. Berlin-New York: Walter de Gruyter, 1994 40. Wassermann, A.: Operator algebras and Conformal field theories III. Invent. Math. 133, 467–538 (1998) 41. Xu, F.: Mirror extensions of local nets. Commun. Math. Phys. 270(3), 835–847 (2007) 42. Xu, F.: Jones-Wassermann subfactors for disconnected intervals. Commun. Contemp. Math. 2, 307–347 (2000) 43. Xu, F.: 3-manifold invariants from cosets. J. Knot Theory and Its Ram. 14(1), 21–90 (2005) 44. Xu, F.: Algebraic coset conformal field theories. Commun. Math. Phys. 211, 1–43 (2005) 45. Xu, F.: New braided endomorphisms from conformal inclusions. Commun. Math. Phys. 192, 347–403 (1998) 46. Xu, F.: Algebraic coset conformal field theories II. Publ. RIMS, Kyoto Univ. 35, 795–824 (1999) 47. Zhu, Y.: Modular invariance of characters of vertex operator algebras. J. Amer. Math. Soc. 9(1), 237–302 (1996) Communicated by Y. Kawahigashi

Commun. Math. Phys. 290, 105–127 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0826-0

Communications in

Mathematical Physics

Green’s Function for the Hodge Laplacian on Some Classes of Riemannian and Lorentzian Symmetric Spaces Alberto Enciso1 , Niky Kamran2 1 Department Mathematik, ETH Zürich, 8092 Zürich, Switzerland.

E-mail: [email protected]

2 Department of Mathematics and Statistics, McGill University, Montréal,

Québec, Canada H3A 2K6. E-mail: [email protected] Received: 6 August 2008 / Accepted: 30 January 2009 Published online: 22 May 2009 – © Springer-Verlag 2009

Abstract: We compute the Green’s function for the wave equation on forms on the symmetric spaces M × , where M is a simply connected n-dimensional Riemannian or Lorentzian manifold of constant curvature and is a simply connected Riemannian surface of constant curvature. Our approach is based on a generalization to the case of differential forms of the method of spherical means and on the use of Riesz distributions on manifolds. The radial part of the Green’s function is governed by a fourth order analogue of the Heun equation.

1. Introduction Our purpose in this paper is to compute the Green’s function for the Hodge Laplacian in some special classes of symmetric spaces of Riemannian or Lorentzian signature, more precisely those spaces which are products M × , where M is a simply connected n-dimensional Riemannian or Lorentzian manifold of constant curvature and is a simply connected Riemannian surface of constant curvature [19]. The spaces we consider are not locally conformally flat, unless one of the factors reduces to a point. They contain in particular the Robinson–Bertotti solutions of the Einstein–Maxwell equations, which are of Petrov type D, as well as the (simply connected) de Sitter and anti-de Sitter spaces, the latter spaces corresponding to cases in which reduces to a point and M is of Lorentzian signature. As is well known, the computation of the Green’s function for the Hodge Laplacian in curved spacetimes is a problem of considerable interest in Physics, since differential forms appear as basic variables in various supersymmetric field theories, including supergravity. To our knowledge, the earliest (formal) results on this problem (for de Sitter and anti-de Sitter spaces) appear in papers by Allen and Jacobson [3] and Folacci [12], and have been used in a number of recent works, e.g. [1,6–8,14,16]. In this paper we shall rigorously extend their results to the case of product manifolds, which seems to

106

A. Enciso, N. Kamran

have received less attention in the literature and presents challenges of its own, as we shall see below. In addition to the physical interest of this problem, our approach serves to illustrate the power of the method of spherical means as applied to operators acting on sections of vector bundles, and to show that it can be adapted efficiently to the case of product manifolds. The approach we take is based on a natural extension to the case of differential forms of the classical method of spherical means [15], which has been applied with great success to the computation of the Green’s function for the scalar wave equation in symmetric spaces [17]. The method of spherical means has the advantage of leading naturally to the use of Riesz distributions, which are fundamental analytical tools in the study of linear wave equations on manifolds. Thus we avoid the need to start with an explicit ansatz on the form of the action of the Hodge Laplacian on the Green’s function, as is often the case in the literature [3,13]. In particular, our approach does not rely on the classification of equivariant bitensor fields on maximally symmetric spaces. In the case of spaces of constant curvature, Riesz distributions provide a rigorous way of reducing the computation of the Green’s function to the study of an ordinary differential equation governing its radial part. We will see that such a reduction also occurs in the case of the product spaces considered in this paper, albeit in a significantly more complicated manner. Indeed, one of the main features of our result is that in the case in which neither nor M is reduced to a point, the expression for the Green’s function of the Hodge Laplacian acting on p-forms (with 1 p n −1) has its radial part governed by a fourth order ordinary differential equation with four regular singular points, that is a “higher order Heun equation”, which appears to be new. This is in contrast with the case in which reduces to a point, where the ordinary differential equations for the radial part of the Green’s function (which are coupled by the Levi-Civita connection terms) can nevertheless be decoupled by a suitably chosen differential substitution into second-order equations of hypergeometric type. In all cases, we obtain a closed form expression for the Green’s function as a bundle-valued distribution. There are some natural extensions that one could pursue for the results that we have obtained in our paper. Of immediate interest would be the extension of the method of computation of the Green’s function for the Hodge Laplacian to warped products of spaces of constant curvature. One could also develop the formalism of our paper in the case of the Laplacian on spinors, along the lines of [4], either in the plain or warped product cases. A more challenging objective, which would perhaps be of less interest in Physics, would be to adapt to the case of differential forms of arbitrary degree the analysis carried out by Lu [18] for the Green’s function for the Laplacian on 1-forms over symmetric spaces which are not maximally symmetric, such as the complex hyperbolic spaces and some classical symmetric domains of higher rank. Our paper is organized as follows. In Sect. 2, we briefly recall the definition and key properties of the spherical means operators on differential forms defined on Riemannian space forms, as well as their relation to the action of the Hodge Laplacian. Section 3 contains a concise account of the method of spherical means in the Lorentzian case, and introduces the representation of the fundamental solution of the wave equation in Lorentzian space forms in terms of Riesz distributions. In Sect. 4, we consider the case of Riemannian products, where we establish in Theorem 2 the expression of the Green’s function for the Hodge Laplacian in terms of the spherical means operators and the solutions of a fourth-order ordinary differential equation with four regular singular points, i.e., a “fourth-order Heun equation” which has not been studied in the literature

Green’s Function for the Hodge Laplacian on Riemannian and Lorentzian Spaces

107

as far as we know. An important element in our proof is a spectral decomposition result (Proposition 1) for the second-order operator of singular Sturm-Liouville type corresponding to the radial part of the Hodge operator on surfaces, which we prove in Appendix A. In Sect. 5, using Riesz distributions, we find the expression of the Green’s function for the Hodge Laplacian in the case of Lorentzian products. This is done in Theorem 3, and also involves a fourth-order radial equation of Heun type which arises now from the use of Riesz potentials. Finally, in Sect. 6, we briefly show for the sake of completeness that in the case in which the surface reduces to a point, that is when the manifold is a simply connected Riemannian or Lorentzian space form, the Green’s function obtained through our approach is expressible in terms of the solutions of a second-order equation of hypergeometric type, as shown in [3 and 13].

2. Spherical Means on Differential Forms Defined on Riemannian Space Forms In this section we shall briefly review some results on spherical means for differential forms defined on the simply connected Riemannian space form M of constant sectional curvature k. Proofs and further details can be found in Ref. [15]. We denote by B M (x, r ) the ball centered at the point x ∈ M of radius r and set S M (x, r ) := ∂ B M (x, r ). It is well known that S M (x, r ) is diffeomorphic to a sphere for any 0 < r < diam(M) and that these spheres foliate M\{x} (minus the antipodal point of M is positively curved). We use the notation dS for the induced area measure on S M (x, r ), whose total area is √ n−1 n−1 sin k r . m(r ) := |S M (x, r )| = S √ k

(1)

Throughout this paper we shall use the determination of the square root with nonnegative imaginary part, which has a branch cut on the positive real axis and is holomorphic in C\[0, +∞). We also recall that√the injectivity radius of M coincides with its diameter, which is +∞ if k 0 and π/ −k if k < 0. The space of smooth p-forms (resp. of p compact support) on M will be denoted by p (M) (resp. 0 (M)). Let ρ ∈ C 0 (M × M) be the distance function on M. For any degree p we define double differential p-forms τ p , τˆ p by setting

τ0 (x, x ) := 1, τ p (x, x ) :=

√ sin k r τ1 (x, x ) := − √ dd ρ(x, x ), k

1 τ p−1 (x, x ) ∧∧ τ1 (x, x ) for 2 p n, p

and τˆ0 (x, x ) := 0, τˆ1 (x, x ) := −dρ(x, x ) d ρ(x, x ), τˆ p (x, x ) := τˆ1 (x, x ) ∧∧ τˆ p−1 (x, x ) for 2 p n

(2a) (2b) (2c) (2d)

at the points (x, x ) ∈ M × M where the distance function is smooth. In the above formulas accented and unaccented operators act on x and x, respectively. Globally, this defines bundle-valued distributions τ p , τˆ p ∈ D (M × M, p (T ∗ M) p (T ∗ M)), where p (T ∗ M) p (T ∗ M) is the vector bundle over M × M whose fiber at (x, x ) is p (Tx∗ M) ⊗ p (Tx∗ M).

108

A. Enciso, N. Kamran

Given a smooth section of p (T ∗ M) p (T ∗ M) and ω ∈ p (M), we introduce the notation (x, x ) · ω(x) := ∗ (x, x ) ∧ ∗ ω(x) , (x, x ) · ω(x ) := ∗ (x, x ) ∧ ∗ ω(x ) for the left and right inner products ( p (T ∗ M) p (T ∗ M)) × p (M) → C ∞ (M) ⊗ p (M). Left and right products can be also defined for ∈ D (M × M, p (T ∗ M) p (T ∗ M)) and ω ∈ p (M). An important property of the double differential forms defined above is that if x does not belong to the cut locus of x, then the sum τ p (x, x ) + τˆ p (x, x ) provides the parallel transport operator for p-forms along the unique minimal geodesic connecting x and x [15]. More precisely, consider ω0 ∈ p (Tx∗ M) and let γ be the minimal geodesic with γ (0) = x and γ (1) = x . Then [τ p (x, x ) + τˆ p (x, x )] · ω0 = ω(1), ˜ where the smooth map ω˜ : [0, 1] → p (T ∗ M) satisfies the differential equation for parallel transport along γ , which with a slight abuse of notation can be written as ˜ = 0, ∇γ˙ (t) ω(t)

ω(0) ˜ = ω0 .

In particular, τ p (x, x) + τˆ p (x, x) is the identity map in p (Tx∗ M). Definition 1. Let 0 < r < diam(M). The (Riemannian) spherical means of a smooth p-form ω ∈ p (M) on the sphere of radius r are defined as 1 Mr ω(x) := τ p (x, x ) · ω(x ) dS(x ), (3a) m(r ) SM (x,r ) r ω(x) := 1 M τˆ p (x, x ) · ω(x ) dS(x ). (3b) m(r ) SM (x,r ) r ω = 0 and Mr ω coincides with the usual spherical Example 1. If ω ∈ 0 (M), then M mean on constant curvature spaces [17], which in Euclidean space is customarily written as 1 Mr ω(x) = n−1 ω(x + r θ ) dS(θ ). |S | Sn−1 r ω) = Mr (∗ω). If ω ∈ n (M), Mr ω = 0 and ∗(M We shall hereafter denote by M := d d∗ + d∗ d the (positive) Hodge Laplacian in M and consider the measure on (0, diam(M)) given by dµ M (r ) := m(r ) dr . From the point of view of their applications to the Hodge Laplacian, the key property of the spherical means is given by the relation [15, p. 47]

M = 0

diam(M)

0 diam(M)

r ω dµ M (r ) f (r ) Mr ω + fˆ(r ) M

r ω dµ M (r ),

f, fˆ)(r ) M F( f, fˆ)(r ) Mr ω + F(

(4)

Green’s Function for the Hodge Laplacian on Riemannian and Lorentzian Spaces

109

where ω ∈ p (M) and f, fˆ are arbitrary smooth functions for which the latter integrals

f, fˆ) are given by converge. The functions F( f, fˆ), F( √ (5a) F( f, fˆ)(r ) := L M f (r ) + pk 2 csc2 kr + n − p − 1 f (r ) √ √ − 2 cos kr csc2 kr fˆ(r ) , (5b) √

f, fˆ)(r ) := L M fˆ(r ) + (n − p)k 2 csc2 kr + p − 1 fˆ(r ) F( (5c) √ √ − 2 cos kr csc2 kr f (r ) , (5d) L M being the radial Laplacian − L M :=

√ √ ∂ ∂2 . + (n − 1) k cot kr 2 ∂r ∂r

(6)

This formula remains valid for f ∈ L 1loc ((0, diam(M)), dµ ) if ω is compactly supported and the above derivatives and integrals are understood in the sense of distributions.

3. The Method of Spherical Means in the Lorentzian Case and Riesz Potentials In this section we shall review the extension of the method of spherical means to Lorentzian spaces of constant curvature and some useful properties of Riesz potentials. Again we refer to [15] for details. We shall denote by M the simply connected Lorentzian manifold of constant sectional curvature k and signature (−, +, · · · , +). The Minkowski (k = 0) and de Sitter spaces (k > 0) are globally hyperbolic, but the anti-de Sitter space (k < 0) is not (it is, however, strongly causal [19]). For this reason it is convenient to consider a domain D ⊂ M, which will be fixed during the ensuing discussion. When k 0, D can be taken to be the whole space M, whereas when k < 0 we shall assume that D is a geodesically normal domain of M [15], that is, a domain which is a normal neighborhood of each of its points. We introduce the notation J D+ (x) for the points in D causally connected with x, i.e., the points x ∈ D such that there exists a future directed, non-spacelike curve from x to x . Furthermore, we set

+ SD (x, r ) := x ∈ J D+ (x) : ρ(x, x ) = r and call dS its induced area measure. Obviously + + + J D+ (x) = SD (x, r ), with S D (x, r ) ∩ S D (x, r ) = ∅ if r = r .

(7)

r 0

We also use the notation ρ : D × D → [0, +∞) for the Lorentzian distance function. Definition 2. The Riesz potential at x with parameter α is the distribution R αD,x ∈ D (D) defined for Re α n by the integral α ρ(x, x )α−n φ(x ) dx , (8) R D,x (φ) := Cα J D+ (x)

110

A. Enciso, N. Kamran

where φ is an arbitrary smooth function compactly supported in D, dx stands for the Lorentzian volume element and we have set n

π 1− 2 21−α Cα := .

( α2 ) ( α−n 2 + 1)

(9)

For the sake of completeness, let us recall that the map α → R αD,x is holomorphic in C provided that α → R αD,x (φ) is an entire function for any smooth function φ with compact support in D. A basic result in the theory of Riesz potentials is the following theorem [15,20]. Theorem 1. For any x ∈ D, the map α → R αD,x can be holomorphically extended to 0 the whole complex plane. Moreover, M R αD,x = R α−2 D,x and R D,x = δx . Corollary 1. For any φ ∈ C0∞ (D), u(x) := R 2D,x (φ) solves the hyperbolic equation

M u = φ in D. We define the bundle-valued distributions τ p , τˆ p over D using the Lorentzian distance ρ as in Sect. 2. If x ∈ D and x ∈ J D+ (x) does not lie on the causal cut locus of x, τ p (x, x ) + τˆ p (x, x ) is again the parallel transport operator p (Tx∗ D) → p (Tx∗ D) along the unique future-directed minimal geodesic from x to x . In particular, τ p (x, x) + τˆ p (x, x) is the identity map on p (Tx∗ M). p

Definition 3. Let ω ∈ 0 (D). Its (Lorentzian) spherical means of radius r are defined as MrD ω(x)

(−1) p := m(r )

p rD ω(x) := (−1) M m(r )

S +D (x,r ) S +D (x,r )

τ (x, x ) · ω(x ) dS(x ), τˆ (x, x ) · ω(x ) dS(x ),

where x ∈ D and m is given by Eq. (1). We shall henceforth drop the script D when there is no risk of confusion. It should be noticed that the latter operators do not exactly describe a spherical mean in the strict sense since in the Lorentzian case m(r ) does not yield the area of S + (x, r ), which can be infinite. However, the fundamental relation

M =

∞

0 ∞

r ω dµ M (r ) f (r ) Mr ω + fˆ(r ) M

r ω dµ M (r )

f, fˆ)(r ) M F( f, fˆ)(r ) Mr ω + F(

(10)

0

f, fˆ) defined by Eq. (5). also holds true in the Lorentzian case, with F( f, fˆ) and F(

Green’s Function for the Hodge Laplacian on Riemannian and Lorentzian Spaces

111

4. Riemannian Products In this section we shall solve the Poisson equation

M× ψ = ω,

(11)

ω being a compactly supported differential form in the Riemannian product space M ×. If M × is compact, we assume that ω is orthogonal to the harmonic forms in M × , which is the necessary and sufficient solvability condition of the latter equation [21]. It obviously suffices to analyze Eq. (11) when ω ∈ p (M) ⊗ q () for some integers p and q. If we denote by M and the Riemannian volume forms in M and , respectively, the above orthogonality conditions amount to imposing ω M ∧ = 0 if ( p, q) = (0, 0), ω ∧ M = 0 if ( p, q) = (0, 2), (12a) ω ∧ = 0 if ( p, q) = (n, 0), ω = 0 if ( p, q) = (n, 2), (12b) when M × is compact. We begin by discussing these exceptional cases. The explicit solution of the scalar equation ( p = q = 0) for functions ω in M × (and more general symmetric spaces) is well known and can be found in [17]. Concerning the other exceptional cases, let us denote by ∗ M and ∗ the Hodge star operator of M and and observe that ∗ ω, ∗ M ω or ∗ (∗ M ω) are C ∞ scalar functions on M × , respectively, when ( p, q) = (0, 2), (n, 0) or (n, 2). As ∗ M and ∗ commute with the Hodge Laplacian of M × and preserve the orthogonality relations (12), for ( p, q) ∈ {(0, 2), (n, 0), (n, 2)}, Eq. (11) turns out to be equivalent to the scalar case by Hodge duality. Therefore we can henceforth assume that ( p, q) ∈ {(0, 0), (0, 2), (n, 0), (n, 2)}. For concreteness we shall also assume that k and κ are nonzero; the case where one of the curvatures is zero can be treated along the same lines (and is considerably easier). We find it convenient to introduce the notation Ss and S s for the spherical means of radius s in the Riemannian surface and call √ sin κ s dµ (s) := 2π √ ds κ the radial measure on (0, diam()). It should be observed that the decomposition p (M) ⊗ q () ∗ (M × ) = p,q 0

defines a natural action of the Laplacians and spherical means operators of M and on ∗ (M × ). A simple but important observation is the following q

Lemma 1. Let β ∈ 0 (). Then

0

diam()

w(s) (Ss + S s )β dµ (s) =

diam() 0

L q w(s) (Ss + S s )β dµ (s),

112

A. Enciso, N. Kamran

2 ((0, diam()), dµ ) and where w ∈ Hloc

√ √ ∂2 ∂ − κ cot κs , 2 ∂s ∂s √ 2 κ w(s) L 1 w(s) := L 0 w(s) + √ . 1 + cos κs

L 0 = L 2 := −

(13a) (13b)

Proof. We saw in Example 1 that S s ω = 0 when q = 0. Hence in this case we easily obtain the desired formula from Eq. (4) by noticing that L 0 w = F(w, w). When q = 1,

this follows directly from Eq. (4) since L 1 w = F(w, w) = F(w, w). The case q = 2

is essentially equivalent to q = 0 by Hodge duality and uses that L 2 w = F(w, w). The formal differential operators L q given by (13) define self-adjoint operators (which we still denote by L q for simplicity of notation) with domains given by the functions u ∈ H 1 ((0, diam()), dµ ) such that L q u ∈ L 2 ((0, diam()), dµ ), lim s u (s) = 0, and s↓0

lim

s↑diam()

(diam() − s) u (s) = 0 if κ > 0 and q = 0.

Next we state a useful proposition on the spectral decomposition of the operators L q . The proof, which consists of an application of the Weyl–Kodaira theorem and some manipulations of hypergeometric functions, is presented in Appendix A. Proposition 1. Let cκ,q := − 41 κ if κ < 0 and cκ,q := 2κδq1 if κ > 0. Then for each q = 0, 1, 2 there exist a Borel measure ρq on [cκ,q , ∞) and a (µ × ρq )-measurable function wq : (0, diam()) × [cκ,q , ∞) → 0 such that: (i) wq (·, λ) is an analytic formal eigenfunction of L q with eigenvalue λ for ρq -almost every λ ∈ [cκ,q , ∞). (ii) The map diam() u → u(s) wq (s, ·) dµ (s) 0

defines a unitary transformation Uq : L 2 ((0, diam()), dµ ) → L 2 ([cκ,q , ∞), dρq ) with inverse given by ∞ wq (·, λ) u(λ) dρq (λ). Uq−1 u := cκ,q

(iii) If g(L q ) is any bounded function of L q , then ∞ g(λ) wq (·, λ) Uq u(λ) dρq (λ) g(L q )u =

(14)

cκ,q

in the sense of norm convergence. Remark 1. Explicit formulas for all the functions involved are given in Lemmas 2 and 3. By definition, ρq is supported on the spectrum of the self-adjoint operator L q .

Green’s Function for the Hodge Laplacian on Riemannian and Lorentzian Spaces

113

We shall construct a solution to Poisson’s equation (11) as ψ=

r ω dµ (s) dµ M (r ) dρq (λ), wq (s, λ) (Ss + S s ) f (r, λ)Mr + fˆ(r, λ)M (15)

where the integral ranges over [cκ,q , ∞) × (0, diam(M)) × (0, diam()). By construction, L q wq (s, λ) = λ wq (s, λ) for ρq -almost every λ, so from Lemma 1 and the fact that M× ψ = M ψ + ψ it follows that

M× ψ =

wq (S + S)

ω dµ dµ M dρq ,

f, fˆ)+λ fˆ M F( f, fˆ)+λ f M+ F( (16)

given by (5). with F and F Let us consider the system of coupled ODEs, F( f, fˆ) + λ f = 0,

f, fˆ) + λ fˆ = 0, F(

(17) √

depending on the nonnegative parameter λ. Let us introduce the variable z := sin2 2kr , which takes values in (0, 1) when r ∈ (0, diam(M)) and M is compact and in (−∞, 0) when M is noncompact. With an abuse of notation we shall temporarily write f (z) or √ fˆ(z) for the expression of the functions f or fˆ in this variable, that is, for f ( √2 arcsin ·). k This allows to write Eqs. (17) as n 1 (1 − 2z) f (z) = + p +n− p−1 f (z) 2 2z(1 − z) p(1 − 2z) ˆ − f (z), (18a) 2z(1 − z) n 1 + p−1 z(1 − z) fˆ (z) + (1 − 2z) fˆ (z) = + (n − p) fˆ(z) 2 2z(1 − z) (n − p)(1 − 2z) f (z), (18b) − 2z(1 − z)

z(1 − z) f (z) +

with := λ/k. We can solve (18a) for fˆ and substitute in the second equation to arrive at a linear fourth order differential equation for f , namely d4 f dj f + Q (z) =0 j dz 4 dz j 3

j=0

(19)

114

A. Enciso, N. Kamran

with the rational functions Q j being given by n+4 n+4 4 + + , z−1 z 1 − 2z Q 2 (z) = 16 n 2 + 2( p + 3)n − 2 p 2 + 2( + 3) z 4 + 32 2 p 2 − n 2 − 2( p + 3)n −2( + 3) z 3 +8 3n 2 +5( p + 3)n + 3n−5 p 2 ++4( + 3)+6 z 2+8 p 2−n 2 −( p + 3)n−3n−−6 z + n 2 +6n + 8 16z 6 −48z 5 +52z 4−24z 3 +4z 2 , Q 1 (z) = 2 pn 2 + n 2 − 2 p 2 n + 4 pn + 2n + 2n − 4 p 2 + 4 + 4 −2 pn 2 − n 2 + 2 p 2 n −2 pn − 2n − 2n + 2 p 2 + 2 p − 2 z + 4 2 pn 2 + n 2 − 2 p 2 n+2 pn+2n 4z 5 − 10z 4 + 8z 3 − 2z 2 , + 2n − 2 p 2 − 2 p + 2 z 2 Q 0 (z) = p 4 − 2np 3 + n 2 p 2 − 2p 2 + p 2 − 2np + 2np − 2 p + 2 − 2 + 4 − p 4 + 2np 3 − n 2 p 2 + 2p 2 + p 2 − 2np − 2 z + 4 p 4 − 2np 3 + n 2 p 2 −2p 2 4z 6 − 12z 5 + 13z 4 − 6z 3 + z 2 . − p 2 + 2np + 2 z 2 Q 3 (z) =

The fourth order equation (19) plays a crucial role in the computation of the Green’s function of the Hodge Laplacian. A simple computation shows that it has four regular singular points at 0, 21 , 1 and ∞, so it can be understood as a fourth order analogue of the Heun equation. For our purpose it is important to make the following observation. Proposition 2. For all 0 there exists a unique real solution + (z, ) of Eq. (19) with the asymptotic behavior at 0 n

n 4z 1− 2 + O(z 2− 2 ) if n 3, k(n − 2)|Sn−1 | log |z| + O(1) if n = 2, + (z, ) = − 4π

+ (z, ) =

(20a) (20b)

and the fastest possible decay as z tends to −∞. If > 0 or p = 0, n and 0, there is also a unique real solution − (z, ) of the latter equation such that − (·, ) is analytic in the half-closed interval (0, 1] and has the asymptotic behavior (20) at 0. Proof. Equation (19) is a fourth order Fuchsian equation with poles at 0, 21 , 1 and ∞. For the sake of concreteness we shall assume that n 3. Then characteristic exponents at 0 and 1 are − n2 , 1 − n2 , 0 and 1, so that + must be given by + (z, ) =

4 φ n (z) + c1 φ0 (z) + c2 φ1 (z), k(n − 2)|Sn−1 | 1− 2

where φν stands for the local solution of (19) which is analytic on (−∞, 0) and asymptotic to z ν at 0. The characteristic exponents at ∞ are 21 [n + 1 ± ((n + 1 − 2 p)2 − 4)1/2 ] and 21 [n − 1 ± ((n − 1 − 2 p)2 − 4)1/2 ], so that each local solution φν has well-defined

Green’s Function for the Hodge Laplacian on Riemannian and Lorentzian Spaces

115

asymptotics at ∞ and the real constants c1 and c2 can be chosen so as to obtain the fastest possible decay. The analysis on the interval (0, 1) is similar. The characteristic exponents at 21 are 0, 1, 3 and 4, so all the solutions of the equation are analytic at this point. The function − must therefore be given by − (z, ) =

4 φ n (z) + c1 φ0 (z) + c2 φ1 (z), k(n − 2)|Sn−1 | 1− 2

(21)

where the local solutions φν are now analytic in (0, 1) and asymptotic to z ν at 0. The constants c1 and c2 should now be chosen so as to ensure that − is also analytic at 1, n n i.e., that it does not have any terms asymptotic to (1 − z)− 2 or (1 − z)1− 2 . To prove that this is always possible when p = 0, n or = 0, we proceed as follows. Let us still denote by M the round n-sphere of curvature k, and let p 0 denote the lowest eigenvalue of the Laplacian M acting on p-forms. It is obvious that p is nonzero for all p = 0, n. Since k + M | p (M) k + p , which is strictly positive whenever p = 0, n or = 0, it follows that in this case the equation (k + M )u = α

(22)

p (M),

and that (the Friedrichs extension on p-forms has a unique solution for all α ∈ of) k + M | p (M) has a compact inverse given by an integral kernel. Moreover, its Green’s function, i.e. the integral kernel of (k + M | p (M) )−1 , must be equivariant under isometries. However, it follows from our previous arguments that the existence of an equivariant Green’s function with a pole at a given point y ∈ M for Eq. (22) is tantamount to the existence of constants c1 , c2 in (21) such that − is analytic in (0, 1], thus completing the proof of the claim. When n = 2, the local solutions of Eq. (19) at zero behave as z −1 , log |z|, 1 and z and the same reasoning applies mutatis mutandis. Now we have all the ingredients to prove the main result of this section. √

Theorem 2. Let us set f (r, λ) := wq (0, λ) sign k (sin2 2kr , λ/k) and define a function fˆ(r, λ) by means of Eq. (18a), i.e., as √ L M f (r, λ) + pk 2 csc2 kr + n − p − 1 f (r, λ) . fˆ(r, λ) := √ √ 2 pk cos kr csc2 kr Then the function ψ given by (15) solves Eq. (11) for ω ∈ p (M) ⊗ q (). Proof. We start by noticing that, by Proposition 2 and the restriction on the possible √ values of ( p, q), the function sign k (sin2 2kr , λ/k) is well defined for all values of r and λ in the integration range. It should be noticed that the definition of fˆ(·, λ) ensures that it has the same asymptotic behavior at 0 that f (·, λ), namely (20) wq (0, λ) + O(r 3−n ). (n − 2)|Sn−1 | r n−2

116

A. Enciso, N. Kamran

An important observation is that the distribution defined by F( f (·, λ), fˆ(·, λ)) dµ

f (·, λ), fˆ(·, λ)) dµ ) is in fact wq (0, λ) times the Dirac measure sup(which equals F( ported at 0, since diam(M) √ f (r, λ) L M ϕ(r ) + pk 2 csc2 kr + n − p − 1 f (r, λ) 0 √ √ (23) −2 cos kr csc2 kr fˆ(r, λ) ϕ(r ) dµ (r ) = wq (0, λ) ϕ(0) for all ϕ ∈ C0∞ ([0, diam(M)]). This immediately stems from the fact that F( f (·, λ), fˆ(·, λ))(r ) is zero for all r = 0 by construction (cf. Proposition 2) and the asymptotic 0 is the identity map it easily behavior of f and fˆ at zero. In particular, as M0 + M follows that diam(M) r ω dµ M (r ) = ω.

f, fˆ) + λ fˆ M F( f, fˆ) + λ f Mr + F( 0

As a consequence of this, Eq. (16) reduces to

M× ψ = w(0, λ) w(s, λ) Ss + S s ω dµ (s) dρq (λ).

(24)

When the integral in λ ranges over [cq,κ , ∞), it immediately follows from Proposition 1 and Lemma 4 on the pointwise convergence of the integral (14) that

M× ψ = S0 + S 0 ω = ω, and the claim in the statement follows. Remark 2. Being constructed using only the distance function and the metric on M and , the symmetric Green’s function that we have computed is equivariant under the isometries of M × . Remark 3. Some comments on the uniqueness of the solution to Eq. (11) are in order. When M × is compact, it follows from standard Hodge theory that for the above values of ( p, q) we have constructed the only L 2 solution to the equation. The case when M or is noncompact can be analyzed using that ker( M× ) = ker( M ) ⊗ ker( ) and that the L 2 kernel of the Hodge Laplacian of the hyperbolic m-space acting on s-forms is {0} if n = 2s and infinite dimensional otherwise [9,10]. Hence for the above values of ( p, q) Theorem 2 also yields the only L 2 solution to (11) when k > 0, κ < 0 and ( p, q) ∈ {(0, 1), (n, 1)}, when k < 0, κ > 0 and ( p, q) ∈ {( n2 , 0), ( n2 , 2)}, and when k < 0, κ < 0 and ( p, q) = ( n2 , 1). 5. Lorentzian Products In this section we shall solve the equation

M× ψ = ω

(25)

for a compactly supported form ω ∈ p (D) ⊗ q () by constructing an advanced Green’s operator for M× . We assume that M and respectively have Lorentzian and Riemannian signature. As in Sect. 3, if M is not globally hyperbolic (k < 0) we

Green’s Function for the Hodge Laplacian on Riemannian and Lorentzian Spaces

117

restrict ourselves to a geodesically normal domain D × . If M is globally hyperbolic r will stand for the Lorentzian (k 0) we simply set D := M. The symbols Mr and M spherical means in D, but other than that we will use the same notation as in the previous section. We recall that for any globally hyperbolic manifold D × there exists a unique advanced Green’s operator ∗0 (D × ) → ∗ (D × ) for the Hodge Laplacian [5]. As discussed in the previous section, when ( p, q) ∈ {(0, 0), (0, 2), (n, 0), (n, 2)}, Eq. (25) is equivalent by Hodge duality to the scalar wave equation, whose Green’s function is well known [17]. Hence we shall assume that ( p, q) does not take any of the above values and that both k and κ are nonzero. The results in the previous section and the definition of the Riesz potentials strongly suggest that we analyze the system of ordinary differential equations F f α (·, λ), fˆα (·, λ) (r ) + λ f α (r, λ) = Cα−2 r α−n−2 , (26a)

f α (·, λ), fˆα (·, λ) (r ) + λ fˆα (r, λ) = Cα−2 r α−n−2 , F (26b) that Re α > n + 2. where λ is positive, Cα is given by (9) and we assume for the moment √ We find it convenient to write this equation in the variable z := sin2 2kr , writing f α (z) or fˆα (z) for the expression of the functions f α (r, λ) or fˆα (r, λ) in terms of z with some abuse √ α−n−2 , Eqs. (26) of notation. Defining the function h α (z) := − k1 Cα−2 √2 arcsin z k now read n 1 +n− p−1 f α (z) z(1 − z) f α (z) + (1 − 2z) f α (z) − + p 2 2z(1 − z) p(1 − 2z) ˆ = h α (z) − (27a) f α (z), 2z(1 − z) n 1 + p−1 z(1 − z) fˆα (z) + (1 − 2z) fˆα (z) − + (n − p) fˆα (z) 2 2z(1 − z) (n − p)(1 − 2z) f α (z), = h α (z) − (27b) 2z(1 − z) where := λ/k. We can combine Eqs. (27) to obtain a single fourth order equation for f α , namely d4 f α d j fα + Q (z) = Hα (z), j dz 4 dz j 3

(28)

j=0

where the rational functions Q j were defined in Sect. 4, Hα (z) := q0 (z)h α (z) + q1 (z)h α (z) + q2 (z)h α (z) and

(z − 1)z 4nz 2 + 8z 2 − 4nz − 8z + n + 4 2(z − 1)2 z 2 , q1 (z) = , q2 (z) = p(2z − 1) p(2z − 1)2 2z (z − 1)(1 − 2z)2 + p(−zp + p + n(z − 1) + z)(1 − 2z)2 − 2z + 2 . q0 (z) = p(2z − 1)3

Using the results from the previous section it is not difficult to prove the following

118

A. Enciso, N. Kamran

Proposition 3. Under the above hypotheses, Eq. (28) has a unique solution α,+ (z, ) which is continuous in [0, 1] if k < 0, and a unique solution α,− (z, ) which is continuous in (−∞, 0] and has the fastest possible decay at infinity if k < 0. These solutions are real for real α. Proof. Let us assume that n 3 and k < 0. We omit the discussion of the case n = 2, n which goes along the same lines by replacing z 1− 2 by log |z| in the discussion. We saw in Proposition 2 that the homogeneous equation (19) has four regular singularities at 0, 1 2 , 1 and ∞. As Hα is continuous on (−∞, 0] for α > n + 2, the method of the variation of constants yields a particular solution 0 of (28) which is analytic in (−∞, 0) with n n possible singularities at 0 of order z 2 and z 1− 2 . The function in the statement of the theorem is thus given by α,− (z, ) = 0 (z) + c1 φ− n2 (z) + c2 φ1− n2 (z) + c3 φ0 (z) + c4 φ1 (z)

(29)

in the notation of Proposition 2. The constants c1 and c2 are chosen so that n

lim z 2 −1 α,− (z, ) = 0, z↑0

while c3 and c4 are chosen so as to obtain the fastest possible decay at infinity. Suppose now that k > 0. Let us observe that for α > n + 2 the function Hα is continuous in [0, 21 ) ∪ ( 21 , 1], whereas it diverges as (z − 21 )−3 at 21 . However, as the characteristic exponents of the homogeneous equation at 21 are (0, 1, 3, 4), it is standard that the method of the variation of constants yields a particular solution 0 of Eq. (28) which is continuous (actually, analytic) at 21 . The desired solution is obtained, in the notation of Proposition 2, as α,+ (z, ) = 0 (z) + c1 φ− n2 (z) + c2 φ1− n2 (z) + c3 φ0 (z) + c4 φ1 (z). Here the constants c j are chosen so that n

n

lim z 2 −1 α,+ (z, ) = lim(1 − z) 2 −1 α,+ (z, ) = 0, z↓0

z↓0

i.e., so as to remove the singularities of order n2 and n2 − 1 at 0 and at 1. From the proof of Proposition 2 it stems that this determines the constants c j . The p (Tx∗ M)-valued distribution defined for each x ∈ D as α r φ(x) dµ M (r ) Rx,D (φ) := Cα r α−n Mr + M = Cα ρ(x, x )α−n τ (x, x ) + τˆ (x, x ) · φ(x ) dx , J D+ (x)

p

φ ∈ 0 (M),

is easily seen to be holomorphic for Re α > n, with fixed x and λ. By Theorem 1 the α admits an entire extension, and as τ (x, x) + τˆ (x, x ) = id p (T ∗ M) function α → R x,D x it follows from this theorem that 0 (φ) = φ(x). R x,D

(30)

Green’s Function for the Hodge Laplacian on Riemannian and Lorentzian Spaces

119

√

In what follows we shall set f α (r, λ) := α,sign k (sin2 2kr , λ/k), with α,± as in Proposition 3, and define fˆα so that Eq. (26) holds true, that is, as √ L M f α (r, λ) + pk 2 csc2 kr + n − p − 1 f α (r, λ) − Cα−2 r α−n−2 . fˆα (r, λ) := √ √ 2 pk cos kr csc2 kr It is natural to define another vector-valued distribution by r φ(x) dµ M (r ), ˆα (r, λ)M (φ) := f (r, λ) M + f Rα,λ α r x,D

p

φ ∈ 0 (M),

for Re α > n − 2. By Eqs. (10), (26) and (30) it follows that the two distributions that we have just defined are related by α−2 ( M + λ)Rα,λ x,D = Rx,D . In fact, from the above equation and Proposition 3 it is standard [5,20] that the function α → Rα,λ x,D can be holomorphically extended to the whole complex plane. Thus we are led to a useful generalization of Corollary 1 that we summarize in the following Proposition 4. The distribution Rα,λ x,D is a holomorphic function of α ∈ C for fixed x ∈ p

D, λ ∈ spec(L q ). Moreover, for any φ ∈ 0 (D) the differential form (x) := R2,λ x,D (φ) solves the equation ( M + λ) = φ in D. The action of Rα,λ x,D naturally defines a bundle-valued distribution on M ×, which we do not distinguish notationally, which acts on the first factor of any ω ∈ p (D)⊗q () to yield an element of p (Tx∗ M) ⊗ q (). Namely, if ω(x, y) = ω1 (x) ⊗ ω2 (y) this action is read: α,λ Rα,λ x,D (ω)(y) := Rx,D (ω1 ) ⊗ ω2 (y),

with (x, y) ∈ D × . This result allows us to express the solution of Eq. (25) as follows. Theorem 3. The differential form ψ(x, y) := wq (0, λ) wq (s, λ) R2,λ x,D (ω)(y) dµ M (r ) dµ (s) dρq (λ), where the integral ranges over [cq,κ , ∞)×(0, diam())×(0, diam(M)), solves Eq. (25) for any compactly supported ω ∈ p (D) ⊗ q (). Proof. It immediately follows from the definition of wq and Proposition 4 that

M× ψ(x, y) = wq (0, λ) wq (s, λ) (λ + M )R2,λ x,D (ω)(y) dµ M (r ) dµ (s) dρq (λ) = wq (0, λ) wq (s, λ) ω(x, y) dµ (s) dρq (λ). As in the proof of Theorem 2, now Proposition 1 and Lemma 4 show that

M× ψ(x, y) = (S0 + S 0 )ω(x, y) = ω(x, y), completing the proof of the statement.

120

A. Enciso, N. Kamran

6. Constant Curvature Spaces

We saw in Eqs. (17) and (26) that the radial behavior of the Green’s operator in the product manifolds M × is controlled by a fourth order analogue of the Heun equation. On the other hand, it is well known [3,13] that the Hodge Green’s function on M (as well as the scalar Green’s function of any simply connected rank 1 symmetric space [17]) is controlled by simple hypergeometric equations: this merely reflects that the geometry of M × is considerably more involved than that of its individual factors. For the sake of completeness, in this short section we shall discuss the simplifications that give rise to hypergeometric functions when one considers the equation on M, that is, when the factor collapses to a point. We shall therefore analyze the following system of coupled ODEs: n 1 (1 − 2z) f (z) − + p +n− p−1 f (z) 2 2z(1 − z) p(1 − 2z) ˆ = h(z) − f (z), (31a) 2z(1 − z) n 1 + p−1 z(1 − z) fˆ (z) + (1 − 2z) fˆ (z) − + (n − p) fˆ(z) 2 2z(1 − z) (n − p)(1 − 2z) f (z). (31b) = h(z) − 2z(1 − z) z(1 − z) f (z) +

When h(z) = 0 this equation controls the radial part of the Green’s operator for the Poisson equation on M × studied in Sect. 4, whereas when h is the function h α defined in Sect. 5 we obtain the equation for the radial part of the retarded Green’s function. It is natural to look for functions a, a, ˆ b and bˆ (possibly depending on p and ) such that the functions g(z) : = f (z) + p a(z) f (z) − pb(z) fˆ(z), ˆ g(z) ˆ : = fˆ (z) + (n − p) a(z) ˆ fˆ(z) − (n − p) b(z) f (z)

(32a) (32b)

satisfy a first-order system of ODEs (equivalent to (31)) of the form z(1 − z)g (z) + (n − p)A(z)g(z) + p B(z) g(z) ˆ = h(z), ˆ ˆ z(1 − z) gˆ (z) + p A(z) g(z) ˆ + (n − p) B(z)g(z) = h(z).

(33a) (33b)

When this reduction can be performed, it is possible to express the functions f and fˆ in terms of the solutions to the decoupled second-order equations for f, fˆ and g, gˆ that one obtains from (32) and (33). As before, Hodge duality and the fact that the scalar case is well known allow us to assume that p = 0, n.

Green’s Function for the Hodge Laplacian on Riemannian and Lorentzian Spaces

121

After some manipulations one finds that these equations amount to imposing the following conditions: (z − 1)zb(z) − B(z) = 0, ˆ − B(z) ˆ (z − 1)z b(z) = 0, 1 ˆ n 2 − z + (n − p)(z − 1)z a(z) ˆ − p A(z) = 0, 1 n 2 − z + p(z − 1)za(z) + ( p − n)A(z) = 0, 1 1 1 + + ( p − n)A(z)b(z) + ( p − n)B(z)a(z) ˆ + (z − 1)z b (z) = 0, 2 z−1 z 1 1 1 ˆ b(z) ˆ − pa(z) B(z) ˆ − p A(z) + (z − 1)z bˆ (z) = 0, + 2 z−1 z 1 ˆ + 1 − n + p + +( p − n)a(z)A(z)+( p−n)B(z)b(z)+(z −1)z a (z) = 0, 2(z − 1)z p 1 ˆ ˆ +1− p+ − pa(z) ˆ A(z) − pb(z) B(z) + (z − 1)z aˆ (z) = 0. 2z(z − 1) n−p Thus we have eight equations with eight unknowns, but solving the above system is in general a formidable task. The crucial observation is that this system of ODE simplifies considerably when ˆ which immediately leads to the = 0. In fact, in this case we can set a = aˆ and b = b, solution a(z) = a(z) ˆ =

1 − 2z , 2z(1 − z)

ˆ b(z) = b(z) =−

1 . 2z(1 − z)

Thus one derives that when = 0 the equations satisfied by the new functions g and gˆ are z(1 − z)g (z) + (n − p) 21 − z g(z) + 21 p g(z) ˆ = h(z), (34a) 1 1 z(1 − z)gˆ (z) + p 2 − z g(z) ˆ + 2 (n − p)g(z) = h(z), (34b) and by isolating gˆ in (34a) one readily finds that z(1 − z)g (z) + n2 + 1 (1 − 2z)g (z) − (n − p)( p + 1)g(z) = H (z), with H (z) := h (z) + ph(z) z−1 . This is a hypergeometric equation, which can be solved in terms of associated Legendre functions by the variation of constants method. Now it suffices to notice that the equations controlling the radial behavior of the Green’s operators on M are obtained from those of M × by collapsing to a point, which amounts to setting = 0. Thus from this discussion and the previous sections we recover the (non-rigorous) result of Allen–Jacobson [3] and Folacci [13] that the Green’s function for the Hodge Laplacian in a simply connected space of constant curvature can be expressed in terms of hypergeometric functions, as in the scalar case. Detailed albeit somewhat formal discussions can be found in the aforementioned references; details are omitted. It should be stressed that for arbitrary values of the radial equation (31a) does not seem to admit an analogous reduction.

122

A. Enciso, N. Kamran

A. The Radial Equation on Surfaces In this Appendix we shall prove Proposition 1 and derive explicit formulas for the elements appearing in the statement. We shall always assume that κ = 0: indeed, when κ = 0 we have L 0 = L 1 = L 2 and the spectral decomposition of this operator claimed in the latter lemma reduces to the Hankel transform. Since L 0 = L 2 , it is obviously sufficient to consider the cases q = 0 and q = 1. √

We shall define a new variable t := sin2 2κs , which ranges over (0, 1) if κ > 0 and over (−∞, 0) if κ < 0. Thus we can identify the Hilbert space L 2 ((0, diam(M)), dµ ) 4π 2 with L 2 ((0, 1), 4π |κ| dt) (if κ > 0) or L ((−∞, 0), |κ| dt) (if κ < 0) via a unitary transformation Vκ , and write L q = κ Vκ−1 Tq Vκ with T0 := −t (1 − t)

∂2 ∂ − (1 − 2t) , 2 ∂t ∂t

T1 := T0 +

1 . 1−t

(35)

It is not difficult to see that the singular differential operator T0 (resp. T1 ) is in the limit circle case at 0 and 1 (resp. at 0), and in the limit point case at infinity (resp. at 1 and at infinity). For simplicity of notation we shall still denote by Tq the self-adjoint operators defined by the action of the differential operators (35) on the domains [11] Dom(T0 ) := u ∈ H 1 ((0, 1)) : T0 u ∈ L 2 ((0, 1)), lim t u (t) = lim(1−t) u (t) = 0 , t↓0

t↑1

(36a)

Dom(T1 ) := u ∈ H 1 ((0, 1)) : T1 u ∈ L 2 ((0, 1)), lim t u (t) = 0 ,

(36b)

t↓0

when κ > 0 and

Dom(T0 ) := u ∈ H ((−∞, 0)) : T0 u ∈ L ((−∞, 0)), lim t u (t) = 0 , t↑0 1 2 Dom(T1 ) := u ∈ H ((−∞, 0)) : T1 u ∈ L ((−∞, 0)), lim t u (t) = 0 , 1

2

t↑0

(36c) (36d)

when κ < 0. The domains of the operators L q defined in Sect. 4 are simply Dom(L q ) = Vκ−1 (Dom(Tq )). When is compact, the spectrum of L q is discrete and its spectral decomposition can be written as follows. Lemma 2. Let us suppose that κ > 0. Then Proposition 1 holds with ⎛ ⎞ dρ0 (λ) = ⎝ δ (λ − κ j ( j + 1))⎠ dλ, ⎛

j 0

⎞ dρ1 (λ) = ⎝ δ λ − κ( j 2 + 3 j + 2) ⎠ dλ, w0 (s, κ j ( j + 1)) =

j 0

κ(2 j + 1) 4π

1/2 P j (cos

√

κs),

√ κ(2 j + 3) 1/2 √ κs (0,2) w1 s, κ( j 2 + 3 j + 2) = P j (cos κs), cos2 4π 2

Green’s Function for the Hodge Laplacian on Riemannian and Lorentzian Spaces (a,b)

where Pν and Pν degree ν.

123

respectively denote the Legendre and Jacobi polynomials of

Proof. It is standard that the eigenvalues of Tq are nonnegative and that the normalized eigenfunctions of Tq provide an orthonormal basis of L 2 ((0, 1)). The choice of boundary conditions (36) ensures that the eigenvalues of L 0 are the numbers 0 for which the classical hypergeometric equation 0 = ( − T0 )u = t (1 − t)u + (1 − 2t)u + u has a polynomial solution. The solution to the latter equation which is continuous at 0 is proportional to the hypergeometric function F( 21 − ( 41 + )1/2 , 21 + ( 41 + )1/2 , 1; t) and this function becomes a polynomial if and only if ( 41 + )1/2 − 21 ∈ N. Hence spec(T0 ) = { j ( j + 1) : j ∈ N}. In this case the latter hypergeometric function reduces to a Legendre polynomial, and using the well known formulas for the norm of these polynomials one readily arrives at the expression 2 j + 1 P j (1 − 2t) for the normalized eigenfunctions of T0 . By making use of the identification of L 2 ((0, diam()), dµ ) with L 2 ((0, 1), 4π κ dt) via Vκ we immediately arrive at the above expression for the spectral resolution of L 0 . The proof for L 1 is analogous. If we set u(t) =: (1 − t)v(t), the eigenvalue equation is read: 0 = ( − T1 )u = (1−t) t (1−t)v +(1 − 4t)v +(−2)v , u ∈ Dom(T1 ). (37) The eigenvalues of T1 can be easily seen to coincide with the polynomial solutions to the latter equation. As the solution v of (37) regular at 0 is proportional to F( 23 − ( 41 + )1/2 , 23 + ( 41 + )1/2 , 1; t), this implies that ( 41 + )1/2 − 23 ∈ N and spec(T1 ) = { j 2 + 3 j + 2 : j ∈ N}. Hence we immediately obtain the desired formula by writing the resulting eigenfunction function in terms of a Jacobi polynomial, expressing the result in the variable s and normalizing it to have unit L 2 norm. When is noncompact, the spectrum of L q is absolutely continuous and its spectral resolution can be derived using the Weyl–Kodaira theorem. Lemma 3. Let us suppose that κ < 0 and set α() := Then Proposition 1 holds with

1 4

+

1 4

+ .

! " λ 1 1/2 κ tanh π − − dρ0 (λ) = dρ1 (λ) = dλ, 4π κ 4 √ √ −2α(λ/κ) −κs −κs 2 , F α(λ/κ), α(λ/κ), 1; − tanh w0 (s, λ) = cosh 2 2 √ √ −2α(λ/κ) −κs −κs 2 w1 (s, λ) = cosh . F α(λ/κ) + 1, α(λ/κ) − 1, 1; − tanh 2 2

124

A. Enciso, N. Kamran

Proof. It is a standard spectral-theoretic result that the spectrum of T0 , which is is absolutely continuous, is given by (−∞, − 41 ]. Hence let us take a complex number with Re − 41 and nonzero imaginary part and consider the equation 0 = ( − T0 )u = t (1 − t)u + (1 − 2t)u + u,

(38)

where t ∈ R− and T0 is to be interpreted as a formal differential operator. Then the solutions of (38) satisfying the boundary condition required in Dom(T0 ) at 0 are proportional to t −α() , F α(), α(), 1; w(t, ) := (1 − t) 1−t and those which are square integrable at infinity are proportional to 1 − −α() if Im < 0, F α(), α(), 2α(); w (t, ) := (1 − t) 1−t 1 + α()−1 if Im > 0. w (t, ) := (1 − t) F 1−α(), 1−α(), 2−2α(); 1−t For notational simplicity we shall henceforth write α instead of α(). It should be noticed that the determination of the square root ensures that these functions depend analytically on in the region Re < − 41 . It is well known [2] that one can write e.g. w + as a linear combination of w and w − as w + (t, ) = k1 () w(t, ) + k2 () w − (t, ), with k1 () :=

(α)2 ,

(2α − 1)

k2 () := −

(α)2 (1 − 2α) .

(1 − α)2 (2α − 1)

Let us use the notation (u, v) := t (1 − t) u (t)v(t) − u(t)v (t) W for the reduced Wronskian of two solutions u and v of Eq. (38), which is actually constant. The Weyl–Kodaira theorem [11] can be used to prove that the self-adjoint operator T0 admits a spectral decomposition analogous to that of Proposition 1, where w plays the role of the function wq and dρq must be replaced by the absolutely continuous Borel measure dρ() :=

k1 () d 2π i W (w(·, ), w + (·, ))

on (−∞, − 41 ]. By [2, 15.3.10] one has that the function w + (·, ) can be written as w + (t, ) = −

(2 − 2α) log(−t) + ϕ(t, ),

(1 − α)2

(w(·, ), w + (·, )) is constant, where ϕ(·, ) is of class C 1 in a neighborhood of 0. As W we immediately arrive at the formula W (w(·, ), w+ (·, )) =

(2 − 2α)

(1 − α)2

Green’s Function for the Hodge Laplacian on Riemannian and Lorentzian Spaces

125

for the reduced Wronskian, which in turn yields the expression

dρ() = tanh π − −

1 4

d

(39)

for the measure ρ. If now use the identification of L 2 ((0, ∞), dµ ) with L 2 ((−∞, 0), 4π κ dt) we readily arrive at the formulas for the spectral decomposition of L 0 . The analysis of L 1 is similar. Setting u(t) =: (1−t)v(t) in the equation (−T1 )u = 0 as in Lemma 2 we find that the regular solution of this equation at 0 is w(t, ) := (1 − t)−α F α + 1, α − 1, 1,

t 1−t

up to a multiplicative constant, and that the solution which is square integrable at infinity is proportional to 1 if Im < 0, w (t, ) := (1 − t) F α + 1, α − 1, 2α; 1−t 1 if Im > 0. w+ (t, ) := (1 − t)α−1 F 2 − α, −α, 2 − 2α; 1−t −

−α

Since the reduced Wronskian of w and w+ is (w(·, ), w + (·, )) = W

(2 − 2α)

(2 − α) (−α)

and

(1 − 2α)

(α + 1) (α − 1) − w (t, ) = w(t, ) − w (t, ) ,

(2α − 1)

(−α) (2 − α) +

the same reasoning as above shows that the spectral measure dρ associated with T1 is also given by Eq. (39). As stated in Proposition 1, the integral representation (14) converges to g(L q )u in the norm topology, and by Egorov’s theorem this implies that the latter integral converges uniformly except on a subset of arbitrarily small measure. We find it convenient to conclude this section with another simple but useful observation concerning the pointwise convergence of previously defined spectral decompositions. Lemma 4. Let Uq , wq , cq,κ and ρq be defined as in Proposition 1, and let u ∈ C0∞ ([0, diam())). Then u(s) =

∞

wq (s, λ) Uq u(λ) dρq (λ)

cq,κ

pointwise for all s ∈ [0, diam()), and the convergence is uniform.

(40)

126

A. Enciso, N. Kamran

Proof. For the sake of concreteness we restrict ourselves to the case κ > 0. As C0∞ ([0, 1)) ⊂ Dom(L qm ) = Vκ−1 (Dom(Tqm )) for any nonnegative integer m, it follows that the integral ∞ 2 (L qm u, u) = λm Uq u(λ) dρq (λ) < ∞ (41) cq,κ

must be convergent for all m and u ∈ C0∞ ([0, diam())). It is well known [2] that the Legendre and Jacobi polynomials satisfy (0,2) ( j + 2)( j + 1) P j (ξ ) 1, P j (ξ ) 2 for all ξ ∈ [−1, 1]. By the explicit expression for wq , this immediately implies that ∞ ∞ wq (s, λ) Uq u(λ) dρq (λ) (c1 λ + c2 )1/2 Uq u(λ) dρq (λ) < ∞, cq,κ cq,κ which converges by virtue of Eq. (41). Here c1 and c2 are positive constants. Hence the integral (40) converges uniformly if the support of u is contained in [0, K ] with some K < diam(). (The same argument also yields convergence in the C k strong topology.) When κ 0 the proof is analogous and will be omitted. Acknowledgements. A.E. is financially supported by a MICINN postdoctoral fellowship and thanks McGill University, Montréal, for hospitality and support. A.E.’s research is supported in part by the DGI and the Complutense University–CAM under grants no. FIS2008-00209 and GR69/06-910556. The research of N.K. is supported by NSERC grant RGPIN 105490-2004.

References 1. Anguelova L., Langfelder P.: Massive gravitino propagator in maximally symmetric spaces and fermions in dS/CFT. JHEP 0303:057, 2003 2. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions. New York: Dover, 1970 3. Allen, B., Jacobson, T.: Vector two-point functions in maximally symmetric spaces. Comm. Math. Phys. 103, 669–692 (1986) 4. Allen, B., Lütken, C.A.: Spinor two-point functions in maximally symmetric spaces. Commun. Math. Phys. 106, 201–210 (1986) 5. Bär, C., Ginoux, N., Pfäffle, F.: Wave equations on Lorentzian manifolds and quantization. Freiburg: EMS, 2007 6. Basu, A., Uruchurtu, L.I.: Gravitino propagator in anti-de Sitter space. Class. Quant. Grav. 23, 6059–6075 (2006) 7. Camporesi, R.: The spinor heat kernel in maximally symmetrical spaces. Commun. Math. Phys. 148, 283–308 (1992) 8. D’Hoker, E., Freedman, D.Z., Mathur, S.D. et al.: Graviton and gauge boson propagators in AdS(d + 1). Nucl. Phys. B 562, 330–352 (1999) 9. Dodziuk, J.: L 2 Harmonic forms on rotationally symmetric Riemannian manifolds. Proc. Amer. Math. Soc. 77, 395–400 (1979) 10. Donnelly, H.: The differential form spectrum of hyperbolic space. Manus. Math. 33, 365–385 (1980/81) 11. Dunford, N., Schwartz, J.T.: Linear Operators II. Spectral Theory. New York: Wiley, 1988 12. Folacci, A.: Quantum field theory of p-forms in curved space-time. J. Math. Phys. 32, 2813–2827 (1991) 13. Folacci, A.: Green functions of the de Rham Laplacian in maximally symmetric spaces. J. Math. Phys. 33, 2228–2231 (1992) 14. Freivogel, B., Sekino, Y., Susskind, L., Yeh, C.P.: A Holographic framework for eternal inflation. Phys. Rev. D 74, 086003 (2006)

Green’s Function for the Hodge Laplacian on Riemannian and Lorentzian Spaces

15. 16. 17. 18.

127

Günther, P.: Huygens Principle and Hyperbolic Equations. Boston: Academic Press, 1988 Hawking, S.W., Hertog, T., Reall, H.S.: Brane new world. Phys. Rev. D 62, 043501 (2000) Helgason, S.: Geometric Analysis on Symmetric Spaces. Providence RI: Amer. Math. Soc., 1994 Lu, Q.K.: The Various Kernels of Classical Domains and Classical Manifolds. International Symposium in Memory of Hua Loo Keng, Vol. II (Beijing, 1988), Berlin: Springer, 1991, pp. 199–211 19. O’Neill, B.: Semi-Riemannian Geometry. New York: Academic Press, 1983 20. Riesz, M.: L’intégrale de Riemann-Liouville et le problème de Cauchy. Acta Math. 81, 1–223 (1951) 21. Warner, F.W.: Foundations of Differentiable Manifolds and Lie Groups. New York: Springer, 1983

Communicated by G. W. Gibbons

Commun. Math. Phys. 290, 129–154 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0761-0

Communications in

Mathematical Physics

Asymptotics in ASEP with Step Initial Condition Craig A. Tracy1 , Harold Widom2 1 Department of Mathematics, University of California, Davis, CA 95616, USA.

E-mail: [email protected]

2 Department of Mathematics, University of California, Santa Cruz,

CA 95064, USA. E-mail: [email protected]; [email protected] Received: 9 August 2008 / Accepted: 25 November 2008 Published online: 26 February 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com

Abstract: In previous work the authors considered the asymmetric simple exclusion process on the integer lattice in the case of step initial condition, particles beginning at the positive integers. There it was shown that the probability distribution for the position of an individual particle is given by an integral whose integrand involves a Fredholm determinant. Here we use this formula to obtain three asymptotic results for the positions of these particles. In one an apparently new distribution function arises and in another the distribution function F2 arises. The latter extends a result of Johansson on TASEP to ASEP, and hence proves KPZ universality for ASEP with step initial condition. 1. Introduction In previous work [8] the authors considered the asymmetric simple exclusion process (ASEP) on the integer lattice Z in the case of step initial condition, particles beginning at the positive integers Z+ . There it was shown that the probability distribution for the position of an individual particle is given by an integral whose integrand involves a Fredholm determinant. Here we use this formula to obtain three asymptotic results for the positions of these particles. In ASEP a particle waits an exponential time, then moves to the right with probability p if that site is unoccupied (or else stays put) or to the left with probability q = 1 − p if that site is unoccupied (or else stays put). The formula in [8] gives the distribution function for xm (t), the position of the m th particle from the left at time t when all xm (0) = m. Here we shall assume that p < q, so there is a drift to the left, and establish three results on the position of the m th particle when t → ∞. The first gives the asymptotics of the probability P(xm (t) ≤ x) when m and x are fixed; the second, conjectured in [8], gives the limiting distribution for fixed m when x goes to infinity; and the third gives the limiting distribution when both m and x go to infinity. In the second result an apparently new distribution function arises and in the third the distribution function F2 of random

130

C. A. Tracy, H. Widom

matrix theory [7] arises. (That F2 should arise in ASEP has long been suspected. In the physics literature this is referred to as KPZ universality [5].) Before giving the results we state the formula derived in [8], valid when p and q are nonzero. It is given in terms of the Fredholm determinant1 of a kernel K (ξ, ξ ) on C R , a circle with center zero and large radius R described counterclockwise. It acts as an operator by f (ξ ) → K (ξ, ξ ) f (ξ ) dξ , (ξ ∈ C R ).2 CR

We use slightly different notation here, which will simplify formulas later. We set γ = q − p, τ = p/q. The kernel is

ξ x eε(ξ )t/γ , K (ξ, ξ ) = q p + qξ ξ − ξ

(1)

where ε(ξ ) = p ξ −1 + q ξ − 1. The formula is

P (xm (t/γ ) ≤ x) =

det(I − λK ) dλ , m−1 k λ k=0 (1 − λ τ )

(2)

where the integral is taken over a contour enclosing the singularities of the integrand at λ = 0 and λ = τ −k (k = 0, . . . , m − 1). We mention here the special case, easily derived from this, P(x1 (t/γ ) > x) = det(I − K ).

(3)

The first formula is concrete. The sign ∼ in its statement indicates that the ratio of the two sides tends to one. Theorem 1. Assume 0 x) ∼

∞

(1 − τ k )

k=1

t 2m−x−2 e−t . (m − 1)! (m − x − 1)!

It is clear probabilistically that P (xm (t) > x) = 0 for all t when x ≥ m: for a particle to be to the right of its initial position all particles to its right would have to move simultaneously to the right, which surely has probability zero. This will also be seen in the proof of the theorem. Although Theorem 1 required p > 0 the statement makes sense when p = 0, the TASEP where particles move only to the left. In this case the probability equals a probability in a unitary Laguerre random matrix ensemble [4]. The corresponding asymptotics can be derived there and found to be the same as our formula when p = 0. 1 The Fredholm determinant of a kernel K is the operator determinant det(I − λK ). Properties of these determinants, trace class operators, etc., may be found in [2]. 2 All contour integrals are to be given a factor 1/2πi.

Asymptotics in ASEP with Step Initial Condition

131

The second result was conjectured, and the beginning of a possible proof given, in [8]. Denote by Kˆ the operator on L 2 (R) with kernel3 q 2 2 2 2 Kˆ (z, z ) = √ e−( p +q ) (z +z )/4+ pq zz . 2π Theorem 2. Assume 0 < p < q. For fixed m the limit xm (t/γ ) + t lim P ≤s t→∞ γ 1/2 t 1/2 is equal to the integral in (2) with K replaced by the operator Kˆ χ (−s, ∞) . From this and (3) we have the special case x1 (t/γ ) + t ˆ χ (s, ∞) . lim P > −s = det I − K t→∞ γ 1/2 t 1/2 This is an apparently new family of distribution functions, parametrized by p. When p = 0 the kernel has rank one and the determinant equals a standard normal distribution. Finally, we state the result when m and x both go to infinity. We use the notations √ √ σ = m/t, c1 = −1 + 2 σ , c2 = σ −1/6 (1 − σ )2/3 . (4) Theorem 3. When 0 ≤ p < q we have xm (t/γ ) − c1 t lim P ≤ s = F2 (s) t→∞ c2 t 1/3 uniformly for σ in a compact subset of (0, 1).4 The proofs of the theorems will involve asymptotic analysis of K . The main point is that the kernel has the same Fredholm determinant as the sum of two kernels; one has large norm but fixed spectrum and its resolvent can be computed exactly, and the other is better behaved. This representation is derived in the next section. 2. Preliminaries We begin with two facts on stability of the Fredholm determinant. They concern smooth kernels acting on simple closed curves. Both use the fact that for a trace class operator L the determinant det(I − λL) is determined by the traces tr L n , n ∈ Z+ . This is so because up to constants these are the coefficients in the expansion of the logarithm of the determinant around λ = 0. Proposition 1. Suppose s → s is a deformation of closed curves and a kernel L(η, η ) is analytic in a neighborhood of s × s ⊂ C2 for each s. Then the Fredholm determinant of L acting on s is independent of s. 3 This is the symmetrization of the Mehler kernel. 4 Notice that here we allow p = 0. In this case we get the asymptotic formula derived by Johansson [4] for TASEP. For ASEP the strong law t −1 xm (t/γ ) → c1 a.s. was proved by Liggett [3]. For stationary

ASEP Balázs and Seppäläinen [1] and Quastel and Valkó [6] proved that the variance of the current across a characteristic has order t 2/3 and the diffusivity has order t 1/3 .

132

C. A. Tracy, H. Widom

Proof. The trace of L n on s equals ··· L(η1 , η2 ) · · · L(ηn−1 , ηn ) L(ηn , η1 ) dη1 · · · dηn . s

s

If s

is sufficiently close to s we may consecutively replace the contours s for the ηi by s , obtaining the trace of L n on s . So tr L n is a locally constant function of s and the usual argument shows that it is constant. Therefore so is the Fredholm determinant.

Proposition 2. Suppose L 1 (η, η ) and L 2 (η, η ) are two kernels acting on a simple closed contour , that L 1 (η, η ) extends analytically to η inside or to η inside , and that L 2 (η, η ) extends analytically to η inside and to η inside . Then the Fredholm determinants of L 1 (η, η ) + L 2 (η, η ) and L 1 (η, η ) are equal. Proof. Suppose L 1 (η, η ) extends analytically to η inside . The operator L 1 L 2 on has kernel L 1 L 2 (η, η ) = L 1 (η, ζ ) L 2 (ζ, η ) dζ = 0,

since the integrand extends analytically to ζ inside . The operator L 22 on has kernel 2 L 2 (η, ζ ) L 2 (ζ, η ) dζ = 0 L 2 (η, η ) =

for the same reason. Therefore for n > 1, (L 1 + L 2 )n = L n1 + L 2 L n−1 1 , = tr L n−1 L 2 = 0, so tr (L 1 + L 2 )n = tr L n1 . When n = 1 we use and tr L 2 L n−1 1 1 L 2 (η, η) dη = 0, tr L 2 =

since the integrand extends analytically inside , which completes the proof.

We introduce the notation

ϕ(η) =

1 − τη 1−η

x

e

1 1 1−η − 1−τ η

t

.

In K (ξ, ξ ) we make the substitutions ξ=

1 − τη 1 − τ η , ξ = , 1−η 1 − η

and we obtain the kernel5 ϕ(η ) = K 2 (η, η ) η − τ η acting on γ , a little circle about η = 1 described clockwise, which has the same Fredholm determinant. We denote this by K 2 because there is an equally important kernel ϕ(τ η) = K 1 (η, η ). η − τ η 5 This is the kernel (dξ/dη)1/2 (dξ /dη )1/2 K (ξ(η), ξ (η )).

Asymptotics in ASEP with Step Initial Condition

133

Proposition 3. Let be any closed curve going around η = 1 once counterclockwise with η = τ −1 on the outside. Then the Fredholm determinant of K (ξ, ξ ) acting on C R has the same Fredholm determinant as K 1 (η, η ) − K 2 (η, η ) acting on . Proof. We must show that the determinant of K 2 acting on γ equals the determinant of K 1 − K 2 acting on . The kernel K 1 (η, η ) extends analytically to η inside γ and to η inside γ while K 2 (η, η ) extends analytically to η inside γ . Hence by Proposition 2 the determinant of K 2 acting on γ equals the determinant of K 2 − K 1 . Next we show that we may replace γ by − . (Recall that γ is described clockwise and counterclockwise.) We apply Proposition 1 to the kernel K 1 (η, η ) − K 2 (η, η ) =

ϕ(τ η) − ϕ(η ) , η − τ η

with 0 = −γ and 1 = . Since the numerator vanishes when the denominator does, the only singularities of the kernel are at η, η = 1, τ −1 , neither of which is passed in a deformation s , s ∈ [0, 1]. Therefore the proposition applies and gives the result.

Proposition 4. Suppose the contour of Proposition 3 is star-shaped with respect to η = 0.6 Then the Fredholm determinant of K 1 acting on is equal to ∞

(1 − λτ k ).

k=0

Proof. The function ϕ(τ η) is analytic except at τ −1 and τ −2 , both of which are outside , so the function is analytic on s when 0 < s ≤ 1. The denominator η − τ η is nonzero for η, η ∈ s for all such s. (The assumption on was used twice.) Therefore by Proposition 1 the Fredholm determinant of K 1 on is the same as on s . This in turn is the same as the Fredholm determinant of ϕ(sτ η) (5) s K 1 (sη, sη ) = η − τη on . The operator is the one with kernel K 0 (η, η ) =

1 , η − τ η

which is trace class since the kernel is smooth, left-multiplied by multiplication by ϕ(sτ η). The latter converges in operator norm to the identity as s → 0 since ϕ(sτ η) → 1 uniformly on , and so (5) converges in trace norm to K 0 . Therefore the Fredholm determinant of K 1 equals the Fredholm determinant of K 0 . The kernel of K 02 equals 1 dζ = . K 02 (η, η ) = − τζ) (ζ − τ η) (η η − τ2 η This is because τ η is inside and τ −1 η outside when η, η ∈ , since is star-shaped. Generally, we find that 1 dζ K 0n (η, η ) = = , n−1 η) (η − τ ζ ) η − τn η (ζ − τ so 6 This means that 0 is inside and each ray from 0 meets at exactly one point.

134

C. A. Tracy, H. Widom

tr K 0n =

1 . 1 − τn

Thus for small λ, log det(I − λK 0 ) = −

∞

λn n=1

∞

∞

∞

τ nk λn

1 = = − log(1 − λτ k ), n n 1−τ n k=0 n=1

k=0

and the result follows.7

Denote by R(η, η ; λ) the resolvent kernel of K 1 , the kernel of λ (I − λK 1 )−1 K 1 . This is analytic everywhere except for λ = τ −k , k ≥ 0. We define ϕn (η) = ϕ(η) ϕ(τ η) · · · ϕ(τ n−1 η). Proposition 5. Assume that is star-shaped with 1 inside and τ −1 outside. Then for sufficiently small λ, R(η, η ; λ) =

∞

λn

n=1

ϕn (τ η) . η − τ n η

Proof. If 0 < τ1 , τ2 < 1 and σ1 , σ2 are analytic inside then σ1 (η) σ2 (τ1 η) σ1 (η) σ2 (ζ ) dζ = . η − τ1 τ2 η ζ − τ1 η η − τ2 ζ This uses, again, the assumption that is star-shaped. From this we see by induction that K 1n has kernel ϕn (τ η) . η − τ n η Here we used the fact that the ϕn (τ η) are analytic inside , although ϕ(η) isn’t. We multiply by λn and sum to get the resolvent.

For λ not equal to any τ −k the operator I − λK 1 is invertible and we may factor it out from I − λK = I − λK 1 + λK 2 , and we obtain det(I − λK ) = det(I − λK 1 ) det I + λK 2 (I − λK 1 )−1 = det(I − λK 1 ) det(I + λK 2 (I + R)) , where R denotes the operator with kernel R(η, η ; λ). The first factor is given by Proposition 4 (we asume here that is as in the proposition), and so we may rewrite (2) as ∞ dλ P (xm (t/γ ) ≤ x) = . (6) (1 − λ τ k ) · det(I + λK 2 (I + R)) λ k=m

This formula and the formula of Proposition 5 form the basis for our proofs.8 7 It is easy to see directly that the nonzero eigenvalues of K are exactly the τ k . This does not give the for0 mula for the Fredholm determinant since for that we would have to show that these eigenvalues have algebraic multiplicity one. The computation of traces avoids that issue. 8 It will become apparent later that it was important that we factored in the order we did. The operator K 2 (I + R) behaves well while (I + R) K 2 does not.

Asymptotics in ASEP with Step Initial Condition

135

3. Proof of Theorem 1 We begin this section with a decomposition of the resolvent kernel that will be used in the proofs of the first two theorems. The first summand will contain the poles of the resolvent inside the contour of integration in (6) while the remainder will be analytic inside it. We assume as before that is as in Proposition 5. We have 1 1 − τ n η x 1−η − 1−τ1 n η t ϕn (η) = e , 1−η and we define η

ϕ∞ (η) = lim ϕn (η) = (1 − η)−x e 1−η

t

n→∞

and G(η, η , u) =

1 − uη 1−η

x

e

1 1 1−η − 1−uη

t

(η − τ −1 uη)−1 .

In this formula we shall always take u ∈ [0, τ 2 ], so G(η, η , u) will be smooth in u and η, η ∈ . Define R1 (η, η ; λ) =

m−1 ϕ∞ (τ η) G (k) (η, η , 0) λτ 2k , ϕ∞ (η) k! 1 − λτ k k=0

∞

ϕ∞ (τ η) n τ (n+1)(m−1) λ R2 (η, η ; λ) = ϕ∞ (η) (m − 1)!

n=1

τ n+1

(1 − u/τ n+1 )m−1 G (m) (η, η , u) du.

0

(Derivatives of G are all with respect to u.) Clearly R1 is analytic everywhere except for poles at λ = 1, τ −1 , . . . , τ −m+1 and R2 is defined and analytic for |λ| < τ −m . Lemma 1. R(η, η ; λ) = R1 (η, η ; λ) + R2 (η, η ; λ) when |λ| < τ −m . Proof. Observe that ϕ∞ (τ η) ϕn (τ η) = G(η, η , τ n+1 ). η − τ n η ϕ∞ (η) By Taylor’s theorem with integral remainder G(η, η , τ n+1 ) is equal to m−1

k=0

G (k) (η, η , 0) (n+1)k τ (n+1)(m−1) τ + k! (m − 1)!

τ n+1

(1 − u/τ n+1 )m−1 G (m) (η, η , u) du.

0

We multiply this by ϕ∞ (τ η)/ϕ∞ (η) times λn and sum over n to get R(η, η ; λ). We obtain the statement of the proposition for λ sufficiently small, and therefore by analyticity it holds throughout |λ| < τ −m .

136

C. A. Tracy, H. Widom

Lemma 2. The operators K 2 R1 and K 2 R2 have kernels

K 2 R1 (η, η ) =

m−1

k=0

K 2 R2 (η, η ) =

∞

n=1

λn

τ (n+1)(m−1) (m − 1)!

1 λτ 2k k! 1 − λτ k

τ n+1

G (k) (ζ, η , 0) dζ, ζ − τη

(1 − u/τ n+1 )m−1 du

0

(7)

G (m) (ζ, η , u) dζ. ζ − τη (8)

Proof. We have ϕ(ζ )

ϕ∞ (τ ζ ) = 1. ϕ∞ (ζ )

The formulas (7) and (8) follow from this and Lemma 1.

When √ x is fixed the steepest descent curve for ϕ(η) is the circle with center zero and radius 1/ τ . In this section we take for the circle with center zero and any radius r ∈ (1, τ −1 ), described counterclockwise. This is one of the contours allowed. On the function ϕ(η) is well-behaved (it is uniformly exponentially small as t → ∞), but ϕ(τ η) is badly-behaved (it is exponentially large at η = r ), which explains the importance of the correct order alluded to in the last footnote. We begin by deriving trace norm estimates. In Lemma 2 the kernels K 2 R1 and K 2 R2 are given in terms of integrals of rank one operators, and we shall use the fact that the trace norms of these integrals are at most the integrals of the Hilbert-Schmidt norms of the integrands. We denote by · 1 the trace norm and (this will be used later) by · 2 the Hilbert-Schmidt norm. For the estimates involving R1 in the following lemma we assume that λ is bounded away from the poles τ −k . Lemma 3. We have, for some δ > 0,9 K 2 1 = O(e−δt ), K 2 R2 1 = O(e−δt ), K 2 R1 1 = O(e−(1/2+δ)t ), K 2 R1 K 2 (I + R2 ) 1 = O(e−(1+δ)t ).

(9)

Proof. For our estimates we use the fact that if v > 0 then on the real part of 1/(1−vη) achieves its maximum at η = −r when vr > 1 and its minimum at η = −r when vr < 1. In particular the real part of 1 1 − 1 − η 1 − τη achieves its maximum at η = −r and equals 1 1 − < 0. 1+r 1 + τr This gives, first, a uniform estimate ϕ(η) = O(e−δt ). The operator K 2 equals the operator with trace class kernel 1/(η − τ η) left-multiplied by the operator multiplication by ϕ(η), which has operator norm O(e−δt ). This gives the first estimate, K 2 1 = O(e−δt ). 9 We shall always use δ to denote some positive number, different with each occurrence.

Asymptotics in ASEP with Step Initial Condition

137

Next, G (m) (η, η , u) is O(t m ) times the exponential of 1 1 − t, 1 − η 1 − uη and when |u| ≤ τ 2 , as it is in (8), the real part of this when η ∈ is at most 1 1 − t, 1+r 1 + τ 2r and the expression in brackets is negative. Thus the integrand in the integral over ζ in (8) is O(e−δt ) uniformly in all variables. In particular its Hilbert-Schmidt norm with respect to η, η has the same estimate, so this integral has trace norm O(e−δt ) uniformly in u. It follows that K 2 R2 1 = O(e−δt ) on compact subsets of |λ| < τ −m . For K 2 R1 , we use ηt ηx τ −1 η G (η, η , u) = − + G(η, η , u), − (10) 1 − uη (1 − uη)2 η − τ −1 uη from which we see that each G (k) (η, η , 0) G(η, η , 0) is a linear combination of products t i η j (η )− . Since G(η, η , 0) = ϕ∞ (η)/η , G (k) (η, η , 0) is a linear combination of t i η j ϕ∞ (η) (η )−−1 , and so by (7) K 2 R1 (η, η ) is a linear combination of integrals ϕ∞ (ζ ) −−1 (η ) ti ζj dζ. ζ − τη

(11)

The exponent in ϕ∞ (ζ ) is t times ζ /(1 − ζ ). Its maximum real part on , occurring at ζ = −r , is −r/(1 + r ). Since r/(1 + r ) > 1/2 this shows that the integrand is uniformly O(e−(1/2+δ)t ), and so this is the bound for K 2 R2 1 , as long as λ is bounded away from the poles. Finally, K 2 R1 K 2 and K 2 R1 K 2 R2 . It follows from (11) that the kernel of K 2 R1 K 2 is a linear combination of ϕ∞ (ζ ) −−1 ϕ(η ) i (ζ ) t ζj dζ dζ . ζ − τη η − τ ζ We integrate first with respect to ζ by expanding the contour. We cross the pole at ζ = τ −1 η , so we get a constant times ϕ∞ (ζ ) ϕ(η ) j −−1 ζ (η ) ti dζ. ζ − τη

138

C. A. Tracy, H. Widom

Now we compute

ϕ∞ (ζ ) j ζ dζ. ζ − τη

The integrand is analytic outside with a pole at infinity. The integral may be written ∞

ζ (τ η)k (1 − ζ )−x e 1−ζ t ζ j−k−1 dζ, k=0

which we see equals e−t times a polynomial in t and η. So the kernel of K 2 R1 K 2 is e−t times a linear combination of products t i η j ϕ(η ) (η )−−1 . Since ϕ(η ) = O(e−δt ) as we have already seen, we have K 2 R1 K 2 1 = O(e−(1+δ)t ). If we use ϕ(η) ϕ∞ (τ η) = ϕ∞ (η) again we see that the kernel of K 2 R1 K 2 R2 is e−t times a linear combination of τ n+1 ∞

τ (n+1)(m−1) λn ζ −−1 (1 − u/τ n+1 )m−1 G (m) (ζ, η , u) du dζ. ti η j (m − 1)! 0 n=1

Using (10) again we see that G (m) (ζ, η , u) is O(t m ) times the exponential of 1 1 − t. 1−ζ 1 − ζu As before the maximum real part of the expression in brackets occurs at −r and equals 1 1 − , 1+r 1 + ru which has a negative upper bound for u ≤ τ 2 . Since we had the factor e−t we obtain the bound K 2 R1 K 2 R2 1 = O(e−(1+δ)t ). This completes the proof of Lemma 3.

Proof of Theorem 1. In (6) the contour enclosesé1 all the singularities of the integrand. If we take the contour instead to have the singularity λ = 0 on the outside and the τ −k with k < m inside then we have ∞ dλ . (12) (1 − λ τ k ) · det(I + λK 2 (I + R)) P (xm (t/γ ) > x) = − λ k=m

Now I + λK 2 (I + R) = I + λK 2 (I + R2 ) + λK 2 R1 = (I + λ K 2 R1 (I + λ K 2 (1 + R2 ))−1 ) (I + λ K 2 (1 + R2 )). (Note that I + λ K 2 (1 + R2 ) is invertible since K 2 (1 + R2 ) has small norm.) Therefore det(I + λK 2 (I + R)) = det(I + λ K 2 (1 + R2 )) det(I + λ K 2 R1 (I + λ K 2 (1 + R2 ))−1 ).

Asymptotics in ASEP with Step Initial Condition

139

The first factor on the right is analytic inside the contour, and equal to 1 + O(e−δt ) by (9). As for the second factor, we have I +λ K 2 R1 (I +λ K 2 (1+ R2 ))−1 = I +λ K 2 R1 −λ2 K 2 R1 K 2 (1+ R2 )(I +λ K 2 (1+ R2 ))−1 = I + λ K 2 R1 + O(e−(1+δ)t ), by (9). Here the error estimate refers to the trace norm. Hence det(I + λ K 2 R1 (I + λ K 2 (1 + R2 ))−1 ) = 1 + λ tr K 2 R1 + O(e−(1+δ)t ), since K 2 R1 1 = O(e−(1/2+δ)t ) by (9), so (K 2 R1 )2 1 = O(e−(1+δ)t ). Thus det(I + λK 2 (I − λK 1 )−1 ) = det(I + λ K 2 (1 + R2 )) 1 + λ tr K 2 R1 + O(e−(1+δ)t ) . When we insert this into the integral in (12) we may ignore the summand 1 in the second factor since the first factor is analytic inside the contour. The integral involving tr K 2 R1 we can compute by residues. Its multiplier λ is cancelled by the denominator in (12). So with error O(e−(1+δ)t ) (12) equals −

m−1 ∞

(1−τ j−k ) · det(I +λ K 2 (1+ R2 (τ −k )) · residue of tr K 2 R1 at λ=τ −k , (13)

k=1 j=m

where R2 (τ −k ) denotes the operator with kernel R2 (η, η ; τ −k ). The determinants are 1 + O(e−δt ), as we saw, and will not contribute to the asymptotics. The residue of tr K 2 R1 at τ −k equals −

1 k!

G (k) (ζ, η, 0) dζ dη. ζ − τη

(14)

From (10) we see, more precisely than earlier, that

G (k) (ζ, η, 0) = ϕ∞ (ζ ) ζ k ai jk t i η− j−1 i+ j≤k

for some coefficients ai jk . Substituting this into (14) and integrating with respect to η by expanding the contour outward gives 1

− ai jk t i τ j ϕ∞ (ζ ) ζ k− j−1 dζ k! i+ j≤k

ζ 1 i j ai jk t τ (1 − ζ )−x e 1−ζ t ζ k− j−1 dζ. =− k! i+ j≤k

The integral vanishes unless x ≤ k − j and otherwise equals e−t times a polynomial in t of degree k − j − x with top coefficient (−1) j−k . (k − j − x)!

140

C. A. Tracy, H. Widom

We see from this that the highest power of t, which is t 2k−x , comes from the summand with j = 0, i = k. The coefficient ak,0,k equals (−1)k . Thus (14) equals e−t times a polynomial of degree 2k − x in t with top coefficient −

1 . k! (k − x)!

In particular the main contribution to the sum in (13) comes from the summand k = m−1, and if we recall the minus sign in (13) we get the statement of Theorem 1.

Remark. As mentioned in the introduction, we can also show that P(xm (t) > x) = 0 when x ≥ m. We know for (11) that K 2 R1 (η, η ) is a linear combination of ζ ϕ∞ (ζ ) k 1 ζ dζ = (η )− j−1 (1 − ζ )−x e 1−ζ t ζ k dζ, (η )− j−1 ζ − τη ζ − τη with j, k < m. When x ≥ m we expand the contour and get zero since k < m and τ η is inside . Therefore K 2 R1 = 0. Hence K 2 (I − λK 1 )−1 = K 2 (1 + R2 ) + K 2 R1 = K 2 (1 + R2 ), and so

det I + λK 2 (I − λK 1 )−1 = det (I + λ K 2 (1 + R2 )),

which is analytic inside the contour of integration and therefore integrates to zero. 4. Proof of Theorem 2 We know from (2) that xm (t/γ ) + t det(I − λK ) dλ , P ≤ s = m−1 1/2 1/2 k γ t λ k=0 (1 − λ τ ) where in the definition of K we set x = −t + γ 1/2 s t 1/2 .

(15)

Therefore the theorem would follow if

lim det(I − λK ) = det I − λ Kˆ χ (−s, ∞)

t→∞

(16)

uniformly on compact λ-sets. The Fredholm determinants are entire functions of λ, and the coefficients in their expansions about λ = 0 are universal polynomials in the traces of powers of the operators. It was shown in [8] that for n ∈ Z+ , n lim tr K n = tr Kˆ χ (−y, ∞) ) , t→∞

and it was pointed out that (16) would follow if we knew that det(I − λK ) is uniformly bounded for large t on compact λ-sets. This is what we shall show here. For any m it suffices that the determinant is uniformly bounded on compact subsets of |λ| < τ −m , and since it is entire we may assume that the sets exclude the singularities at λ = τ −k . From the uniform boundedness of det(I − λK 1 ) on compact λ-sets it follows

Asymptotics in ASEP with Step Initial Condition

141

that it suffices to prove the uniform boundedness of det(I + K 2 (I + R)) on compact sets excluding the τ −k . Here is how we decide what contour to take for . The steepest descent curves for all the ϕn including ϕ∞ are similar. They lie in the right half-plane, tangent to the imaginary axis at the saddle point η = 0, and have an inward-pointing cusp at η = 1, where the real part of the exponential tends to −∞. We would like to take as the curve of Propositions 3 and 4 something like this. It need not have that cusp at η = 1, only that the ϕn are exponentially small there, and if it passes through η = 1 vertically that will happen. That η = 1 is a singularity of K 2 does not change the conclusions of the propositions since we can take an appropriate limit of contours not passing through 1. So we may take to be the circle with diameter [0, 1]. But this is not star-shaped with respect to the origin, so Proposition 4 would not apply (even though Proposition 3 would). Therefore we expand it a little on the left, resulting in a contour that is starshaped. We expand it so that instead of 0 it passes through −t −1/2 . This, finally, is the contour in this section: the circle symmetric about the real line and meeting it at η = −t −1/2 and η = 1. From the identity det(I + A) = det 2 (I + A) etr A and the fact that the det 2 is bounded on · 2 -bounded sets, we see that is suffices to prove that tr (K 2 (I + R)) = O(1), K 2 (I + R) 2 = O(1). We shall prove more, namely tr K 2 = O(1), K 2 2 = O(1), K 2 R 1 = O(1).

(17)

We begin by obtaining a bound for integrals involving the various ϕn (η). The coefficients of t appearing in the exponentials of these functions are of the form 1 1−η 1 − + log 1 − η 1 − vη 1 − vη

(18)

with 0 ≤ v ≤ τ . On the part of outside any fixed neighborhood of zero in C the real parts of these are uniformly bounded above by −δ for some δ > 0 when t is sufficiently large. In a sufficiently small fixed neighborhood of zero the real part is at most 2 1/2 O(t −1 ) − δ |η|2 . It follows that ϕn (η) = O(e−δ|η| t+O(t |η|) ), where the t 1/2 |η| term 1/2 comes from the y t term in (15). From this it follows that for any k ≥ 0, |ϕn (η)| |η|k |dη| = O(t −(k+1)/2 ), (19)

for the following reason. The integral over that part of outside any fixed neighborhood of zero is exponentially small. For the integral over a neighborhood of zero we have, if y = Im η, |η|2 = O(t −1 + y 2 ), |η|2 ≥ y 2 , |dη| = O(dy), so the integral over that portion of is bounded by a constant times ∞ 2 1/2 e−δy t+O(|y| t ) (t −1 + y 2 )k/2 dy = O(t −(k+1)/2 ). −∞

142

C. A. Tracy, H. Widom

If we change variables in (19) we get the equivalent estimate |ϕn (t −1/2 η)| |η|k |dη| = O(1). t 1/2

More generally, for all j > 0 we have |ϕn (t −1/2 η)| j |η|k |dη| = O(1), t 1/2

(20)

since ϕn (η) is uniformly bounded. We shall now establish (17). First K 2 , with kernel ϕ(η ) . η − τ η We use the fact that the kernel substitution L(η, η ) on −→ t −1/2 L(t −1/2 η, t −1/2 η ) on t 1/2

(21)

preserves norms and traces. The circle t 1/2 meets the real line at η = −1 and η = t 1/2 . Making this substitution gives the kernel ϕ(t −1/2 η ) . η − τ η

(22)

We have |tr K 2 | ≤

1 1−τ

t 1/2

−1/2

ϕ(t η)

|dη| = O(1),

η

by (20) and the fact that t 1/2 is bounded away from zero. Next, K 2 22 Now

=

t 1/2

t 1/2

−1/2 2

ϕ(t η )

dη dη .

η − τη t 1/2

|η

1 |dη| = O(1) − τ η|2

uniformly for η ∈ t 1/2 .10 Using this and (20) we see that K 2 2 = O(1). Next, K 2 R. When x is given by (15) we find that uη2 t (q − p)−1/2 yηt 1/2 τ −1 η G (η, η , u) = − G(η, η , u). + − (1 − uη)2 1 − uη η − τ −1 uη 10 That’s because if η ∈ t 1/2 then the distance from τ η to t 1/2 is at least some positive constant times

|η|.

Asymptotics in ASEP with Step Initial Condition

143

From this and the fact that uη is bounded away from 1 when η ∈ and u ≤ τ 2 we find that each G (k) (η, η , u) G(η, η , u) is bounded by a linear combination of products

j

η

1/2 i

. t

η η − uη/τ Since G(η, η , 0) = ϕ∞ (η)/η it follows in particular that G (k) (η, η , 0) is bounded by a constant times a linear combination of products |η t 1/2 |i |η| j |η |− j−1 |ϕ∞ (η)|. After the substitution (21) and the variable change ζ → t −1/2 ζ in each integral in (7) we get as bound a linear combination of

ϕ∞ (t −1/2 ζ ) i+ j − j−1

| |dζ |. (23)

ζ − τ η |ζ | | η | t 1/2 The Hilbert-Schmidt norm with respect to η, η of |ζ − τ η|−1 |η |− j−1 on t 1/2 is uniformly bounded for ζ ∈ t 1/2 (as in the last footnote), and so the trace norm of the integral is bounded by |ϕ∞ (t −1/2 ζ )| |ζ |i+ j | |dζ | = O(1), t 1/2

by (20). That takes care of K 2 R1 . For K 2 R2 it is enough to show that the last integral in (8) has bounded trace for u ≤ τ 2 , for then the trace norm of K 2 R2 would be at most norm ∞ a constant times n=1 |τ m λ|n , which is bounded on compact subsets of |λ| < τ −m . In the estimate for the last integral the analogue of (23) would be

G 0 (t −1/2 ζ, u) i+ j

|ζ | |η − uζ /τ |− j−1 | |dζ |,

ζ − τη t 1/2 where

G 0 (η, u) =

1 − uη 1−η

x

e

1 1 1−η − 1−uη

t

.

(This is G without its last factor.) Taking the Hilbert-Schmidt norm with respect to η, η under the integral sign shows (as in the last footnote again) that the trace norm of the integral is bounded by |G 0 (t −1/2 ζ, u)| |ζ |i+ j | |dζ |. t 1/2

In G 0 the factor of t in the exponent is of the form (18) with v = u, and so this integral is O(1) uniformly for u ≤ τ 2 . This completes the proof of (17) and so of Theorem 2.

144

C. A. Tracy, H. Widom

5. Proof of Theorem 3 In formula (6) the integral is taken over a circle with center zero and radius larger than τ −m+1 . We set λ = τ −m µ,

(24)

and the formula becomes P(xm (t/γ ) ≤ x) =

∞ k=0

dµ , (1 − µ τ k ) · det I + τ −m µ K 2 (I + R) µ

(25)

where µ runs over a circle of fixed radius larger than τ (but not equal to any τ −k with k ≥ 0). We shall show that when c1 and c2 are given by (4) and x = c1 t + c2 s t 1/3

(26)

the determinant in this integrand has the limit F2 (s) uniformly in µ and σ , which will establish the theorem. The main lemma replaces the kernel τ −m µ K 2 (I + R) by one which will allow us to do a steepest descent analysis. Now we do not decompose R into a sum of two kernels, but use the entire infinite series in Proposition 5. We define f (µ, z) =

∞

k=−∞

τk zk . 1 − τkµ

This is analytic for 1 < |z| < τ −1 and extends analytically to all z = 0 except for poles at the τ k , k ∈ Z. We define a kernel J (η, η ) acting on a circle with center zero and radius r ∈ (τ, 1) by f (µ, ζ /η ) ϕ∞ (ζ ) ζ m dζ, (27) J (η, η ) = ϕ∞ (η ) (η )m+1 ζ −η where the integral is taken over a circle with center zero and radius in the interval (1, r/τ ). Lemma 4. With λ given by (24) we have det (I + λ K 2 (I + R)) = det(I + µ J ). Proof. Our operators K 1 and K 2 may be taken to act on a circle with radius r ∈ (1, τ −1 ). From Proposition 5 and the identity ϕn (ζ ) =

ϕ∞ (ζ ) , ϕ∞ (τ n ζ )

we obtain K 2 R(η, η ) =

∞

n=1

λn

dζ ϕ∞ (ζ ) . n+1 ϕ∞ (τ ζ ) (ζ − τ η) (η − τ n ζ )

Asymptotics in ASEP with Step Initial Condition

145

Here |ζ | = r but by analyticity we may take any radius such that 1 < |ζ | < τ −1r. This is equal to ∞

λ

ϕ∞ (ζ ) dζ ζ − τη

n

n=1

du 1 , ϕ∞ (uζ ) (η − uζ /τ ) u − τ n+1

as long as on the circle of u-integration we have τ 2 < |u| < τr/|ζ |. We use ∞

τ (n+1)k 1 = u − τ n+1 u k+1 k=0

and sum over n first to get ∞

du τ 2k λ 1 ϕ∞ (ζ ) . dζ 1 − τkλ ζ − τη ϕ∞ (uζ ) (η − uζ /τ ) u k+1 k=0

If we assume also that τ < |u| < τr/|ζ |, which requires also that 1 < |ζ | < r,

(28)

we may rewrite this as ∞

k=0

−

τk 1 − τkλ ∞

k=0

τ

k

ϕ∞ (ζ ) dζ ζ − τη

ϕ∞ (ζ ) dζ ζ − τη

1 du ϕ∞ (uζ ) (η − uζ /τ ) u k+1

1 du , ϕ∞ (uζ ) (η − uζ /τ ) u k+1

because both series converge. Summing the second series gives ϕ∞ (ζ ) du − dζ . ζ − τη ϕ∞ (uζ ) (η − uζ /τ ) (u − τ ) Since ϕ∞ (uζ ) is analytic and nonzero inside the u-contour (since |uζ | < τr < 1) and τ is inside and τ η /ζ outside this equals dζ ϕ∞ (ζ ) ϕ(ζ ) − = − dζ. ϕ∞ (τ ζ ) (ζ − τ η) (η − ζ ) (ζ − τ η) (η − ζ ) If we expand the contour so that |ζ | > r

146

C. A. Tracy, H. Widom

then we pass the pole at ζ = η and get ϕ(η ) ϕ(ζ ) − dζ. − η − τη |ζ |>r (ζ − τ η) (η − ζ ) The first summand is exactly −K 2 (η, η ), so have shown ϕ(ζ ) dζ K 2 (I + R) (η, η ) = − |ζ |>r (ζ − τ η) (η − ζ )

+

∞

k=0

τk 1 − τkλ

ϕ∞ (ζ ) dζ ζ − τη

du 1 . ϕ∞ (uζ ) (η − uζ /τ ) u k+1

(29)

If the index k were negative then the u-integration would give zero since the integrand would be analytic inside the u-contour. Therefore the sum over k can be taken from −∞ to ∞. The integration domains in the double integral are given in (28). If we make the variable change u → u/ζ in the integral the sum becomes J0 (η, η ) =

∞

k=−∞

τk 1 − τkλ

ϕ∞ (ζ ) k ζ dζ ζ − τη

du 1 , ϕ∞ (u) (η − u/τ ) u k+1

and the new conditions are 1 < |ζ | < r, τ |ζ | < |u| < τr. The first operator on the right side of (29) is analytic for |η|, |η | ≤ r . The kernel J0 (η, η ) is analytic for |η| ≤ r . It follows by Proposition 2 that the Fredholm determinant of the sum of the two, i.e., of K 2 (I + R), equals the Fredholm determinant of J0 . Now we use (24). Substituting k → m + k in the first sum below we find that ∞

k=−∞

τk 1 − τkλ

m

m k k ∞ ζ ζ ζ ζ τk m = τm = τ f (µ, ζ /u). u u 1 − τkµ u u k=−∞

Thus

J0 (η, η ) = τ

m

ϕ∞ (ζ ) ϕ∞ (u)

m ζ du f (µ, ζ /u) dζ . u (ζ − τ η) (η − u/τ ) u

This has the same Fredholm determinant as du f (µ, ζ /u) ϕ∞ (ζ ) ζ m τ −1 J0 (τ −1 η, τ −1 η ) = τ m dζ , ϕ∞ (u) u (ζ − η) (η − u) u where now the operator acts on a circle with radius r ∈ (τ, 1) and in the integral 1 < |ζ | < r/τ, τ |ζ | < |u| < r.

Asymptotics in ASEP with Step Initial Condition

147

We now do something similar to what we did before. If we move the u-integral outward, so that r < |u| < 1 on the new contour, we pass the pole at u = η , which gives the contribution f (µ, ζ /η ) ϕ∞ (ζ ) ζ m τm dζ = λ−1 µ J (η, η ). ϕ∞ (η ) (η )m+1 ζ −η (The function f (µ, ζ /u) remains analytic in u during the deformation.) The new double integral is a kernel analytic for |η|, |η | ≤ r and J (η, η ) is analytic for |η| ≤ r . Therefore by Proposition 2, det (I + λ K 2 (I + R)) = det(I + µ J ), which completes the proof.

Remark. The lemma was proved under the assumption that τ > 0. The only occurrence µ . Since the of τ in µ J (η, η ) is in µ f (ζ /η ) and as τ → 0 this tends to ηη−ζ + 1−µ 11 probabilities P(xm (t) ≤ x) are continuous in p at p = 0 the integral fomula we derived for the probability holds for p = 0 as well, with this replacement for µ f (ζ /η ). The asymptotics that follow are actually simpler in this case. We now explain where the constants c1 and c2 come from. When we to do a saddle point analysis of the integral in (27) the first step is to write ϕ∞ (ζ ) ζ m as the exponential of −x log(1 − ζ ) + t

ζ + m log ζ, 1−ζ

and differentiate this to get the saddle point equation x m t + + = 0, 2 1 − ζ (1 − ζ ) ζ or (m − x) ζ 2 + (x + t − 2m) ζ + m = 0. The transition of the asymptotics occurs when the two saddle points coincide, which is when (x + t − 2m)2 = 4 m (m − x). This gives m=

(x + t)2 . 4t

σ =

(c1 + 1)2 , 4

Setting m = σ t and x = c1 t gives

11 This follows, for example, from formula (2) of [8].

148

C. A. Tracy, H. Widom

√ or c1 = −1 ± 2 σ . Since c1 should be increasing with σ we take the positive square root in (4). The saddle point is at √ √ ξ = − σ /(1 − σ ). We compute that if x is given by (26) precisely and we set ϕ∞ (ζ ) ζ m = ϕ∞ (ξ ) ξ m eψ(ζ ) , then in a neighborhood of ζ = ξ , ψ(ζ ) = −c33 t (ζ − ξ )3 /3+c3 s t 1/3 (ζ − ξ )+ O(t (ζ − ξ )4 ))+ O(t 1/3 (ζ − ξ )2 ), (30) where c3 = σ −1/6 (1 − σ 1/2 )5/3 . (It is only with c2 as given in (4) that the coefficients of t and t 1/3 are related this way.) Carrying out the details, we define ψ0 (ζ ) = −c1 log(1 − ζ ) +

ζ + σ log ζ, ψ1 (ζ ) = ψ0 (ζ ) − ψ0 (ξ ). 1−ζ

There are two steepest descent curves, an outer one o and an inner one i . (See Fig. 1. All curves are for the case σ = 1/4.) Both pass through ξ and have cusps at 1. The outer one emanates from ξ in the directions ± 2π/3 and has an inner-pointing cusp at ζ = 1. On it, Re(ψ1 (ζ )) has its maximum of zero at ζ = ξ and tends to −∞ at the cusp. The inner one emanates from ξ in the directions ± π/3 and has an outer-pointing cusp at η = 1. On it, Re(ψ1 (η)) has its minimum of zero at η = ξ and tends to +∞ at the cusp. We would like to deform the η-contour for J , which is a circle with radius r < 1, to i and apply Proposition 1 to assure that the Fredholm determinant doesn’t change. The ζ -contour started out as a circle with radius slightly bigger than one. We may deform the η-contour as described if we deform the ζ -contour simultaneously, assuring that the ζ -contour is always just outside the η, η -contour, so that in particular we don’t pass a singularity of f (µ, ζ /η ). Next we want to expand the ζ -contour outward to o , but in the process we might encounter a singularity of f (µ, ζ /η ), and this causes a problem. It will happen if a ray from zero meets i at a point η and o at ζ and η/ζ ≤ τ . This will not happen if τ is close enough to zero but will happen if τ is close enough to one. But we do not have to use the steepest descent curves, and the next lemma says that we can always find curves passing through ξ in the right directions that do the job. The main point is that during the simultaneous deformation of the ζ and η-contours no singularity of the integrand is passed. This means that the η-contour is strictly inside the ζ -contour, 1 is between the two, and if a ray from zero hits meets the ζ -contour at ζ and the η-contour at η, then the ratio η/ζ is strictly greater than τ . Thus we will have to make this ratio as close to one as desired. Lemma 5. There are disjoint closed curves η and ζ with the following properties. (i) The part of η in a neighborhood Nη of η = ξ is a pair of rays from ξ in the directions ± π/3 and the part of ζ in a neighborhood Nζ of ζ = ξ is a pair of rays from ξ − t −1/3 in the directions ± 2π/3. (ii) For some δ > 0 we have Re(ψ1 (ζ )) < −δ on ζ \Nζ and Re(ψ1 (η)) > δ on η \Nη .

Asymptotics in ASEP with Step Initial Condition

149

2

o

1

i

ξ

0

1

2

2

1

0

1

2

Fig. 1. Steepest descent curves o and i for ψ1 . The point ξ is the location of the saddle point

(iii) The circular η and ζ -contours for J can be simultaneously deformed to η and ζ , respectively, so that during the deformation the integrand in (27) remains analytic in all variables. Proof.

12

From the local behavior of ψ1 near ξ , ψ1 (ζ ) ∼ −c33 (ζ − ξ )3 /3,

(31)

and its global behavior we see that the set where Re(ψ1 ) = 0 consists of three closed curves meeting at ξ . (See Fig. 2.) One, which we call Ci since it is the inside one, has the tangent directions ± π/6 at ξ and meets the real line at a point in (0, 1); another, which we call Cm because it is the middle one, has the tangent directions ±π/2 at ξ and meets the real line at 1; the third, which we call Co since it is the outside one, has the tangent directions ±5π/6 at ξ and meets the real line at a point in (1, ∞). We have Re(ψ1 ) < 0 inside Ci , Re(ψ1 ) > 0 between Ci and Cm , Re(ψ1 ) > 0 between Cm and Co , and Re(ψ1 ) > 0 outside Co . (All these may be seen by taking appropriate points in the regions and using the fact that they are connected.) Our curves η and ζ will be very close to Cm , the first inside it and the second outside it. The set where Re(ψ1 ) = ε, with ε small and positive, consists of two curves, one lying between Ci and Cm and tangent to Cm at η = 1, and the other outside C0 . We are interested in the first, which we call C (ε) . (See Fig. 3.) Except for a neighborhood of ξ , one part of C (ε) is very close to Cm and inside it and the other very close to Ci and outside it. These are joined near ξ by smooth curves. The rays arg(η − ξ ) = ±π/3 meet C (ε) at points ηε+ and ηε− close to ξ . The curve η is described as follows: it goes from ξ in the direction −π/3 until ηε− , then it takes a right turn and goes counterclockwise around 12 The reader satisfied with an assumption that τ is small enough need not read what follows.

150

C. A. Tracy, H. Widom

Co

5

Cm Ci

0

5

2

0

2

4

6

8

10

Fig. 2. Curves Co , Cm and Ci defined by Re(ψ1 ) = 0

1.0

0.5

Cm C Ci

0.0

C 0.5

1.5 1.0

0.5

0.0

0.5

1.0

Fig. 3. Curves C (ε) , Cm , Ci and rays arg(η − ξ ) = ±π/3 used in the construction of η

C (ε) (it will be very close to Cm the while) until ηε+ , and then it goes backwards along the ray with direction π/3 until returning to ξ . (Actually, we modify this by making a semi-circular indentation around η = 1 to the left.)

Asymptotics in ASEP with Step Initial Condition

151

3

2

Co

1

Cm

C

0

C

1

2

3 2.5

2.0

1.5

1.0

0.5

0.0

0.5

1.0

Fig. 4. Curves C (−ε) , Cm , Co and rays arg(ζ − ξ + t −1/3 ) = ±2π/3 used in the construction of ζ

The curve ζ is obtained similarly. The set where Re(ψ1 ) = −ε consists of two curves, one lying inside Ci and the other between Cm and Co and tangent to Cm at ζ = 1. We are interested in the second, which we call C (−ε) . (See Fig. 4.) Except for a neighborhood of ξ , one part of C (−ε) is very close to Cm and outside it and the other very close to Co and inside it. These are joined near ξ by smooth curves. The rays + and η− near ξ . The curve arg(ζ − ξ + t −1/3 ) = ±2π/3 meet the curves at points η−ε −ε − , then ζ is described as follows: it goes from ξ − t −1/3 in the direction −2π/3 until η−ε it takes a left turn and goes counterclockwise around C (−ε) (it will be very close to Cm + , and then it goes backwards along the ray with direction 2π/3 until the while) until η−ε returning to ξ − t −1/3 . (We modify this by making a small semi-circular indentation around ζ = 1 to the right.) Let us see why the three stated conditions are satisfied. The first is obvious. The bounds in the second are clear on the curved parts of the contours, and is easy to see from (31) on the line segments near ξ . As for (iii), the ζ and η-contours start out as just outside and just inside the unit circle, respectively. We may simultaneously deform these contours to just outside and inside Cm , respectively, without passing any singularity of the integrand in (27), as long as the contours remain close enough to each other (and bounded away from zero). Then a further small deformation takes them to ζ and η .

Proof of Theorem 3. By part (iii) of the lemma and Proposition 1 the determinant is unchanged if J acts on η and the integral in (27) is over ζ . The operator µJ is the product AB, where A : L 2 ( ζ ) → L 2 ( η ) and B : L 2 ( η ) → L 2 ( ζ ) have kernels A(η, ζ ) =

µ f (µ, ζ /η) eψ(ζ ) , B(ζ, η) = . ζ −η η eψ(η)

152

C. A. Tracy, H. Widom

Aside from the factors involving ψ both kernels are uniformly O(t 1/3 ), due to the fact that the ζ -contour was shifted to the left by t −1/3 near ξ . It follows from this and (ii) that if we restrict the kernels to either ζ ∈ ζ \Nζ or η ∈ η \Nη the resulting product has exponentially small trace norm. So for the limit of the determinant we may replace the contours by their portions in Nζ and Nη , which are rays. Using (30) we see that we may further restrict η and ζ to t −a -neighborhoods of ξ as long as a < 1/3, because with 1−3a ). either variable outside such a neighborhood the product has trace norm O(e−δ t On these segments of rays we make the replacements η → ξ + c3−1 t −1/3 η, η → ξ + c3−1 t −1/3 η , ζ → ξ + c3−1 t −1/3 ζ. The new η-contour consists of the rays from 0 to c3 t 1/3−a e±πi/3 while the new ζ -contour consists of the rays from −c3 to −c3 + c3 t 1/3−a e±2πi/3 . In the rescaled kernels the factor 1/(ζ − η) in A(ζ, η) remains the same. Because near z = 1, 1 µ−1 f (µ, z) = O and f (µ, z) = + O(1), |1 − z| 1−z the factor µ f (µ, ζ /η)/η in B(η, ζ ) becomes 1 1 O and + O(t −1/3 ) |η − ζ | η−ζ

(32)

after the rescaling. (The µ and η appearing as they do is very nice.) As for the factors eψ(ζ ) and e−ψ(η) we see from (30) that for some δ > 0 after scaling 3 3 they are O(e−δ |ζ | ) and O(e−δ |η| ), respectively, on their respective contours. Thus the rescaled kernels are bounded by constants times e−δ |ζ | , |ζ − η| 3

e−δ |η| , |η − ζ | 3

respectively, which are Hilbert-Schmidt, i.e., L 2 . (Notice that after the scaling ζ − η becomes bounded away from zero.) It follows that convergence in the Hilbert-Schmidt norm of the rescaled operators A and B, and so trace norm convergence of their product, would be a consequence of pointwise convergence of their kernels. The error term in (32) goes to zero pointwise. If also a > 1/4, which we may assume, the error terms in (30) go to zero and we see that the kernels have pointwise limits e−ζ /3+sζ , ζ −η 3

eη /3−sη , η−ζ 3

respectively. Therefore we have found for µJ the limiting rescaled kernel

3

ζ

3

e−ζ /3+sζ +(η ) /3−sη dζ. (ζ − η) (η − ζ )

(33)

The four rays constituting the rescaled contours ζ and η in the limit go to infinity: the limiting ζ consists of the rays from −c3 to −c3 + ∞ e±2πi/3 while the limiting η consists of the rays from 0 to ∞ e±πi/3 .

Asymptotics in ASEP with Step Initial Condition

153

For ζ ∈ ζ and η ∈ η we have Re (ζ − η ) < 0, so we may write

es(ζ −η ) = η − ζ

∞

e x(ζ −η ) d x.

s

Hence (33) equals

∞

3 /3+(η )3 /3+x(ζ −η )

e−ζ

ζ −η

ζ

s

dζ d x.

The operator may be written as a product ABC where the factors have kernels e−ζ /3 3 , B(ζ, x) = e xζ , C(x, η) = e−xη+η /3 . A(η, ζ ) = ζ −η 3

These are all Hilbert-Schmidt. The operator C AB, which has the same Fredholm determinant, acts on L 2 (s, ∞) and has kernel ζ

=

ζ

η

e−ζ η

C(x, η) A(η, ζ ) B(ζ, y) dη dζ

3 /3+η3 /3+yζ −xη

ζ −η

dη dζ = −K Airy (x, y),

where K Airy (x, y) =

∞

Ai(z + x) Ai(z + y) dz.13

0

Hence det(I + µ J ) → det I − K Airy χ (s, ∞) = F2 (s). The convergence is clearly uniform for µ on its fixed circle, and it is easy to see that it is uniform in the neighborhood of any fixed σ and therefore for σ in any compact subset of (0, 1). This completes the proof.

Acknowledgements. This work was supported by the National Science Foundation through grants DMS0553379 (first author) and DMS-0552388 (second author). Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited. 13 The reason the double integral equals −K Airy (x, y) is that applying the operator ∂/∂ x + ∂/∂ y to the two kernels gives the same result, Ai(x) Ai(y), so they differ by a function of x − y. Since both kernels go to zero as x and y go to +∞ independently this function must be zero.

154

C. A. Tracy, H. Widom

References 1. Balázs, M., Seppäläinen, T.: Order of current variance and diffusivity in the asymmetric simple exclusion process. http://arxiv.org/abs/math/0608400v3[math.PR], 2008 2. Gohberg, I.C., Krein, M.G.: Introduction to the Theory of Linear Nonselfadjoint Operators. Transl. Amer. Math. Soc. 13, Providence, RI: Amer. Math. Soc., 1969 3. Liggett, T.M.: Interacting Particle Systems. [Reprint of the 1985 original.] Berlin: Springer-Verlag, 2005 4. Johansson, K.: Shape fluctuations and random matrices. Commun. Math. Phys. 209, 437–476 (2000) 5. Kardar, M., Parisi, G., Zhang, Y.-C.: Dynamic scaling of growing interfaces. Phys. Rev. Lett. 56, 889–892 (1986) 6. Quastel, J., Valkó, B.: t 1/3 superdiffusivity of finite-range asymmetric exclusion processes on Z. Commun. Math. Phys. 273, 379–394 (2007) 7. Tracy, C.A., Widom, H.: Level-spacing distributions and the Airy kernel. Commun. Math. Phys. 159, 151–174 (1994) 8. Tracy, C.A., Widom, H.: A Fredholm determinant representation in ASEP. J. Stat. Phys. 132, 291–300 (2008) Communicated by H. Spohn

Commun. Math. Phys. 290, 155–218 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0812-6

Communications in

Mathematical Physics

Power Law Inflation Hans Ringström Department of Mathematics, KTH, 100 44 Stockholm, Sweden. E-mail: [email protected] Received: 11 August 2008 / Accepted: 6 November 2008 Published online: 19 April 2009 – © Springer-Verlag 2009

Abstract: The subject of this paper is Einstein’s equations coupled to a non-linear scalar field with an exponential potential. The problem we consider is that of proving future global non-linear stability of a class of spatially locally homogeneous solutions to the equations. There are solutions on R+ ×Rn with accelerated expansion of power law type. We prove a result stating that if we have initial data that are close enough to those of such a solution on a ball of a certain radius, say B4R0 ( p), then all causal geodesics starting in B R0 ( p) are complete to the future in the maximal globally hyperbolic development of the data we started with. In other words, we only make local assumptions in space and obtain global conclusions in time. We also obtain asymptotic expansions in the region over which we have control. As a consequence of this result and the fact that one can analyze the asymptotic behaviour in most of the spatially homogeneous cases, we obtain quite a general stability statement in the spatially locally homogeneous setting.

Contents 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.

Introduction . . . . . . . . . . . . . . . . . . . . . . . Reformulation of the Equations on Tn . . . . . . . . . . Model Problem . . . . . . . . . . . . . . . . . . . . . . Energy Estimates . . . . . . . . . . . . . . . . . . . . . Bootstrap Assumptions . . . . . . . . . . . . . . . . . Estimates for the Non-Linearity . . . . . . . . . . . . . Differential Inequalities . . . . . . . . . . . . . . . . . Global Existence . . . . . . . . . . . . . . . . . . . . . Causal Structure . . . . . . . . . . . . . . . . . . . . . Asymptotic Expansions . . . . . . . . . . . . . . . . . Proof of the Main Theorem . . . . . . . . . . . . . . . Stability of Locally Spatially Homogeneous Spacetimes

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

156 165 173 177 180 184 189 191 195 197 203 206

156

H. Ringström

1. Introduction 1.1. Background and motivation. The spacetimes currently used by physicists to model the universe are ones with accelerated expansion. However, such expansion can be achieved by many different mechanisms, and which one to choose is not completely clear. Some examples of candidates are a positive cosmological constant, quintessence and k-essence, cf. e.g. [16–18]. Due to this uncertainty, it seems reasonable to try to understand the behaviour of solutions under as general assumptions on the model as possible. One particular question of interest is that of future global non-linear stability, i.e., for the purposes of the present discussion, the following question: given initial data for the equations such that the corresponding maximal globally hyperbolic development (MGHD) is future causally geodesically complete, do small perturbations of the initial data also yield future causally geodesically complete MGHD’s? It is of course also of interest to analyze the asymptotics in the causally geodesically complete direction, but that the answer to the above question be yes is a minimum requirement for stating that the MGHD of the given initial data is future stable. In [20], we built a framework for considering the question of future global non-linear stability for Einstein’s equations coupled to a non-linear scalar field. The actual case considered in [20] was that of a potential with a non-degenerate positive local minimum, Einstein’s vacuum equations with a positive cosmological constant being contained as a special case, and the resulting expansion being exponential. As a test of the framework of [20], and of the preconception that situations with accelerated expansion are stable, it is of interest to use it to prove stability in some other context. Here we study the behaviour in the case of an exponential potential. There are solutions of the corresponding equations on R+ × Rn such that the metric is of the form − dt 2 + t 2 p δi j d x i ⊗ d x j ,

(1)

where p > 1 is a real number, δ is the Kronecker delta and t and x i are standard coordinates on R+ and Rn respectively. In other words, the expansion is of power law type, and in the limiting case, p = 1, it is not accelerated. One might thus expect the problem of proving stability to be harder in this setting, and, in fact, it is more difficult to analyze the behaviour of the solutions to the PDE’s that result in the end. To our knowledge, the first author to study an exponential potential was Halliwell, cf. [7], who considered the spatially homogeneous and isotropic case. Later, the spatially homogeneous but non-isotropic case was studied in [10]. The question of stability in the case of 3 + 1 dimensions has also been considered, see [9]. In [9], Heinzle and Rendall used the results of Michael Anderson on the stability of even dimensional de Sitter space, cf. [1], together with Kaluza Klein reduction techniques, in order to obtain stability of the metrics (1) and the corresponding scalar fields, for a discrete set of values of p converging to 1. It is of interest to note that the methods used in [1] avoid the problem of proving global existence of a system of PDE’s by an intelligent and geometric choice of equations, see also [5]. In other words, the arguments used to prove the stability results of [9] are essentially geometric in flavour. In the present paper, the focus is rather on the analysis aspect, and though the perspective taken is less geometric, the results are more robust; we get stability in n + 1 dimensions of the metrics (1) together with the corresponding scalar fields for any p > 1. We also formulate a result which makes local assumptions in space and yields global conclusions in time. From a conceptual point of view, this is the natural type of result to prove due to the extreme nature of the causal structure in the case of accelerated expansion. However, it is also very convenient in practice to have

Power Law Inflation

157

such a statement; combining it with the results concerning the asymptotic behaviour in the spatially homogeneous setting, we get a non-linear stability result for quite general spatially locally homogeneous solutions to the equations under consideration.

1.2. Equations. The subject of this paper is Einstein’s equations, given by G µν = Tµν ,

(2)

where G µν = Rµν −

1 Sgµν , 2

Rµν are the components of the Ricci tensor of a Lorentz metric g on an n +1-dimensional manifold M, and S is the associated scalar curvature. In this paper, we shall be interested in stress energy tensors of the form

Tµν

1 γ ∇ φ∇γ φ + V (φ) gµν , = ∇µ φ∇ν φ − 2

(3)

where ∇ is the Levi-Civita connection associated with the metric g, φ is a smooth function on M, V (φ) = V0 e−λφ ,

(4)

and V0 and λ are positive constants. We shall refer to the matter model defined by (3) as the non-linear scalar field model, to V as the potential and to φ as the scalar field. Note that in this situation, (2) is equivalent to Rµν = ∇µ φ∇ν φ +

2 V (φ)gµν . n−1

(5)

It should of course be coupled to a matter equation for φ, which is given by ∇ µ ∇µ φ − V (φ) = 0.

(6)

Observe that this equation is a sufficient, but not necessary, condition for the stress energy tensor to be divergence free. We do, however, impose it. The system of equations of interest is thus (5)–(6).

1.3. Initial value problem. Concerning the system of equations under consideration, there is a natural initial value problem. The idea is to specify initial data that would correspond to the metric, second fundamental form, scalar field and normal derivative of the scalar field induced on a spacelike hypersurface in the Lorentz manifold one wishes to construct. However, in order for this to make sense, the initial data cannot be specified freely; they have to satisfy certain constraint equations that are implied by the Gauß and Codazzi equations, cf. [20] for more details.

158

H. Ringström

Definition 1. Initial data for (5) and (6) consist of an n dimensional manifold Σ, a Riemannian metric h, a symmetric covariant 2-tensor k and two functions φa and φb on Σ, all assumed to be smooth and to satisfy r − ki j k i j + (tr h k)2 = φb2 + D i φa Di φa + 2V (φa ),

(7)

D j k ji − Di (tr h k) = φb Di φa ,

(8)

where D is the Levi-Civita connection of h, r is the associated scalar curvature and indices are raised and lowered by h. Given initial data, the initial value problem is that of finding – an n + 1 dimensional manifold M with a Lorentz metric g and a φ ∈ C ∞ (M) such that (5) and (6) are satisfied, and – an embedding i : Σ → M such that i(Σ) is a Cauchy hypersurface in (M, g), i ∗ g = h, φ ◦ i = φa , and if N is the future directed unit normal and κ is the second fundamental form of i(Σ), then i ∗ κ = k and (N φ) ◦ i = φb . Such a triple (M, g, φ) is referred to as a globally hyperbolic development of the initial data, the existence of an embedding i being tacit. Remark 1. A Cauchy hypersurface is a set in a Lorentz manifold which is intersected exactly once by every inextendible timelike curve, see [14 or 20] for more details. In the above definition, and below, we assume all Lorentz manifolds to be time oriented. One can of course define the concept of initial data and development for a lower degree of regularity. We shall, however, restrict our attention to the smooth case in this paper. For results concerning the existence of initial data in the current setting, we refer the reader to [3 and 8]. Definition 2. Given initial data (Σ, h, k, φa , φb ) for (5) and (6), a maximal globally hyperbolic development of the data is a globally hyperbolic development (M, g, φ), with embedding i : Σ → M, such that if (M , g , φ ) is any other globally hyperbolic development of the same data, with embedding i : Σ → M , then there is a map ψ : M → M which is a diffeomorphism onto its image such that ψ ∗ g = g , ψ ∗ φ = φ and ψ ◦ i = i. Theorem 1. Given initial data for (5) and (6), there is a maximal globally hyperbolic development of the data which is unique up to isometry. Remark 2. When we say that (M, g, φ) is unique up to isometry, we mean that if (M , g , φ ) is another maximal globally hyperbolic development, then there is a diffeomorphism ψ : M → M such that ψ ∗ g = g, ψ ∗ φ = φ and ψ ◦ i = i , where i and i are the embeddings of Σ into M and M respectively. The proof is as in [2]. This is an important result and will be of use to us in this paper. However, it does not yield any conclusions concerning e.g. causal geodesic completeness. 1.4. Background solution. The basic background solution we are interested in is (in Lemma 1 below, we shall prove that it is a solution) g0 = −dt 2 + e2K (t/t0 )2 p δi j d x i ⊗ d x j , φ0 =

1 2 ln t − c0 , λ λ

(9) (10)

Power Law Inflation

159

on R+ × Tn , where R+ = (0, ∞), t0 > 0, K and p > 1 are constants and λ= c0 = ln

2 , [(n − 1) p]1/2

(11)

(n − 1)(np − 1) p . 2V0

(12)

Note that given the dimension n, there is a one to one correspondence between p and λ, and we shall prefer to specify p rather than λ. The above constructions make sense for p > 1/n, but in order for us to get an accelerated expansion, we need to have p > 1. Consider the metric (9) on R+ × Rn . Let h denote the Riemannian metric induced on {t0 } × Rn by g0 and let γ : [0, T ) → R+ × Rn be a future directed causal curve with γ (0) ∈ {t0 } × Rn . Then, if γ is the projection of γ to Rn , lh [γ ] :=

T

[h i j γ˙ i γ˙ j ]1/2 ds ≤

0

t0 , p−1

where Latin indices run from 1 to n, a convention that will be used consistently in what follows, as well as the convention that Greek indices run from 0 to n. Furthermore, the indices used on R+ × Tn and R+ × Rn will be the ones associated with the standard frame ∂0 = ∂t and ∂i unless otherwise specified. As a consequence, if we define (t0 ) :=

t0 , p−1

(13)

then J + [{t0 } × B(t0 ) (ξ )] ⊆ D + [{t0 } × B3(t0 ) (ξ )],

(14)

where J + (A) is the causal future of a set A and D + (A) is the future Cauchy development of a set A, cf. [14 or 20] for detailed definitions. This demonstrates that (t0 ) is a fundamental length scale and, similarly to the case studied in [20], that if we want to control the behaviour of a solution to the linear wave equation (on R+ × Tn with metric given by (9)) to the future of {t0 } × B(t0 ) (ξ ), then we only need to control the initial data on {t0 } × B3(t0 ) (ξ ). However, it also demonstrates that there is a difference between the case considered in the present paper and the case considered in [20]. In [20], the fundamental length scale was a constant, determined by the dimension and the minimum of the scalar field. In the present case, it depends on the starting time and tends to infinity with the starting time. As a consequence, the size of the ball over which it is necessary to have control in order to predict what happens along causal geodesics that start at the center tends to infinity with time. However, if we consider the above situation on R+ × Tn , then we see that the size of the torus grows even more rapidly if p > 1, so that the fraction of the volume of the torus that the ball constitutes tends to zero. Another problem that arises in the present setting is the fact that it is necessary to make a choice of t0 given initial data (Σ, ρ, κ, φa , φb ). We shall here do so by using the relation (10), in which we shall replace φ0 by the mean value of φa in the ball of interest, cf. Theorem 2 (in particular (15)) for a more precise statement.

160

H. Ringström

1.5. Results. Before we state the main result, we need to introduce some terminology. Let Σ be an n dimensional manifold. We shall be interested in coordinate systems x on open subsets U of Σ such that x : U → B1 (0) is a diffeomorphism. If s is a tensor field on Σ, we shall use the notation ⎛ ⎞1/2 n n i ···i s H l (U ) = ⎝ |∂ α s j11 ··· jqr ◦ x −1 |2 d x 1 · · · d x n ⎠ , i 1 ,...,i q =1 j1 ,..., jr =1 |α|≤l

x(U )

where the components of s are computed with respect to x and the derivatives are with respect to x. When we write s H l (U ) , we shall take it to be understood that there are coordinates x as above. Below, we shall use δ to denote the Kronecker delta with respect to the x coordinates. In particular, we shall use the notation ⎛ ⎞1/2 n g − aδ H l (U ) = ⎝ |∂ α (gi j − aδi j ) ◦ x −1 |2 d x 1 · · · d x n ⎠ . i, j=1 |α|≤l

x(U )

Theorem 2. Let V be given by (4), where V0 is a positive number and λ is given by (11) in which n ≥ 3 is an integer and 1 0, depending on n and p, such that if – (Σ, ρ, κ, φa , φb ) are initial data for (5) and (6), with dimΣ = n, – x : U → B1 (0) is a diffeomorphism, where U ⊆ Σ, – the objects φa , t0 and K are defined by 1 1 −1

φa := φa ◦ x d x, t0 := exp (λ φa + c0 ) , K := ln[4(t0 )], ωn B1 (0) 2 (15) where ωn is the volume of the unit ball in Rn with respect to the ordinary Euclidean metric, c0 is defined in (12) and (t0 ) is defined in (13), and – the inequality e−2K ρ − δ H k0 +1 (U ) + e−2K t0 κ − pδ k H

0 (U )

+ φa − φ0 (t0 ) H k0 +1 (U ) + t0 φb − t0 (∂t φ0 )(t0 ) H k0 (U ) ≤

(16)

holds, where k0 is the smallest integer satisfying k0 > n/2 + 1, then the maximal globally hyperbolic development (M, g, φ) of (Σ, ρ, κ, φa , φb ) has the property that if i : Σ → M is the associated embedding, then all causal geodesics that start in i{x −1 [B1/4 (0)]} are future complete. Furthermore, there is a t− ∈ (0, t0 ) and a smooth map, Ψ : (t− , ∞) × B5/8 (0) → M,

(17)

which is a diffeomorphism onto its image, such that all causal curves that start in i{x −1 [B1/4 (0)]} remain in the image of Ψ to the future, and g and φ have expansions (18)–(23) in the solid cylinder [0, ∞) × B5/8 (0) when pulled back by Ψ . Finally, Ψ (0, p) = i ◦ x −1 ( p) for p ∈ B5/8 (0). In the formulas below, Latin indices refer to the natural Euclidean coordinates on B5/8 (0) and t is the natural time coordinate on the

Power Law Inflation

161

solid cylinder. There is a positive constant α, a Riemannian metric χ on B5/8 (0) and constants K l such that if · C l denotes the C l norm on B5/8 (0), we have, for t ≥ t0 , φ(t, ·) − φ0 (t)C l + (t∂t φ)(t, ·) − t∂t φ0 (t)C l ≤ K l (t/t0 )−α , (g00 + 1)(t, ·)C l + (t∂t g00 )(t, ·)C l ≤ K l (t/t0 ) −1 1 jm t g0i (t, ·) − χ γ jim l np − 2 p + 1 C

−α

,

(18) (19) (20)

+ [t∂t (t −1 g0i )](t, ·)C l ≤ K l (t/t0 )−α , (t/t0 )−2 p e−2K gi j (t, ·) − χi j C l

(21)

+ (t/t0 )−2 p e−2K t∂t gi j (t, ·) − 2 pχi j C l ≤ K l (t/t0 )−α , (t/t0 )2 p e2K g i j (t, ·) − χ i j C l ≤ K l (t/t0 )−α , (t/t0 )

−2 p −2K

e

tki j (t, ·) − pχi j C l ≤ K l (t/t0 )

−α

(22) ,

(23)

where γ jim are the Christoffel symbols associated with the metric χ and k is the second fundamental form of the hypersurfaces {t} × B5/8 (0). Remark 3. Remarks similar to those made in connection with the analogous theorem in [20] remain valid and need not be repeated here. Let us simply point out that t0 is chosen so that φ0 (t0 ) = φa , a choice which is essentially necessary, and that K is chosen so that the ball of radius 1 with respect to the x-coordinates roughly corresponds to a ball of radius 4(t0 ) with respect to ρ. The latter choice should be compared with (14); if we replace 3(t0 ) with 4(t0 ) on the right-hand side, the inclusion still holds, but with a margin, so that the corresponding statement can be expected to hold in the MGHD’s corresponding to perturbed initial data. Due to (22) and (23), we have, for t ≥ t0 , t (g i j k jl )(t, ·) − pδli C l ≤ K l (t/t0 )−α , and in this sense, we have isotropization. The expansions are incomplete but with more work it should be possible to obtain more detailed information. In [9], more detailed asymptotic expansions were provided, though it should be pointed out that the foliation considered here differs from that considered in [9]. Note that as a consequence of Theorem 2 and Cauchy stability, cf. Theorem 7 of [20], we get future global non-linear stability of the solutions (9) and (10) on R+ × Tn for n ≥ 3, since we can apply Theorem 2 in a neighbourhood of every point at late enough times. The reason for this is that [4(t)]−2 e2K (t/t0 )2 p tends to infinity, so that a ball in Tn of fixed positive radius > 0 with respect to fixed coordinates will sooner or later contain a ball of radius 4(t) with respect to the metric induced on {t} × Tn by the metric g0 . The proof of the above theorem is to be found in Sect. 11. Let us consider the 4-dimensional spatially homogeneous case. In other words, let us restrict our attention to 3-dimensional initial data with a transitive isometry group. Due to the work of Kitada and Maeda, cf. [10], it is reasonable to hope that Theorem 2 will be applicable in a neighbourhood of every point on a late enough hypersurface of spatial homogeneity, with some exceptions. If Σ is S3 , S2 × R or quotients thereof, then it is not clear that the corresponding solution needs to expand; it might recollapse. The reason for this is that S3 and S2 × R admit homogeneous metrics with positive scalar curvature. To simplify the statement, we shall thus exclude this possibility. Furthermore, we are only interested in the case that the isometry group admits a cocompact subgroup.

162

H. Ringström

Theorem 3. Let V be given by (4), where V0 is a positive number and λ is given by (11) in which n = 3 and p > 1. Let M be a connected and simply connected 3-dimensional manifold and let (M, h, k, φa , φb ) be initial data for (5) and (6). Assume, furthermore, that one of the following conditions is satisfied: – M is a unimodular Lie group different from SU(2) and the isometry group of the initial data contains the left translations. – M = H3 , where Hn is the n-dimensional hyperbolic space, and the initial data are invariant under the full isometry group of the standard metric on H3 . – M = H2 × R and the initial data are invariant under the full isometry group of the standard metric on H2 × R. Assume finally that tr h k > 0. Let Γ be a cocompact subgroup of M in the case that M is a unimodular Lie group and a cocompact subgroup of the isometry group otherwise. Let Σ be the compact quotient. Then (Σ, h, k, φa , φb ) are initial data. Make a choice of Sobolev norms · H l on tensorfields on Σ. Then there is an > 0 such that if (Σ, ρ, κ, ϕa , ϕb ) are initial data for (5) and (6) satisfying ρ − h H 4 + κ − k H 3 + ϕa − φa H 4 + ϕb − φb H 3 ≤ , then the maximal globally hyperbolic development corresponding to (Σ, ρ, κ, ϕa , ϕb ) is future causally geodesically complete and there are expansions of the form given in the statement of Theorem 2 to the future. Remark 4. If M is a 3-dimensional unimodular Lie group it contains a cocompact subgroup Γ , cf. [15]. Concerning the definition of Sobolev norms on tensorfields on manifolds, we refer the reader to e.g. [20]. The statement that there are expansions to the future should be interpreted as saying that there is a Cauchy hypersurface Σ in the maximal globally hyperbolic development of (Σ, ρ, κ, ϕa , ϕb ) such that for every p ∈ Σ , there is a neighbourhood of p to which Theorem 2 applies. In [20], we made several comments that are equally relevant in the present context, but for the sake of brevity, we do not wish to repeat them here. The proof of the above theorem is to be found in Sect. 12. 1.6. Outline. Let us start by discussing the proof of Theorem 2. Due to the nature of the causal structure, it is sufficient to study the future stability of the solutions given by (9)–(12) on R+ × Tn . The procedure leading to this reduction can briefly be described as follows. Given initial data and a diffeomorphism x : U → B1 (0) as described in the statement of Theorem 2, pull back the initial data to B1 (0) by x −1 . Using a cut-off function and a suitable choice of t0 and K , one can fit the initial data on B1 (0) to the initial data on Tn corresponding to a t = t0 slice of (9)–(12). The resulting data on Tn in general violate the constraints in an annular region. However, they are close to those of the t = t0 slice of (9)–(12), and it is possible to demonstrate stability in the class of constraint violating data for a suitable modification of Einstein’s equations, described below. Thus one obtains a solution to the modified equations which is global to the future. Furthermore, the Cauchy development of the part of B1 (0) unaffected by the cut-off function yields a patch of spacetime corresponding to the original initial data. For the purposes of the present discussion, we shall refer to this patch as the global patch. The statements concerning future completeness of causal geodesics starting in i{x −1 [B1/4 (0)]} and asymptotic expansions hold in the global patch. Constructing local

Power Law Inflation

163

patches corresponding to the other points of the original initial manifold, one obtains a globally hyperbolic development of the original initial data which includes the global patch. By the abstract properties of the MGHD of the initial data, this globally hyperbolic development can be embedded into the maximal globally hyperbolic one, and the statement of the theorem follows. Due to the above observations, it is clear that the essential step of the argument is to prove future stability of the solutions defined by (9)–(12) in a situation where the constraints are violated. Such a result presupposes a hyperbolic formulation of the equations, which we provide in the beginning of Sect. 2. The formulation we use is based on gauge source functions, cf. [6], together with some additional modifications, cf. (24)–(25). The gauge source functions are chosen so that they coincide with the contracted Christoffel symbols of the background, the equality holding for upstairs indices, cf. (26). The main purpose of adding the modifications is that they make it possible to prove stability for data violating the constraints. However, the modifications, additionally, yield a partial decoupling at the linear level, which leads to a hierarchy we shall describe below, and they yield damping terms which are of crucial importance when proving stability. In the beginning of Sect. 2, we briefly discuss the hyperbolic formulation we shall use, the associated initial data and a division of the terms appearing into ones that have to be taken into account and ones that can, in the end, in practice be ignored. Readers interested in a more complete presentation are referred to [20]. After a discussion of the background solution, we then reformulate the equations. The first reformulation serves the purpose of expressing the equations in terms of quantities concerning which we have definite expectations; we subtract the background scalar field φ0 from the scalar field φ and consider ψ = φ − φ0 , u = g00 + 1, u i = g0i and h i j = (t/t0 )−2 p gi j . We expect ψ and u to converge to zero and h i j to converge. Concerning u i , it seems reasonable to expect that if we rescale it by a factor of t − p (the logic being that every downstairs spatial index corresponds to a factor t p ), then the resulting object remains small or converges to zero. Thus, it might seem natural to carry out such a rescaling. However, in the case of u i , we shall do this rescaling at the level of the energies, cf. Sect. 5. The resulting equations, (45)–(48), have a certain structure; considering the linear terms, it is clear that the terms involving zeroth order derivatives have a factor in front of them of the form of a constant divided by t 2 , and the terms involving first order derivatives have a factor in front of them of the form of a constant divided by t. Consequently, it seems natural to multiply the equations with t 2 and to change the time coordinate so that t∂t = ∂τ for some new time coordinate τ . This is the purpose of the second reformulation, which leads to Eqs. (61)–(64) with which we shall be working. Starting with (61)–(64), one can generate a model problem by dropping the terms ˆ g by the wave operator given by µν and ψ , and by replacing the wave operator associated with the background. Considering (61)–(64) with these simplifications in mind, one sees that some of the equations partly decouple; the equations for u and ψ, (61) and (64), do not involve the remaining unknowns, and the equation for h i j , (63), does not involve u i . In other words, there is a hierarchy in the model problem. One can start by analyzing the model equations for u and ψ, then turn to the model equations for h i j , and finally consider the equation for u i . Even though this hierarchy does not persist in the non-linear case, some aspects of it remain and are of central importance in the proof of future global non-linear stability; given suitable bootstrap assumptions, the hierarchy does, for all practical purposes, persist. Given the structure of the hierarchy, it is natural to start by considering the model equations for u and ψ. Such an analysis is the subject of Sect. 3. It turns out that one can construct an energy which decays

164

H. Ringström

exponentially. For this to hold, one does, however, need to require that n ≥ 3 and p > 1; for p = 1, there are constant, non-zero, solutions to the model equations. In Sect. 4, we write down the energies, not only for u and ψ, but also for h i j and u i , with which we shall be working in the non-linear setting, the construction in part being based on that of the model problem. We also derive the estimates for the time derivatives of the energies on which the bootstrap argument will be based. In Sect. 5, we specify the bootstrap assumptions. There are two levels of assumptions. The first level consists of assumptions ensuring that g remains a Lorentz metric, with quantitative bounds, cf. Subsect. 5.1. Thanks to this assumption, it is, among other things, possible to define the energies. The second level assumption consists of an upper bound for the energy, cf. Subsect. 5.4. The main tool for proving future global existence is the system of differential inequalities derived in Lemma 16 of Sect. 7. Corollaries 1 and 2 of Sect. 4 and Eqs. (61)–(64) constitute the starting point for the derivation. However, it is necessary to estimate the terms that are of higher order in the expressions that vanish on the background, cf. Lemma 11, to estimate the commutator terms that arise when applying spatial derivatives to the equations, cf. Lemma 13, and to estimate the remainder terms that appear in the estimates for the time derivatives of the energies in Corollaries 1 and 2, cf. Lemma 15. Section 6 is devoted to deriving the necessary estimates. All the estimates are of course based on the bootstrap assumptions, and deriving them requires an effort. However, applying general techniques developed in [20] leads to a significant reduction of the amount of work. Using these estimates, we then derive the system of differential inequalities in Sect. 7. The hierarchy mentioned above is apparent in this system. Disregarding the terms involving in (139)–(141) (the corresponding terms can be estimated using the bootstrap assumptions), it is clear that only Hˆ lp,k appears on the right-hand side of the differential inequality for Hˆ lp,k , cf. (139), so that one can improve the bootstrap assumptions for this quantity first. Considering (141), the second and third terms on the right-hand side may appear hard to control. However, since it is possible to improve the bootstrap for Hˆ lp,k to say that, not only is it small but it decays exponentially, the second and third terms on the right-hand side of (141) do not constitute a problem. Finally, turning to (140), the second term on the right-hand side can be controlled using the information already obtained concerning Hˆ lp,k and Hˆ m,k . To conclude, it is of crucial importance to derive a system of differential inequalities; combining (139)–(141) into one differential inequality yields an estimate which does not appear to be very useful. In Sect. 8, we then prove future global existence of solutions corresponding to initial data on Tn close to those of a model solution. Note, however, that given initial data on Tn , it is necessary to determine an initial time, since some of the unknowns, i.e. ψ and h i j , depend on it. We carry out a discussion concerning how to achieve this in the beginning of Sect. 8. After the proof of global existence, we derive some basic conclusions; in the case of hyperbolic PDE’s, it is natural to make smallness assumptions for a finite degree of regularity and then to draw conclusions for any degree of regularity, and a first step in this direction is taken in Theorem 5, following the proof of future global existence. In Sect. 9, we then carry out a rough analysis of the causal structure. This analysis yields information concerning the future Cauchy development of subsets of the initial data, which is of crucial importance when carrying out the arguments described at the beginning of the present subsection. Furthermore, we prove future causal geodesic completeness. In Sect. 10, we derive asymptotic expansions for the solution and in Sect. 11 we prove the main theorem along the lines described above. The spatially homogeneous solutions of interest were already analyzed in [10], but the perspective taken

Power Law Inflation

165

here is somewhat different. Furthermore we need somewhat more detailed knowledge concerning the asymptotics, and consequently, we discuss the spatially homogeneous solutions in detail in Sect. 12. Note, however, that the results of [10] cover a much more general situation than we discuss in the present paper. At the end of Sect. 12 we then prove Theorem 3. Let us comment on the differences and the similarities between the situation studied in [20] and the one studied in the present paper. The main purpose of [20] was to build a framework for proving future global non-linear stability in the Einstein-non-linear scalar field setting. In particular, specific choices of gauge source functions and corrections to the equation were made that work equally well for the case studied in [20] as for the case studied here. Furthermore, in [20], we wrote down bootstrap assumptions as well as a partial division of the terms appearing in the equations, separating out the ones of higher order. Finally, and perhaps most importantly, we constructed an algorithm yielding estimates for the non-linear terms given that the bootstrap assumptions hold, the advantage of the algorithm being that in order to estimate a term in H k , it is enough to make simple computations such as counting the number of downstairs spatial indices minus the number of upstairs spatial indices. All of these constructions carry over, and will be very useful in the present situation. On the other hand, the actual PDE problems that result are quite different in the different cases. In the case of a potential with a positive non-degenerate minimum, the background scalar field is zero, but in the case of an exponential potential, the scalar field tends to infinity as t → ∞. As described above, it is thus, in the case of an exponential potential, necessary to subtract the background solution. The process of doing so introduces couplings between the equations for the scalar field and the different components of the metric, even on the linear level, and this makes the resulting equations harder to analyze. Above, we discussed the equations for u and ψ that result after having dropped the terms that are quadratic in the quantities that vanish on the background and after having changed the coefficients of the highest order derivatives to those corresponding to the background. In particular, we noted that these equations are coupled, and it turns out that finding an energy that decays exponentially does require an effort. If one were to consider the corresponding equations for u and ψ in the case studied in [20], one would see that the equations for u and ψ decouple, and that one easily obtains exponential decay for both of them separately. To sum up, there are several aspects concerning the general set up of the equations and the general methods for estimating the non-linearity that are common to the analysis carried out in [20] and the analysis carried out here. However, the actual PDE problems that one has to deal with in the end are quite different, the present one being the more difficult. Finally, let us note that in the outline of the proof of the theorem in [20] corresponding to Theorem 2 in the present paper, we motivated the choice of gauge source functions, the choice of corrections, and we made comparisons between our method and the methods used by Lindblad and Rodnianski to prove the stability of Minkowski space in [12] and [13] (simplifying the original proof by Christodoulou and Klainerman [4], though not obtaining as detailed asymptotics). As a consequence, we shall not do so here. 2. Reformulation of the Equations on Tn As we pointed out in the outline, the central problem in the proof of Theorem 2 is that of proving future global non-linear stability of the solutions (9)–(12) on R+ × Tn . In [20], we considered (5) and (6) in the context of perturbations around metrics of the form −dt 2 + e2Ω δi j d x i ⊗ d x j

166

H. Ringström

on R+ × Tn . Thus the problem we are interested in here fits exactly into the general framework developed in [20], provided we choose Ω = p ln t + K − p ln t0 (below, we shall, for various reasons, make a somewhat different choice). As in [20], we shall use ˙ so that ω = p/t. The choice of equations, the relevant estimates the notation ω = Ω, for the non-linearity etc. then follow from [20]. Consequently, we shall consider the equations Rˆ µν − ∇µ φ∇ν φ −

2 V (φ)gµν + Mµν = 0, n−1

g αβ ∂α ∂β φ − Γ µ ∂µ φ − V (φ) + Mφ = 0,

(24) (25)

cf. (53) and (54) of [20], where all the indices are with respect to the standard vectorfields on R+ × Tn , i.e. ∂0 = ∂t , ∂i = ∂x i for i = 1, . . . , n, if x i are the standard “coordinates” on Tn . Here 1 α Dµ = Fµ − Γµ , Rˆ µν = Rµν + ∇(µ Dν) , ∇(µ Dν) = (∂µ Dν + ∂ν Dµ ) − Γµν Dα , 2 and Fµ = nωg0µ , M00 = 2ωg 0µ (Γµ − Fµ ), M0i = −2ω(Γi − nωg0i ), Mi j = 0, Mφ = g

µν

(Γµ − Fµ )∂ν φ.

(26) (27)

Equations (24) and (25) imply a homogeneous wave equation for Dµ , cf. (56) and (57) of [20]. If the initial data satisfy the constraints and one sets up the initial data, for Eqs. (24) and (25) in the correct way, the initial data for Dµ vanish. This leads to the conclusion that Dµ = 0 where the solution is defined. As a consequence, we obtain a solution to (5) and (6). For more details on this argument, the reader is referred to [20], cf., in part, Proposition 1. 2.1. Initial data. The initial data for (24) and (25) are not completely determined by initial data for (5) and (6). However, part of the corresponding freedom has to be used to ensure that Dµ = 0 initially. In practice, we shall be interested in initial data that do not satisfy the constraint equations on the entire initial manifold. We shall thus assume that we are given (, ς, Φa , Φb ) on Tn , where is a Riemannian metric, ς is a symmetric covariant 2-tensor and Φa , Φb are smooth functions on Tn . Furthermore, we shall assume that (7) and (8) are satisfied on S ⊆ Tn (with (h, k, φa , φb ) replaced by (, ς, Φa , Φb )). Starting with these initial data, we construct initial data for (24) and (25) as in [20]: gi j (t0 , ·) = (∂i , ∂ j ),

(28)

g00 (t0 , ·) = −1, g0i (t0 , ·) = 0,

(29)

for i, j = 1, . . . , n, cf. (58) and (59) of [20]. Due to this choice, the future directed unit normal to the hypersurface t = t0 is ∂t . Note, furthermore, that this fixes Fµ (t0 , ·), cf. (26). Concerning the first time derivatives, we choose ∂0 gi j (t0 , ·) = 2ς (∂i , ∂ j ),

(30)

∂0 g00 (t0 , ·) = −2F0 (t0 , ·) − 2trς, 1 ∂0 g0l (t0 , ·) = −Fl + g i j (2∂i g jl − ∂l gi j ) (t0 , ·), 2

(31) (32)

Power Law Inflation

167

cf. (60), (62) and (63) of [20] respectively. Due to these choices, Dµ (t0 , ·) = 0. Concerning φ, we require φ(t0 , ·) = Φa , (∂t φ)(t0 , ·) = Φb ,

(33)

cf. (61) of [20], since ∂t is the future directed unit normal to {t0 } × Tn . With these initial data, we get a local existence and uniqueness result. Furthermore, we get a continuation criterion and the conclusion that (5) and (6) are satisfied in D({t0 }× S), where D signifies the Cauchy development (for a definition of Cauchy development, see [14 or 20]). For an exact statement, cf. Proposition 1 of [20].

2.2. Equations. To conclude, we consider the equations 2 V (φ)g00 = 0, n−1 2 V (φ)g0i = 0, Rˆ 0i − 2ω(Γi − nωg0i ) − ∇0 φ∇i φ − n−1 2 V (φ)gi j = 0, Rˆ i j − ∇i φ∇ j φ − n−1 Rˆ 00 + 2ωΓ 0 − 2nω2 − ∇0 φ∇0 φ −

g αβ ∂α ∂β φ − nω∂0 φ − V (φ) = 0.

(34) (35) (36) (37)

In order to analyze what terms are relevant and what terms are irrelevant in the expressions for Rˆ µν + Mµν , one can use the results of [20]. Combining Lemma 4, Lemma 6 and (88) of [20], we obtain 1 1 Rˆ 00 + 2ωΓ 0 − 2nω2 = − g αβ ∂α ∂β g00 + (n + 2)ω∂0 g00 2 2 + n(ω˙ + ω2 )g00 + nω2 (g00 + 1) + A,00 + C,00 , 1 1 Rˆ 0m − 2ω(Γm − nωg0m ) = − g αβ ∂α ∂β g0m + nω∂0 g0m 2 2 1 2 + 2(n − 1)ω + n ω˙ g0m − ωg i j Γim j 2 + A,0m + C,0m , 1 1 Rˆ i j = − g αβ ∂α ∂β gi j + nω∂0 gi j + 2ωg 00 ∂0 gi j 2 2 − 2ω2 g 00 gi j + A,i j ,

(38)

(39)

(40)

where the higher order terms A,µν , C,µν are defined in (87), (92) and (93) of [20]. The point of these expressions is that A,µν and C,µν are sums of terms that are quadratic in factors that vanish for the background solution.

2.3. Background solution, revisited. Before we proceed, let us prove that the basic solution around which we are perturbing actually is a solution.

168

H. Ringström

Lemma 1. Let n ≥ 3, p > 1, V0 > 0 and define λ, c0 and V by (11), (12) and (4) respectively. Then the metric g0 , given by (9), and the function φ0 , given by (10), on R+ × Tn satisfy (5) and (6). In particular, φ0 satisfies the equation φ¨ 0 + nωφ˙ 0 + V (φ0 ) = 0.

(41)

Proof. One can compute that for g0 given in (9), we have Γ 0 = nω, where ω = p/t, and Γ i = 0. In other words, Fµ defined in (26) coincides with Γµ , so that for g0 , Rˆ µν = Rµν and the modifications Mµν and Mφ vanish. Note also that A,µν = 0 for the metric under consideration, and that C,00 = C,0m = 0, cf. [20] (note that this is clear due to the idea behind the definition of these quantities). The 00 component of (5) is thus, due to (34) and (38), equivalent to − n ω˙ − nω2 − φ˙ 02 +

2V (φ0 ) = 0. n−1

(42)

Since φ0 only depends on t, the 0m equations are automatically satisfied and the i j equations are equivalent to ω˙ + nω2 −

2V (φ0 ) = 0. n−1

(43)

With φ0 as in (10), Eq. (43) is equivalent to − p + np 2 −

2V0 ec0 = 0, n−1

which is equivalent to e c0 =

(n − 1)(np − 1) p , 2V0

which holds due to (12). In particular, 2V (φ0 ) p(np − 1) = . n−1 t2

(44)

Using this information, (42) is equivalent to np − np 2 −

4 + p(np − 1) = 0. λ2

In other words, (11) implies (42). Thus (5) is satisfied. To check that φ0 satisfies the last equation, which in the current situation is equivalent to (37), is simply a computation. Since (37) is equivalent to (6) for the metric under consideration, the lemma follows.

Power Law Inflation

169

2.4. Linear algebra. Before reformulating the equations, let us introduce some terminology concerning Lorentz matrices. Let g be a real valued (n + 1) × (n + 1)-matrix with components gµν . We shall denote the matrix with components gi j , i, j = 1, . . . , n by g , denote the vector with components g0i by v[g] and denote g00 + 1 by u[g]. If g is symmetric and has one negative and n positive eigenvalues, we shall say that g is a Lorentz matrix. In case g is an invertible (n + 1) × (n + 1) matrix, we shall let g µν denote the components of the inverse and we shall let g denote the matrix with components g i j , i, j = 1, . . . , n. It is of interest to note the following, cf. Lemma 1 and 2 of [20]. Lemma 2. Let h be a symmetric (n + 1) × (n + 1) real valued matrix. Assume that u[h] < 1 and that h is positive definite. Then h is a Lorentz matrix, h is positive definite and u[h −1 ] < 1. Remark 5. Below, we shall sometimes use the notation h > 0 to indicate that h is positive definite. Definition 3. A canonical Lorentz matrix is a symmetric (n + 1) × (n + 1)-dimensional real valued matrix g such that u[g] < 1 and g > 0. Let Cn denote the set of (n + 1) × (n + 1)-dimensional canonical Lorentz matrices. Note that, due to Lemma 2, the inverse of an element of Cn is in Cn . 2.5. First reformulation of the equations. Since the background scalar field φ0 tends to infinity as t → ∞, it seems natural to reformulate the equations in terms of ψ = φ − φ0 . Furthermore, since the 00- and i j-components of the background metric are −1 and e2K (t/t0 )2 p δi j respectively, it seems natural to consider u = g00 + 1 and h i j = (t/t0 )−2 p gi j . Isolating terms that involve, at worst (in terms of number of derivatives) first order derivatives of the unknowns and are quadratic in quantities that vanish on the background, we obtain the following reformulation. Lemma 3. Let V0 > 0, p > 1 and let n ≥ 3 be an integer. Define λ by (11), V by (4) and let φ0 be given by the right-hand side of (10), where c0 is given by (12). Finally, fix 0 < t0 ∈ R and let U be an open subset of R+ × Tn . Then the following statements are equivalent: – the functions g and φ, with values in Cn and R respectively, are C ∞ and satisfy (34)–(37) on U , – the functions ψ = φ −φ0 , u = g00 +1, u i = g0i , h i j = (t/t0 )−2 p gi j (i, j = 1, . . . , n) are C ∞ , where u < 1 and h i j are the components of a positive definite metric, and satisfy β1 8 2λp(np − 1) ˜ 00 = 0, u − ∂0 ψ − ψ + 2 t λt t2 β2 4 ˜ 0i = 0, −g µν ∂µ ∂ν u i + nω∂0 u i + 2 u i − 2ωglm Γlim − ∂i ψ + t λt 2p 2λp(np − 1) ˜ i j = 0, ψ hi j + −g µν ∂µ ∂ν h i j + nω∂0 h i j + − 2 u + t t2

−g µν ∂µ ∂ν u + (n + 2)ω∂0 u +

−g µν ∂µ ∂ν ψ + nω∂0 ψ +

2(np − 1) 2 ˜ ψ = 0, ψ − 2u + t2 λt

(45) (46) (47) (48)

170

H. Ringström

on U , where ω = p/t, β1 = 2 p[n( p − 1) + 1] and β2 = p(n − 2)(2 p − 1). Further˜ 00 , ˜ 0i , ˜ i j and ˜ ψ are defined by (51), (52), (55) and (56) respectively. more, Remark 6. Given u, u i and h i j , one can construct gµν and thereby g µν . Note that the equivalence presupposes that t0 has been fixed. Recall that Cn was defined in Definition 3. It is of interest to note that (45)–(48) are independent of V0 ; an expression of the form V0 e−λφ0 appears in E,φ , cf. (49), but this expression is independent of V0 due to (44). On the other hand, it is necessary to know V0 in order to be able to reconstruct φ from ψ. Proof. Note that e−λφ = e−λφ0 (e−λψ − 1 + λψ) + e−λφ0 (1 − λψ). Since the first term is quadratic in ψ, which vanishes on the background, we define E,φ = V0 e−λφ0 (e−λψ − 1 + λψ).

(49)

With this notation, we can write 2 E,φ p(np − 1) 2V (φ) = , (1 − λψ) + 2 n−1 t n−1 cf. (44). We thus have (in this proof, we shall use the notation f˙ = ∂t f ) φ˙ 2 +

2V (φ) 4 p(np − 1) g00 = 2 2 − n−1 λ t t2 4 λp(np − 1) p(np − 1) + ψ˙ + (g00 + 1) + ψ + φ,00 , λt t2 t2

where φ,00 = ψ˙ 2 −

2 E,φ λp(np − 1) ψ(g00 + 1) + g00 . t2 n−1

(50)

Before we reformulate (34), let us note that, due to (38), 1 1 Rˆ 00 + 2ωΓ 0 − 2nω2 = − g µν ∂µ ∂ν g00 + (n + 2)ω∂0 g00 + n(ω˙ + 2ω2 )(g00 + 1) 2 2 − n(ω˙ + ω2 ) + A,00 + C,00 . Since −n(ω˙ + ω2 ) −

4 λ2 t 2

+

p(np − 1) = 0, t2

cf. (42) and (44), we get 2V (φ) 1 1 g00 = − g µν ∂µ ∂ν g00 + (n + 2)ω∂0 g00 Rˆ 00 + 2ωΓ 0 − 2nω2 − φ˙ 2 − n−1 2 2 p(np − 1) 2 (g00 + 1) + n(ω˙ + 2ω ) − t2 λp(np − 1) 4 1 ˜ 00 , − ψ˙ − ψ+ λt t2 2

Power Law Inflation

171

where ˜ 00 = 2 A,00 + 2C,00 − 2φ,00 .

(51)

Thus (34) is equivalent to (45). By similar arguments, using (39), (35) is equivalent to (46), where ˜ 0i = 2 A,0i + 2C,0i − 2∂t ψ∂i ψ +

4 E,φ 2 p(np − 1) g0i . λψg0i − 2 t n−1

(52)

Using (40), Eq. (36) can be reformulated to −g µν ∂µ ∂ν gi j + (n + 4g 00 )ω∂0 gi j p(np − 1) λp(np − 1) ˆ i j = 0, − 2 2ω2 g 00 + − ψ gi j + t2 t2 where ˆ i j = 2 A,i j −

4 E,φ gi j − 2∂i ψ∂ j ψ. n−1

(53)

We wish to reformulate this equation in terms of h i j = (t/t0 )−2 p gi j . Note that (t/t0 )−2 p ∂0 gi j = ∂0 h i j + 2ωh i j , (t/t0 )−2 p ∂02 gi j = ∂02 h i j + 4ω∂0 h i j +

2 p(2 p − 1) hi j . t2

Using g 00 + 1 = −(g00 + 1) +

1 [(g00 + 1)2 − g 0i g0i ], g00

(54)

we conclude that (36) is equivalent to (47) where ˜ i j = −4g 0l ω∂l h i j +

2p 1 ˆ ij [(g00 + 1)2 − g 0l g0l ]h i j + (t/t0 )−2 p t 2 g00

(55)

ˆ i j is given by (53). Finally, let us turn to (37). Note that and V (φ) = −λV0 e−λφ0 + λ2 V0 e−λφ0 ψ − λ E,φ , so that (37) is equivalent to −g µν ∂µ ∂ν ψ + nω∂0 ψ +

2(np − 1) ψ − (g 00 + 1)∂02 φ0 − λ E,φ = 0, t2

where we have used the fact that φ0 satisfies (41). Due to (54), (37) is equivalent to (48), where ˜ψ =− The lemma follows.

1 [(g00 + 1)2 − g 0i g0i ]∂02 φ0 − λ E,φ . g00

(56)

172

H. Ringström

2.6. Second reformulation of the equations. Consider (45). All the terms on the lefthand side but the first and the last have a certain structure: terms involving ∂t u and ∂t ψ are multiplied by a factor in the form of a constant divided by t (recall that ω = p/t) and terms involving u and ψ are multiplied by a factor in the form of a constant divided by t 2 . Similar comments can be made concerning the remaining Eqs. (46)–(48). Consequently, in order to minimize the number of time dependent coefficients, it seems natural to multiply the equations with t 2 and to change the time coordinate to τ , where τ is such that ∂τ = t∂t . Lemma 4. Let V0 > 0, p > 1 and let n ≥ 3 be an integer. Define λ by (11), V by (4) and let φ0 be given by the right-hand side of (10), where c0 is given by (12). Fix 0 < t0 ∈ R, let the time coordinate τ be defined by τ = ln(t/t0 ), τ0 be defined by τ0 = ln t0 , and let U be an open subset of R+ × Tn . Finally, let Υ : R+ × Tn → R × Tn be defined by Υ (t, x) = [ln(t/t0 ), x]. Then the following statements are equivalent: – the functions g and φ, with values in Cn and R respectively, are C ∞ and satisfy (34)–(37) on U , – the functions h i j , u i , u and ψ (i, j = 1, . . . , n) defined by h i j (τ, x) u i (τ, x) u(τ, x) ψ(τ, x)

= e−2 pτ gi j (eτ +τ0 , x), = g0i (eτ +τ0 , x), = g00 (eτ +τ0 , x) + 1, = φ(eτ +τ0 , x) − φ0 (eτ +τ0 )

(57) (58) (59) (60)

are C ∞ , where u < 1 and h i j are the components of a positive definite metric, and satisfy ˆ g u + α1 ∂τ u + β1 u − 8 ∂τ ψ − 2λp(np − 1)ψ + 00 λ τ +τ0 τ +τ ˆ g u i + α2 ∂τ u i + β2 u i − 2 pe 0 glm Γlim − 4e ∂i ψ + 0i λ ˆ g h i j + (np − 1)∂τ h i j + [−2 pu + 2λp(np − 1)ψ]h i j + i j ˆ g ψ + (np − 1)∂τ ψ + 2(np − 1)ψ − 2 u + ψ λ

= 0,

(61)

= 0,

(62)

= 0,

(63)

=0

(64)

on Υ (U ), where ˆ g = −g 00 ∂τ2 − 2eτ +τ0 g 0i ∂τ ∂i − e2(τ +τ0 ) g i j ∂i ∂ j ,

(65)

α1 = (n + 2) p − 1, β1 = 2 p[n( p − 1) + 1], α2 = np − 1, β2 = p(n − 2)(2 p − 1) and 00 , 0i , i j and ψ are given by (66)–(69). Remark 7. From time to time, we shall abuse notation by writing gi j (τ, x) when gi j (eτ +τ0 , x) would be the correct-expression, etc. Note that the functions h i j , etc. are different from the ones of the previous lemma, the difference amounting to a change of time coordinate. Proof. Note that t 2 ∂02 f = −∂τ f + ∂τ2 f.

Power Law Inflation

173

The conclusions follow by straightforward computations, and we have ˜ 00 , 00 = (g 00 + 1)∂τ u + e2(τ +τ0 ) 00 2(τ +τ0 ) ˜ 0i = (g + 1)∂τ u i + e 0i ,

(66)

i j = (g

00

(68)

ψ = (g

00

+ 1)∂τ h i j + e + 1)∂τ ψ + e

˜ ij, ˜ ψ,

2(τ +τ0 )

2(τ +τ0 )

(67) (69)

˜ 00 , ˜ 0i , ˜ i j and ˜ ψ are defined by (51), (52), (55) and (56) respectively. The where lemma follows. 3. Model Problem As was discussed in the Introduction, the system of Eq. (61)–(64) has, in a certain sense, a hierarchical structure; dropping the µν and ψ terms and changing the coefficients of the highest order derivatives to those of the background, the equations for u and ψ involve neither u i nor h i j and the equations for h i j do not involve u i . As has been mentioned, this structure will be of essential importance in the bootstrap argument used to prove future global existence. As a consequence of the structure of the hierarchy, a natural problem to consider is that of proving decay of solutions to the resulting model equations for u and ψ, cf. (70) and (71) below. In order for the analysis to be of use in the non-linear setting, it is preferable to prove decay by constructing a decaying energy; arguments based on energies tend to be more robust. The purpose of the present section is to construct such an energy. 3.1. Model equations. If we consider (61) and (64), ignore the higher order terms and replace the metric with the background metric, i.e. if we assume g 00 = −1, g 0i = 0 and g i j = (t/t0 )−2 p δ i j and assume, for the sake of simplicity, t0 = 1, we obtain the equations u τ τ − e−2H τ u + α1 u τ + β1 u + γ1 ψτ + δ1 ψ = 0, ψτ τ − e

−2H τ

ψ + β3 u + γ3 ψτ + δ3 ψ = 0.

(70) (71)

Here H = p − 1 and 8 α1 = (n + 2) p − 1, β1 = 2 p[n( p − 1) + 1], γ1 = − , λ 2 δ1 = −2λp(np − 1), β3 = − , γ3 = np − 1, δ3 = 2(np − 1), λ where n ≥ 3, p > 1 and λ is given by (11). Let us define

α1 γ 1 u β1 δ1 , C= , u= . A= β3 δ3 0 γ3 ψ

(72) (73)

(74)

Then (70) and (71) can be written uτ τ − e−2H τ u + Cuτ + Au = 0. Let T be an invertible 2 × 2 matrix and apply T −1 to (75). We obtain uˆ τ τ − e−2H τ uˆ + T −1 C T uˆ τ + T −1 AT uˆ = 0, where uˆ := T −1 u.

(75)

174

H. Ringström

3.2. Positive definiteness of the coefficient matrices. Let us try to find a matrix T so that T −1 AT is diagonal. The eigenvalues of A are given by λ± :=

1/2 (β1 + δ3 )2 β1 + δ3 . ± − β1 δ3 + δ1 β3 2 4

Note that (β1 + δ3 )2 1 − β1 δ3 + δ1 β3 = [(β1 − δ3 )2 + 4δ1 β3 ] > 0, 4 4 since δ1 β3 > 0 for n ≥ 3 and p > 1. The eigenvalues are thus real and different. Note also that β1 δ3 − δ1 β3 = 4np(np − 1)( p − 1) > 0 for the range of n and p we are interested in. This computation shows that λ− = 0 when p = 1. Since β1 + δ3 > 0, we conclude that both eigenvalues are positive. Let

λ− − δ3 λ+ − δ3 . (76) T := β3 β3 Then det T > 0 and Aˆ := T −1 AT =

λ− 0 0 λ+

,

(77)

where the first equality is a definition. Let Cˆ := T −1 C T . The main question is then whether Cˆ + Cˆ t is positive definite or not. Lemma 5. With definitions as above, Cˆ + Cˆ t is positive definite. Proof. Define

R :=

β3 δ3 − λ+ −β3 λ− − δ3

.

Note that T −1 = R/ det T . In other words, R coincides with T −1 up to a positive factor. The question is then if RC T plus its transpose is positive definite. Let us define a, b, c and d by

ab = RC T. cd In order to prove that RC T plus its transpose is positive definite, all we need to prove is that a + d > 0, (a + d)2 − (a − d)2 − (b + c)2 > 0. One can compute that a + d = −β3 (λ+ − λ− )(α1 + γ3 ).

(78)

Power Law Inflation

175

Since β3 < 0, α1 + γ3 > 0 and λ+ − λ− > 0, we conclude that a + d > 0. One can also compute that b + c = −β3 (λ+ − λ− )(γ3 − α1 ), a − d = β3 [(α1 − γ3 )(β1 − δ3 ) + 2β3 γ1 ]. Consequently,

a+d β3

2

−

a−d β3

2

−

b+c β3

2

= 4α1 γ3 [(β1 − δ3 )2 + 4β3 δ1 ] − (α1 − γ3 )2 (β1 − δ3 )2 − 4(α1 − γ3 )(β1 − δ3 )β3 γ1 − 4β32 γ12 . After inserting the values for the different constants, we obtain

a+d 2β3

2

−

a−d 2β3

2

−

b+c 2β3

2

= [(np − 1)2 + 2 p(np − 1)](β1 − δ3 )2 + 16 p(np − 1)2 (np − 1 + 2 p) − p 2 (β1 − δ3 )2 − 8(n − 1) p 2 (β1 − δ3 ) − 16(n − 1)2 p 2 . One can see that the terms involving (β1 − δ3 )2 add up to something non-negative. Consider the second term on the right-hand side. If we write the last factor in this term as np − 1 + p + p, take the term that arises from one of the p’s and add it to the last two terms, we obtain 16 p 2 [(np − 1)2 − (n − 1)2 ] − 8(n − 1) p 2 (β1 − δ3 ) = 16np 2 ( p − 1)[n( p + 1) − 2] − 8(n − 1) p 2 (β1 − δ3 ). However, β1 − δ3 = 2np( p − 1) + 2 p − 2np + 2, so that −8(n − 1) p 2 (β1 − δ3 ) = −16np 2 ( p − 1)(n − 1) p + 16(n − 1) p 2 (np − p − 1). We conclude that 16 p 2 [(np − 1)2 − (n − 1)2 ] − 8(n − 1) p 2 (β1 − δ3 ) = 16np 2 ( p − 1)(n + p − 2) + 16(n − 1) p 2 (np − p − 1). Thus (78) holds and Cˆ + Cˆ t is positive definite.

176

H. Ringström

3.3. Model energy. Let us consider a solution to (70) and (71), where τ ∈ R and x ∈ Tn . Let us use the notation

uˆ = uˆ = T −1 u, ψˆ where T is given by (76). Then uˆ τ τ − e−2H τ uˆ + Cˆ uˆ τ + Aˆ uˆ = 0. Note that Aˆ is given by (77) and that Cˆ + Cˆ t is positive definite. We shall denote the components of Cˆ by Cˆ i j . Let us define an energy 1 ˆ 2 ) + 2cuˆ t uˆ τ + b1 uˆ 2 + b2 ψˆ 2 ]d x, [|uˆ τ |2 + e−2H τ (|∇ u| ˆ 2 + |∇ ψ| E= 2 Tn where the constants c and bi are to be determined. To start with, the only condition we impose is that c2 < bi for i = 1, 2. Note that this implies that there is an η > 0, depending on c, b1 and b2 , such that 1 ˆ 2 ) + |u| ˆ 2 ]d x ≤ ηE. [|uˆ τ |2 + e−2H τ (|∇ u| ˆ 2 + |∇ ψ| 2 Tn Let us compute 1 dE ˆ 2) − uˆ τt (Cˆ + Cˆ t )uˆ τ + c|uˆ τ |2 − (H + c)e−2H τ (|∇ u| ˆ 2 + |∇ ψ| = n dτ 2 T − cλ− uˆ 2 − cλ+ ψˆ 2 + (b1 − λ− − cCˆ 11 )uˆ uˆ τ + (b2 − λ+ − cCˆ 22 )ψˆ ψˆ τ − cCˆ 12 uˆ ψˆ τ − cCˆ 21 ψˆ uˆ τ ]d x.

Let us choose b1 = λ− + cCˆ 11 , b2 = λ+ + cCˆ 22 .

(79)

Since the λ± are positive, we obtain c2 < bi by choosing c small enough. Note that 1 2 ˆ2 |cCˆ 12 uˆ ψˆ τ | ≤ c3/2 uˆ 2 + c1/2 Cˆ 12 ψτ , 4 and similarly for cCˆ 21 ψˆ uˆ τ . Choosing bi as in (79), we obtain dE 1 1 2 ˆ2 2 2 − uˆ τt (Cˆ + Cˆ t )uˆ τ + c|uˆ τ |2 + c1/2 (Cˆ 12 ≤ uˆ τ ) ψτ + Cˆ 21 n dτ 2 4 T ˆ 2) ˆ 2 + |∇ ψ| − (H + c)e−2H τ (|∇ u| 1/2 2 1/2 ˆ 2 − c(λ− − c )uˆ − c(λ+ − c )ψ d x. Due to Lemma 5, Cˆ t + Cˆ is positive definite, so that by choosing c small enough, there is a constant a1 > 0, depending on n and p, such that dE ˆ 2 ) + uˆ 2 + ψˆ 2 ]d x. ≤ −a1 [uˆ 2τ + ψˆ τ2 + e−2H τ (|∇ u| ˆ 2 + |∇ ψ| n dτ T This of course implies the existence of a κ > 0, depending on n and p, such that dE ≤ −2κ E. dτ

Power Law Inflation

177

4. Energy Estimates Let us turn back to the actual equations. The purpose of the present section is to construct the energies on which the bootstrap argument will be based. Let us start by constructing the energy associated with (61) and (64). Note that we can write (61) and (64) as ˆ g u + Cuτ + Au + = 0, ˆ g is defined in (65) and where A, C and u are defined in (74),

00 . = ψ ˆ = T −1 , Aˆ = T −1 AT and Cˆ = T −1 C T , Letting T be defined by (76), uˆ = T −1 u, we obtain ˆ = 0. ˆ g uˆ + Cˆ uˆ τ + Aˆ uˆ +

(80)

We shall also use the terminology

uˆ ψˆ

ˆ := u.

Lemma 6. Let p > 1 and τ0 be real numbers and n ≥ 3 be an integer. Let g : I × Tn → Cn , where I is an interval, and denote the components of g by gµν . Consider a solution uˆ to the equation ˆ g uˆ + Cˆ uˆ τ + Aˆ uˆ = F

(81)

ˆ g is defined in (65), F is a given function and Aˆ and Cˆ are defined on I × Tn , where above. Given constants clp and bi , i = 1, 2, we define 1 ˆ = E[u] {−g 00 ∂τ uˆ t ∂τ uˆ + gˆ i j ∂i uˆ t ∂ j uˆ − 2clp g 00 uˆ t ∂τ uˆ + b1 uˆ 2 + b2 ψˆ 2 }d x (82) 2 Tn on I , where we use the notation gˆ i j = e2(τ +τ0 ) g i j . Below we shall also use the notation gˆ 0i = eτ +τ0 g 0i and H = p − 1. There are constants ηlp , ζlp , bi , clp > 0, depending on n and p, such that if E is defined by (82) with this choice of bi and clp and |g 00 + 1| ≤ ηlp , then

E ≥ ζlp

and

(83)

Tn

ˆ x {∂τ uˆ t ∂τ uˆ + gˆ i j ∂i uˆ t ∂ j uˆ + uˆ t u}d

dE ˆ x ≤ −2ηlp E + {(∂τ uˆ t + clp uˆ t )F + E [u]}d dτ Tn

ˆ is given in (85). where E [u]

(84)

178

H. Ringström

Remark 8. Note that since g is a map into Cn , Lemma 2 implies that g i j are the components of a positive definite matrix and that g 00 < 0. Proof. Let us compute dE 1 = {− ∂τ uˆ t (Cˆ + Cˆ t )∂τ uˆ − ∂τ uˆ t Aˆ uˆ + ∂τ uˆ t F − (H + clp )gˆ i j ∂i uˆ t ∂ j uˆ dτ 2 Tn ˆ τ uˆ − clp uˆ t Aˆ uˆ + clp uˆ t F + b1 u∂ ˆ τ ψˆ ˆ 2 − clp uˆ t C∂ + clp |∂τ u| ˆ τ uˆ + b2 ψ∂ ˆ x, + E [u]}d where ˆ = −clp (g 00 + 1)∂τ uˆ t ∂τ uˆ − 2clp gˆ 0i ∂i uˆ t ∂τ uˆ − 2clp (∂i gˆ 0i )uˆ t ∂τ uˆ E [u] 1 1 ij t 00 t ij ij ∂τ gˆ + H gˆ ∂i uˆ t ∂ j uˆ − clp (∂ j gˆ )∂i uˆ uˆ − (∂τ g )∂τ uˆ ∂τ uˆ + 2 2

(85)

ˆ − (∂i gˆ 0i )∂τ uˆ t ∂τ uˆ − (∂ j gˆ i j )∂τ uˆ t ∂i uˆ − clp (∂τ g 00 )uˆ t ∂τ u. Choosing clp and bi similarly to how we chose them in Subsect. 3.3, we get the desired conclusion, assuming g 00 to be close enough to −1. Corollary 1. With assumptions as in Lemma 6, let E be defined by (82) with constants chosen as in the statement of Lemma 6. Let ˆ Ek = E[∂ α u]. |α|≤k

Then, assuming (83) holds, dEk ˆ g , ∂ α ]u) ˆ x. ˆ + E [∂ α u]}d ≤ −2ηlp Ek + {(∂ α ∂τ uˆ t + clp ∂ α uˆ t )(∂ α F + [ dτ Tn |α|≤k

Remark 9. When we write ∂ α , we shall always take for granted that the Greek index used upstairs is a multiindex, α = (l1 , . . . , ln ), where the li are non-negative integers so that ∂ α = ∂1l1 · · · ∂nln , where ∂i is the standard differential operator with respect to the i th “coordinate” on Tn . Note in particular that ∂ α never contains any derivatives with respect to the time coordinate. Note also that in an expression ∂α , the Greek index downstairs means a number from 0 to n. Proof. Differentiating (81), we obtain ˆ g ∂ α uˆ + C∂ ˆ g , ∂ α ]u, ˆ τ ∂ α uˆ + A∂ ˆ α uˆ = ∂ α F + [ ˆ so that we only need to apply Lemma 6 in order to get the desired conclusion.

The energies we shall construct for u i and h i j will be based on the following lemma.

Power Law Inflation

179

Lemma 7. Let τ0 be a real number and n ≥ 3 be an integer. Let g : I × Tn → Cn , where I is an interval, and denote the components of g by gµν . Consider a solution to the equation ˆ g v + α∂τ v + βv = F,

(86)

ˆ g is defined in (65), F is a given function and α > 0 and β ≥ 0 on I × Tn , where are constants. Then there are constants ηc , ζ > 0 and γ , δ ≥ 0, depending on α and β, such that if |g 00 + 1| ≤ ηc and Eγ ,δ [v] =

(87)

1 [−g 00 (∂τ v)2 + gˆ i j ∂i v∂ j v − 2γ g 00 v∂τ v + δv 2 ]d x, 2 Tn

then

Eγ ,δ ≥ ζ

Tn

[(∂τ v)2 + gˆ i j ∂i v∂ j v + ιβ v 2 ]d x,

(88)

where ιβ = 0 if β = 0 and ιβ = 1 otherwise, and dEγ ,δ ≤ −2ηc Eγ ,δ + {(∂τ v + γ v)F + E,γ ,δ [v]}d x, dτ Tn where E,γ ,δ [v] is given by (89). If β = 0, then γ = δ = 0. Proof. If β > 0, choose γ = α/2 and δ = β + α 2 /2. Then γ 2 < δ, and it is clear that there is a constant ζ > 0 such that (88) holds, assuming g 00 is close enough to −1. If β = 0, we simply let γ = δ = 0, and the existence of a ζ > 0 such that (88) holds again follows from the assumption that g 00 is close enough to −1. Compute dEγ ,δ = {−(α − γ )(∂τ v)2 + (δ − β − γ α)v∂τ v − βγ v 2 dτ Tn − (H + γ )gˆ i j ∂i v∂ j v + (∂τ v + γ v)F + E,γ ,δ [v]}d x, where E,γ ,δ [v] = −γ (∂i gˆ i j )v∂ j v − 2γ (∂i gˆ 0i )v∂τ v − 2γ gˆ 0i ∂i v∂τ v − (∂i gˆ 0i )(∂τ v)2

1 1 ij 00 2 ij ij ∂i v∂ j v (89) ∂τ gˆ + H gˆ − (∂ j gˆ )∂i v∂τ v − (∂τ g )(∂τ v) + 2 2 − γ ∂τ g 00 v∂τ v − γ (g 00 + 1)(∂τ v)2 . Due to our choices, we have, assuming β > 0, dEγ ,δ 1 =− [α(∂τ v)2 + (α + 2H )gˆ i j ∂i v∂ j v + αβv 2 ]d x dτ 2 Tn + {(∂τ v + γ v)F + E,γ ,δ [v]}d x. Tn

Since the opposite inequality to (88) also holds, provided we replace ζ by ζ −1 for ζ small enough, we obtain the conclusion of the lemma for β > 0. The conclusion in the case β = 0 follows for similar reasons.

180

H. Ringström

Corollary 2. Under the assumptions of Lemma 7, let Eγ ,δ [∂ α v]. Ek = |α|≤k

Then, assuming (87) holds, dEk ˆ g , ∂ α ]v) + E,γ ,δ [∂ α v]}d x. {(∂τ ∂ α v + γ ∂ α v)(∂ α F + [ ≤ −2ηc Ek + n dτ T |α|≤k

Proof. Given that v satisfies (86), ∂ α v satisfies ˆ g (∂ α v) + α∂τ (∂ α v) + β(∂ α v) = ∂ α F + [ ˆ g , ∂ α ]v. The statement follows from Lemma 7.

5. Bootstrap Assumptions Before we write down the basic bootstrap assumptions, let us introduce some terminology. If A is a symmetric positive definite n ×n matrix with components Ai j and w ∈ Rn , we shall use the notation ⎞1/2 ⎛ n |w| A = ⎝ Ai j w i w j ⎠ . i, j=1

If Id is the identity matrix, we define |w| := |w|Id . We shall also use the notation introduced in Subsect. 2.4. 5.1. Primary bootstrap assumptions. The purpose of the primary bootstrap assumptions is to ensure that the metric remains Lorentzian, with quantitative bounds. Definition 4. Let p > 1, a > 0, c1 > 1, η ∈ (0, 1), K 0 and τ0 be real numbers and n ≥ 3 be an integer. We shall say that a function g : I × Tn → Cn , where I is an interval, satisfies the primary bootstrap assumptions on I (the relevant constants being understood from the context) if c1−1 |w|2 ≤ e−2Ω−2K |w|2g ≤ c1 |w|2 , |u[g]| ≤ η, |v[g]| ≤ 2

ηc1−1 e2Ω−2r +2K ,

(90) (91) (92)

for all w ∈ Rn and all (τ, x) ∈ I × Tn , where Ω = pτ , r = aτ and K = τ0 + K 0 . Remark 10. We shall specify a and η in (101) and (100) below. In the end we shall apply the above conditions to a situation in which K 0 only depends on p, so that factors of e−K 0 and e K 0 can be considered to be constants of which one need not keep track. In fact, the natural choice to make for e K is a numerical multiple of the basic length scale (t0 ). Furthermore, the constants η and a we shall use only depend on n and p, and c1 will, in our applications, be a numerical constant. In other words, the only quantity that in practice needs to be specified (beyond n and p) is τ0 .

Power Law Inflation

181

Lemma 7 of [20] gives the following conclusions of the bootstrap assumptions. Lemma 8. Let p > 1, a > 0, c1 > 1, η ∈ (0, 1), K 0 and τ0 be real numbers and n ≥ 3 be an integer. Assume that g : I × Tn → Cn satisfies the primary bootstrap assumptions on I , where I is an interval. There is a numerical constant η0 ∈ (0, 1/4) such that if we assume η ≤ η0 in (91) and (92), then |v[g −1 ]| ≤ 2c1 e−2Ω−2K |v[g]|, |(v[g], v[g

−1

])| ≤ 2c1 e

|u[g

−1

−2Ω−2K

(93)

|v[g]| , 2

]| ≤ 4η,

2 3c1 |w|2 , |w|2 ≤ e2Ω+2K |w|2g ≤ 3c1 2

(94) (95) (96)

for all w ∈ Rn and (τ, x) ∈ I × Tn . Here we use the notation (ξ, ζ ) for the ordinary scalar product of ξ, ζ ∈ Rn . Remark 11. The lemma holds irrespective of the value of a. 5.2. Energies. Let p > 1, a > 0, c1 > 1, η ∈ (0, min{η0 , ηlp /4}], K 0 and τ0 be real numbers and n ≥ 3 be an integer. Assume that g : I × Tn → Cn satisfies the primary bootstrap assumptions on I , where I is an interval. Then (83) is satisfied due to (95). In order to define the energy associated with u and ψ, let us note that (61) and (64) can be combined into (80). Using the notation introduced in connection with (80), let ˆ Hlp,k = E[∂ α u], (97) |α|≤k

where E is defined in (82) with the constants that are obtained as a result of Lemma 6. Consider (62). If we take all the terms on the left-hand side except for the first three to the right-hand side, we get an equation of the type discussed in Lemma 7 with α replaced by α2 and β replaced by β2 . Since α2 , β2 > 0, Lemma 7 yields positive constants γs , δs , ηs and ζs such that the conclusions of that lemma holds, and we define Hs,k = Eγs ,δs [∂ α u i ], (98) i

|α|≤k

where Eγs ,δs is defined in Lemma 7. Consider (63). Taking all but the first two terms on the left-hand side to the righthand side, we obtain an equation of the type considered in Lemma 7 with α replaced by np − 1 > 0 and β replaced by 0. We thus get γm = δm = 0 and ηm , ζm > 0 such that the conclusions of Lemma 7 hold. We define the energy associated with h i j to be

1 Eγm ,δm [∂ α h i j ] + Hm,k = e−2aτ aα (∂ α h i j )2 d x , (99) 2 Tn i, j |α|≤k

where a > 0 is given by (101) and aα = 1 for |α| > 0, aα = 0 for α = 0. From now on, we shall assume that g satisfies the primary bootstrap assumption on an interval I , where η is defined by η := min{η0 , ηlp /4, ηs /4, ηm /4}.

(100)

182

H. Ringström

Note that as a consequence, the conclusions of Lemma 6 and 7 hold for the energies of interest, cf. (95). Furthermore, we define a :=

1 min{ p − 1, ηlp , ηs , ηm }. 4

(101)

Note that a and η only depend on n and p. 5.3. Basic estimates. Let us use the notation ⎛ ⎞1/2 f H k = ⎝ (∂ α f )2 d x ⎠ n |α|≤k T

for the Sobolev norms (note that we shall use this notation even when f depends on t, and then the derivatives will still only be with respect to the spatial coordinates). We wish to express the Sobolev norms of the quantities of interest in terms of the geometrically defined energies Hlp,k , etc. In the end it will turn out to be convenient to use the following energies instead: Hˆ lp,k = e2aτ Hlp,k , Hˆ s,k = e−2 pτ +2aτ −2K Hs,k , Hˆ m,k = e2aτ −4K Hm,k , where a > 0 is given by (101). We shall also use the notation Hˆ k = Hˆ lp,k + Hˆ s,k + Hˆ m,k .

(102)

Note that, using the notation of Sect. 7 in [20], Hˆ lp,k , Hˆ s,k and Hˆ m,k are equivalent ˆ the quantity r should be to Eˆ lp,k , Eˆ s,k and Eˆ m,k respectively; in the formulas for E, replaced by aτ and it is convenient to note that ω−1 ∂t = p −1 ∂τ , ω−1 g 0i = p −1 gˆ 0i , ω−2 g i j = p −2 gˆ i j .

(103)

In particular, Hˆ k is equivalent to Eˆ k . Furthermore, we have the following lemma. Lemma 9. Let p > 1, c1 > 1, K 0 and τ0 be real numbers and n ≥ 3 be an integer. Let η and a be defined by (100) and (101) respectively and assume that g : I × Tn → Cn satisfies the primary bootstrap assumptions on an interval I . Then eaτ [ψ H k + ∂τ ψ H k + e−H τ −K 0 ∂i ψ H k ] ≤ C Hˆ lp,k ,

(104)

1/2 eaτ [u H k + ∂τ u H k + e−H τ −K 0 ∂i u H k ] ≤ C Hˆ lp,k ,

(105)

1/2

e− pτ +aτ −K [u m H k + ∂τ u m H k + e−H τ −K 0 ∂i u m H k ] ≤ C Hˆ s,k ,

1/2 e−2 pτ +aτ −2K ∂τ gi j − 2 pgi j H k + e−H τ −K 0 ∂l gi j H k ≤ C Hˆ m,k , 1/2

1/2 e−2 pτ −2K ∂ α gi j 2 ≤ C Hˆ m,k

(106) (107) (108)

hold on I , where K = τ0 + K 0 , the last estimate is valid for 0 < |α| ≤ k and the constants depend on c1 , n and p.

Power Law Inflation

183

Proof. The lemma follows from Lemma 8 of [20] given the above mentioned equivalence of the energies (though it is not difficult to prove the statement directly). Note, however, that this is based on observations such as (103) and ω−1 e− pτ −K = p −1 e−H τ −K 0 , and the fact that 1 is as good a constant as p −1 .

We shall need estimates for the components of the inverse of the metric. Such estimates follow from the results of [20]. Lemma 10. Let p > 1, c1 > 1, K 0 and τ0 be real numbers and n ≥ 3 be an integer. Let η and a be defined by (100) and (101) respectively and assume that g : I × Tn → Cn satisfies the primary bootstrap assumptions on an interval I . Then, for 0 < |α| ≤ k, eaτ ∂ α g 00 2 ≤ C Hˆ k , 1/2

e2 pτ +2K ∂ α glm 2 ≤ e

pτ +aτ +K

g H k ≤ 0l

1/2 C Hˆ k , 1/2 C Hˆ k

(109) (110) (111)

hold on I , where K = τ0 + K 0 , Hˆ k is defined in (102) and the constants depend on n, p, k and c1 . Proof. See Lemma 9 of [20].

5.4. The main bootstrap assumption. Using the primary bootstrap assumptions, it is possible to define the energy Hˆ k in terms of which the main bootstrap assumption is phrased. Definition 5. Let p > 1, c1 > 1, K 0 , 0 < ≤ 1 and τ0 be real numbers and n ≥ 3 and k0 > n/2 + 1 be integers. We shall then say that (g, ψ) satisfy the main bootstrap assumption on I (the relevant constants being understood from the context), where I is an interval, if – g : I × Tn → Cn and ψ : I × Tn → R are C ∞ , – g satisfies the primary bootstrap assumptions on I , where η and a are defined by (100) and (101) respectively, – g and ψ satisfy 1/2 Hˆ k0 (τ ) ≤

(112)

for all τ ∈ I , where K = K 0 + τ0 . Remark 12. Note that these bootstrap assumptions correspond exactly to the bootstrap assumptions made in [20], given the specific form of Ω and r , cf. (105) of [20].

184

H. Ringström

6. Estimates for the Non-Linearity In the proof of future global existence of solutions, the main tool is the system of differential inequalities given in Sect. 7. The first step in the derivation of these inequalities has already been taken, cf. Corollary 1 and 2. However, in order to obtain (139)–(141), ˆ g , ∂ α ]u, ˆ etc. in H k , cf. Corollary 1 ˆ E [∂ α u], it is necessary to estimate µν , ψ , [ and 2. The present section is devoted to a derivation of such estimates. In Subsect. 9.1 of [20], we described an algorithm for estimating the higher order terms. The current context is only a special case of what was considered there. However, a few things should be kept in mind when making the comparison. First of all, in the estimates in [20], time derivatives were computed with respect to the original time t and not with respect to τ . Furthermore, Ω = pτ , K = τ0 + K 0 , ω = p/t and r = aτ . The relationship between t and τ is of course given by τ = ln t − τ0 . When using the algorithm described in [20], it is convenient to note that (103) holds. In particular, changing ∂t to ∂τ corresponds to multiplication with ω−1 as far as estimates are concerned. 6.1. Estimates for the quadratic terms. Lemma 11. Let p > 1, c1 > 1, K 0 , 0 < ≤ 1 and τ0 be real numbers and n ≥ 3 and k0 > n/2 + 1 be integers. Assume that (g, ψ) satisfy the main bootstrap assumption on an interval I . Then 00 H k ≤ Ce−2aτ Hˆ k , 1/2

0l H k ≤ i j H k ≤ ψ H k ≤

1/2 Ce pτ −2aτ +K Hˆ k , 1/2 Ce−2aτ +2K Hˆ k , 1/2 Ce−2aτ Hˆ k

(113) (114) (115) (116)

on I , where 00 , 0i , i j and ψ are given by (66)–(69), K = K 0 + τ0 , Hˆ k is defined in (102) and the constants depend on n, p, k and c1 . Remark 13. The bootstrap assumptions only constitute control of k0 + 1 derivatives, but the conclusions of the present lemma, as well as several lemmas to follow, hold for any non-negative integer k. Proof. Consider ˜ 00 . 00 = (g 00 + 1)∂τ u + e2(τ +τ0 ) To estimate the first term using the algorithm, cf. Subsect. 9.1 of [20], note that it can be rewritten (g 00 + 1) pω−1 ∂t u.

(117)

(g 00 + 1)∂t u,

(118)

The expression

is of the type dealt with by the algorithm, and, in the terminology of [20], we compute that l = 2, lh = 0 and l∂ = 1. Here l gives the number of terms that are “small” (for a precise definition, see [20]), lh gives the number of downstairs spatial indices minus the

Power Law Inflation

185

number of upstairs spatial indices, including spatial derivatives, and l∂ is the number of derivatives occurring. Due to the algorithm, the expression (118) can thus be estimated by 1/2 1/2 Cωl∂ elh (Ω+K )−l r Eˆ k = Cωe−2aτ Eˆ k ,

which yields the desired estimate for (117) in view of the fact that Eˆ k and Hˆ k are equivalent. What remains to be considered is thus ˜ 00 = 2e2(τ +τ0 ) A,00 + 2e2(τ +τ0 ) C,00 − 2e2(τ +τ0 ) φ,00 , e2(τ +τ0 )

(119)

cf. (51). Due to Lemma 12 of [20], we have the estimate 1/2 A,00 H k + C,00 H k ≤ Cω2 e−2r Eˆ k .

Noting that ω−2 = p −2 e2(τ +τ0 ) , this estimate implies e2(τ +τ0 ) A,00 H k + e2(τ +τ0 ) C,00 H k ≤ Ce−2aτ Hˆ k , 1/2

which yields the desired estimate for the first two terms on the right-hand side of (119). Let us turn to e2(τ +τ0 ) φ,00 , where φ,00 is given by (50). An estimate for the first two terms in (50), after multiplication by e2(τ +τ0 ) , follows by estimating ω−2 ψ˙ 2 and uψ. These objects can be estimated by the algorithm; in both cases l = 2 and lh = 0 and in the first case l∂ = 2 whereas l∂ = 0 in the last case. Finally, we need to estimate e2(τ +τ0 ) g00 E,φ .

(120)

Note that E,φ is given by (49) and that V0 e−λφ0 = p(np − 1)(n − 1)/(2t 2 ), cf. (44), so that estimating (120) is the same as estimating g00 (e−λψ − 1 + λψ) = R(ψ)g00 ψ 2 for some smooth function R, cf. the proof of Lemma 16 in [20]. This is an object which can be estimated by the algorithm; l = 2 and lh = l∂ = 0. The arguments to derive (114)–(116) are similar. 6.2. Estimates for the commutators. We shall need estimates for the H k -norm of ˆ g u = −α1 ∂τ u − β1 u + 8 ∂τ ψ + 2λp(np − 1)ψ − 00 , Fˆ0 := λ τ +τ0 ˆ g u i = −α2 ∂τ u i − β2 u i + 2 peτ +τ0 glm Γlim + 4e ∂i ψ − 0i , Fˆi := λ ˆ g h i j = −(np − 1)∂τ h i j − [−2 pu + 2λp(np − 1)ψ]h i j − i j , Fˆi j := ˆ g ψ = −(np − 1)∂τ ψ − 2(np − 1)ψ + 2 u − ψ , Fˆψ := λ where we have used (61)–(64).

(121) (122) (123) (124)

186

H. Ringström

Lemma 12. Let p > 1, c1 > 1, K 0 , 0 < ≤ 1 and τ0 be real numbers and n ≥ 3 and k0 > n/2 + 1 be integers. Assume that (g, ψ) satisfy the main bootstrap assumption on an interval I . Assuming (61)–(64) are satisfied (where h i j , u i and u are defined in terms of g according to (57)–(59)), we conclude that Fˆ0 H k ≤ Ce−aτ Hˆ k ,

(125)

Fˆm H k ≤

(126)

1/2

Fˆi j H k ≤ Fˆψ H k ≤

1/2 Ce pτ −aτ +K Hˆ k , 1/2 Ce−aτ +2K Hˆ k , 1/2 Ce−aτ Hˆ k

(127) (128)

on I , where Fˆ0 ,…, Fˆψ are defined in (121)–(124) respectively, K = K 0 +τ0 , Hˆ k is defined in (102) and the constants depend on n, p, k and c1 . Proof. Except for the terms 2 peτ +τ0 glm Γlim , −[−2 pu + 2λp(np − 1)ψ]h i j ,

(129)

the conclusions are immediate consequences of (113)–(116), (104)–(108), the definition of Hˆ k and the fact that ≤ 1. In order to deal with the first expression appearing in (129), note that we can apply the algorithm, cf. Subsect. 9.1 of [20], with l = 1, l∂ = 1 and lh = 1 in order to obtain 1/2 1/2 2 peτ +τ0 glm Γlim H k ≤ Ceτ +τ0 ωe pτ +K −aτ Hˆ k = C pe pτ +K −aτ Hˆ k ,

which is an estimate of the desired form. In order to deal with the second expression appearing in (129), we can also apply the algorithm with l = 1, lh = 2 and l∂ = 0, though in order for this to fit with the conventions of [20], we have to rewrite h i j as e−2 pτ gi j . We obtain 1/2 [−2 pu + 2λp(np − 1)ψ]h i j H k ≤ Ce−2 pτ e2 pτ +2K −aτ Hˆ k ,

and the lemma follows.

Lemma 13. Let p > 1, c1 > 1, K 0 , 0 < ≤ 1 and τ0 be real numbers and n ≥ 3 and k0 > n/2 + 1 be integers. Assume that (g, ψ) satisfy the main bootstrap assumption on an interval I . Assuming (61)–(64) are satisfied (where h i j , u i and u are defined in terms of g according to (57)–(59)), we conclude that, for 0 < |α| ≤ k, ˆ g , ∂ α ]u2 ≤ Ce−2aτ Hˆ 1/2 , [ k ˆ g , ∂ ]u m 2 ≤ [ α

ˆ g , ∂ α ]h i j 2 ≤ [ ˆ g , ∂ ]ψ2 ≤ [ α

1/2 Ce Hˆ k , 1/2 Ce−2aτ +2K Hˆ k , 1/2 Ce−2aτ Hˆ k pτ −2aτ +K

(130) (131) (132) (133)

on I . Here, K = K 0 + τ0 , Hˆ k is defined in (102) and the constants depend on n, p, k, c1 and an upper bound on e−K 0 . Remark 14. Note that a ≤ H due to (101).

Power Law Inflation

187

Proof. The result follows from Lemma 13 in [20]. However, in order to be able to see that, we need to translate the terminology of [20] to the current setting. In [20], the ˆ g occurs, but this object does not coincide with the ˆ g used in the current notation old ˆ ˆ paper. Let us denote the object g that occurs in [20] by g to distinguish it from the object considered in the present paper. The relation between the two is then given by 00 ˆ g = t 2 ˆ old g − g ∂τ .

This can be restated as follows: −2 ˆ 00 ˆ old ω−2 g = p (g + g ∂τ ).

Expressing the statement of Lemma 13 in [20] in terms of the current terminology, we conclude that if, for some smooth v on I × Tn , ˆ g v + g 00 ∂τ v H k p −1 ∂τ v H k + p −1 e−H τ −K 0 ∂i v H k + p −2 ≤ Celh ( pτ +K )−aτ Hˆ k , 1/2

for some k > n/2 + 1, then, for 0 < |α| ≤ k, ˆ g + g 00 ∂τ , ∂ α ]v2 ≤ Celh ( pτ +K )−2aτ Hˆ 1/2 , [ k where the constant depends on sup ω−1 e−Ω−K +r ,

(134)

t∈I

which is assumed to be finite. Note that ω−1 e−Ω−K +r = p −1 e−H τ +aτ −K 0 . In order to be allowed to use Lemma 13 of [20], we thus need to have 0 < a ≤ p − 1, which is ensured by (101). Furthermore, the constant depends on e−K 0 . Let us reformulate the assumptions and the conclusions. Note that if we assume 1/2 ∂τ v H k ≤ Celh ( pτ +K )−aτ Hˆ k ,

(135)

then g 00 ∂τ v H k ≤ ∂τ v H k + (g 00 + 1)∂τ v H k ≤ C[(1 + 1 + g 00 ∞ )∂τ v H k + ∂τ v∞ 1 + g 00 H k ] 1/2 ≤ Celh ( pτ +K )−aτ Hˆ k

due to the bootstrap assumptions, the algorithm applied to g 00 + 1, Sobolev embedding ˆ g v + g 00 ∂τ v with and the fact that ≤ 1. As a conclusion we might as well replace ˆ g v in the assumptions. Concerning the conclusions, note that for |α| ≤ k, [g 00 ∂τ , ∂ α ]v2 ≤ C

n

(∂i g 00 ∞ ∂τ v H k−1 + ∂i g 00 H k−1 ∂τ v∞ )

i=1 lh ( pτ +K )−2aτ

≤ Ce

1/2 Hˆ k ,

where we have used (135), (109), the bootstrap assumptions and Sobolev embedding. ˆ g v + g 00 ∂τ v with ˆ g v in the conclusions. Thus we might as well replace In order to obtain the desired conclusion, all we need to do is to combine the above result with the estimates (104)–(108) and (125)–(128), with one exception. In the case of h i j , lh (Ω + K ) should be replaced by 2K . The argument goes through all the same if we simply let lh = 0 in that case and apply the result to v = e−2K h i j .

188

H. Ringström

6.3. Estimates for the remainder terms in the energy estimates. In preparation for the final estimate, let us note that the following estimates hold. Lemma 14. Let p > 1, c1 > 1, K 0 , 0 < ≤ 1 and τ0 be real numbers and n ≥ 3 and k0 > n/2 + 1 be integers. Assume that (g, ψ) satisfy the main bootstrap assumption on an interval I . Then, on I , 1 ∂τ gˆ i j + H gˆ i j ≤ Ce−2H τ −2K 0 e−aτ , 2 ∞

∂τ g ∞ ≤ Ce−aτ . 00

Proof. Note that gˆ i j = e2(τ +τ0 ) g i j = t 2 g i j , so that (recall that H = p − 1 and that ω = p/t), j

t∂t gˆ i j = −2H gˆ i j − t 3 g ik (g jl ∂t gkl − 2ωδk ) − t 3 g i0 g j0 ∂t g00 − t 3 g ik g j0 ∂t g0k − t 3 g i0 g jk ∂t g0k . Moving −2H gˆ i j over to the left-hand side the objects that remain on the right-hand side can be estimated using the algorithm; e.g. j 1/2 g ik (g jl ∂t gkl − 2ωδk ) H k ≤ Cωe−2( pτ +K )−aτ Hˆ k ,

since l = 1, lh = −2 and l∂ = 1 in this case. Since t∂t = ∂τ , we get the desired conclusion using the bootstrap assumptions and Sobolev embedding. The second estimate follows by a similar argument. Finally, we need the following estimates. Lemma 15. Let p > 1, c1 > 1, K 0 , 0 < ≤ 1 and τ0 be real numbers and n ≥ 3 and k0 > n/2 + 1 be integers. Assume that (g, ψ) satisfy the main bootstrap assumption on an interval I . Then ˆ 1 ≤ Ce−aτ Hlp,k , E [∂ α u] α

−aτ

α

−aτ

E,γs ,δs [∂ u m ]1 ≤ Ce E,γm ,δm [∂ h i j ]1 ≤ Ce

(136)

Hs,k ,

(137)

Hm,k

(138)

on I for |α| ≤ k, where E and E,γ ,δ are defined in (85) and (89) respectively. The constants depend on n, p, k, c1 and an upper bound for e−K 0 . Proof. Due to the algorithm, (112) and Sobolev embedding, g 00 + 1∞ ≤ Ce−aτ . Furthermore, using (110) and (111), we conclude that e2H τ +2K 0 ∂i gˆ lm ∞ + e H τ +K 0 +aτ ∂i gˆ 0m ∞ + e H τ +K 0 +aτ gˆ 0m ∞ ≤ C for all i, l, m, due to Sobolev embedding, the fact that k0 > n/2 + 1 and the fact that the bootstrap assumptions hold. Recall that gˆ 0i and gˆ i j were defined in Lemma 6. Due to these estimates, (90)–(92) and the estimates of Lemma 14, we conclude that ˆ 1 ≤ Ce−aτ E[u], ˆ E [u] where E was defined in (82) and the constant depends on an upper bound of e−K 0 . Note that in order to obtain this conclusion, we used the fact that a ≤ H , cf. (101). This proves (136). The other estimates follow in a similar fashion, keeping in mind that γm = δm = 0.

Power Law Inflation

189

7. Differential Inequalities Finally, we are in a position to derive the differential inequalities that will be the core of the proof of global existence. Lemma 16. Let p > 1, c1 > 1, K 0 , 0 < ≤ 1 and τ0 be real numbers and n ≥ 3 and k0 > n/2 + 1 be integers. Assume that (g, ψ) satisfy the main bootstrap assumption on an interval I . Assuming (61)–(64) are satisfied (where h i j , u i and u are defined in terms of g according to (57)–(59)), we conclude that d Hˆ lp,k 1/2 1/2 ≤ −2a Hˆ lp,k + Ce−aτ Hˆ lp,k Hˆ k , dτ d Hˆ s,k 1/2 1/2 1/2 1/2 1/2 ≤ −2a Hˆ s,k + C Hˆ s,k ( Hˆ lp,k + Hˆ m,k ) + Ce−aτ Hˆ s,k Hˆ k , dτ d Hˆ m,k 1/2 1/2 1/2 ≤ Ce−aτ Hˆ m,k + C Hˆ lp,k0 Hˆ m,k + C Hˆ lp,k Hˆ m,k dτ 1/2 1/2 + Ce−aτ Hˆ Hˆ m,k

(139) (140) (141)

k

on I , where the constants depend on n, p, k, c1 and an upper bound on e−K 0 . Proof. Recall that Hlp,k is defined by (97). Due to Corollary 1, where

−1 00 F = −T ψ and T is defined in (76), we obtain d Hlp,k 1/2 1/2 ≤ −2ηlp Hlp,k + Ce−2aτ Hlp,k Hˆ k + Ce−aτ Hlp,k , dτ where we have used (113), (116), (130), (133) and (136). Given the definition of Hˆ lp,k and (101), we conclude that (139) holds. Let us turn to Hˆ s,k . Consider (62). This is an equation for u i of the form considered in Corollary 2 if we let 4eτ +τ0 ∂i ψ − 0i . λ Due to Corollary 2, (114), (131) and (137), we have d Hs,k 1/2 ≤ −2ηs Hs,k + C Hs,k eτ +τ0 (glm Γlim H k + ∂i ψ H k ) dτ Fi = 2 peτ +τ0 glm Γlim +

+ Ce

pτ −2aτ +K

i 1/2 ˆ 1/2 Hs,k Hk

+ Ce−aτ Hs,k .

By (104), eτ +τ0 ∂i ψ H k ≤ Ce pτ −aτ +K Hˆ lp,k . 1/2

When estimating glm Γlim in H k , it is convenient to divide the terms that appear into two different categories. Due to (107) 1/2 glm ∂ α ∂ j grq 2 ≤ Ce pτ −aτ +K Hˆ m,k . eτ +τ0 |α|≤k

190

H. Ringström

The second category consists of terms of the form eτ +τ0 ∂ α1 ∂ j glm ∂ α2 ∂i grq 2 ≤ Ceτ +τ0 [∂ j glm ∞ ∂i grq H k−1 + ∂ j glm H k−1 ∂i grq ∞ ] ≤ Ceτ +τ0 Hˆ k , 1/2

where |α1 | + |α2 | ≤ k − 1 and we have used (108), (110) and the fact that k0 > n/2 + 1. Due to these observations, the definition of Hˆ s,k and (101), we obtain the conclusion that (140) holds with a constant depending on an upper bound on e−K 0 . Finally, consider Hm,k defined by (99). Due to Lemma 7, we have, cf. Corollary 2,

d α −2aτ α 2 Eγm ,δm [∂ h i j ] + e aα (∂ h i j ) d x dτ Tn ˆ g , ∂ α ]h i j )d x (∂τ ∂ α h i j + γm ∂ α h i j )(∂ α Fi j + [ ≤ −2ηm Eγm ,δm [∂ α h i j ] + n T α + E,γm ,δm [∂ h i j ]d x − 2a e−2aτ aα (∂ α h i j )2 d x n n T T +2 e−2aτ aα ∂ α h i j ∂τ ∂ α h i j d x, Tn

where Fi j = −[−2 pu + 2λp(np − 1)ψ]h i j − i j . Due to (101), the fact that γm = δm = 0, (115), (132) and (138), we obtain d Hm,k 1/2 ≤ −2a Hm,k + Ce−aτ Hm,k + C [uh i j H k + ψh i j H k ]Hm,k dτ i, j

+ Ce

−2aτ +2K

1/2 1/2 Hˆ k Hm,k + Ce−aτ Hm,k .

When estimating uh i j in H k it is useful to divide the terms into two different categories. Let us first consider 1/2 h i j ∂ α u2 ≤ Ce2K Hlp,k . |α|≤k i, j

The second category consists of terms of the form 1/2 ∂ α1 ∂q h i j ∂ α2 u2 ≤ C[∂q h i j ∞ u H k + u∞ eaτ Hm,k ]. |α1 |+|α2 |≤k−1

Since we assume that k0 > n/2 + 1, the bootstrap assumptions imply that ∂q h i j ∞ ≤ Ce2K , cf. (108). Consequently, uh i j H k ≤ C[ Hˆ lp,k0 Hm,k + e2K Hlp,k ]. 1/2

1/2

1/2

We have a similar estimate for ψh i j H k and consequently we obtain (141).

Power Law Inflation

191

8. Global Existence We are now in a position to prove that solutions corresponding to small initial data for (61)–(64) do not become unbounded in finite time. Before we do so, we do, however, need to relate initial data for (24)–(25) to initial data for (61)–(64). A complication arises due to the fact that the background solution we are subtracting has an explicit time dependence. Consequently, we need to determine the starting time based on the data we have. Let (, ς, Φa , Φb ) be given on Tn , where is a smooth Riemannian metric, ς is a smooth symmetric covariant 2-tensor and Φa , Φb are smooth functions. Since we wish Φa to be close to the background solution, we shall in the end demand that its spatial variation be small. A natural condition to determine the initial time, t0 , is thus 2 1

Φa = ln t0 − c0 , λ λ where · denotes the mean value over Tn , i.e. 1

Φa = Φa d x. (2π )n Tn As a consequence, we make the following definition. Definition 6. Let n ≥ 3 be an integer and let p > 1. Let V (φ) be given by (4), where V0 > 0 and λ is given by (11). Let (, ς, Φa , Φb ) be given on Tn , where is a smooth Riemannian metric, ς is a smooth symmetric covariant 2-tensor and Φa , Φb are smooth functions. Define the initial time associated with (, ς, Φa , Φb ) to be 1 t0 = exp (λ Φa + c0 ) , (142) 2 where c0 is defined in (12), and define the initial data for (61)–(64) associated with (, ς, Φa , Φb ) to be u(0, ·) = 0, (143) (∂τ u)(0, ·) = 2np − 2t0 trς, (144) u i (0, ·) = 0, (145) 1 ij (146) (∂τ u l )(0, ·) = t0 (2∂i jl − ∂l i j ), 2 h i j (0, ·) = i j , (147) (∂τ h i j )(0, ·) = 2t0 ςi j − 2 pi j , (148) ψ(0, ·) = Φa − Φa , (149) 2 (150) (∂τ ψ)(0, ·) = t0 Φb − , λ where all the indices are with respect to the standard frame {∂i } of the tangent space on Tn . Lemma 17. Let n ≥ 3 be an integer and let p > 1. Let V (φ) be given by (4), where V0 > 0 and λ is given by (11). Let (, ς, Φa , Φb ) be given on Tn , where is a smooth Riemannian metric, ς is a smooth symmetric covariant 2-tensor and Φa , Φb are smooth functions. Then (, ς, Φa , Φb ) determine initial data for (24)–(25) according to (28)– (33). Choosing t0 to be the initial time associated with (, ς, Φa , Φb ), the initial data (28)–(33) for (24)–(25) transform to the initial data (143)–(150) for (61)–(64) under the transformation (57)–(60).

192

H. Ringström

Proof. The lemma follows by straightforward computations. Note, however, that in the current setting Fl (t0 , ·) = 0 and F0 (t0 , ·) = nωg00 (t0 , ·) = −np/t0 . Furthermore, φ0 (t0 ) = Φa by definition. In what follows, we shall use the notation K = ln[4(t0 )],

(151)

where (t0 ) is defined in (13). Note that, using the convention K = τ0 + K 0 , where τ0 = ln t0 , we have K 0 = ln

4 . p−1

In other words, K 0 only depends on p, so that e K 0 and e−K 0 can be treated as constants of which we need not keep track. Theorem 4. Let n ≥ 3 be an integer and let p > 1. Let V (φ) be given by (4), where V0 > 0 and λ is given by (11). Let (, ς, Φa , Φb ) be given on Tn , where is a smooth Riemannian metric, ς is a smooth symmetric covariant 2-tensor and Φa , Φb are smooth functions. Define initial data for (61)–(64) according to (143)–(150) where τ0 = ln t0 and t0 is given by (142). Assume there is a constant c1 > 2 such that 2 2 c1 |v| ≤ e−2K h i j (0, x)v i v j ≤ |v|2 c1 2

(152)

for all v ∈ Rn and x ∈ Tn , where K is given by (151). Let k0 > n/2 + 1 and a be given by (101). There is an 0 > 0 and a cb ∈ (0, 1), where 0 and cb should be small enough, depending on n, k0 , p and c1 such that if 1/2 Hˆ k0 (0) ≤ cb ,

(153)

for some ≤ 0 , then the solution to (61)–(64) exists for all future times and (90)–(92) (where η is given by (100)) and 1/2 Hˆ k0 (τ ) ≤

(154)

are satisfied for all τ ≥ 0. Remark 15. Note that a does not appear in Hˆ k0 (0). Proof. Note that p > 1, c1 > 1, K 0 = K − τ0 , τ0 , n ≥ 3 and k0 > n/2 + 1 have already been specified. Let 0 < ≤ 1 and let A denote the set of s ∈ [0, ∞) such that (in the conditions below, we abuse notation by consistently using τ -time, cf. the remark following Lemma 4) – (g, ψ) satisfy the main bootstrap assumption on I = [0, s). – (g, ψ) constitute a smooth solution to (61)–(64) on I × Tn with initial data as specified in (143)–(150) (where h i j , u i and u are defined in terms of g according to (57)–(59) and ψ is related to φ according to (60)).

Power Law Inflation

193

Note that if s ∈ A, then the conditions necessary for deriving the different inequalities above are satisfied on [0, s). Note that (61)–(64) are equivalent to (24)–(25) and that initial data specified by (143)–(150) correspond to initial data defined by (28)–(33). Since Proposition 1 of [20] applies to Eqs. (24)–(25) with initial data given by (28)–(33), we obtain a unique smooth solution to (61)–(64) on some time interval (Tmin , Tmax ). Assume cb ≤ 1/2. Then (154) is satisfied with a margin for τ = 0, so that it is satisfied on an open time interval containing 0. Since (152) holds and since u(0, ·) = 0 and u i (0, ·) = 0, we conclude that (90)–(92) are satisfied on an open interval containing 0. In particular, there is a T > 0 such that T ∈ A. Assume 0 < T < ∞ is such that T ∈ A. Due to the bootstrap assumptions and the equations, we conclude that u, u i , h i j , and φ do not blow up in C 2 . Furthermore, g00 and the smallest eigenvalue of {h i j } stay bounded away from zero due to (90) and (91). Due to Proposition 1 of [20], we conclude that T < Tmax . As a consequence, we have a smooth solution beyond T , and the bootstrap assumptions (90)–(92) together with (154) hold on [0, T ]. The above arguments lead to the conclusion that A is closed (note that it is connected by definition). All that remains to be proved is that A is open. This would yield the conclusion that A = [0, ∞). Let T ∈ A. That there exists a solution beyond T is clear from the above. We need to prove that we can improve the bootstrap assumptions in [0, T ). Due to (154) and Sobolev embedding, we obtain, cf. (107), eaτ −2K ∂τ h i j ∞ ≤ C. Consequently, e−2 pτ −2K gi j (τ, ·) − e−2K gi j (0, ·)∞ ≤ Ca −1

(155)

for all τ ∈ [0, T ). By assuming to be small enough, we obtain an improvement of (90). By assuming to be small enough, we also obtain improvements of (91) and (92), due to the definition of the energies, (105), (106) and Sobolev embedding. Finally, we need to improve (154). Note that in [0, T ), the conditions of Lemma 16 are satisfied so that (139)–(141) hold in this interval. Note also that e−K 0 only depends on p. Thus, in [0, T ), we have d Hˆ lp,k0 1/2 ≤ −2a Hˆ lp,k0 + C 2 e−aτ Hˆ lp,k0 . dτ This inequality implies 1 1/2 1/2 Hˆ lp,k0 (τ ) ≤ e−aτ Hˆ lp,k0 (0) + Cτ e−aτ 2 2 for all τ ∈ [0, T ). We obtain 1/2 Hˆ lp,k0 (τ ) ≤ Clp (cb + 2 )e−aτ/2 .

In order to get an estimate for Hˆ m,k0 , let us define C −aτ (e f = exp − 1) , a

(156)

194

H. Ringström

where C is the first constant appearing on the right-hand side of (141) for k = k0 . Note that exp(−C/a) ≤ f ≤ 1 for all τ ∈ [0, T ). Furthermore, since we can assume that 1/2 Hˆ m,k0 ≤ 1, we can estimate Hˆ m,k0 ≤ Hˆ m,k0 . If we let H˜ m,k = f Hˆ m,k , and use (156), then (141) yields d H˜ m,k0 1/2 ≤ [Ccb e−aτ/2 + C 2 e−aτ/2 ] f 1/2 H˜ m,k0 ≤ Ccb 2 e−aτ/2 + C 3 e−aτ/2 , dτ so that Hˆ m,k0 (τ ) ≤ eC/a Hˆ m,k0 (0) + Ca −1 eC/a [ 3 + cb 2 ]. We obtain Hˆ m,k0 (τ ) ≤ Cm (cb + 2 ),

(157)

assuming cb ≤ 1. Consider (140). We have d Hˆ s,k0 1/2 1/2 ≤ −2a Hˆ s,k0 + Cs (cb + 3/2 ) Hˆ s,k0 . dτ We see that the right-hand side is negative if 2a Hˆ s,k0 > Cs (cb + 3/2 ). 1/2

1/2

By assuming cb and to be small enough, depending only on Clp , Cm and Cs , we conclude that 1 1/2 Hˆ k ≤ 3 holds in [0, T ). Consequently, A is open and the theorem follows.

Theorem 5. Consider a solution to (61)–(64) corresponding to smooth initial data satisfying the conditions of Theorem 4, with k0 given by the smallest integer strictly larger than n/2 + 1. Then, for every k, there is a constant Ck such that 1/2 Hˆ k (τ ) ≤ Ck

(158)

for all τ ≥ 0. Proof. Since the conditions required for deriving the differential inequalities are satisfied for the entire future, we have (139)–(141) for all k and all τ ≥ 0. Let us define H˜ s,k = e−aτ/2 Hˆ s,k , H˜ lp,k = eaτ/2 Hˆ lp,k . Then d H˜ s,k 1/2 1/2 1/2 1/2 1/2 ≤ −2a H˜ s,k + Ce−aτ/4 ( Hˆ lp,k + Hˆ m,k ) H˜ s,k + Ce−5aτ/4 Hˆ k H˜ s,k , dτ d H˜ lp,k 1/2 1/2 ≤ −a H˜ lp,k + C −3aτ/4 H˜ lp,k Hˆ k . dτ

Power Law Inflation

195

Due to these inequalities and (141), we obtain dHk 1/2 ≤ Ce−aτ/4 Hk + C Hˆ lp,k0 Hˆ m,k , dτ

(159)

where Hk = H˜ lp,k + H˜ s,k + Hˆ m,k . 1/2 Due to the fact that Hˆ m,k0 is bounded for all τ ≥ 0 and the fact that (159) holds, we conclude that

dHk0 1/2 ≤ Ce−aτ/4 Hk0 . dτ 1/2 Thus Hk0 is bounded. Consequently, Hˆ lp,k0 ≤ Ce−aτ/4 , which, in combination with (159), yields

dHk ≤ Ce−aτ/4 Hk . dτ Consequently, Hk is bounded for all k. This leads to the conclusion that Hˆ lp,k and Hˆ m,k are both bounded. If we insert this information into (140), we get d Hˆ s,k 1/2 ≤ −2a Hˆ s,k + Ce−aτ Hˆ s,k + C Hˆ s,k . dτ By assuming τ to be great enough, the second term on the right-hand side can be absorbed in the first. The inequality that results immediately implies that Hˆ s,k is bounded, since it implies that Hˆ s,k decays as soon as it exceeds a certain value. The theorem follows. 9. Causal Structure Recall the outline of the proof of Theorem 2 given in the beginning of Subsect. 1.6. In the course of the proof of this theorem, we are interested in the future Cauchy development of a subset of the initial data on Tn on which the constraint equations are satisfied. The purpose of Proposition 1 below is to yield quantitative control of this set, which we referred to as the global patch in the outline. In Proposition 2, we then prove future causal geodesic completeness. Proposition 1. Consider a Lorentz manifold of the type constructed in Theorem 4. Let γ be a future directed causal curve with domain [s0 , smax ) such that γ 0 (s0 ) = t0 , where t0 is as in Theorem 4. If the appearing in the assumptions of Theorem 4 is small enough (depending only on n, p and c1 ), then γ˙ 0 > 0 and the length of the spatial part of the curve with respect to the metric at t = t0 satisfies smax [gi j (t0 , γ )γ˙ i γ˙ j ]1/2 ds ≤ d()(t0 ), (160) s0

where d() is independent of γ , d() → 1 as → 0, (t0 ) is defined in (13) and γ = π ◦ γ , where π : [t0 , ∞) × Tn → Tn is given by π(t, x) = x. Finally, if γ is future inextendible, then γ 0 (s) → ∞ as s → smax .

196

H. Ringström

Remark 16. The time orientation is assumed to be such that ∂t is future directed and γ˙ µ is defined by the condition that γ˙ µ ∂µ = γ˙ , where ∂µ is the standard frame for the tangent space of R+ × Tn . The statement d() → 1 as → 0 can be improved to the statement: for any δ > 0, there is an 1 depending only on n, p, c1 and δ such that if ≤ 1 , then |d() − 1| ≤ δ. Proof. Due to causality, we have gµν γ˙ µ γ˙ ν ≤ 0.

(161)

The condition that the curve be future directed is equivalent to g00 γ˙ 0 + g0i γ˙ i < 0.

(162)

Let us work out the consequences of this. Due to (92), we have |2g0i γ˙ 0 γ˙ i | ≤ η1/2 |γ˙ 0 |2 + η−1/2 |g0i γ˙ i |2 ≤ η1/2 |γ˙ 0 |2 + η1/2 c1−1 e2Ω+2K −2r δi j γ˙ i γ˙ j . Note that the t appearing in e.g. Ω is given by γ 0 (s). Note, furthermore, that, due to (105), (106) and (154), we can replace η in (91) and (92) by C, where C only depends on n, p and c1 . Since the last term can be bounded by η1/2 gi j γ˙ i γ˙ j , due to (90), we obtain gi j γ˙ i γ˙ j ≤ c(η)γ˙ 0 γ˙ 0 ,

(163)

where c(η) → 1 as η → 0+ and we have used (91) and (161). Due to (90), we conclude that δi j γ˙ i γ˙ j ≤ c1 c(η)e−2Ω−2K γ˙ 0 γ˙ 0 = c1 c(η)(t/t0 )−2 p e−2K γ˙ 0 γ˙ 0 .

(164)

Note that (155) can be rewritten (t/t0 )−2 p e−2K gi j (t, ·) − e−2K gi j (t0 , ·)∞ ≤ Ca −1 , where C only depends on n, p and c1 . Combining this observation with (164), we obtain |e−2K gi j (t0 , γ )γ˙ i γ˙ j − (t/t0 )−2 p e−2K gi j γ˙ i γ˙ j | ≤ Ca −1 c1 c(η)(t/t0 )−2 p e−2K γ˙ 0 γ˙ 0 . This observation, together with (163), yields e−2K gi j (t0 , γ )γ˙ i γ˙ j ≤ d 2 ()(t/t0 )−2 p e−2K γ˙ 0 γ˙ 0 ,

(165)

where d() → 1 as → 0+ (note that η → 0+ as → 0+). Consider (162). Note that |g0i γ˙ i | ≤ [e−2Ω−2K δ i j g0i g0 j ]1/2 [e2Ω+2K δi j γ˙ i γ˙ j ]1/2 ≤ ξ()|γ˙ 0 |, where ξ() → 0 as → 0+, due to (92) and (164). Assuming to be small enough (depending only on n, p and c1 ), we conclude that γ˙ 0 > 0, which yields the first conclusion of the proposition. Combining this observation with (165), we obtain (160). Finally, let γ be future inextendible and assume γ 0 does not tend to ∞. Since γ˙ 0 > 0, γ 0 has to converge to a finite number and thus, since we have (164), γ has to converge to a point on Tn . We have a contradiction.

Power Law Inflation

197

Proposition 2. Consider a spacetime of the type constructed in Theorem 4. Assuming the appearing in the assumptions of Theorem 4 to be small enough (depending only on n, p and c1 ), this spacetime is future causally geodesically complete. Proof. Let γ be a future directed causal geodesic and assume (smin , smax ) to be the maximal existence interval. In other words, γ is a map from (smin , smax ) into the spacetime satisfying γ = 0, and (smin , smax ) is the maximal existence interval of solutions to the corresponding equation. We shall use the notation t = γ 0 (s). Due to the equation for a geodesic, we have 0 µ ν γ¨ 0 + Γµν γ˙ γ˙ = 0.

(166)

Due to (154) and the algorithm, cf. Subsect. 9.1 of [20], 0 |Γ00 | ≤ Cωe−aτ , |Γ0i0 | ≤ Cωe pτ +K −aτ , |Γi0j − ωgi j | ≤ Cωe2 pτ +2K −aτ .

Consequently, Γi0j γ˙ i γ˙ j ≥ 0 for t large enough (or small enough). Due to these estimates and (164), we conclude that 0 0 0 |Γ00 γ˙ γ˙ | + 2|Γ0i0 γ˙ 0 γ˙ i | ≤ Cωe−aτ |γ˙ 0 |2 ,

where C only depends on n, p and c1 . Combining these pieces of information with (166), we obtain γ¨ 0 ≤ Cωe−aτ γ˙ 0 γ˙ 0 = Cpt −1

−a t γ˙ 0 γ˙ 0 t0

for s ≥ s1 . Since γ˙ 0 > 0, assuming to be small enough (depending only on n, p and c1 ), we can divide by γ˙ 0 in this equation and integrate in order to obtain (recall that t = γ 0 (s)) γ˙ 0 (s) ln 0 ≤ Cp γ˙ (s1 )

s

t s1

−1

−a

−a γ 0 (s) t t 0 −1 γ˙ ds = Cp t dt ≤ Cpa −1 , t0 t0 γ 0 (s1 )

if we assume s1 to be large enough that γ 0 (s1 ) ≥ t0 . Thus γ˙ 0 is bounded to the future. Consequently, γ 0 (s) − γ 0 (s0 ) =

s

γ˙ 0 (s)ds ≤ C|s − s0 |.

s0

Since γ 0 (s) → ∞ as s → smax , we conclude that smax = ∞.

10. Asymptotic Expansions Let us derive conclusions concerning the asymptotic behaviour which are more detailed than (158).

198

H. Ringström

Proposition 3. Consider a spacetime of the type constructed in Theorem 4. Then, assuming to be small enough in this construction (depending on n, p, c1 and k0 ), there is a positive constant α > 0, a smooth Riemannian metric ρ on Tn and, for every l ≥ 0, a constant K l (depending on n, l, p and c1 ) such that for all t ≥ t0 , φ(t, ·) − 2 ln t + 1 c0 ≤ K l (t/t0 )−α , (167) λ λ C l (t∂t φ)(t, ·) − 2 ≤ K l (t/t0 )−α , (168) λ C l (g00 + 1)(t, ·)C l + (t∂t g00 )(t, ·)C l ≤ K l (t/t0 )−α , −1 1 jm t g0i (t, ·) − ρ γ jim l np − 2 p + 1

(169)

+ [t∂t (t −1 g0i )](t, ·)C l ≤ K l (t/t0 )−α ,

(170)

C

(t/t0 ) +(t/t0 )

−2 p −2K

e

−2 p −2K

e

gi j (t, ·) − ρi j C l

t∂t gi j (t, ·) − 2 pρi j C l ≤ K l (t/t0 )−α ,

(t/t0 ) e (t/t0 )

−2 p −2K

e

(171)

g (t, ·) − ρ C l ≤ K l (t/t0 )

−α

,

(172)

tki j (t, ·) − pρi j C l ≤ K l (t/t0 )

−α

,

(173)

2 p 2K i j

ij

where γ jim are the Christoffel symbols associated with the metric ρ and ki j (t, ·) are the components of the second fundamental form induced on the hypersurface {t} × Tn with respect to the standard vectorfields on Tn . Here · C l denotes the C l norm on Tn . Proof. Let us begin by observing that (due to (158), (104), (105) and (107)) u, ψ, u τ , ψτ and e−2K ∂τ h i j are decaying in any C l norm as e−aτ . As a consequence, (167)–(169) hold and there are smooth functions ρi j on Tn such that for every k ≥ 0, there is a constant K l such that e−2K h i j (τ, ·) − ρi j C l ≤ K l e−aτ for all τ ≥ 0. This leads to the conclusion that (171) holds. Furthermore, e−2K h i j is bounded in any C l norm. Consider e2K ∂τ (e2 pτ g i j ) = 2 pe2 pτ +2K g i j − e2 pτ +2K g iµ g jν ∂τ gµν . Using the algorithm, one can conclude that the right-hand side is bounded by e−aτ in any C l norm. In other words, there are smooth functions ρ i j on Tn such that e2 pτ +2K g i j (τ, ·) − ρ i j C l ≤ K l e−aτ . From the above, we conclude that ρ i j ρ jk = δki (note that g i0 g0m converges to zero due to the algorithm), so that ρi j are the components of a Riemannian metric on Tn and ρ i j are the components of the inverse of the matrix with components ρi j . Furthermore, (172) holds. If we let γlim denote the Christoffel symbols of ρ, we obtain, in particular, that (glm Γlim )(τ, ·) − ρ lm γlim C l ≤ K l e−aτ .

(174)

Power Law Inflation

199

We wish to improve our knowledge concerning u i . Consider (62). Note that a term of the form −2 peτ +τ0 glm Γlim appears in this equation. Since, by the above observations, glm Γlim converges to something which is not necessarily zero, it is clear that this object may tend to infinity. It therefore seems natural to rescale the equation and to introduce uˆ i = e−τ −τ0 u i . Using (62), we obtain ˆ g uˆ i + αˆ 2 ∂τ uˆ i + βˆ2 uˆ i 4 − 2 pglm Γlim − ∂i ψ + e−τ −τ0 0i − (g 00 + 1)(2∂τ uˆ i + uˆ i ) − 2gˆ 0 j ∂ j uˆ i = 0, (175) λ where αˆ 2 = α2 + 2 = np + 1, βˆ2 = β2 + α2 + 1 = p(n − 2)(2 p − 1) + np. Note that the first three terms on the left-hand side of (175) are such that Lemma 7 applies. In particular, there are strictly positive constants γˆs , δˆs , ζˆs and ηˆ s as specified in Lemma 7. Assuming to be small enough in the original construction of the development, we are allowed to use the conclusions of Lemma 7 as well as the conclusions of Corollary 2. In particular, we can define an energy as described in Corollary 2, El = Eγˆs ,δˆs [∂ α uˆ i ]. i

|α|≤l

Note that uˆ i (0, ·) = 0 and that (∂τ uˆ i )(0, ·) = e−τ0 (∂τ u i )(0, ·). Considering (146), it is clear that this object is small in H k0 . As a consequence, Ek0 (0) is also small. Due to Lemma 7, we have ˆζs [(∂ α ∂τ uˆ i )2 + gˆ lm ∂ α ∂l uˆ i ∂ α ∂m uˆ i + (∂ α uˆ i )2 ]d x ≤ El . i

n |α|≤l T

Due to Corollary 2, we have dEl 1/2 ˆ g , ∂ α ]uˆ i 2 + Ce−aτ El , ≤ −2ηˆ s El + CEl ∂ α Fˆi + [ dτ |α|≤l

(176)

i

where the constant depends on an upper bound on e−K 0 and we have argued similarly to the proof of Lemma 15 to deal with the term arising from E,γˆs ,δˆs [∂ α uˆ i ] (note that the only difference between proving the estimate needed for (176) and proving (137) is that the constants γ and δ are different, something which does not affect the arguments) and Fˆi is given by 4 Fˆi = 2 pglm Γlim + ∂i ψ − e−τ −τ0 0i + (g 00 + 1)(2∂τ uˆ i + uˆ i ) + 2gˆ 0 j ∂ j uˆ i . λ

200

H. Ringström

Note that the first and the second terms in Fˆi are bounded in any C l norm (in fact, the second term is exponentially decaying in any C l norm). Since (g 00 + 1)(τ, ·)C m ≤ Cm e−aτ , gˆ 0 j (τ, ·)C m ≤ Cm e−H τ −K 0 −aτ for any m, we conclude that the fourth and fifth terms in Fˆi can be bounded by Ce−aτ El . 1/2

(177)

Thus 1/2 Fˆi (τ, ·) H l ≤ K l (1 + e−aτ El ) + e−τ −τ0 0i (τ, ·) H l

and 1/2 Fˆi (τ, ·) − 2 pglm Γlim H l ≤ K l e−aτ (1 + El ) + e−τ −τ0 0i (τ, ·) H l .

Consider (67). The first term appearing on the right-hand side of this expression can, after multiplication with e−τ −τ0 , be estimated by the expression appearing in (177), so that e−τ −τ0 0i (τ, ·) H l ≤ Ce−aτ El

1/2

˜ 0i (τ, ·) H l . + eτ +τ0

˜ 0i . The expression ˜ 0i is given in (52). The third term on the We thus focus on eτ +τ0 right-hand side is, after multiplication with t = eτ +τ0 , given by −2∂τ ψ∂i ψ, an object which decays exponentially. The fourth term, after multiplication by t, is given by 2 p(np − 1)λψ uˆ i , so that it can be estimated by Ce−aτ El . Since the fifth term can be written, after multiplication by t, 1/2

−

4t 2 E,φ uˆ i n−1

and t 2 E,φ is exponentially decaying with respect to any C m norm as e−2aτ , we get a similar estimate for it. Thus ˜ 0i (τ, ·) H l ≤ K l e−aτ (1 + E1/2 ) + 2t A,0i + C,0i H l , eτ +τ0 l where A,0i and C,0i are given in (87) and (93) of [20] respectively. Assume a term in t A,0i or tC,0i contains a factor g0i . If we extract t −1 g0i = uˆ i from this term, what remains is t 2 times an expression to which we can apply the algorithm with l ≥ 1, lh = 0, l∂ = 2 (sometimes l∂ may be less than 2, but this will then be compensated for by a corresponding number of factors of ω). By the algorithm, the factor multiplying uˆ i can thus be estimated by 1/2 t 2 K m ω2 e−aτ Eˆ m ≤ Ce−aτ

in H m , and the corresponding term can be estimated by Ce−aτ El . 1/2

(178)

Power Law Inflation

201

Note that g 0i = −

1 ij g g0 j , g00

so that a term appearing in t A,0i or tC,0i which contains a factor of g 0i can also be estimated as in (178). Assume a term in t A,0i or tC,0i contains a factor ∂t g0i = ∂τ uˆ i + uˆ i . What remains of this term after extracting ∂t g0i is then t times something to which the algorithm can be applied with l ≥ 1, lh = 0 and l∂ = 1 (with the same caveat as before). Applying the algorithm, one sees that the original term can be estimated by (178). If a term contains a factor of the form ∂ j g0i , one can argue similarly to the above to conclude that it is bounded by Ce−H τ −K 0 −aτ ∂ j uˆ i H l , which in its turn is bounded by (178). If a term contains a factor of the form ∂i g00 , we can extract this term, and conclude, by the algorithm, that what remains is exponentially decaying so that the term we started with had to be exponentially decaying in any C m norm. The argument to deal with terms containing a factor of the form ∂i glm is similar. Since all the terms in A,0i and C,0i are such that each term falls into one of the categories described above, cf. (87) and (93) of [20], we obtain 2t A,0i + C,0i H l ≤ Ce−aτ (1 + El ). 1/2

To conclude, Fˆi (τ, ·) H l ≤ K l + K l e−aτ (1 + El ).

(179)

1/2 Fˆi (τ, ·) − 2 p(glm Γlim )(τ, ·) H l ≤ K l e−aτ (1 + El ).

(180)

1/2

Note also that

What remains to be estimated is ˆ g , ∂ α ]uˆ i 2 [ for |α| ≤ k. Estimate, using (158) and (111), ∂ α1 (∂ j gˆ 0m )∂ α2 ∂τ ∂m uˆ i 2 ≤ Ce−H τ −K 0 −aτ ∂τ uˆ i H l ≤ Ce−aτ El , 1/2

where |α1 | + |α2 | = |α| − 1 and the constant depends on an upper bound for e−K 0 . Similarly, we get an estimate ∂ α1 (∂ j gˆ lm )∂ α2 ∂l ∂m uˆ i 2 ≤ Ce−aτ El . 1/2

Consider ∂ α1 (∂ j g 00 )∂ α2 ∂τ2 uˆ i 2 ≤ Ce−aτ ∂τ2 uˆ i H l−1 . We have ∂τ2 uˆ i = −

1 0m lm ˆ ˆ 2 g ˆ ∂ ∂ u ˆ + g ˆ ∂ ∂ u ˆ − α ˆ ∂ u ˆ − β u ˆ + F τ m i l m i 2 τ i 2 i i . g 00

202

H. Ringström

Due to the estimates given above, we obtain ∂ α1 (∂ j g 00 )∂ α2 ∂τ2 uˆ i 2 ≤ Ce−aτ (1 + El ). 1/2

Thus ˆ g , ∂ α ]uˆ i 2 ≤ Ce−aτ (1 + E ). [ l 1/2

(181)

Inserting (179) and (181) into (176), we get dEl 1/2 ≤ −2ηˆ s El + CEl + Ce−aτ El , dτ which leads to the conclusion that El is bounded for all l, since it implies that El is decreasing after it has exceeded a certain value. Let us introduce u˜ i (τ, ·) = uˆ i (τ, ·) −

2 p lm ρ γlim . βˆ2

Then, ˆ g u˜ i + αˆ 2 ∂τ u˜ i + βˆ2 u˜ i = F˜i , where

(182)

2 p rq ρ γriq , F˜i = Fˆi − 2 pρ rq γriq + gˆ lm ∂l ∂m βˆ2

so that F˜i C m ≤ Cm e−aτ for all m due to (174) and (180). Note also that as a consequence of (182), the fact that uˆ i and ∂τ uˆ i are bounded in any C m norm and the fact that gˆ i j and gˆ 0i are exponentially decaying in any C m norm as e−H τ , we have gˆ lq ∂l ∂q u˜ i C m + gˆ 0l ∂l ∂τ u˜ i C m + (g 00 + 1)∂τ2 u˜ i C m ≤ Ce−aτ . Combining this observation with (182), we conclude that ∂τ2 u˜ i + αˆ 2 ∂τ u˜ i + βˆ2 u˜ i = F˜i , where F˜i satisfies the same kind of estimate as F˜i . By arguments similar to those used to prove Lemma 7, one can prove that u˜ i is exponentially decaying in every C m norm as well as ∂τ uˆ i = ∂τ u˜ i . We obtain (170). Finally, let us turn to the second fundamental form. Note that the future directed unit normal is given by N = −(−g 00 )−1/2 g 0µ ∂µ . Thus ki j = ∇∂i N , ∂ j = −∂i [(−g 00 )−1/2 g 0µ ]gµj − (−g 00 )−1/2 g 0µ Γi jµ .

Power Law Inflation

203

With the exception of 1 (−g 00 )1/2 ∂t gi j , 2 all the terms appearing in ki j can be estimated using the algorithm with lh = 2, l∂ = 1 and l ≥ 1, i.e. by Cωe2 pτ +2K −aτ Hˆ l

1/2

,

so that for every l ≥ 0, there is a constant Cl such that

−α (t/t0 )−2 p e−2K tki j − 1 (−g 00 )1/2 t∂t gi j (t, ·) ≤ Cl t . l 2 t0 C Since g 00 + 1 is exponentially decaying and (171) holds, we conclude that (173) holds. 11. Proof of the Main Theorem Proof (Theorem 2). Consider Tn to be [−π, π ]n with the ends identified. Construction of a global (in time) patch. Let us start by constructing a patch of spacetime which is essentially the development of the piece of the data over which we have some control. Let f c ∈ C0∞ [B1 (0)] be such that f c ( p) = 1 for | p| ≤ 15/16 and 0 ≤ f c ≤ 1. In order to apply Theorem 4, we need to define a Riemannian metric on Tn , a symmetric covariant 2-tensor and two functions. We define them by i j = f c ρi j ◦ x −1 + (1 − f c )e2K δi j , p ςi j = f c κi j ◦ x −1 + (1 − f c )e2K δi j , t0 Φa = f c φa ◦ x −1 + (1 − f c ) φa − f c (φa ◦ x −1 − φa ) 1 − f c −1 (1 − f c ), 2 Φb = f c φb ◦ x −1 + (1 − f c ) , λt0

(183)

where t0 and K are given by (15) and where the indices on the right-hand side refer to the coordinates x assumed to exist in the statement of the theorem, δi j are the components of the Kronecker delta and the indices on the left-hand side refer to the standard coordinates on Tn . The choice (183) requires some motivation. The last term is there to ensure that

Φa = φa

(184)

while, at the same time, ensuring that Φa equals φa ◦ x −1 in the set of interest. The reason it is of importance to have (184) is that it ensures that t0 defined in Theorem 2 coincides with t0 defined in Theorem 4. We can view (, ς, Φa , Φb ) as initial data on Tn . Given these data, we can define initial data for (61)–(64) by (143)–(150). Due to (16), e−2K − δ H k0 +1 = f c {e−2K ρ ◦ x −1 − δ} k +1 ≤ C. (185) H

0

204

H. Ringström

Furthermore, due to (16), e−2K t0 ς − pδ H k0 = f c {t0 e−2K κ ◦ x −1 − pδ} H k0 ≤ C, which implies 2e−2K t0 ς − 2 pe−2K H k0 ≤ C.

(186)

Since the object inside the norm in (186) corresponds to e−2K ∂τ h i j (0, ·), cf. (148), the estimates (185) and (186) imply that 1/2 Hˆ m,k0 (0) ≤ C.

Note also that due to (185), we have (152) for some suitable c1 > 2. Let us turn to Hˆ lp,k0 (0). Since u(0, ·) = 0, we only need concern ourselves with ∂τ u(0, ·) and the initial data for ψ. The initial data for ∂τ u is given by (144). Note that (185) implies that e2K i j − δ i j H k0 ≤ C,

(187)

assuming to be small enough, where i j are the components of the inverse of the matrix with components i j . Combining this observation with (186), we conclude that t0 i j ςi j − np H k0 ≤ C. 1/2 Thus the part of Hˆ lp,k0 coming from u is bounded by C, since the object appearing inside the norm is, up to a numerical factor, the right-hand side of (144). Turning to ψ,

ψ(0, ·) = Φa − Φa = f c (φa ◦ x −1 − φa )− f c (φa ◦ x −1 − φa ) 1− f c −1 (1 − f c ), so that, due to (16) (recall that φa = Φa = φ0 (t0 )), ψ(0, ·) H k0 +1 ≤ C. Consider ∂τ ψ(0, ·) = t0 Φb −

2 2 = f c t0 φb ◦ x −1 − . λ λ

Due to (16), we have ∂τ ψ(0, ·) H k0 ≤ C. The above estimates together imply Hˆ lp,k0 (0) ≤ C. 1/2 What remains to be considered is Hˆ s,k0 . Since u i (0, ·) = 0, we need only estimate

e−K ∂τ u i (0, ·) =

p − 1 1 lj t0 (2∂l ji − ∂i jl ), 4t0 2

cf. (146). Due to (185) and (187), we get e−K ∂τ u i (0, ·) H k0 ≤ C,

Power Law Inflation

205

so that 1/2 Hˆ s,k0 (0) ≤ C.

To conclude, 1/2 Hˆ k0 (0) ≤ C

where the constant depends on n and p. Note, furthermore, that c1 is numerical in the current setting, that K 0 only depends on p and that k0 only depends on n. As a consequence, we get the conclusions of Theorem 4, assuming to be small enough depending only on n and p. In particular, we get a solution, say (g, ¯ Φ), on (t− , ∞) × Tn . Note that we also get asymptotics as in the statement of Proposition 3. Note that the variables used in (34)–(37) are related to the variables in (61)–(64) according to (57)–(60). Using these relations, we get solutions to the original Eqs. (34)– (37). Furthermore, on B15/16 (0), the constraint equations are satisfied, and we have chosen the initial data in such a way that Dµ |t=t0 = 0 (cf. Lemma 17 and the comments made in Subsect. 2.1). Due to standard local existence and uniqueness results, cf. Proposition 1 of [20], we conclude that in D[{t0 } × B15/16 (0)], the solution (g, ¯ Φ) satisfies (5) and (6). If is small enough, Proposition 1 implies that (t− , ∞) × B5/8 (0) ⊆ D[{t0 } × B29/32 (0)],

(188)

where we increase t− if necessary. The reason for this is that, first of all, (185) and Sobolev embedding yield (here g¯i j (t0 , ·) = i j ) [4(t0 )]2 |v|2 ≤ d12 ()g¯i j (t0 , ·)v i v j for all v ∈ Rn , where d1 () → 1 as → 0. Due to (160), we then obtain smax [δi j γ˙ i γ˙ j ]1/2 ds ≤ d()d1 ()(t0 ). 4(t0 ) s0

For small enough, we thus get smax s0

[δi j γ˙ i γ˙ j ]1/2 ds ≤

9 , 32

which implies (188). Note that due to Lemma 3 of [20], see also the proof corresponding to the present one in Sect. 16 of [20], the sets U0,exc = D[{t0 } × B15/16 (0)], U1,exc = D[{t0 } × B29/32 (0)], U2,exc = D[{t0 } × B¯ 29/32 (0)], are open, open and closed subsets of (t− , ∞) × x(U ) respectively. Consequently, Wi,exc = (Id × x −1 )(Ui,exc ) for i = 0, 1, 2 are also open, open and closed respectively. Construction of a reference metric. In order to prove that the patches that we construct fit together to form a globally hyperbolic development, it is convenient to construct a reference metric. Let g˜ = (1 − f c ◦ x)(−dt 2 + ρ) + ( f c ◦ x)(Id × x)∗ g. ¯

206

H. Ringström

Here ρ is the Riemannian metric on Σ given by the initial data. Note that ∂t is timelike with respect to g¯ so that ∂t is timelike with respect to g. ˜ The hypersurfaces {s} × Σ are spacelike with respect to −dt 2 + ρ and with respect to (Id × x)∗ g¯ for s ∈ (t− , ∞) (where this metric is defined), so that they are spacelike with respect to g. ˜ As a consequence, g˜ is a Lorentz metric on (t− , ∞) × Σ, cf. Lemma 2. End of the proof. The argument required to finish the proof is essentially identical to the end of the corresponding proof in [20] and need not be repeated here (at one stage V (0) = 0 is used, but this can easily be circumvented by multiplying the corresponding term by a cut-off function). 12. Stability of Locally Spatially Homogeneous Spacetimes Let us first consider the case in which the background initial data are given by (G, g, k, φa , φb ), where G is a simply connected unimodular Lie group and the isometry group of the initial data contains the left translations in G. Many of the arguments are quite similar to the ones presented in [20], and we shall therefore sometimes only sketch them. One can define an orthonormal basis {ei } (with respect to the metric g) of the Lie algebra such that the components of k with respect to this basis, say ki j , are diagonal and such that there is a diagonal matrix ν i j with the property that [ej , ek ] = jkl ν li ei , where jkl is antisymmetric in all of its indices and 123 = 1. The reader interested in the details is referred to Sect. 17 of [20] (the momentum constraint (8) corresponds to the same condition as in [20] since Di φa = 0). Define n(0) = ν, θ (0) = tr g k, ˙ σi j (0) = ki j − θ (0)δi j /3, φ(0) = φa and φ(0) = φb . Define n, θ, σ, φ to be the solution to 3 3 1 θ˙ = − σ 2 + R − φ˙ 2 , (189) 2 2 2 φ¨ = − θ φ˙ − V (φ), (190) σ˙ lm = − θ σlm − slm , (191) 1 (192) n˙ i j = 2σ k(i n j)k − θ n i j , 3 where a parenthesis among indices denotes symmetrization and 1 slm = blm − (trb)δlm , 3 blm = 2n m i n il − (trn)nlm , 1 R = − n i j n i j + [trn]2 , 2 σ 2 = σi j σ i j , trn = δ n i j . ij

(193) (194) (195) (196) (197)

In these equations, indices are raised and lowered with δi j . In other words, there is no difference between indices upstairs and downstairs. Let (t− , t+ ) be the maximal existence interval. Note that (7) is equivalent to 2 2 θ − σ 2 + R = φ˙ 2 + 2V (φ), 3

(198)

Power Law Inflation

207

so that this equation holds for t = 0. Due to (191)–(192), the off diagonal components of n and σ , collected into one vector, say v, satisfy an equation of the form v˙ = Cv, so that n and σ remain diagonal in all of (t− , t+ ). Collecting all the terms in (198) on one side and differentiating, using (189)–(197), one obtains zero as a result, so that (198) is satisfied for all t ∈ (t− , t+ ). Finally, σ remains trace free. Using the above information, we can construct a spacetime metric as in [20], g¯ = −dt + 2

3

ai2 (t)ξ i ⊗ ξ i ,

(199)

i=1

on M = (t− , t+ ) × G, where the ξ i are the duals of the ei . Here ai (0) = 1 and 1 a˙ i = σi + θ, ai 3 where σi are the diagonal components of σi j . Define ei = ai−1 ei . Then e0 = ∂t and ei constitute an orthonormal frame for (M, g). ¯ Similarly to Sect. 17 of [20], one can prove that 1 g(∇ei e0 , e j ) = σi j + θ δi j 3 i is defined by [e , e ] = γ i e , then and that if γ jk j k jk i i γ jk = jkl n il .

We refer the interested reader to [20], Sect. 17, for a proof of these facts, cf. also the proof of Lemma 21.2 of [19]. Given this information, one can compute that the scalar curvature of the hypersurfaces {t} × G is given by (195). The Ricci curvature can be expressed in terms of the quantities n i j and θi j . In fact, in the current setting, the 00 components and the lm components of (5) read −θ˙ − θ i j θi j = φ˙ 2 − V (φ),

(200)

θ˙lm + θ θlm + 2n m i n il − n i j n i j δlm 1 + (trn)2 δlm − (trn)nlm = V (φ)δlm . 2

(201)

In fact, in these equations, the left-hand side of the first equation is the 00 component of Ric and the left-hand side of the second equation represents the lm components of Ric. The 0l-components of the left and right-hand sides of (5) vanish identically due to the setup; the 0l equations correspond to the momentum constraint (8) and in the current setting the momentum constraint is equivalent to the matrices with components n i j and θi j commuting, which is an immediate consequence of the fact that both these matrices are diagonal. Let us prove that (M, g, ¯ φ) is a solution of (5) and (6). That (6) holds is an immediate consequence of (190) due to the current geometric setup. To prove that (5) is satisfied all we need to prove is that (200)–(201) are satisfied. However, (200) is a consequence of (189) and (198); one simply uses (198) to eliminate R in (189). The equation (201) on the other hand can be divided into its trace part and its trace free part. Due to (191), we see that the equation corresponding to the trace free part of (201) holds. Furthermore,

208

H. Ringström

the equation corresponding to the trace part is a consequence of (189) and (198); one simply uses (198) to eliminate the expressions involving σi j in (189). Thus (5) and (6) are satisfied. That all the hypersurfaces {t} × G are Cauchy hypersurfaces in (M, g) ¯ follows by an argument which is identical to the proof of Lemma 21.4 of [19]. Finally, the initial data induced on {0} × G correspond to the data we started with. Analyzing the asymptotics. The asymptotics were already analyzed in [10], see also [11] for the situation with matter of Vlasov type, but since the analysis is, for our purposes, in some respects incomplete, we prefer to give a different analysis here. Definition 7. We refer to initial data for (189)–(192) satisfying the constraint (198) as Bianchi class A initial data if σ and n are diagonal matrices. If all the diagonal elements of n are non-zero and have the same sign, we shall say that the initial data are of Bianchi type IX. We shall here be interested in the case that the potential is given by √

V (φ) = V0 e−λφ ,

(202)

where V0 > 0 and λ ∈ (0, 2) are constants. We shall furthermore restrict our attention to Bianchi class A initial data and exclude Bianchi type IX (Bianchi IX corresponds to the universal covering group of the Lie group under consideration being isomorphic to SU(2)), so that 1 R = −n i j n i j + (trn)2 ≤ 0, (203) 2 where we use R(t) to denote the scalar curvature of the hypersurface {t} × G, and G is the unimodular Lie group under consideration. For convenience, we shall also drop the argument t most of the time. Lemma 18. Consider Bianchi class A initial data for (189)–(192) at t = 0 which is not of Bianchi type IX. If θ (0) > 0 and the maximal existence interval of the corresponding solution to (189)–(192) is (t− , t+ ), then t+ = ∞. Proof. Due to (189) and (203), we see that θ˙ ≤ 0. Due to (189) and (198), we see that ˙ 2 is bounded. Assuming t1 ∈ (0, t+ ) to be the first time such that θ (t1 ) = 0, we get, θ/θ for t2 ∈ (0, t1 ), 1 1 θ (0) − θ (t ) ≤ C|t2 |. 2

As t2 → t1 −, the left-hand side blows up whereas the right-hand side is bounded. As a consequence, θ (t) > 0 for all t ∈ (t− , t+ ). Due to (198), σi j and φ˙ are bounded to the future, so that φ cannot blow up in finite time. Considering (192), keeping the fact that θ and σi j are bounded in mind, we see that the n i j cannot blow up in finite time. Global existence follows. It will be of interest to note that many of the conclusions hold using only (189), (190), (198) and the assumptions that R ≤ 0 and that σ 2 ≥ 0. Lemma 19. Assume we have a solution to (189), (190), (198) on (t− , ∞) where t− < 0 and R and σ 2 are functions satisfying R ≤ 0 and σ 2 ≥ 0 on this interval. If, furthermore, ˙ θ (0) > 0, then 0 < θ (t) ≤ θ (0) for all t ≥ 0, there is a T ≥ 0 such that φ(t) > 0 for all t ≥ T , θ ∈ / L 1 ([0, ∞)) and lim φ(t) = ∞.

t→∞

Power Law Inflation

209

Proof. The proof that θ has to remain positive is identical to the one presented in the proof of Lemma 18. Since θ˙ ≤ 0 due to (189) and the assumptions, the first conclusion follows. Note that V (φ) < 0 so that if φ˙ ≤ 0, then, due to (190), φ¨ > 0. Since −V (φ) has a positive lower bound on sets of the form (−∞, ϕ0 ) for ϕ0 ∈ R, we conclude that φ˙ must, sooner or later, become positive and then, due to (190), it will stay positive. Assuming φ to be bounded from above, we conclude that it has to converge to a finite number. As a consequence, −V (φ) = λV (φ) ≥ cmin > 0. As long as φ˙ <

cmin , θ (0)

we get φ¨ > 0, so that φ˙ will in the end have a positive uniform lower bound. We conclude that φ → ∞, a contradiction. Thus φ is not bounded from above. In fact, φ → ∞. Note that (198) and the assumptions imply that φ˙ is bounded. Since θ and V (φ) are bounded, (190) thus implies that φ¨ is bounded. Since, due to (189), φ˙ 2 is integrable, we ˙ > 0 for t ≥ T and let conclude that φ˙ converges to zero. Let T be chosen so that φ(t) t q(t) = θ (s)ds. 0

Then, due to (190), d q ˙ = −V (φ)eq > 0, e φ˙ = eq (φ¨ + θ φ) dt so that ˙ ˙ ≥ (eq φ)(T )>0 (eq φ)(t) for all t ≥ T . Since φ˙ converges to zero, we conclude that θ ∈ / L 1 ([0, ∞)).

Lemma 20. Assume we have a solution to (189), (190), (198) on (t− , ∞) where t− < 0 and R and σ 2 are functions satisfying R ≤ 0 and σ 2 ≥ 0 on this interval. If, furthermore, θ (0) > 0, then σ2 − R = 0, t→∞ θ2 φ˙ λ = , lim t→∞ θ 3 lim

V 1 λ2 = − . 2 t→∞ θ 3 18 lim

(204) (205) (206)

˙ Proof. Let T0 be such that φ(t) > 0 for all t ≥ T0 . In the present proof, we shall consistently assume that t ≥ T0 . Using (189) and (198), we obtain (one simply uses (198) to eliminate the expression involving R)

φ˙ V θ 1 V σ2 d V ˙ = φ −λ + 2 , + 2 − + dt θ 2 θ2 θ φ˙ 3 θ 2 θ 2

210

H. Ringström

an equation which should be compared with (11) of [17], cf. also the proof of Theorem 4, pp. 1660–1661 of [17]. Since 2x +

√ 2α ≥4 α x

for all α ≥ 0 and x > 0, we obtain (note that 1/3 − V /θ 2 ≥ 0 due to (198)) 1/2

V d V 1 V σ2 ≥ φ˙ 2 −λ + 4 + . − dt θ 2 θ 3 θ2 θ2 Say, for the sake of argument, that 1 λ2 V ≤ − − 2 θ 3 16 for some > 0 and for all t ≥ T . Then V σ2 λ2 1 − 2+ 2 ≥ + 3 θ θ 16 for all t ≥ T , so that there is a constant C() > 0 such that

d V V ≥ C()φ˙ 2 2 dt θ θ for all t ≥ T . Integrating this differential inequality, we obtain

V V (t) ≥ (T ) exp{C()[φ(t) − φ(T )]}. θ2 θ2 Due to Lemma 19, φ → ∞, so that V /θ 2 → ∞, which contradicts (198). Due to the above arguments, once V /θ 2 has exceeded 1/3 − λ2 /16 − it will not decay below that to the future. To conclude: for any > 0, there is a T such that V 1 λ2 − − ≥ θ2 3 16

(207)

holds for t ≥ T . Using (189), (190) and (198) (in the expression that appears, one simply uses (198) to eliminate σ 2 ), we obtain

d φ˙ φ˙ V φ˙ R =θ 2 λ−3 +θ . (208) dt θ θ θ θ θ2 Since R ≤ 0 by assumption, we conclude that if φ˙ λ ≥ + θ 3 for some > 0 and for all t ≥ T , then

d φ˙ dt

θ

≤ −3θ

V θ2

Power Law Inflation

211

for all t ≥ T . Since V /θ 2 has a uniform positive lower bound, due to (207) and the fact ˙ → −∞, that λ2 < 2, and since θ ∈ / L 1 ([0, ∞)), due to Lemma 19, this implies that φ/θ ˙ ˙ contradicting (198). Since the time derivative of φ/θ is negative for φ/θ > λ/3, there is, for every > 0, a T such that φ˙ λ ≤ + θ 3 for all t ≥ T . Define

S=

2 2 θ − φ˙ 2 − 2V eλφ , 3

a quantity which should be compared with S˜ defined in (3.1) of [10]. Then, using (198) to eliminate R from (189),

φ˙ 2 4 dS = −θ −λ S − θ σ 2 eλφ . dt 3 θ 3 ˙ ≤ λ/3 + and λ2 < 2, we have Since φ/θ φ˙ 2 2 λ2 −λ ≥ − − λ = η > 0 3 θ 3 3 for small enough. Thus dS ≤ −η θ S dt

(209)

for t ≥ T so that S → 0 since θ ∈ / L 1 ([0, ∞)). Note that V /θ 2 is bounded from below and from above by positive constants. Thus the same is true of θ 2 eλφ . Since S converges to zero, we thus conclude that

2 φ˙ 2 2V lim − 2 − 2 = 0. t→∞ 3 θ θ Combining this with (198), we conclude that (204) holds. Combining this observation with (208), the fact that V /θ 2 has a positive lower bound and the fact that θ ∈ / L 1 ([0, ∞)), we conclude that (205) must hold. Combining (198), (204) and (205), we obtain (206). Lemma 21. Assume we have a solution to (189), (190), (198) on (t− , ∞), where t− < 0 and R and σ 2 are functions satisfying R ≤ 0 and σ 2 ≥ 0 on this interval. If, furthermore, θ (0) > 0, then there are constants C, cai and β > 0 such that for t ≥ 1, φ − 2 ln t + c0 ≤ Ct −β , (210) λ λ t φ˙ − 2 ≤ Ct −β , (211) λ

212

H. Ringström

where c0 is the constant defined in (12). Assuming, furthermore, that

1 a˙ i = σi + θ ai , 3

(212)

where the σi are functions such that 3

σi2 ≤ σ 2 ,

(213)

i=1

we have

ai (t) 2 −β ln a (0) − λ2 ln t − cai ≤ Ct , i t a˙ i 2 −β a − λ2 ≤ Ct .

(214) (215)

i

Proof. Let us introduce a new time coordinate t τ (t) = θ (s)ds.

(216)

0

Note that τ → ∞ as t → ∞ due to Lemma 19. Furthermore dτ = θ. dt Due to (209), we conclude that S converges to zero exponentially in τ -time. In other words, there are constants C and α > 0 such that 2 φ˙ 2 2V − − 2 ≤ Ce−ατ 3 θ2 θ ˙ converges for τ ≥ 0. Combining this fact with (208) and (198), we conclude that φ/θ ˙ − λ/3); for to λ/3 exponentially in τ -time. To see this, derive an equation for eατ (φ/θ α > 0 small enough the resulting equation implies that this quantity has to converge to zero. As a consequence, V /θ 2 converges to 1/3 − λ2 /18 exponentially. Compute

θ˙ λ2 d 3 σ2 1 R 3 φ˙ 2 λ2 ln θ = 2 = − 2 + − . (217) − − dτ θ 2θ 2 θ2 2 θ2 9 6 By the above observations, we have θ (τ ) λ2 −ατ ln θ (0) + 6 τ − cθ ≤ Ce for some suitably chosen cθ , where we have abused notation by writing θ (τ ) when we ˜ (t)] = θ (t). Letting r (τ ) should in fact write θ˜ (τ ), where θ˜ is the function such that θ[τ be the expression inside the absolute value signs, we obtain

2 λ θ (τ ) = θ (0) exp − τ + cθ + r (τ ) . 6

Power Law Inflation

213

Since dt/dτ = 1/θ , this leads to t (τ ) =

1 θ (0)

τ

exp 0

λ2 s − cθ − r (s) ds. 6

Combining θ (0) and cθ into one constant, say c1 , this leads to

2

2 τ λ 6 λ t (τ ) = s + c1 − r (s) ds = 2 exp τ + c1 1 + O(e−ατ ) exp 6 λ 6 0 for τ ≥ 0, where 0 < α < λ2 /6. As a consequence, τ=

6 ln t + c2 + O(t −β ) λ2

˙ converges to λ/3 expofor t ≥ 1 and some constants β > 0 and c2 . Since φτ = φ/θ nentially, we conclude that 2 ln t + c3 + O(t −β ) λ for t ≥ 1 and some constant c3 . Note that, cf. (217), 2 θ˙ λ d (tθ ) = 1 + tθ 2 = 1 + tθ − + O(e−ατ ) dτ θ 6 2

λ 6 −ατ − + O(e = tθ − 2 ) + O(e−ατ ). λ 6 φ=

(218)

˙ Thus tθ converges to 6/λ2 exponentially so that (211) holds for t ≥ 1, since φ/θ converges to λ/3 exponentially. Due to (212) and (213), we have t

τ

ai (t) σi 1 1 1 ln σi + θ ds = = + dτ = τ + c1,i + O(e−ατ ) ai (0) 3 θ 3 3 0 0 2 = 2 ln t + c2,i + O(t −β ), λ yielding (214), and (215) follows from 2 σi 1 t a˙ i = tθ + tθ = 2 + O(t −β ) ai θ 3 λ for t ≥ 1. What remains to be proved is (210). Consider (190). Let us introduce c0 2 ln t + . λ λ At this stage, we only know that ψ converges to a constant, with an error of the form O(t −β ), cf. (218), and that (211) holds. Compute, using the fact that p = 2/λ2 in the present situation,

2 2 2 ¨ 2 −2 2(3 p − 1) p −λψ ˙ t ψ = t −θ ψ − θ + λV0 t e + 2 λt 2V0 λt 2 = −tθ t ψ˙ − (tθ − 1) + λ(3 p − 1) pe−λψ λ

2 6 6 2 tθ − 2 − = −tθ t ψ˙ − − 1 (1 − e−λψ ). λ λ λ λ2 ψ =φ−

214

H. Ringström

The first two terms on the right-hand side are O(t −β ) due to (211) and the fact that tθ converges to 6/λ2 with an error of the order of magnitude t −β . If the constant c3 appearing in (218) is −c0 /λ we are done, so let us assume not. Then the above shows that t 2 ψ¨ = α0 + O(t −β ) for some α0 = 0. Since t ψ˙ = O(t −β ), we conclude that ˙ = α0 + O(t −β ). t∂t (t ψ) Integrating this equality from T ≥ 1, we get ˙ ˙ ) + α0 ln t ψ(t) = T ψ(T

t + O(1). T

Since everything in this equation is bounded except for ln(t/T ), we get a contradiction, and the lemma follows. Let us assume the initial data are specified on H3 and that they are invariant under the isometry group of the corresponding canonical metric. By arguments similar to those given in Sect. 17 of [20], the initial data for the metric and second fundamental form can be assumed to be of the form g = α 2 gH3 and k = αβgH3 for positive constants α and β and it is enough to consider metrics of the form g¯ = −dt 2 + a 2 (t)gH3

(219)

on I ×H3 for some open interval I . Using the formulas (1)–(3), p. 211 of [14] to compute the Ricci tensor, one concludes that (5) and (6) in the current situation are equivalent to

2 1 a˙ a¨ 1 6 a˙ = − φ˙ 2 + V (φ), 6 − 2 = φ˙ 2 + 2V (φ), φ¨ + 3 φ˙ + V (φ) = 0. (220) a 3 3 a a a The first and the last of these equations can be used as evolution equations given initial data. Collecting all the terms in the middle equation on the right-hand side and denoting the result f , one can compute, using the first and the last equation, that f˙ is a multiple of f . Since f (0) = 0 (this is simply the Hamiltonian constraint), one obtains f = 0, where the solution exists. Letting R = −6/a 2 (this is simply the scalar curvature of the hypersurfaces {t} × H3 ), θ = 3a/a ˙ (this is simply the trace of the second fundamental form of the hypersurfaces {t} × H3 ) and σ 2 = 0, one can compute, using (220), that (189), (190) and (198) hold in the present setting. By arguments similar to the proof of Lemma 18, one can prove that future global existence holds. Since, in our case, a˙ = θa/3, we are thus allowed to use the conclusions of Lemma 21 with ai = a and σi = 0. Finally, consider the case that the initial data are specified on H2 × R and are invariant under the isometry group of the corresponding canonical metric. Then, by the same argument that was presented in Sect. 17 of [20], the initial data can be assumed to be of the form g = a02 gH2 + b02 dz 2 , k = a1 a0 gH2 + b1 b0 dz 2 , and it is enough to consider metrics of the form g¯ = −dt 2 + a 2 (t)gH2 + b2 (t)dz 2 .

(221)

Power Law Inflation

215

When computing the Ricci curvature of (221), it is convenient to note that the spacetime (I × H2 × R, g), ¯ where I is an open interval, can be viewed as a warped product with warping function a and B = I × R, g B = −dt 2 + b2 (t)dz 2 , F = H2 , g F = gH2 , using the terminology of [14], pp. 204-211. One can compute that (5) is equivalent to

2 a˙ a˙ b˙ 1 b¨ a¨ a˙ b˙ + + +2 = V (φ), (222) − 2 = V (φ), a ab a a b ab

2 a˙ 1 a˙ b˙ 1 − 2 = φ˙ 2 + V (φ). +2 (223) a ab a 2 Equation (6) for the scalar field turns into φ¨ + θ φ˙ + V (φ) = 0,

(224)

˙ is the trace of the second fundamental form of the hypersurfaces where θ = 2a/a ˙ + b/b {t} × H2 × R (note that we shall assume θ (0) > 0 in what follows). We evolve the initial data using the evolution eq. (222) and (224). If we collect all the terms in (223) on the lefthand side, denote the resulting function f , then (222) and (224) imply that f˙ = −2θ f , so that f vanishes, where the solution is defined, since f (0) = 0 due to the fact that the initial data satisfy the Hamiltonian constraint the development √ (7). As a consequence, √ ˙ satisfies (5) and (6). Let us introduce σ = 2(a/a ˙ − b/b)/ 3 and R = −2/a 2 (this is the scalar curvature of the hypersurfaces {t} × H2 × R). Then (223) takes the form (198). Using (222) and (223), one can prove that (189) holds. Finally, note that (224) and (190) coincide in the current setting. In order to prove future global existence, one proceeds similarly to Lemma 18. Due to the above observations, we are allowed √to use the conclusions √ √ of Lemma 21 with a1 = a2 = a, a3 = b, σ1 = σ2 = σ/ 6 and σ3 = − 2σ/ 3. Proof (Theorem 3). Let us assume we have a metric of the form (199) on I × G, where G is a 3-dimensional Lie group, I is an open interval containing (t0 , ∞) for t0 large enough and ξ i are the duals of a basis {ei } for the Lie algebra (the metrics (219) and (221) can be written in this form due to the fact that hyperbolic space can be considered as a Lie group with a left invariant metric, cf. Sect. 17 of [20]). Assume furthermore that lim t −2/λ ai (t) = αi , 2

t→∞

lim

t→∞

t a˙ i 2 = 2, ai λ

for some αi > 0 and that (210) and (211) hold. Note that these assumptions hold in the cases of interest here, due to the arguments given at the beginning of the present section. Assume finally that there is a group of diffeomorphisms Γ acting freely and properly discontinuously on G such that {Id} × Γ is a group of isometries of g¯ and such that the quotient of G under Γ is compact (it is clear that the groups under consideration in the theorem are of this type in the unimodular case, due to our assumptions, and in the remaining cases due to the forms of the metrics in these cases, cf. (219) and (221)). Let Σ denote the quotient and let π : G → Σ be the covering projection. Let us define a reference metric h=

3 i=1

αi2 ξ i ⊗ ξ i

216

H. Ringström

on G. Note that since 2 hˆ = t −4/λ

3

ai2 (t)ξ i ⊗ ξ i

i=1

ˆ Γ is a group converges to the metric h as t → ∞ and Γ is a group of isometries of h, of isometries of h. Consequently, h induces a metric on Σ. In what follows it will be useful to compare ∂ y i for some coordinates y with the basis ei . Unfortunately, we cannot assume that the ei are well defined on Σ, since the group Γ may contain diffeomorphisms that do not map ei to itself. On the other hand, there is an 0 > 0 such that if ≤ 0 and q ∈ Σ, then B (q) (measured with respect to the metric h) is such that π −1 [B (q)] consists of a disjoint collection of open sets such that π , restricted to any connected member of the disjoint union, is an isometry onto B (q). One can use one of these isometries to push the basis ei (and thus ξ i ) forward to B (q). However, the result will in general depend on the choice of connected member of π −1 [B (q)]; below we shall speak of a choice of ξ i on B (q). In [20], we proved that there is an > 0 and a K > 0 such that for every q ∈ Σ, there are normal coordinates y i on B (q) with respect to the metric h, and a choice of ξ i such that if ζ ij = ξ i (∂ y j ), then all the derivatives of ζ ij with respect to y l up to order k0 + 1 are bounded by K in the sup norm on B (q) (cf. pp. 204–205 of [20]). Let > 0 and K > 0 be as above and q ∈ Σ. Let y i be normal coordinates on B (q) with respect to the metric h, and make a choice of ξ i such that if ζ ij = ξ i (∂ y j ), then all the derivatives of ζ ij with respect to y l up to order k0 + 1 are bounded by K in the sup norm on B (q). The initial data induced on the hypersurface {t} × G are given by g=

3 i=1

ai2 (t)ξ i ⊗ ξ i , k =

3

˙ a˙ i (t)ai (t)ξ i ⊗ ξ i , φ(t), φ(t).

i=1

Let us introduce coordinates x i = [4(t)]−1 t 2/λ y i . For t large enough, the range of x i contains the ball of radius 1 (recall that λ2 < 2). Note that 2

gi j = g(∂x i , ∂x j ) = [4(t)]2

3

t −4/λ al2 (t)(ξ l ⊗ ξ l )(∂ y i , ∂ y j ). 2

l=1

Since t −2/λ ai (t) → αi as t → ∞, h(∂ y i , ∂ y j ) = δi j at q, the derivatives of ξ l (∂ y i ) with respect to y j are bounded by K on B (q) and the ball of radius 1 with respect to the x i coordinates corresponds to a ball of an arbitrarily small radius with respect to the y i coordinates for t large enough, we conclude that for t large enough (the bound being independent of q), 2

[4(t)]−2 gi j − δi j is arbitrarily small in the ball of radius 1 with respect to the x i coordinates. Since ∂ 2 ∂ = 4(t)t −2/λ , i ∂x ∂ yi

(225)

Power Law Inflation

217

and ξ i (∂ y j ) is bounded in C k0 +1 , the spatial derivatives of the expression appearing in (225) with respect to x l are arbitrarily small for t large enough (independent of q). Similarly, ki j = k(∂x i , ∂x j ) = [4(t)]2

3

t −4/λ a˙ l (t)al (t)(ξ l ⊗ ξ l )(∂ y i , ∂ y j ). 2

l=1

Since, in addition to the above observations, lim t −2/λ t a˙ i (t) = 2

t→∞

2 αi , λ2

we conclude that (recall that p = 2/λ2 ) t p −1 [4(t)]−2 ki j − δi j

(226)

is arbitrarily small in a ball of radius 1 with respect to the x i -coordinates. Furthermore, the derivatives of the expression appearing in (226) with respect to ∂x l are arbitrarily small. There is one problem with the above argument of course; in Theorem 2, the time t0 used is determined by the mean value of the scalar field. In fact, instead of (t), we should use [t0 (t)] in (225), where 1 t0 (t) = exp (λφ(t) + c0 ) , 2 and similarly in (226). Due to (210), we have t0 (t) = t[1 + O(t −β )]. As a consequence [4(t)]−2 = 1 + O(t −β ), {4[t0 (t)]}−2

t p −1 [4(t)]−2 = 1 + O(t −β ). t0 (t) p −1 {4[t0 (t)]}−2

In other words, whether we use t or t0 (t) does not make any difference as far as the conclusions are concerned. Note that, by definition, φ(t) − φ0 [t0 (t)] is zero, and by (211) and the above observations, ˙ − t0 (t)φ˙ 0 [t0 (t)] t0 (t)φ(t) converges to zero. Since this object is spatially homogeneous, we are allowed to conclude that for t large enough, (16) is satisfied with replaced by /2, where the coordinates are of the form described above (regardless of the point q). Combining Theorem 2 of the present paper with Theorem 7 of [20], we get the desired stability statement. Acknowledgement. Part of this work was carried out while the author was enjoying the hospitality of the Isaac Newton Institute for Mathematical Sciences and the Max Planck Institute for Gravitational Physics. The research was supported by the Swedish Research Council and the Göran Gustafsson Foundation. The author is a Royal Swedish Academy of Sciences Research Fellow supported by a grant from the Knut and Alice Wallenberg Foundation.

218

H. Ringström

References 1. Anderson, M.T.: Existence and Stability of even-dimensional asymptotically de Sitter spaces. Ann. Henri Poincaré 6, 801–820 (2005) 2. Choquet-Bruhat, Y., Geroch, R.: Global aspects of the Cauchy problem in General Relativity. Commun. Math. Phys. 14, 329–335 (1969) 3. Choquet-Bruhat, Y., Isenberg, J., Pollack, D.: The constraint equations for the Einstein-scalar field system on compact manifolds. Class. Quant. Grav. 24, 809–828 (2007) 4. Christodoulou, D., Klainerman, S.: The Global Non-Linear Stability of the Minkowski Space. Princeton, NJ: Princeton University Press, 1993 5. Friedrich, H.: On the existence of n-geodesically complete or future complete solutions of Einstein’s field equations with smooth asymptotic structure. Commun. Math. Phys. 107, 587–609 (1986) 6. Friedrich, H., Rendall, A.D.: The Cauchy problem for the Einstein equations. In: Einstein’s Field Equations and their Physical Implications. Lecture Notes in Phys. 540, Berlin: Springer, 2000 7. Halliwell, J.J.: Scalar fields in cosmology with an exponential potential. Phys. Lett. B 185, 341–344 (1987) 8. Hebey, E., Pacard, F., Pollack, D.: A variational analysis of Einstein-scalar field Lichnerowicz equations on compact Riemannian manifolds. Commun. Math. Phys. 278(1), 117–132 (2008) 9. Heinzle, J.M., Rendall, A.D.: Power-law Inflation in Spacetimes without Symmetry. Commun. Math. Phys. 269, 1–15 (2007) 10. Kitada, Y., Maeda, K.: Cosmic no-hair theorem in homogeneous spacetimes:I. Bianchi Models. Class. Quant. Grav. 10, 703–734 (1993) 11. Lee, H.: The Einstein-Vlasov system with a scalar field. Ann. H. Poincaré 6, 697–723 (2005) 12. Lindblad, H., Rodnianski, I.: Global existence for the Einstein vacuum equations in wave coordinates. Commun. Math. Phys. 256, 43–110 (2005) 13. Lindblad, H., Rodnianski, I.: The global stability of Minkowski space-time in harmonic gauge. Ann. of Math., accepted, available at http://pjm.math.berkeiey.edu/editorial/uploads/annals/accepted/080517Rodnianski/080517-Rodnianski-v1.pdf 14. O’Neill, B.: Semi Riemannian Geometry. Orlando, FL: Academic Press, 1983 15. Raymond, F., Vasquez, T.: 3-manifolds whose universal coverings are Lie groups. Top. and Appl. 12, 161–179 (1981) 16. Rendall, A.D.: Accelerated cosmological expansion due to a scalar field whose potential has a positive lower bound. Class. Quant. Grav. 21, 2445–2454 (2004) 17. Rendall, A.D.: Intermediate inflation and the slow-roll approximation. Class. Quant. Grav. 22, 1655–1666 (2005) 18. Rendall, A.D.: Dynamics of k-essence. Class. Quant. Grav. 23, 1557–1570 (2006) 19. Ringström, H.: The Bianchi IX attractor. Ann. H. Poincaré 2, 405–500 (2001) 20. Ringström, H.: Future stability of the Einstein non-linear scalar field system. Invent. Math. 173, 123–208 (2008) Communicated by G.W. Gibbons

Commun. Math. Phys. 290, 219–238 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0762-z

Communications in

Mathematical Physics

Robustness of Discrete Dynamics via Lyapunov Sequences Luis Barreira, Claudia Valls Departamento de Matemática, Instituto Superior Técnico, 1049-001 Lisboa, Portugal. E-mail: [email protected]; [email protected] Received: 15 August 2008 / Accepted: 17 November 2008 Published online: 28 February 2009 – © Springer-Verlag 2009

Abstract: For a nonautonomous dynamics with discrete time defined by a sequence of matrices, we give a complete characterization of nonuniform exponential contractions and nonuniform exponential dichotomies in terms of Lyapunov sequences. We note that these include as very special cases uniform exponential contractions and uniform exponential dichotomies. Due to the central role played by these properties in a substantial part of the theory of dynamical systems, in particular in connection with the study of stable and unstable invariant manifolds, it is important to have available optimal characterizations that are more amenable to check whether a given dynamics has such a property. We also obtain inverse theorems that give explicitly Lyapunov sequences for a given contraction or dichotomy. As a nontrivial application, we establish the robustness under sufficiently small linear perturbations both of nonuniform exponential contractions and nonuniform exponential dichotomies. We emphasize that when compared to former work, our proof of the robustness property is much simpler. 1. Introduction 1.1. Main objectives. The purpose of this paper is twofold. For a nonautonomous dynamics with discrete time defined by a sequence of p × p matrices (Am )m∈Z , that is, for the dynamics in R p defined by xm+1 = Am xm , m ∈ Z,

(1)

we want to: 1. characterize completely when it admits a nonuniform exponential contraction or a nonuniform exponential dichotomy in terms of Lyapunov sequences (all these notions are discussed below in the introduction); Partially supported by FCT through CAMGSD, Lisbon.

220

L. Barreira, C. Valls

2. and as a consequence, establish in a simple manner the stability under sufficiently small linear perturbations (usually called the robustness property) of nonuniform exponential contractions and nonuniform exponential dichotomies. We first recall the notion of nonuniform exponential dichotomy. Given a sequence (Am )m∈Z of invertible p × p matrices we define ⎧ ⎪ ⎨ Am−1 · · · An if m > n, A(m, n) = Id if m = n, ⎪ ⎩ A−1 · · · A−1 if m < n. m n−1 We say that the sequence (Am )m∈Z admits a nonuniform exponential dichotomy if there exist projections Pm for m ∈ Z such that Pm A(m, n) = A(m, n)Pn , m, n ∈ Z,

(2)

and there exist constants a < 0 < b, D > 0 and ε ≥ 0 such that A(m, n)Pn ≤ Dea(m−n)+ε|n| , m ≥ n,

(3)

A(m, n)−1 Q m ≤ De−b(m−n)+ε|m| m ≥ n,

(4)

and

where Q m = Id −Pm is the complementary projection. We say that (Am )m∈Z admits a uniform exponential dichotomy if it admits a nonuniform exponential dichotomy with ε = 0. In the particular case of an autonomous dynamics defined by a constant sequence Am = A, one can easily verify that if the sequence admits a nonuniform exponential dichotomy, then the dichotomy is uniform. Thus, the notion of nonuniform exponential dichotomy can be seen as an extension of the concept of hyperbolicity to nonautonomous systems. A Lyapunov sequence is an appropriate generalization of a Lyapunov function for an arbitrary nonautonomous dynamics (here in the case of discrete time), in which a single function decreasing along the dynamics is replaced by a sequence of functions, now called a Lyapunov sequence, with a corresponding property for the nonautonomous dynamics in (1). More precisely, a sequence of functions (Hm : R p → R)m∈Z is called a Lyapunov sequence if Hm+1 (Am x) ≤ Hm (x), m ∈ Z, x ∈ R p . In the particular case of an autonomous dynamics defined by a constant sequence Am = A, a function H : R p → R is called a Lyapunov function if H (Ax) ≤ H (x), x ∈ R p . Thus, this notion can be seen as a particular case of Lyapunov sequence. In this paper we consider quadratic Lyapunov sequences (Hm )m∈Z , in which case all functions Hm are quadratic forms, that is, Hm (x) = Sm x, x for some p × p matrices Sm . It is sometimes easier to make explicit computations with these sequences and we take full advantage of it in the paper. The importance of Lyapunov functions is well established, particularly in the study of the stability of trajectories both under linear and nonlinear perturbations. The notion of Lyapunov sequence plays a similar important role in the general case of nonautonomous

Robustness of Discrete Dynamics via Lyapunov Sequences

221

dynamics (here in the case of discrete time). According to Coppel in [10], the connection between Lyapunov functions and the classical (uniform) exponential dichotomies was first considered by Ma˘ızel’ in 1954 in [17]. We refer to the book by Mitropolsky, Samoilenko and Kulik [20] for a detailed discussion in the case of continuous time. The use of Lyapunov functions in the study of the stability of trajectories, both in the finite and in the infinite-dimensional settings, goes back to the seminal work of Lyapunov in his 1892 thesis, republished most recently in [16]. Among the first accounts of the theory are the books by LaSalle and Lefschetz [15], Hahn [12], and Bhatia and Szegö [6]. 1.2. Nonuniform hyperbolicity. The notion of (uniform) exponential dichotomy, introduced by Perron in 1930 in [23], plays a central role in a substantial part of the theory of dynamical systems. In particular, the existence of an exponential dichotomy for a linear dynamics implies the existence of stable and unstable invariant manifolds for any sufficiently small nonlinear perturbation. For example, for the geodesic flow of a Riemannian manifold of strictly negative curvature, the linear variational equation along any trajectory admits an exponential dichotomy. We note that the theory of exponential dichotomies and its applications are widely developed. We refer the reader to the books [5,7,13,14,26] for details and further references. On the other hand, the existence of exponential dichotomies is a strong requirement and it is of interest to look for more general types of hyperbolic behavior, particularly in the case of nonautonomous systems. In this paper we consider the more general notion of nonuniform exponential dichotomy (see Sect. 3 for the definition). In comparison with the notion of uniform exponential dichotomy it is a much weaker requirement. In particular, essentially any linear dynamics in R p with nonzero Lyapunov exponents has a nonuniform exponential dichotomy (see [3] for details). Moreover, it follows from Oseledets’ multiplicative ergodic theorem that from the point of view of ergodic theory the nonuniform behavior is also very common. To formulate a rigorous statement we recall that a measurable transformation f : X → X is said to preserve a measure µ in X if µ( f −1 A) = µ(A) for every measurable set A ⊂ X . Let also M p be the space of p × p matrices. Theorem 1. Let f : X → X and A : X → M p be measurable transformations. If f preserves a finite measure µ in X such that log+ A ∈ L 1 (X, µ), and lim sup m→+∞

1 log A( f m (q)) · · · A( f (q))A(q)x < 0 m

for µ-almost every q ∈ X and every x ∈ R p , then for µ-almost every q ∈ X there exists λ = λ(q) > 0 and for each ε > 0 there exists c = c(q, ε) > 0 such that A( f m−1 (q)) · · · A( f n (q)) ≤ ce−λ(m−n)+εn , m ≥ n.

(5)

Theorem 1 is a consequence of Oseledets’ multiplicative ergodic theorem in [22] (see for example [1] for a detailed discussion, including the case when there are simultaneously negative and positive Lyapunov exponents, which corresponds to the existence of nonuniform exponential dichotomies). In particular, Theorem 1 shows that in the context of ergodic theory the nonuniformity given by the constant ε in (5) can be

222

L. Barreira, C. Valls

made arbitrarily small for almost all trajectories, although not necessarily zero. Thus, it is important to study the asymptotic stability in the general nonuniform case. This observation is particularly relevant in the study of perturbations, since one needs the perturbation to decrease sufficiently fast so that it compensates for a possible nonzero nonuniformity. On the other hand, it follows from the work Barreira and Schmeling in [2] that for some classes of measure-preserving transformations, the set of points q ∈ X for which the nonuniformity cannot be made equal to zero has full topological entropy and full Hausdorff dimension. 1.3. Lyapunov sequences and robustness property. Due to the central role played by the notion of exponential dichotomy, particularly in stability theory and perturbation theory, it is important to understand how exponential dichotomies vary under perturbations. This is the so-called robustness problem. More precisely, we say that a nonuniform exponential dichotomy defined by a sequence (Am )m∈Z is robust if for any sequence (Bm )m∈Z that is sufficiently small (in some precise sense), the sequence (Am + Bm )m∈Z still has a nonuniform exponential dichotomy. Compared with former work, our proof of the robustness of nonuniform exponential contractions and nonuniform exponential dichotomies is much simpler. In particular, we recover in a very simple manner results in [4] in the particular case of bounded sequences (Am )m∈Z . Our proof is based on the characterization of nonuniform exponential dichotomies, also established in this paper. In particular, unlike in other works, we do not need any fixed point problem or the notion of admissibility. Moreover, in contrast with several former works on the robustness of (uniform) exponential dichotomies, we do not assume that the sequence (Am )m∈Z is bounded (and we only need it to be tempered). To the best of our knowledge, in what respects to robustness property all former related results in the literature (with the exception of our work [4]) we consider only the case of uniform exponential dichotomies, with several different methods. We note that the study of robustness has a long history. In particular, it was discussed by Massera and Schäffer [18] (building on earlier work of Perron [23]; see also [19]), by Coppel [9], and in the case of Banach spaces by Dalec ki˘ı and Kre˘ın [11], with different approaches and successive generalizations. For more recent work we refer to [8,21,24,25] and the references therein (since we are dealing with nonuniform exponential dichotomies we refrain from being more detailed). 2. Nonuniform Exponential Contractions We formulate in this section our results in the particular case of nonuniform exponential contractions. These follow immediately from our results for nonuniform exponential dichotomies (see Sects. 3 and 4). 2.1. Basic notions. We say that a sequence (Am )m∈Z of p × p matrices admits a nonuniform exponential contraction if there exist constants a < 0, D > 0 and ε ≥ 0 such that for every m ≥ n, A(m, n) ≤ Dea(m−n)+ε|n| . We also say that (Am )m∈Z admits a uniform exponential contraction if it admits a nonuniform exponential contraction with ε = 0. We give an example of a nonuniform exponential contraction.

Robustness of Discrete Dynamics via Lyapunov Sequences

223

Example 1. Given ω < 0 and ε ≥ 0, we consider the constants Am = eω+ε[(−1)

n m−1/2]

, m ∈ Z.

(6)

Clearly, for every m ≥ n we have A(m, n) = e(ω−ε/2)(m−n)+ε

m−1 k=n

(−1)k k

.

We note that l

(−1)k k = (−1)l (l + 1)/2

(7)

k=1

for each l ∈ N, where · denotes the integer part. Indeed, if l is even then l

(−1)k k = −

k=1

l/2 l/2 l (2 j − 1) + 2 j = = (−1)l (l + 1)/2 , 2 j=1

j=1

and if l is odd then l l−1 l −1 − l = (−1)l (l + 1)/2 . (−1)k k = (−1)k k − l = 2 k=1

k=1

In addition to (7), for each l ∈ Z− we have |l| −1 −l k −j (−1) k = (−1) (− j) = − (−1) j j = (−1)|l|+1 (|l| + 1)/2 . k=l

j=1

j=1

We claim that for every m, n ∈ Z with m ≥ n we have m−1

(−1)k k ≤

k=n

|m| + |n| + 2 . 2

This follows from (7) when m, n ∈ N. If m ∈ N and n ∈ Z− , then m−1

(−1)k k =

k=n

−1 m−1 (−1)k k + (−1)k k k=n

k=1 |n|+1

= (−1) (|n| + 1)/2 + (−1)m−1 m/2 ≤ (m + |n| + 1)/2. Finally, if m, n ∈ Z− with m ≥ n, then m−1 k=n

(−1)k k =

−1 −1 (−1)k k − (−1)k k k=n

k=m |n|+1

= (−1) (|n| + 1)/2 + (−1)|m| (|m| + 1)/2 ≤ (|m| + |n| + 2)/2.

(8)

224

L. Barreira, C. Valls

By (8), for every m ≥ n we have A(m, n) = ≤ ≤ =

m−1

e(ω−ε/2)(m−n)+ε k=n (−1) k e(ω−ε/2)(m−n)+ε(|m|+|n|+2)/2 e(ω−ε/2)(m−n)+ε|m−n|/2+ε|n|+ε eε eω(m−n)+ε|n| . k

(9)

This shows that (Am )m∈Z admits a nonuniform exponential contraction with a = ω and D = eε . Moreover, (Am )m∈Z does not admit a uniform exponential contraction when ε > 0. 2.2. Formulation of the results. The following statements provide a characterization of nonuniform exponential contractions in terms of what can be called Lyapunov sequences. Given a sequence (Sm )m∈Z of p× p matrices we consider the functions Hm in R p defined by Hm (x) = Sm x, x.

(10)

We denote by A∗ the transpose of the matrix A, and given two p × p matrices A and B, we write A ≤ B if Ax, x ≤ Bx, x for every x ∈ R p . Theorem 2. If (Am )m∈Z admits a nonuniform exponential contraction, then there exist a sequence (Sm )m∈Z of symmetric positive-definite p× p matrices, and constants d, β ≥ 0 and η ∈ (0, 1) such that for every m ∈ Z: 1. Sm ≤ deβ|m| and A∗m Sm+1 Am − Sm ≤ − Id /2; 2. Hm+1 (Am x) − Hm (x) ≤ −η|Hm (x)|, x ∈ R p . Theorem 3. Assume that there exist constants c, δ ≥ 0 such that Am ≤ ceδ|m| , m ∈ Z.

(11)

If there exist a sequence (Sm )m∈Z of symmetric positive-definite p × p matrices, and constants d, β ≥ 0 and η ∈ (0, 1) satisfying (1 + η)/(1 − η) > eβ and Conditions 1 and 2 in Theorem 2, then (Am )m∈Z admits a nonuniform exponential contraction. The proofs of Theorems 2 and 3 are not given since the statements are particular cases respectively of Theorems 5 and 6 below. It follows from the proof of Theorem 5 (see (16)) that in Theorem 2 we can consider the p × p matrices Sm = A(k, m)∗ A(k, m)e−2(a+)(k−m) , (12) k≥m

for any fixed positive constant < −a.

Robustness of Discrete Dynamics via Lyapunov Sequences

225

Example 2. Consider the sequence (Am )m∈Z in (6). By Example 1 it admits a nonuniform exponential contraction. In this case the sequence Sm in (12) is given by −(ε+2)(k−m)+2ε k−1 (−1) j j j=m Sm = e , k≥m

for any fixed positive constant < −ω. We also formulate a robustness result for nonuniform exponential contractions. It is a simple consequence of Theorem 8 below. Theorem 4. Assume that there exist c, δ ≥ 0 satisfying (11), and that (Am )m∈Z admits a nonuniform exponential contraction with ε < −a. If Bm ≤ κe−(2ε+δ)|m+1| , m ∈ Z for some sufficiently small κ, ε > 0, then (Am + Bm )m∈Z admits a nonuniform exponential contraction. 3. Characterization of Nonuniform Exponential Dichotomies We consider in this section the case of nonuniform exponential dichotomies. We first give an example of exponential dichotomy. Example 3. Given ω < 0 and ε ≥ 0, we consider the matrices ω+ε[(−1)m m−1/2] 0 e Am = , m ∈ Z. m+1 0 e−ω+ε[(−1) m−1/2]

(13)

We also consider the projections Pm and Q m given by Pm (x, y) = (x, 0) and Q m (x, y) = (0, y). Clearly, for every m ≥ n we have

m−1 k e(ω−ε/2)(m−n)+ε k=n (−1) k 0 m−1 A(m, n) = , k+1 0 e−(ω+ε/2)(m−n)+ε k=n (−1) k and (2) holds. By (9), for every m ≥ n we have A(m, n)Pn ≤ eε eω(m−n)+ε|n| . Moreover, by (8), for every m ≥ n we have A(m, n)−1 Q m = ≤ ≤ =

m−1

e(ω+ε/2)(m−n)+ε k=n (−1) k e(ω+ε/2)(m−n)+ε(|m|+|n|+2)/2 e(ω+ε/2)(m−n)+ε|m|+ε|n−m|/2+ε eε e(ω+ε)(m−n)+ε|m| . k

Therefore, (Am )m∈Z admits a nonuniform exponential dichotomy with a = ω, b = −ω − ε, and D = eε provided that ω + ε < 0. Now we obtain a characterization of nonuniform exponential dichotomies.

226

L. Barreira, C. Valls

Theorem 5. If (Am )m∈Z admits a nonuniform exponential dichotomy, then there exist a sequence (Sm )m∈Z of invertible symmetric p × p matrices, and constants d, β ≥ 0 and η ∈ (0, 1) such that for every m ∈ Z: 1. Sm ≤ deβ|m| and A∗m Sm+1 Am − Sm ≤ − Id /2;

(14)

Hm+1 (Am x) − Hm (x) ≤ −η|Hm (x)|, x ∈ R p .

(15)

2.

Proof. For each m ∈ Z, using the projections Pm and Q m we define the subspaces Fms = Pm (R p ) and Fmu = Q m (R p ). We also consider the p × p matrices

Sm =

(A(k, m)Pm )∗ A(k, m)Pm e−2(a+)(k−m)

k≥m

−

(A(k, m)Q m )∗ A(k, m)Q m e2(b−)(m−k) ,

(16)

k<m

for any fixed positive constant < min{−a, b}. It follows from (3) and (4) that each series is well-defined. Clearly, Sm is symmetric for each m. We consider the functions Hm in (10). Since Hm (x) > 0 for x ∈ Fms \{0}, and Hm (x) < 0 for x ∈ Fmu \{0}, it follows from the identity Fms ⊕ Fmu = R p that Sm is invertible for each m. Now we observe that |Hm (x)| ≤

A(k, m)Pm x2 e−2(a+)(k−m)

k≥m

+

A(k, m)Q m x2 e2(b−)(m−k)

k<m

⎛

≤ D 2 e2ε|m| x2 ⎝

e−2(k−m) +

k≥m

≤ D2

1 + e−2 1 − e−2

⎞ e−2(m−k) ⎠

k<m

e2ε|m| x2 .

(17)

Since Sm is symmetric we obtain −2 |Hm (x)| 21+e ≤ D e2ε|m| . 2 1 − e−2 x=0 x

Sm = sup

This establishes the first inequality in Condition 1 (with β = 2ε).

(18)

Robustness of Discrete Dynamics via Lyapunov Sequences

227

On the other hand, A∗m Sm+1 Am = A∗m (A(k, m + 1)Pm+1 )∗ A(k, m + 1)Pm+1 e−2(a+)(k−m−1) Am k≥m+1

−

A∗m (A(k, m + 1)Q m+1 )∗ A(k, m + 1)Q m+1 e2(b−)(m+1−k) Am

k<m+1

=e

2(a+)

(A(k, m)Pm )∗ A(k, m)Pm e−2(a+)(k−m)

k≥m+1

−e

2(b−)

(A(k, m)Q m )∗ A(k, m)Q m e2(b−)(m−k) .

k<m+1

Since Sm = Pm∗ Pm +

(A(k, m)Pm )∗ A(k, m)Pm e−2(a+)(k−m)

k≥m+1

+Q ∗m Q m

−

(A(k, m)Q m )∗ A(k, m)Q m e2(b−)(m−k) ,

k<m+1

we obtain A∗m Sm+1 Am − Sm = −Pm∗ Pm − Q ∗m Q m − 1 − e2(a+) (A(k, m)Pm )∗ A(k, m)Pm e−2(a+)(k−m)

+ 1−e

k≥m+1

2(b−)

(A(k, m)Q m )∗ A(k, m)Q m e2(b−)(m−k)

k<m+1

≤ −Pm∗ Pm − Q ∗m Q m ,

(19)

using the fact that a + < 0 and b − > 0. Moreover, since 2(Pm∗ Pm + Q ∗m Q m )x, x = 2Pm x2 + 2Q m x2 ≥ Pm x2 + Q m x2 + 2Pm x · Q m x = (Pm x + Q m x)2 ≥ (Pm + Q m )x2 = x2 , we have Pm∗ Pm + Q ∗m Q m ≥ Id /2. It follows from (19) that A∗m Sm+1 Am − Sm = −Pm∗ Pm − Q ∗m Q m ≤ − Id /2, thus establishing (14).

228

L. Barreira, C. Valls

Finally, we observe that Hm+1 (Am x) − Hm (x) A(k, m + 1)Am Pm x2 e−2(a+)(k−m−1) = k≥m+1

−

A(k, m)Pm x2 e−2(a+)(k−m)

k≥m

−

A(k, m + 1)Am Q m x2 e2(b−)(m+1−k)

k<m+1

+

A(k, m)Q m x2 e2(b−)(m−k)

k<m

= −e−2(a−) Pm x2 − 1 − e2(a+) A(k, m)Pm x2 e−2(a+)(k−m) −e

2(b−)

Q m x − e 2

k≥m

2(b−)

−1 A(k, m)Q m x2 e2(b−)(m−k) . k<m

Setting η = min 1 − e2(a+) , e2(b−) − 1 , we obtain

⎛

Hm+1 (Am x) − Hm (x) ≤ −η ⎝

A(k, m)Pm x2 e−2(a+)(k−m)

k≥m

+

⎞ A(k, m)Q m x2 e2(b−)(m−k) ⎠ .

k≤m

Notice that η ∈ (0, 1). It follows from the first inequality in (17) that (15) holds. This concludes the proof of the theorem. Theorem 6. Assume that there exist constants c, δ ≥ 0 satisfying (11). If there exist a sequence (Sm )m∈Z of invertible p × p matrices, and constants d, β ≥ 0 and η ∈ (0, 1) satisfying (1 + η)/(1 − η) > eβ

(20)

and Conditions 1 and 2 in Theorem 5, then (Am )m∈Z admits a nonuniform exponential dichotomy. Proof. We split the proof into several lemmas. We denote by E ns the set of points x such that either x = 0 or Hm (A(m, n)x) > 0 for every m ∈ Z, and by E nu the set of points x such that either x = 0 or Hm (A(m, n)x) < 0 for every m ∈ Z. Clearly, A(m, n)E ns = E ms and A(m, n)E nu = E mu for every m, n ∈ Z.

Robustness of Discrete Dynamics via Lyapunov Sequences

229

Lemma 1. There exists a constant C > 0 such that 1 x2 , x ∈ E ns , 2

(21)

1 −2δ|n| x2 , x ∈ E nu . e C

(22)

Hn (x) ≥ and |Hn (x)| ≥

Proof of the lemma. It follows from (14) that Hn+1 (An x) − Hn (x) = Sn+1 An x, An x − Sn x, x = A∗n Sn+1 An x, x − Sn x, x 1 = (A∗n Sn+1 An − Sn )x, x ≤ − x2 . 2 Let x ∈ E ns . We have Hn (x) ≥ 0 and Hn+1 (An x) ≥ 0, and hence, |Hn (x)| ≥ Hn (x) − Hn+1 (An x) ≥

1 x2 . 2

This establishes inequality (21). Similarly, if x ∈ E nu , then Hn (x) ≤ 0 and Hn−1 (A−1 n−1 x) ≤ 0. Hence, by Condition 1 we obtain |Hn (x)| ≥ |Hn (x)| − |Hn−1 (A−1 n−1 x)| = Hn−1 (A−1 n−1 x) − Hn (x) ≥

x2 1 −1 An−1 x2 ≥ . 2 2An−1 2

It follows from (11) that |Hn (x)| ≥ This establishes inequality (22).

1 −2δ|n| e x2 . 2c2

Lemma 2. If x ∈ E ns , then Hm (A(m, n)x) ≤ (1 − η)m−n Hn (x), m ≥ n,

(23)

and if x ∈ E nu , then |Hm (A(m, n)x)| ≥ (1 + η)m−n |Hn (x)|, m ≥ n. Proof of the lemma. If x ∈ E ns , then Hn+1 (An x) − Hn (x) ≤ −ηHn (x), and hence, Hn+1 (An x) ≤ (1 − η)Hn (x).

(24)

230

L. Barreira, C. Valls

Notice that A(m, n)x ∈ E ms for every m ∈ Z. Therefore, for x ∈ E ns \{0} and m ≥ n we have Hm (A(m, n)x) =

m−n−1 j=0

Hm− j (A(m − j, n)x) Hn (x) Hm− j−1 (A(m − j − 1, n)x)

≤ (1 − γ )m−n Hn (x). This establishes (23). Now take x ∈ E nu . We have Hn+1 (An x) − Hn (x) ≤ −η|Hn (x)| and hence, |Hn+1 (An x)| ≥ (1 + η)|Hn (x)|. Notice that A(m, n)x ∈ E mu for every m ∈ Z. Therefore, for x ∈ E nu \{0} and m ≥ n we have |Hm (A(m, n)x)| =

m−n−1 j=0

|Hm− j (A(m − j, n)x)| |Hn (x)| |Hm− j−1 (A(m − j − 1, n)x)|

≥ (1 + η)m−n |Hn (x)|.

This establishes (24).

Lemma 3. There exists a constant K > 0 such that for every m, n ∈ Z with m ≥ n we have A(m, n)|E ns 2 ≤ K eβ|n| (1 − η)m−n , and A(m, n)−1 |E mu 2 ≤ K e(2δ+β)|n| (e−β (1 + η))n−m . Proof of the lemma. By Lemmas 1 and 2, for each x ∈ E ns and m ≥ n we have A(m, n)x2 ≤ 2Hm (A(m, n)x) ≤ 2(1 − η)m−n Hn (x). By Condition 1, we conclude that A(m, n)x2 ≤ 2deβ|n| (1 − η)m−n x2 . Similarly, for each x ∈ E nu and m ≥ n we have 1 −β|m| e |Hm (A(m, n)x)| d 1 ≥ e−β|m| (1 + η)m−n |Hn (x)|. d

A(m, n)x2 ≥

By Lemma 1 we obtain 1 −β|m|−2δ|n| e (1 + η)m−n x2 Cd 1 −(β+2δ)|n| −β e ≥ (e (1 + η))m−n x2 . Cd This completes the proof of the lemma. A(m, n)x2 ≥

Robustness of Discrete Dynamics via Lyapunov Sequences

231

Lemma 4. For each m ∈ Z the sets E ms and E mu are subspaces, and E ms ⊕ E mu = R p . Proof of the lemma. Since the matrices Sm are invertible they have constant index, and there exist integers ru , rs ∈ N with ru + rs = p such that the sets u = A(n, m) x ∈ R p : Hm (x) ≤ 0 Cn,m and s = A(n, m) x ∈ R p : Hm (x) ≥ 0 Cn,m contain subspaces respectively of dimensions ru and rs . Moreover, it follows from Condition 2 that Hm+1 (Am x) ≤ Hm (x), and hence u u u ⊃ Cn,0 ⊃ Cn,−1 ⊃ ··· · · · ⊃ Cn,1

and s s s ⊃ Cn,0 ⊃ Cn,1 ⊃ ··· . · · · ⊃ Cn,−1

By the compactness of the closed unit ball in R p , the intersections u s Cn,m and Fns = Cn,m Fnu = m∈Z

m∈Z

also contain subspaces respectively of dimensions ru and rs . By the definitions we have E mu ⊂ Fmu and E ms ⊂ Fms . On the other hand, it follows readily from Condition 2 that Fmu ⊂ E mu and Fms ⊂ E ms . Therefore, E mu = Fmu and E ms = Fms for each m ∈ Z. u ⊂ E u be any r -dimensional subspace, and let D s ⊂ E s For each m ∈ Z, let Dm u m m m be any rs -dimensional subspace. By construction, we have u s ⊕ Dm = Rp. Dm s , then one can write x = y + z with y ∈ D s and z ∈ D u \{0}. If there exists x ∈ E ms \Dm m m On the other hand, it follows readily from Lemma 3 together with inequality (20) that s = ∅, and thus we must have z = 0. But this is impossible, which shows that E ms \Dm s s u u E m = Dm . Similarly, we show that E m = Dm .

We denote by Pm : R p → E ms and Q m : R p → E mu the projections associated to the decomposition E ms ⊕ E mu = R p . Lemma 5. We have Pm = Q m ≤ 2Ce2δ|m| Sm , m ∈ Z.

232

L. Barreira, C. Valls

Proof of the lemma. Given x ∈ R p we write it as x = y + z with y ∈ E ms and z ∈ E mu . Clearly, Hm (y) ≥ and Hm (z) ≤ 0. Take am > 0. We define Hms (y) = −Hm (y) + am y2 = −Sm Pm x, Pm x + am y2 . By Lemma 1, assuming without loss of generality that C ≥ 2, we have 1 −2δ|m| 1 −2δ|m| s 2 2 y2 . Hm (y) ≤ − e y + am y = am − e C C Similarly, for each m ∈ Z we define Hmu (z) = −Hm (z) − am z2 = −Sm Q m x, Q m x − am z2 . Again by Lemma 1 we have Hmu (z) ≥

1 −2δ|m| e − am z2 . C

We conclude that if am ≤ e−2δ|m| /C, then −Hm (y) + am y2 ≤ 0 and

− Hm (z) − am z2 ≥ 0,

that is, −Sm Pm x, Pm x + am Pm x2 ≤ 0, and −Sm Q m x, Q m x − am Q m x2 ≥ 0. Since Sm is symmetric, subtracting the two inequalities we obtain 0 ≥ am Pm x2 + am Q m x2 −Sm Pm x, Pm x + Sm Q m x, Q m x = am Pm x2 + am Q m x2 + Sm x, x − 2Sm Pm x, x. Therefore, 2 2 1 1 am Pm x − Sm x + am Q m x + Sm x 2am 2am = am Pm x2 + am Q m x2 Sm x2 + + Sm x, x − 2Sm Pm x, x 2am Sm x2 ≤ , 2am and 2 2 2 Pm x − 1 Sm x + Q m x + 1 Sm x ≤ Sm x . 2 2am 2am 2am

Robustness of Discrete Dynamics via Lyapunov Sequences

233

This implies that

1 1 Pm x = Pm x − Sm x + Sm x 2am 2am 1 1 ≤ Pm x − 2a Sm x + 2a Sm x m m √ 1 2 1 ≤ √ Sm x + Sm x ≤ Sm x, 2am am 2am

and similarly

1 1 Q m x = Q m x + Sm x − Sm x 2am 2am 1 1 ≤ Q m x + 2a Sm x + 2a Sm x m m √ 1 2 1 ≤ √ Sm x + Sm x ≤ Sm x. 2a a 2am m m

Taking am = e−2δ|m| /C we obtain the desired statement. Now we assume that Property 2 holds. We have A(m, n)Pn ≤ A(m, n)|E ns · Pn , and A(m, n)−1 Q m ≤ A(m, n)−1 |E mu · Q m . In view of Lemmas 3 and 5, we conclude that (Am )m∈Z admits a nonuniform exponential dichotomy. Example 4. Consider the sequence of matrices (Am )m∈Z in (13). By Example 3 it admits a nonuniform exponential dichotomy. In this case the sequence Sm in (16) is given by a 0 , Sm = m 0 −bm where am =

e−(ε+2)(k−m)+2ε

k−1

j=m (−1)

j

j

k≥m

and bm =

e−(ε+2)(m−k)−2ε

m−1 j=k

(−1) j+1 j

,

k<m

for any fixed positive constant < −ω − ε. Now we consider the particular case of uniform exponential dichotomies. The following is a simple consequence of the proofs of Theorems 5 and 6.

234

L. Barreira, C. Valls

Theorem 7. Assume that the sequence (Am )m∈Z is bounded. Then the following properties are equivalent: 1. (Am )m∈Z admits a uniform exponential dichotomy; 2. there exist a bounded sequence (Sm )m∈Z of invertible symmetric p × p matrices and η ∈ (0, 1) such that for every m ∈ Z: (a) A∗m Sm+1 Am − Sm ≤ − Id /2; (b) Hm+1 (Am x) − Hm (x) ≤ −η|Hm (x)|, x ∈ R p . 4. Robustness of Nonuniform Exponential Dichotomies The following statement is our robustness result for nonuniform exponential dichotomies. Theorem 8. Assume that there exist c, δ ≥ 0 satisfying (11), and that (Am )m∈Z admits a nonuniform exponential dichotomy with ε < min{−a, b}. If Bm ≤ κe−(2ε+δ)|m+1| , m ∈ Z for some sufficiently small κ, ε > 0, then (Am + Bm )m∈Z admits a nonuniform exponential dichotomy. Proof. In view of Theorem 6 it is sufficient to show that there exists a sequence (Sm )m∈Z satisfying Conditions 1–2 for the sequence of matrices (Am + Bm )m∈Z . We consider the sequence (Sm )m∈Z in (16). Proceeding as in the proof of Theorem 5 we show that Condition 1 holds. Now we prove that for every m ∈ Z and x ∈ R p we have (Am + Bm )∗ Sm+1 (Am + Bm ) − Sm ≤ − Id /4,

(25)

Hm+1 ((Am + Bm )x) − Hm (x) ≤ −η|Hm (x)|,

(26)

and

for some constant η ∈ (0, 1). By Theorem 5 we have (Am + Bm )∗ Sm+1 (Am + Bm ) − Sm = A∗m Sm+1 Am + A∗m Sm+1 Bm + Bm∗ Sm+1 Am + Bm∗ Sm+1 Bm − Sm ≤ − Id /2 + A∗m Sm+1 Bm + Bm∗ Sm+1 Am + Bm∗ Sm+1 Bm . Using (18) we obtain A∗m Sm+1 Bm + Bm∗ Sm+1 Am + Bm∗ Sm+1 Bm ≤ 2Sm+1 · Am · Bm + Sm+1 · Bm 2 Id D 2 (1 + e−2 ) 2ε|m+1| −(2ε+δ)|m+1| δ|m| −2ε|m+1| 2ce Id e e + κe ≤κ 1 − e−2 D 2 (1 + e−2 ) ≤κ (2c + κ) Id. 1 − e−2 This yields (25) provided that κ is sufficiently small.

Robustness of Discrete Dynamics via Lyapunov Sequences

235

Finally, to obtain (26) we note that Hm+1 ((Am + Bm )x) − Hm (x) Pk A(k, m + 1)Am x + Pk A(k, m + 1)Bm x2 e−2(a+)(k−m−1) = k≥m+1

−

A(k, m)Pm x2 e−2(a+)(k−m)

k≥m

−

Q k A(k, m + 1)Am x + Q k A(k, m + 1)Bm x2 e2(b−)(m+1−k)

k<m+1

+

A(k, m)Q m x2 e2(b−)(m−k) ,

k<m

and hence, Hm+1 ((Am + Bm )x) − Hm (x) A(k, m + 1)Am Pm x2 e−2(a+)(k−m−1) ≤ k≥m+1

+

A(k, m + 1)Pm+1 Bm x2 e−2(a+)(k−m−1)

k≥m+1

−

A(k, m)Pm x2 e−2(a+)(k−m)

k≥m

−

A(k, m + 1)Am Q m x2 e2(b−)(m+1−k)

k<m+1

+

A(k, m + 1)Q m+1 Bm x2 e2(b−)(m+1−k)

k<m+1

+

A(k, m)Q m x2 e2(b−)(m−k)

k<m

= −e2(a+) Pm x2 +

k≥m+1

− 1−e

2(a+)

A(k, m)Pm x2 e−2(a+)(k−m)

k≥m

−e

A(k, m + 1)Pm+1 Bm x2 e−2(a+)(k−m−1)

2(b−)

Am Q m x2 +

A(k, m + 1)Q m+1 Bm x2 e2(b−)(m+1−k)

k<m+1

− e2(b−) − 1 A(k, m)Q m x2 e2(b−)(m−k) . k<m

Now we observe that −e2(a+) Pm x2 + ≤ −e

2(a+)

k≥m+1 Pm x2

A(k, m + 1)Pm+1 Bm x2 e−2(a+)(k−m−1)

(27)

236

L. Barreira, C. Valls

+ Dκx2

e2a(k−m−1)+2ε|m+1| e−(4ε+2δ)|m+1| e−2(a+)(k−m−1)

k≥m+1

Dκ e−2ε|m+1| x2 . 1 − e−2

≤ −e2(a+) Pm x2 + Furthermore, −e2(b−) Am Q m x2 +

A(k, m + 1)Q m+1 Bm x2 e2(b−)(m+1−k)

k<m+1

≤ −e2(b−) Am Q m x2 +Dκx2 e−2b(m+1−k)+2ε|m+1| e−(4ε+2δ)|m+1| e2(b−)(m+1−k) k<m+1

Dκe−2ε|m+1| x2 1 − e−2 Dκe−2ε|m+1| 1 + ≤ −e2(b−) Q m x2 x2 A(m + 1, m)−1 Q m+1 1 − e−2 1 Dκ ≤ − 2 e4b−2−2ε|m+1| Q m x2 + e−2ε|m+1| x2 . D 1 − e−2 ≤ −e2(b−) Am Q m x2 +

Hence, α := −e2(a+) Pm x2 +

A(k, m + 1)Pm+1 Bm x2 e−2(a+)(k−m−1)

k≥m+1

−e

2(b−)

Am Q m x2 +

A(k, m + 1)Q m+1 Bm x2 e2(b−)(m+1−k)

k<m+1

Dκ ≤ −e2(a+) Pm x2 + e−2ε|m+1| x2 1 − e−2 1 Dκ − 2 e4b−2−2ε|m+1| Q m x2 + e−2ε|m+1| x2 . D 1 − e−2 Taking into account that 1 Pm x2 + Q m x2 + 2Pm x · Q m x 2 1 = (Pm x + Q m x)2 2 1 1 ≥ (Pm + Q m )x2 = x2 , 2 2

Pm x2 + Q m x2 ≥

we obtain Dκ e−2ε|m+1| x2 − βe−2ε|m+1| (Pm x2 + Q m x2 ) 1 − e−2 β Dκ e−2ε|m+1| x2 − e−2ε|m+1| x2 , ≤2 −2 1−e 2

α≤2

where β is some positive constant. We note that α ≤ 0 provided that κ is sufficiently small. Setting η = min 1 − e2(a+) , e2(b−) − 1 ,

Robustness of Discrete Dynamics via Lyapunov Sequences

it follows from (27) that

⎛

Hm+1 ((Am + Bm )x) − Hm (x) ≤ −η ⎝

237

A(k, m)Pm x2 e−2(a+)(k−m)

k≥m

+

A(k, m)Q m x e

2 2(b−)(m−k)

.

k<m

Finally, by the first inequality in (17) we obtain (26). Therefore, it follows from Theorem 6 (after replacing the sequence (Sm )m∈Z by (2Sm )m∈Z ) that the sequence of matrices (Am + Bm )m∈Z admits a nonuniform exponential dichotomy provided that ε is sufficiently small. In the particular case of uniform exponential dichotomies we obtain the following statement. Theorem 9. Assume that (Am )m∈Z and (Bm )m∈Z are bounded sequences. If (Am )m∈Z admits a uniform exponential dichotomy, and supm∈Z Bm is sufficiently small, then (Am + Bm )m∈Z admits a uniform exponential dichotomy.

References 1. Barreira, L., Pesin, Ya.: Nonuniform Hyperbolicity. Encyclopedia of Mathematics and Its Applications 115, Cambridge University Press, 2007 2. Barreira, L., Schmeling, J.: Sets of “non-typical” points have full topological entropy and full Hausdorff dimension. Israel J. Math. 116, 29–70 (2000) 3. Barreira, L., Valls, C.: Stability theory and Lyapunov regularity. J. Diff. Eqs. 232, 675–701 (2007) 4. Barreira, L., Valls, C.: Robustness of nonuniform exponential dichotomies in Banach spaces. J. Diff. Eqs. 244, 2407–2447 (2008) 5. Barreira, L., Valls, C.: Stability of Nonautonomous Differential Equations. Lect. Notes. in Math. 1926, Berlin-Heidelberg: Springer Verlag, 2008 6. Bhatia, N., Szegö, G.: Stability Theory of Dynamical Systems. Grundlehren der mathematischen Wissenschaften 161, Berlin-Heidelberg-New York: Springer, 1970 7. Chicone, C., Latushkin, Yu.: Evolution Semigroups in Dynamical Systems and Differential Equations. Mathematical Surveys and Monographs 70, Providence, RI: Amer. Math. Soc., 1999 8. Chow, S.-N., Leiva, H.: Existence and roughness of the exponential dichotomy for skew-product semiflow in Banach spaces. J. Diff. Eqs. 120, 429–477 (1995) 9. Coppel, W.: Dichotomies and reducibility. J. Diff. Eqs. 3, 500–521 (1967) 10. Coppel, W.: Dichotomies in Stability Theory. Lect. Notes in Math. 629, Berlin-Heidelberg-New York: Springer, 1978 11. Dalec ki˘ı, Ju., Kre˘ın, M.: Stability of Solutions of Differential Equations in Banach Space. Translations of Mathematical Monographs 43, Providence, RI: Amer. Math. Soc., 1974 12. Hahn, W.: Stability of Motion. Grundlehren der mathematischen Wissenschaften 138, Berlin-HeidelbergNew York: Springer, 1967 13. Hale, J.: Asymptotic Behavior of Dissipative Systems. Mathematical Surveys and Monographs 25, Providence, RI: Amer. Math. Soc., 1988 14. Henry, D.: Geometric Theory of Semilinear Parabolic Equations. Lect. Notes in Math. 840, BerlinHeidelberg-New York: Springer, 1981 15. LaSalle, J., Lefschetz, S.: Stability by Liapunov’s Direct Method, with Applications. Mathematics in Science and Engineering 4, London-New York: Academic Press, 1961 16. Lyapunov, A.: The General Problem of the Stability of Motion. London: Taylor & Francis, 1992 17. Ma˘ızel’, A.: On stability of solutions of systems of differential equations. Ural. Politehn. Inst. Trudy 51, 20–50 (1954) 18. Massera, J., Schäffer, J.: Linear differential equations and functional analysis. I. Ann. of Math. (2) 67, 517–573 (1958)

238

L. Barreira, C. Valls

19. Massera, J., Schäffer, J.: Linear Differential Equations and Function Spaces. Pure and Applied Mathematics 21, London-New York: Academic Press, 1966 20. Mitropolsky, Yu., Samoilenko, A., Kulik, V.: Dichotomies and Stability in Nonautonomous Linear Systems. Stability and Control: Theory, Methods and Applications 14, London: Taylor & Francis, 2003 21. Naulin, R., Pinto, M.: Admissible perturbations of exponential dichotomy roughness. Nonlinear Anal. 31, 559–571 (1998) 22. Oseledets, V.: A multiplicative ergodic theorem. Liapunov characteristic numbers for dynamical systems. Trans. Moscow Math. Soc. 19, 197–221 (1968) 23. Perron, O.: Die Stabilitätsfrage bei Differentialgleichungen. Math. Z. 32, 703–728 (1930) 24. Pliss, V., Sell, G.: Robustness of exponential dichotomies in infinite-dimensional dynamical systems. J. Dynam. Diff. Eqs. 11, 471–513 (1999) 25. Popescu, L.: Exponential dichotomy roughness on Banach spaces. J. Math. Anal. Appl. 314, 436–454 (2006) 26. Sell, G., You, Y.: Dynamics of Evolutionary Equations. Applied Mathematical Sciences 143, BerlinHeidelberg-New York: Springer, 2002 Communicated by G. Gallavotti

Commun. Math. Phys. 290, 239–248 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0794-4

Communications in

Mathematical Physics

K -Causality Coincides with Stable Causality E. Minguzzi Dipartimento di Matematica Applicata, Università degli Studi di Firenze, Via S. Marta 3, I-50139 Firenze, Italy. E-mail: [email protected] Received: 18 August 2008 / Accepted: 23 December 2008 Published online: 4 April 2009 – © Springer-Verlag 2009

Abstract: It is proven that K -causality coincides with stable causality, and that in a K -causal spacetime the relation K + coincides with the Seifert’s relation. As a consequence the causal relation “the spacetime is strongly causal and the closure of the causal relation is transitive” stays between stable causality and causal continuity.

1. Introduction The relation K + is defined as the smallest closed and transitive relation which contains the causal relation J + . It was introduced by Sorkin and Woolgar in [16] who also defined a spacetime as K -causal if the relation K + is antisymmetric. The relation K + was originally conceived to recast global causal analysis in an ordertheoretic framework, or as a tool for exploring spacetimes with C 0 metrics or varying topology [3,16]. In [7,8] I compared K -causality with the levels already present in the causal ladder of spacetimes [13,15], a well known hierarchy of conformal invariant properties whose study started in a seminal work by Hawking and Sachs [5], and which in the last years has seen the introduction of new levels [7,11] and some improvements [2,12,13]. In this respect since the introduction of K -causality R. Low [16, footnote p. 1990] suggested the coincidence of this relation with stable causality [4]. Indeed, stable causality is equivalent to the antisymmetry of Seifert’s relation [14] JS+ = g >g Jg+ (a fact rigorously proved in [8], see also [5]) where JS+ is a closed, transitive relation which contains J + . These last properties imply, from the definition of K + , K + ⊂ JS+ , and since the antisymmetric property is inherited by inclusion, stable causality implies K -causality. The open question was whether the equality K + = JS+ holds, because in this case K -causality and stable causality would be equivalent. Actually [8] there are causal examples for which JS+ = K + , but nevertheless it could still be that K -causality coincides with stable causality, in particular if K -causality forces the equivalence K + = JS+ . In

240

E. Minguzzi

Seifert’s work [14] there is indeed an unproved claim1 (Lemma 2) which is equivalent to such a statement, although it should be noted that K -causality was not yet defined at the time (see [8] for a discussion). Thus the problem of the equivalence between stable and K -causality has been around for almost four decades, though it has attracted attention only in the last twelve years. As I shall prove below, K -causality and stable causality do indeed coincide and, thanks to the results of [8], this equivalence implies that in a K -causal spacetime the K + relation coincides with the Seifert relation. Given this result the logical structure of some other proofs simplify considerably; I mention the proof that causal continuity implies stable causality and the proof that chronological spacetimes without lightlike lines are stably causal. It also suggests the definition of a new causal relation which stays between stable causality and causal continuity. This relation, here termed causal easiness, is: the spacetime is strongly causal and J¯+ is transitive. The proof of the coincidence between stable causality and K -causality uses the concept of “compact stable causality” introduced in [9]. In short a spacetime is compactly stably causal if for every compact set the light cones can be widened on the compact set while preserving causality. In [9] I proved that K -casuality implies compact stable causality, and I gave examples which show that the two properties differ. I refer the reader to [7,13] for most of the conventions used in this work. In particular, I denote with (M, g) a C r spacetime (connected, time-oriented Lorentzian manifold), r ∈ {3, . . . , ∞} of arbitrary dimension n ≥ 2 and signature (−, +, . . . , +). On M × M the usual product topology is defined. For convenience and generality I often use the causal relations on M × M in place of the more widespread point based relations I + (x), J + (x), E + (x) (and past versions). All the causal curves that we shall consider are future directed. The subset symbol ⊂ is reflexive, X ⊂ X . Several versions of the limit curve theorem will be repeatedly used, particularly those referring to sequences of gn -causal curves, where the metrics in the sequence gn may differ. The reader is referred to [10] for a sufficiently strong formulation (see also Remark 3 of [9]). With A+ I denote [1,7,17] the closure of the causal relation, A+ = J¯+ , and a spacetime on which A+ is antisymmetric is called A-causal. For our purposes, it will be useful to recall the implications: K -causality ⇒ compact stable causality ⇒ A-causality ⇒ strong causality ⇒ non-total imprisonment ⇒ causality. Subsequences are denoted by changing the index, thus xk may denote a subsequence of xn . The set is the diagonal on M × M. 2. K -Causality Coincides with Stable Causality We need some preliminary lemmas. The first one basically states that if two points are K -related but not causally related then it is possible to find a new point, in a compact shell as close to infinity as one wishes, which stays in the “middle” of the original points. Lemma 1. Let (M, g) be a non-total imprisoning spacetime. If (x, z) ∈ K + \J + then for every compact C there is w ∈ M\C such that (x, w) ∈ K + and (w, z) ∈ K + . In particular if (x, z) ∈ K + \J + then for every open set with compact closure B, with x, z ∈ B, there is y ∈ B˙ such that (x, y) ∈ K + and (y, z) ∈ K + . 1 Seifert’s unproved claim has raised some confusion in recent literature. I warn the reader that in the preprint gr-qc/9912090v1, Dowker et al. claimed that stable causality implies K + = JS+ . Actually, the proof relied on Seifert’s Lemma 2 so that after realizing the inconsistency of that lemma they correctly removed this statement from the published version [3]. Unfortunately, in [6] the authors attribute this result to Dowker et al., as they took this information from the preprint version.

K -Causality Coincides with Stable Causality

241

Proof. Let R + = {(x, z) ∈ K + such that (x, z) ∈ J + or for every compact set C there is w ∈ M\C such that (x, w) ∈ K + and (w, z) ∈ K + }, we are going to prove that R + is closed and transitive, and since J + ⊂ R + ⊂ K + this fact will imply R + = K + from which the first statement will follow. The last statement is a trivial consequence of the first statement and [16, Lemma 14] [8, Lemma 5.3]. For the transitivity let (x, y) ∈ R + and (y, z) ∈ R + . If both belong to J + then (x, z) ∈ J + ⊂ R + . If the latter pair does not belong to J + then whatever the compact set C there is w ∈ M\C such that (y, w) ∈ K + and (w, z) ∈ K + thus since (x, y) ∈ K + , we have (x, w) ∈ K + , and hence (x, z) ∈ R + . If the former pair does not belong to J + the proof is analogous. For the closure let (xn , z n ) → (x, z) with (xn , z n ) ∈ R + . We have to prove that (x, z) ∈ R + , thus we can assume x = z, since ⊂ J + ⊂ R + . If there is a subsequence (xk , z k ) ∈ J + , let σk be a sequence of causal curves connecting xk to z k . By the limit curve theorem either there is a causal curve connecting x to z, in which case (x, z) ∈ J + ⊂ R + , and there is nothing left to prove, or there is a past inextendible limit causal curve σ z ending at z, such that for every point w ∈ σ z , (x, w) ∈ J¯+ ⊂ K + . Since (M, g) is non-total imprisoning σ z must escape every compact, thus choosing a compact C, w can be chosen in M\C. Since clearly (w, z) ∈ J + ⊂ K + , it follows (x, z) ∈ R + . Thus without loss of generality we can assume that none of the elements in the sequence (xn , z n ) belongs to J + . Let C be a compact and let B be a open set with compact closure such that C ⊂ B and x, z ∈ B so that we can assume (pass to a subsequence if necessary) xn , z n ∈ B. Since (xn , z n ) ∈ R + and B¯ is compact, there is wn in M\ B¯ such that (xn , wn ) ∈ K + and (wn , z n ) ∈ K + . By a well known result [16, Lemma 14] [8, Lemma 5.3] it is possible to find wn ∈ B˙ ⊂ M\C such that (xn , wn ) ∈ K + and (wn , z n ) ∈ K + . Since B˙ is compact there is a subsequence such that (xi , wi ) → (x, w) ˙ Since K + is closed we have in particular (x, w) ∈ K + and (wi , z i ) → (w, z) with w ∈ B. + and (w, z) ∈ K from which (x, z) ∈ R + follows. The next lemma clarifies that if it is possible to enlarge the light cones in an arbitrary compact set while preserving K -causality then the process can be continued all over the spacetime. Lemma 2. The statement “if (M, g) is K -causal then for every compact set C, there is a metric gC ≥ g, such that gC > g on C, with (M, gC ) K -causal”, implies the apparently stronger statement “if (M, g) is K -causal then it is also stably causal”. Proof. Assume the first statement. Note that if C ⊂ Int C , and C is compact, by taking a point dependent convex combination of g and gC it is possible to find gC ≥ g, g < gC ≤ gC on C, gC = g outside C . Hence, since gC ≤ gC , (M, gC ) is K -causal (recall Lemma 5.10 of [8]). Thus the statement “if (M, g) is K -causal then for every compact C, there is a metric gC ≥ g, such that gC > g on C, with (M, gC ) K -causal” implies “if (M, g) is K -causal then for every pair of compacts C, C , C ⊂ IntC , there is a metric gCC ≥ g, such that gCC > g on C, gCC = g outside C with (M, gCC ) K-causal”. Assume that (M, g) is K -causal. Take p ∈ M and let h be a complete Riemannian metric on M. Let Bn ( p) be closed balls centered at p of h-radius n. By the Hopf-Rinow theorem they are compact. Let g2 ≥ g be a metric such that g2 > g on B2 ( p), g2 = g outside B3 ( p) and (M, g2 ) is K -causal. Consider the compacts C3 = B3 ( p)\IntB2 ( p), and

242

E. Minguzzi

C3 = B4 ( p)\IntB1 ( p), let g3 ≥ g2 be a metric such that g3 > g2 on C3 , g3 = g2 outside C3 , and (M, g3 ) is K -causal. Continue in this way by defining Cn = Bn ( p)\IntBn−1 ( p), and Cn = Bn+1 ( p)\IntBn−2 ( p), and let gn ≥ gn−1 be a metric such that gn > gn−1 on Cn , gn = gn−1 outside Cn . By induction given the assumed statement, (M, gn ) is K -causal. Now, note that if x ∈ Bn ( p), then gk (x) is independent of k for k ≥ n + 1. Define g so that if x ∈ Bn ( p), g (x) = gn+1 (x). Clearly, for every n, g ≥ gn and g > g. Suppose (M, g ) is not causal then there is a closed g -causal curve γ , which necessarily is contained in a compact Bs ( p). But g | Bs ( p) = gs+1 | Bs ( p) , thus γ is gs+1 -causal in contradiction with the causality of (M, gs+1 ). Thus (M, g) is stably causal. In order to prove that the metric can be enlarged over a compact set C while preserving K -causality, we are going to enlarge it in a finite covering of C made of open sets A x constructed as in the next lemma. As a matter of notation, in the next lemma with J(+A¯ ,g ) it is denoted the set made of x the diagonal of the compact A¯ x × A¯ x plus the pairs in A¯ x × A¯ x which can be joined by a continuous g -causal curve of (M, g ) entirely contained in A¯ x (it is an abuse of notation since ( A¯ x , g ) is not a spacetime as A¯ x is compact). Lemma 3. Let (M, g) be a compactly stably causal spacetime. Let C be a compact set and B ⊃ C be a open set with compact closure. There is a metric g B ≥ g, g B > g on B, g B = g on M\B, such that (M, g B ) at every point x ∈ C, admits an open neighborhood A x with compact closure A¯ x ⊂ B such that A¯ x is g B -causally convex. As a consequence, for every g ≤ g B , A¯ x is g -causally convex, no future inextendible continuous g -causal curve is future imprisoned in A¯ x , and J(+A¯ ,g ) is compact. x

Proof. This proof is similar to that of [8, Lemma 3.10]. Since (M, g) is compactly stably causal there is g B , g B > g on B, g B = g on M\B, such that (M, g B ) is causal [9]. Let g B ≥ g be a metric such that g < g B < g B on B, g B = g on M\B. Let x ∈ C; it admits a nested family of g B -globally hyperbolic neighborhoods Vn , V¯n+1 ⊂ Vn , whose closures are all g B -causally convex in V1 , the set {Vn } giving a base for the topology at x (see [13]). We can also assume that for all n, V¯n ⊂ B, and V1 has compact closure. If none of the sets V¯n is g B -causally convex in M there is a sequence of g B -causal curves σn of endpoints xn , z n , with xn → x, z n → x, not entirely contained in V1 and hence in V¯2 . Let cn ∈ V˙2 be the first point at which σn escapes V¯2 , and let dn be the last point at which σn reenters V¯2 . Since V˙2 is compact there are c, d ∈ V˙2 , and a subsequence σk such that ck → c, dk → d and since V1 is globally hyperbolic the causal relation + + on V1 × V1 , J(V , is closed and hence (x, c), (d, x) ∈ J(V thus d = c as the 1 ,g B ) 1 ,g B ) + spacetime (V1 , g B ) is causal, finally (x, c), (d, x) ∈ J(M,g B ) . Taking into account that + + it is (c, d) ∈ J¯(M,g . (ck , dk ) ∈ J(M,g B) B) Let us widen the light cones from g B to g B . There is a g B -timelike curve connecting + + d to c passing through x, and since (c, d) ∈ J¯(M,g ) and I(M,g ) is open there is a closed B

B

g B -timelike curve passing through x, a contradiction with the causality of (M, g B ). The contradiction proves that there is a choice of n for which V¯n is g B -casually convex. Set A x = Vn , then A¯ x is also clearly g -causally convex for every g ≤ g B . Since (V1 , g B ) is globally hyperbolic it is also non-total imprisoning, in particular no future inextendible continuous g -causal curve is future imprisoned in the compact A¯ x . The fact that J(+A¯ ,g ) x is compact follows from the compactness of A¯ x , indeed by the limit curve theorem any sequence of continuous g -causal curves in A¯ x with endpoints converging to a pair

K -Causality Coincides with Stable Causality

243

(y, z) ∈ A¯ x × A¯ x , y = z, necessarily admits a limit g -causal curve connecting y to z contained in A¯ x , as the alternative would imply the presence of a future inextendible continuous g -causal curve future imprisoned in the compact A¯ x passing through y. Recall that if R + is a generic relation, (R + )0 is by definition the diagonal of M × M, while (R + )i denotes the composition of the relation with itself for i-times. Lemma 4. Let (M, g), C, B, g B and the sets {A x } be as in Lemma 3. Let g be a metric such that g ≤ g ≤ g B . Let x ∈ C, if (M, g ) is K -causal then there is g , g ≤ g ≤ g B , such that (M, g ) is K -causal and g > g on A x . + Proof. By assumption K (M,g ) is antisymmetric. Let g˜ be a metric such that g ≤ g˜ ≤ g B , g˜ > g on A x , g˜ = g on M\A x (e.g. a point dependent convex combination of g and g B ). + + + + + + i For every i ≥ 0, K (M,g ◦ K (M,g ) ◦ (J ¯ ) ) ⊂ K (M,˜ g ) as it is J( A¯x ,˜g ) ⊂ J(M,˜g ) ⊂ ( A x ,˜g ) + + + + + K (M,˜ g ) and K (M,g ) ⊂ K (M,˜g ) (note that K (M,˜g ) is closed, transitive and contains J(M,g ) ), thus +∞ i + + + + K (M,g ◦ K ⊂ K (M,˜ (1) ) ◦ J ¯ (M,g ) g) . ( A ,˜g ) x

i=0

Suppose we prove that g˜ is also such that there is N > 0 so that +∞

+ + K (M,g ) ◦ J ¯ (A

x

+ ◦ K (M,g ) ,˜g )

i

i=0

=

N

+ + K (M,g ) ◦ J ¯ (A

x

+ ◦ K (M,g ) ,˜g )

i

.

(2)

i=0

Each term ◦ ◦ is closed, a fact which follows easily from the observation that the composition of a closed, a compact and a closed relation is closed. Thus the right-hand side of Eq. (2) is closed as it is the union of a finite number of closed sets. Moreover, it is also transitive because it equals the left-hand side of Eq. (2) which + is clearly transitive. Finally, J(M,˜ g ) is contained in it, a property which follows from the ¯ fact that since A x is g -causally convex (g and g˜ coincide outside A x ) + K (M,g )

(J(+A¯ ,˜g) x

+ i K (M,g ))

+ + + + J(M,˜ g ) = J(M,g ) ∪ J(M,g ) ◦ J( A¯

g) x ,˜

+ ◦ J(M,g ).

+ ˜ curve connecting [The previous equation means that if (x, z) ∈ J(M,˜ g ) then the g-causal + x to z either passes outside A x in which case it is g -causal and (x, z) ∈ J(M,g ) or ¯ it intersects A x on, by g -causal convexity of A x , a single segment. In this last case since the points at which the curve enters and escape A¯ x are g-causal ˜ related, it is + + + (x, z) ∈ J(M,g ◦ J ◦ J .] ) (M,g ) ( A¯x ,˜g ) Thus Eq. (2) implies + K (M,˜ g) ⊂

N

+ + K (M,g ) ◦ J ¯ (A

x

+ ◦ K (M,g ) ,˜g )

i

,

i=0

and hence by Eq. (1), + K (M,˜ g) =

N

+ + K (M,g ) ◦ J ¯ (A

x

i=0

+ ◦ K (M,g ) ,˜g )

i

.

244

E. Minguzzi

Consider a sequence of metrics gn , gn → g pointwisely, which have the properties ≤ gn ≤ g B , gn > g on A x , gn = g on M\A x (for instance take g, ¯ g ≤ g¯ ≤ g B , g¯ > g 1 1 on A x , g¯ = g on M\A x and define gn = (1 − n )g + n g). ¯ Assume that a subsequence gk exists such that for each value of k, Eq. (2) with g˜ = gk does not hold no matter the value of N (k). For every k since the equation

g

+∞

+ K (M,g )

i=0

◦

J(+A¯ ,g ) x k

◦

+ K (M,g )

i

=

N

+ + K (M,g ) ◦ J ¯ (A

i=0

x ,gk )

+ ◦ K (M,g )

(k)

i

(k)

,

(3)

(k)

does not hold for any N , it is possible to find for each k a chain x j such that (x2i , x2i+1 ) ∈ J(+A¯

x ,gk )

(k)

(k)

+ + and (x2i+1 , x2i+2 ) ∈ K (M,g ) \J ¯ (A

x ,gk )

with 0 ≤ i ≤ k. Let D be an open set with

compact closure such that B¯ ⊂ D. Note that for 1 ≤ i ≤ k − 1, since J(+A¯ (k) (k) + + ¯ ¯ (x2i+1 , x2i+2 ) ∈ K (M,g ) ∩ { A x × A x }\J ¯ (A

= =

x ,g ) + + [K (M,g ) ∩ ( A¯ x × A¯ x )]\[J(M,g ) ∩ ( A¯ x + + ¯ ¯ [K (M,g ) \J(M,g ) ] ∩ ( A x × A x ),

x ,g

)

⊂ J(+A¯

x ,gk )

,

× A¯ x )]

where the first equality follows from the g -causal convexity of A¯ x . Thus by Lemma 1, (k) (k) (k) + + there is wi(k) ∈ D˙ such that (x2i+1 , wi(k) ) ∈ K (M,g ) and (wi , x 2i+2 ) ∈ K (M,g ) .

(k) (k) , x2i+1 ) as dependent on Now we consider, by starting from i = 1, the sequence (x2i (k1 ) (k1 ) k and pass to a convergent subsequence (x2i , x2i+1 ); then we consider the sequence (k1 ) (k2 ) , wi(k1 ) ) as dependent on k1 and pass to a convergent subsequence (x2i+1 , wi(k2 ) ); (x2i+1 (k2 ) (k2 ) then we consider the sequence (wi , x2i+2 ) as dependent on k2 and pass to a convergent (k3 ) ); then we pass to the next value of i and continue in this way subsequence (wi(k3 ) , x2i+2 each time passing to a convergent subsequence. ( j) ( j) Moreover, we use the fact that if an arbitrary sequence (xt , xt+1 ) ∈ J(+A¯ ,g ) conx

j

because g j → g [recall that by the limit x curve theorem since no g -causal curve is future imprisoned in A¯ x , a sequence of connect( j) ( j) ing g j -causal curves contained in A¯ x of endpoints (xt , xt+1 ) has a limit g -causal curve contained in A¯ x of endpoints (xt , xt+1 )]. The limit pairs belong alternatively to J + , verges to (xt , xt+1 ), then (xt , xt+1 ) ∈ J(+A¯

,g )

+ + + ¯ ˙ ˙ ¯ K (M,g ) ∩( A x × D) and K (M,g ) ∩( D × A x ), and since J ¯ (A

( A¯ x ,g )

+ ⊂ K (M,g ) it is possible to x + ˙ while for find a sequence denoted ys , (ys , ys+1 ) ∈ K (M,g ) , such that for s even, ys ∈ D, ˙ odd s, ys ∈ A¯ x . From this sequence we are going to find two points p ∈ A¯ x and q ∈ D, + + and hence p = q, such that ( p, q) ∈ K (M,g and (q, p) ∈ K in contradiction with ) (M,g ) + ¯ ˙ K (M,g ) -causality. Consider the sequence (y2 j+1 , y2 j+2 ) ∈ A x × D and pass to a converg+ + ing subsequence (y2 jr +1 , y2 jr +2 ) → ( p, q). Since K (M,g ) is closed, ( p, q) ∈ K (M,g ) . + Since jr +1 ≥ jr + 1, 2 jr +1 + 1 ≥ 2 jr + 2, thus (y2 jr +2 , y2 jr +1 +1 ) ∈ K (M,g ) as this last + relation is transitive. Passing to the limit r → +∞, (q, p) ∈ K (M,g ). ,g )

K -Causality Coincides with Stable Causality

245

The contradiction proves that for sufficiently large n there is always N (n) such that +∞

+ + K (M,g ) ◦ J ¯ (A

x ,gn )

i=0

+ ◦ K (M,g )

i

=

N (n)

+ + K (M,g ) ◦ J ¯ (A

x ,gn )

i=0

+ ◦ K (M,g )

i

,

(4)

thus for sufficiently large n (in what follows we pass to a subsequence denoted in the same way so that it will hold for every n), + K (M,g n)

=

N (n)

+ + K (M,g ) ◦ J ¯ (A

x ,gn )

i=0

+ ◦ K (M,g )

i

.

We would conclude the proof by proving that there is a choice of n, such that the corre+ sponding K (M,g is antisymmetric, indeed we would set g = gn . n) Here the argument is basically the same that lead to the construction of points p and + + + q. Since K (M,g are antisymmetric for every n, if K (M,g ) and J ¯ were not antisym( A x ,gn ) n) metric for no value of n then, for each n, we would find a closed chain of points so that + + the successive pairs belong to J(+A¯ ,g ) and K (M,g . However, a pair belonging ) \J ¯ ( A x ,gn ) x n to K + \J + belongs also to K + \J + so that there is a point in D˙ so as to (M,g )

( A¯x ,gn )

(M,g )

( A¯x ,g )

+ split the pair in two, the middle point belonging to D˙ and both pairs belonging to K (M,g ). Then by passing to subsequences as done above (basically to get the limit n → +∞), + ¯ ˙ we find a chain of K (M,g ) -related events alternatively belonging to A x and D. If the chain is finite and closed then it is easy to infer the contradiction that the spacetime is not + K (M,g ) -causal. If it is infinite one gets again the same conclusion by using the argument used above in the construction of p and q.

Lemma 5. Let C be a compact. If (M, g) is K -causal then there is a metric gC ≥ g, such that gC > g on C, with (M, gC )K -causal. Proof. Since (M, g) is K -causal it is compactly stably causal. Let B, g B and the sets {A x } be as in Lemma 3. Since C is compact there is a finite covering {A xi }, thus one can start enlarging the metric in A x1 while keeping K -causality according to Lemma 4, and continue with successive enlargements so as to obtain a final metric gC as in the statement of this lemma. Theorem 1. K -causality coincides with stable causality. Proof. If (M, g) is K -causal then it is stably causal, indeed this result follows as a corollary of Lemmas 2 and 5. The other direction is well known, see the discussion in the Introduction. Theorem 2. If (M, g) is K -causal (stably causal) then K + = JS+ . Proof. It is a consequence of Theorem 6.2 of [8]. 3. Causal Easiness The equivalence between K -causality and stable causality suggests to define a new conformally invariant property

246

E. Minguzzi

Definition 1. A spacetime which is A-causal and such that A+ = K + is said to be causally easy. It is actually natural to define the property of causal easiness, indeed it appears in [9, Theorem 5] where it is proven that a spacetime which is chronological and has no lightlike line is causally easy. Notice that the condition A+ = K + states that J¯+ is transitive. Theorem 3 (Transverse conformal ladder). The compactness of the causal diamonds implies the closure of the causal relation which implies reflectivity which implies the transitivity of J¯+ . Proof. It is well known that the compactness of the causal diamonds J + (x) ∩ J − (z) for all x, z ∈ M, implies J¯+ = J + , see for instance [13, Prop. 3.68 and 3.71]. Now recall [12], that the relation D + = {(x, y) : y ∈ I + (x) and x ∈ I − (y)} is reflexive and transitive. It holds D + = A+ iff the spacetime is reflective [12]. Since J + ⊂ D + ⊂ A+ , J + = A+ implies reflectivity. Finally, reflectivity (but future or past reflectivity would be sufficient) implies the transitivity of A+ , indeed D + = A+ , and since D + is transitive then A+ is transitive. A spacetime can be stably causal without being causally easy (the spacetime of Fig. 38 of [4] without the identification). A spacetime can also be causally easy without being causally continuous (1+1 Minkowski spacetime with a timelike geodesic segment removed). Causal easiness can indeed be placed between these two levels. Theorem 4. Causal continuity implies causal easiness which implies K -causality (stable causality). Proof. Recall [12] that a spacetime is causally continuous iff it is weakly distinguishing, that is D + is antisymmetric, and reflective, that is D + = A+ . But since D + is antisymmetric then A+ is antisymmetric, that is the spacetime is A-casual. Moreover, by Theorem 3 reflectivity implies the transitivity of A+ (A+ = K + ), thus the spacetime is causally easy. Assume the spacetime is causally easy that is A+ is antisymmetric and A+ = K + , then K + is antisymmetric, that is, the spacetime is K -causal. The definition of causal easiness can be improved by weakening the condition of A-causality to strong causality.

Fig. 1. A distinguishing non-strongly causal, and hence non-causally easy, spacetime for which A+ is transitive. Here the non-removed boundary at the bottom is identified with that at the top; as a consequence the spacetime is non-orientable but this feature is not essential. The only points at which strong causality is violated are those on the lightlike geodesic γ , and their future A+ (x) is given by the shadowed region. This spacetime example is interesting because it shows that if strong causality is violated at x then there needs not to be a second event z = x, such that x ∈ I − (z) ∩ I + (z)

K -Causality Coincides with Stable Causality

247 Global hyperbolicity Causal simplicity Causal continuity Causal easiness

Compactness Closure of the of the causal diamonds causal relation

_

Reflectivity

Stable causality (K-causality) Absence of lightlike lines

Transitivity of J +

Absence of future (or past) lightlike rays

Strong causality

Weak distinction Causality Chronology

Fig. 2. The causal ladder of spacetimes with the levels involved in the implications which climb the ladder. Here the double arrow A ⇒ B means that A implies B and that there are spacetime examples which show that properties A and B differ

Proposition 1. A spacetime is causally easy iff it is strongly causal and J¯+ is transitive. Proof. To the right it is immediate since A-causality implies strong causality. Assume that the spacetime is strongly causal and J¯+ is transitive, and assume that the spacetime is not A-causal, then there are events x, z, x = z, such that (x, z) ∈ A¯ + and (z, x) ∈ A+ . Let σn be a sequence of causal curves of endpoints (xn , z n ) → (x, z). By the limit curve theorem there is a limit causal curve σ z ending at z (past inextendible or such that it connects x to z) and if y ∈ σ z \{z} then (x, y) ∈ J¯+ . Since J¯+ is transitive (z, y) ∈ J¯+ , while clearly (y, z) ∈ J + , thus by [7, Theorem 3.4] the spacetime is not strongly causal, a contradiction. In the definition of causal easiness the condition of causality cannot be further weakened to distinction, see Fig. 1. Figure 2 summarizes the relationship between the various conformally invariant properties. 4. Conclusions In this work the conjecture that K -causality and stable causality coincide has been proven. As a consequence in a K -causal spacetime the K + relation and the Seifert relation coincide. This is a powerful result which, once proved, allows to readily deduce several other results that otherwise should be obtained through more specific reasonings. Given this result it becomes also natural to introduce a new relation which I called causal easiness, which stays between causal continuity and stable causality. Acknowledgement. This work has been partially supported by GNFM of INDAM.

References 1. Akolia, G.M., Joshi, P., Vyas, U.: On almost causality. J. Math. Phys. 22, 1243–1247 (1981) 2. Bernal, A.N., Sánchez, M.: Globally hyperbolic spacetimes can be defined as ‘causal’ instead of ‘strongly causal’. Class. Quant. Grav. 24, 745–749 (2007)

248

E. Minguzzi

3. Dowker, H.F., Garcia, R.S., Surya, S.: K -causality and degenerate spacetimes. Class. Quant. Grav. 17, 4377–4396 (2000) 4. Hawking, S.W., Ellis, G.F.R.: The Large Scale Structure of Space-Time. Cambridge University Press, Cambridge, 1973 5. Hawking, S.W., Sachs, R.K.: Causally continuous spacetimes. Commun. Math. Phys. 35, 287–296 (1974) 6. Janardhan, S., Saraykar, R.V.: K -causal structure of space-time in general relativity. Pramana-Journal of Physics 70, 587–601 (2008) 7. Minguzzi, E.: The causal ladder and the strength of K -causality. I. Class. Quant. Grav. 25, 015009 (2008) 8. Minguzzi, E.: The causal ladder and the strength of K -causality. II. Class. Quant. Grav. 25, 015010 (2008) 9. Minguzzi, E.: Chronological spacetimes without lightlike lines are stably causal. Preprint: available at http://arxiv.org/abs/0806.0153v1[gr-qc], 2008, to appear in Commun. Math. Phys. doi:10.1007/s00220009-0784-6 10. Minguzzi, E.: Limit curve theorems in Lorentzian geometry. J. Math. Phys. 49, 092501 (2008) 11. Minguzzi, E.: Non-imprisonment conditions on spacetime. J. Math. Phys. 49, 062503 (2008) 12. Minguzzi, E.: Weak distinction and the optimal definition of causal continuity. Class. Quant. Grav. 25, 075015 (2008) 13. Minguzzi, E., Sánchez, M.: The causal hierarchy of spacetimes. In: Baum, H., Alekseevsky, D. (eds.), Recent developments in pseudo-Riemannian geometry, of ESI Lect. Math. Phys., Zurich: Eur. Math. Soc. Publ. House, 2008, pp. 299–358 14. Seifert, H.: The causal boundary of space-times. Gen. Rel. Grav. 1, 247–259 (1971) 15. Senovilla, J.M.M.: Singularity theorems and their consequences. Gen. Rel. Grav. 30, 701–848 (1998) 16. Sorkin, R.D., Woolgar, E.: A causal order for spacetimes with C 0 Lorentzian metrics: proof of compactness of the space of causal curves. Class. Quant. Grav. 13, 1971–1993 (1996) 17. Woodhouse, N.M.J.: The differentiable and causal structures of space-time. J. Math. Phys. 14, 495–501 (1973) Communicated by G.W. Gibbons

Commun. Math. Phys. 290, 249–290 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0727-7

Communications in

Mathematical Physics

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries Hans Jockers, Masoud Soroush Department of Physics, Stanford University, Stanford, CA 94305-4060, USA. E-mail: [email protected]; [email protected] Received: 25 August 2008 / Accepted: 23 September 2008 Published online: 13 January 2009 – © Springer-Verlag 2009

Abstract: For compact Calabi-Yau geometries with D5-branes we study N = 1 effective superpotentials depending on both open- and closed-string fields. We develop methods to derive the open/closed Picard-Fuchs differential equations, which control D5-brane deformations as well as complex structure deformations of the compact Calabi-Yau space. Their solutions encode the flat open/closed coordinates and the effective superpotential. For two explicit examples of compact D5-brane Calabi-Yau hypersurface geometries we apply our techniques and express the calculated superpotentials in terms of flat open/closed coordinates. By evaluating these superpotentials at their critical points we reproduce the domain wall tensions that have recently appeared in the literature. Finally we extract orbifold disk invariants from the superpotentials, which, up to overall numerical normalizations, correspond to orbifold disk Gromov-Witten invariants in the mirror geometry. 1. Introduction Since D-branes have been discovered in string theory as non-perturbative BPS objects [1], they have played an important role. Besides serving as crucial ingredients in phenomenological string model building they have increased our insight into non-perturbative physics in both string and field theory. Over and above D-branes have also deepened our understanding of the web of string dualities. Such dualities often map theories in the quantum regime to dual descriptions in which semiclassical methods are applicable. One prominent example of this kind is given by mirror symmetry, which connects classical geometry to a notion of quantum geometry. In the context of D-branes mirror symmetry is further refined and leads towards the homological mirror symmetry conjecture [2,3]. Mirror symmetry in the closed string sector relates type IIA string theory compactified on a Calabi-Yau threefold, X , to type IIB string theory compactified on the mirror Calabi-Yau threefold, Y . Among other things the duality implies that the associated

250

H. Jockers, M. Soroush

four-dimensional low energy effective N = 2 supergravity theories are the same. In particular the target space manifolds of the scalar fields in the N = 2 vector multiplets are captured by the same holomorphic prepotential. On the type IIB side this holomorphic function is derived by analyzing the complex structure moduli space of the Calabi-Yau, Y , by means of classical geometry, whereas on the mirror type IIA side the prepotential arises from the quantum geometry of the Kähler moduli space of the Calabi-Yau, X . The holomorphic prepotential arises from the underlying N = 2 special geometry, which is a consequence of the N = 2 local supersymmetry of the compactified type II string theories. Another approach to this N = 2 special geometry structure appears by investigating the topological A- and B-model [4]. These topological string theories can be viewed as subsectors of the physical type IIA and type IIB string theories respectively [4–6]. Then mirror symmetry connects the A-model on the Calabi-Yau manifold, X , to the B-model on the Calabi-Yau manifold, Y . In this context mirror symmetry can be extended to the open-string sector by including topological branes. The homological mirror symmetry conjecture states that the category of topological B-branes in the B-model is equivalent to the category of topological A-branes in the mirror A-model [2,3,7–9]. Excellent reviews of these matters may be found in refs. [10,11]. Analogously to the closed-string sector one would like to take advantage of this extended version of mirror symmetry for explicit computational purposes in the open-string sector. As type II string theories with branes compactified on Calabi-Yau manifolds exhibit only N = 1 local supersymmetry their low-energy regime is given by four-dimensional N = 1 supergravity theories. Part of the defining data of these supergravity theories is the holomorphic superpotential, on which we focus in this work. Similarly as the prepotential in a purely closed-string setup it turns out that the N = 1 superpotential of string compactifications with branes and fluxes are given on the type IIB side by classical obstruction theory, whereas on the type IIA side they are generated non-perturbatively by open-string disk instantons [12,13]. Thus from the N = 1 superpotential of the topological B-model we get a handle on the non-perturbative superpotential of the topological A-model for the mirror configuration. As the superpotential corresponds in the topological A-model to the disk partition function [14,15], open mirror symmetry provides a powerful tool in enumerative geometry. In the large radius region of the topological A-model the disk partition function counts integer disk invariants [14], whereas, as investigated recently in refs. [16–18], in the vicinity of orbifold points the superpotential encodes rational orbifold disk invariants [19–21]. Moreover, the effective superpotential is a relevant quantity not only in enumerative geometry but also in string phenomenology. It constitutes an important ingredient in string model building because, for instance, it stabilizes moduli fields and/or triggers supersymmetry breaking. Most of the phenomenological N = 1 type II string models are either constructed from non-compact geometries, and hence gravity is decoupled, or they are obtained by semi-classical Kaluza-Klein reductions, for which the explicit quantum corrections are often not known. In order to capture some of these corrections in the context of compact Calabi-Yau scenarios it is desirable to get at least a handle on the quantum-corrected N = 1 effective superpotential. In practice computing such effective superpotential is rather hard. First of all for a given brane configuration in a compact Calabi-Yau manifold on the B-model side the corresponding brane configuration on the mirror A-model side is only known in special situations. Second even if the mirror configurations are known on both sides one still needs to find the open-closed string mirror map, which provides for the dictionary translating the classical computation in the topological B-model to the quantum computation in the topological A-model.

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

251

For non-compact Calabi-Yau manifolds the program sketched above has successfully been carried out in refs. [22–26]. In ref. [22,23] the boundary condition at infinity has been used, whereas in refs. [24,25] the concept of N = 1 special geometry has been introduced and applied. Although the connection between these two approaches is not obvious both methods yield results which are in agreement. For compact Calabi-Yau manifolds the analog analysis seems to be more involved and to our knowledge the above sketched endeavor, namely to compute quantum corrected superpotentials depending on both open- and closed-string moduli has not been carried out explicitly so far. Recently, however, in refs. [27,28] a major step in this direction has been achieved by computing quantum-corrected domain wall tensions on the quintic threefold using open-string mirror symmetry. Following this recipe similar computations have been carried out successfully for other one-parameter Calabi-Yau geometries in refs. [29,30]. Guided by the N = 1 special geometry techniques applied to non-compact Calabi-Yau geometries in refs. [24,25], we propose in this work an analog method to derive Picard-Fuchs equations governing effective superpotentials for D5-brane configurations in compact Calabi-Yau manifolds. The resulting Picard-Fuchs partial differential equations depend on both open- and closed-string moduli, and their solutions encode in addition to the effective superpotential the open-/closed-string mirror map. Thus our approach is not only suitable to compute effective superpotentials but also to extract enumerative invariants in the topological A-model of the mirror geometry. We apply these novel techniques to D5-branes in Calabi-Yau threefolds, which are related to the geometries discussed in the context of domain wall tensions in refs. [27, 29,30]. For two examples we explicitly derive the quantum corrected superpotential, which we express in the vicinity of the orbifold point by a uniquely distinguished set of open/closed flat coordinates. Then, similarly to the analysis performed for D-branes in local Calabi-Yau geometries [16–18], we extract (up to overall numerical normalizations) from the flat superpotential a tower of orbifold disk invariants for the mirror D-brane configuration in the compact mirror Calabi-Yau geometry. Finally, as a bonus and as a highly non-trivial consistency check we reproduce for our two examples the domain wall tension computed in refs. [27,29,30] by evaluating the calculated effective superpotential at its critical points. The outline of this paper is as follows. In Sect. 2 we review some relevant aspects of N = 1 special geometry along the lines of refs. [24,25]. Then in Sect. 3 we develop the tools to derive the open/closed Picard-Fuchs differential equations, and we argue that their solutions capture the necessary information to extract the effective superpotential in terms of flat coordinates. In Sect. 4 and Sect. 5 we apply our techniques in detail to two concrete examples. The first example is given by a family of D5-branes in the degree eight Calabi-Yau hypersurface of the weighted projective space, WP4(1,1,1,1,4) /(Z8 )2 × Z2 , whereas the second example is a family of D5-branes in the mirror quintic Calabi-Yau threefold. For these examples we extract orbifold disk invariants for the associated mirror geometries and determine domain wall tensions in agreement with the results in the literature. Finally we present our conclusions in Sect. 5. In the two appendices we supplement further computational details for the two discussed examples. 2. Effective Superpotentials and N=1 Special Geometry It is well-known that type II string theory compactified on Calabi-Yau threefolds with background fluxes or space-time filling D-branes is described in the low-energy regime

252

H. Jockers, M. Soroush

by N = 1 effective supergravity theories. For such string compactifications we present in this work new techniques to compute the effective superpotential, which is part of the defining data of N = 1 supergravities and plays an important role in many phenomenological applications. To set the stage for the subsequent sections we first review some aspects of N = 1 special geometry, which are relevant for our computations. 2.1. Flux-induced and D5-brane superpotentials. Let us consider type IIB string theory compactified on the Calabi-Yau threefold, Y . Then in the presence of internal background fluxes an effective superpotential is induced. Here we mainly focus on the quantized three-form RR fluxes, F (3) , which takes values in the integer cohomology group, H 3 (Y, Z). Then the resulting superpotential reads [31,32] (z) ∧ F (3) , (2.1) WRR (z) = Y

where (z) is up to normalization the unique holomorphic three form of the Calabi-Yau threefold, Y , depending on the complex structure moduli parametrized by the coordinates, z.1 The dependence on the complex structure moduli can be made more explicit by expressing the three-form superpotential in terms of the period vector, α (z), of the Calabi-Yau manifold, which is obtained by integrating the holomorphic three form, , over a basis, α , of the integer homology group, H3 (Y, Z), (z), α ∈ H3 (Y, Z). (2.2) α (z) = α

The periods, α (z), of a Calabi-Yau manifold are governed by the underlying N = 2 special geometry, which gives rise to the holomorphic prepotential of the vector multiplets in the associated N = 2 supergravity theory. Here we also express the flux-induced superpotential, WRR , in terms of these periods WRR (z) = Nα α (z),

(2.3)

where the integers, Nα , are the quanta of the three-form background fluxes, F (3) . A similar effective superpotential arises from space-time filling D-branes wrapping even-dimensional cycles in type IIB Calabi-Yau compactifications. For branes filling the whole compactification space the open-string partition function, which in our context yields the resulting effective superpotential, arises from the holomorphic Chern-Simons action [12]. As we focus in this work on a space-time filling D5-brane wrapping a two cycle, C, of the internal Calabi-Yau, Y , we need to consider the dimensional reduction of the holomorphic Chern-Simons action to two dimensions, which becomes [13,22,33] WD5 = i jw ζ i ∂¯w ζ j dw d w. ¯ (2.4) C

ζi

Here are sections of the normal bundle of the two cycle, C, embedded in its ambient Calabi-Yau space, Y , and they parametrize infinitesimal deformations of the D5-branes. The holomorphic Chern-Simons action also depends on the complex structure moduli 1 Strictly speaking a modulus parametrizes a flat direction in the scalar potential of the effective field theory. In this paper, however, a modulus refers to the complex scalar field of a neutral chiral multiplet, which may or may not be obstructed by the superpotential.

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

253

through its coupling to the holomorphic three form, , and therefore the superpotential, WD5 , depends on both the complex structure moduli and the D5-brane open-string moduli for the deformations of the embedding cycle, C. Analogously to the flux-induced superpotential the moduli dependence becomes explicit by writing the D5-brane superpotential in terms of semi-periods. The semi-period ˆ αˆ , is defined by vector, αˆ ˆ (z, u) = (z), (2.5) ˆ αˆ (u)

where ˆ αˆ (u) constitutes a basis of three chains that have non-trivial boundaries, ∂ ˆ αˆ (u), lying in the union of non-trivial two-cycles, S, of the Calabi-Yau manifold, Y . As this basis of open three chains depends on the open-string moduli, u, the semi-period vector, ˆ αˆ (z, u), becomes a function of both closed and open fields, and the moduli-dependent D5-brane superpotential (2.4) reads [22,33,34] ˆ αˆ (z, u). WD5 (z, u) = Nˆ αˆ

(2.6)

Here the integers, Nˆ αˆ , specify the topology of the internal two cycle, C, of the D5-brane worldvolume by specifying a linear combination of the two cycles in the set, S. Since both the flux-induced and the D5-brane superpotential arise from integrals of the holomorphic three form, , it is natural to consider the combined superpotential [24,25] W (z, u) = WRR (z) + WD5 (z, u) = N a a (z, u), (2.7) in terms of the relative period vector a (z, u) = (z), a (u) ∈ H3 (Y, S, Z). a (u)

(2.8)

Here a denotes a basis of three chains in the relative integer homology group, H3 (Y, S, Z). We should stress that the basis, a , captures closed three chains, α , and open three chains, ˆ αˆ , with boundaries in the set of two cycles, S. Therefore the effective superpotential (2.7) does indeed get contributions from three-form RR fluxes and D5-branes, and the integers, N a , specify now both the three-form flux quanta and the D5-brane topology. Note that also from a physics point of view the interplay of threeform fluxes and space-time filling D5-branes in the effective superpotential is not very surprising because D5-branes and three-form fluxes are often related by geometric transitions [35]. Although the effective superpotential (2.7) is a purely classical expression for type IIB string compactifications it describes a highly intricate sum of non-perturbative instantons for the mirror type IIA string compactification [36]. In order to extract these non-trivial instanton contributions it is necessary to analyze the structure of these relative periods. Analogously to the N = 2 special geometry, which relates the holomorphic N = 2 prepotential to periods of Calabi-Yau manifolds, the relative periods entering the N = 1 holomorphic superpotential (2.7) are governed by the underlying N = 1 special geometry, which has been introduced in refs. [24,25,34]. However, before discussing the properties of relative periods there are a few general comments in order. First of all since we are interested in effective superpotentials arising from compact Calabi-Yau geometries the background three-form fluxes and the space-time filling D5-branes introduce RR tadpoles rendering the physical string theory inconsistent. These tadpoles arise from worldsheets at the one-loop level and can be

254

H. Jockers, M. Soroush

cancelled by introducing appropriate orientifold planes. But these orientifold planes do not alter our computations because the effective superpotential (2.7) appears at string tree level. Moreover, as a BPS protected quantity the superpotential is also not further modified by flux- or brane-induced backreactions to the geometry. Finally we remark that due to the S L(2, Z) symmetry of type IIB string theory the effective superpotential (2.7) can easily be extended to describe NS three-form fluxes and NS5-branes [24,25]. This is achieved by replacing the RR sector charges, N a , by the complexified charge quanta, N a + τ N aNS . Here τ denotes the complex dilaton and the integers, N aNS , capture the NS sector charges. However, in the following we set the NS charges again to zero and restrict ourselves to the RR sector. 2.2. Relative periods. As relative periods are crucial for the effective superpotentials of interest we study now their structure in some more detail. These relative periods are adequately described in terms of relative homology and relative cohomology [24,25]. Therefore we give a brief mathematical interlude to these matters. For more details see, for instance, ref. [37]. For the submanifold, S, embedded by the map, i : S → Y , in the ambient Calabi-Yau manifold, Y , the space of relative forms, ∗ (Y, S), is the subspace of forms, ∗ (Y ), defined as the kernel of the pullback, i ∗ : ∗ (Y ) → ∗ (S). In other words the relative forms fit in the exact sequence i∗

0 −→ ∗ (Y, S) → ∗ (Y )−→∗ (S) −→ 0.

(2.9)

Then the relative cohomology groups, H ∗ (Y, S), arise from the space of closed modulo exact relative forms with respect to the de Rham differential, d. Since the de Rham differential commutes with the maps in the short exact sequence (2.9), we deduce a long exact sequence on the level of cohomology in the usual manner. In particular the long exact sequence implies for the three-form cohomology group, H 3 (Y, S), H 3 (Y, S) ∼ (2.10) = ker H 3 (Y ) → H 3 (S) ⊕ coker H 2 (Y ) → H 2 (S) . For Calabi-Yau threefolds the first summand equals H 3 (Y ) on dimensional grounds. 2 (S), with respect to the The second contribution is the variable cohomology group, Hvar embedding space, Y , i.e.these are cohomology elements of the submanifold not induced form the ambient space. Consequently the decomposition (2.10) allows us to represent a relative three form, , as a pair of a closed three form, , and a closed two form, ξ , = ( , ξ ) ∈ H 3 (Y, S), such that the relative form, , obeys the equivalence relation ∼ + dα, i ∗ α − dβ .

(2.11)

(2.12)

Here α is a two form on the Calabi-Yau, Y , whereas β is a one form on the subspace, S. Let us turn to the homology group, H3 (Y, S, Z), of relative three cycles. As mentioned previously a relative three cycle, , is a three chain whose boundary, ∂, lies in the submanifold, S. The duality pairing between relative three-form cohomology elements and relative three cycles is given by ≡ − ξ. (2.13)

∂

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

255

This topological pairing is compatible with the equivalence relation (2.12), and hence it is well-defined. After this brief introduction to relative (co-)homology we can write the relative periods (2.8) in terms of the relative three form, (z, u), integrated over a relative homology basis, a , a (z, u) =

a

(z, u), a ∈ H3 (Y, S, Z).

(2.14)

At a given reference point in the open-/closed-string moduli space the relative three form, , can be viewed as the pair, (, 0). However, as we move in the moduli space the relative three form changes and generically acquires also non-zero two-form contributions. The moduli dependence of the relative periods, a (z, u), is now entirely captured by the variation of the relative form, (z, u). 2.3. Variation of mixed Hodge structure. The suitable formalism to get a handle on the moduli dependence of the relative periods (2.14) is the variation of mixed Hodge structure [24,25]. It is a generalization of the variation of Hodge structure used to compute the complex structure moduli dependence of the closed-string period vector. For the mathematical definition of a mixed Hodge structure we refer the reader to refs. [38,39]. In our context, relative three forms realize a mixed Hodge structure as follows [24,25]: The Z-module of a mixed Hodge structure is given by the relative integer cohomology group, H 3 (Y, S, Z). The second ingredient is a finite decreasing filtration, F p , of the complexified group, H 3 (Y, S, C) ≡ C⊗Z H 3 (Y, S, Z). This filtration becomes F 3 = H 3,0 (Y, S), F = H 3,0 (Y, S) ⊕ H 2,1 (Y, S), 2

F 1 = H 3,0 (Y, S) ⊕ H 2,1 (Y, S) ⊕ H 1,2 (Y, S), F 0 = H 3,0 (Y, S) ⊕ H 2,1 (Y, S) ⊕ H 1,2 (Y, S) ⊕ H 0,3 (Y, S).

(2.15)

In terms of the decomposition (2.10) the groups, H p,q (Y, S), split into the cohomolp,q−1 ogy groups, H p,q (Y ) and Hvar (S). Thus in particular the holomorphic three form, 3 , spans the filtration, F . Finally a mixed Hodge structure has a finite increasing weight filtration, W p , on the rational relative cohomology group, H 3 (Y, S, Q) ≡ Q ⊗Z H 3 (Y, S, Z). The weight filtration is again induced from the decomposition (2.10), and it reads 2 (S, Q) ∼ W3 ∼ = H 3 (Y, S, Q). = H 3 (Y, Q), W4 ∼ = H 3 (Y, Q) ⊕ Hvar

(2.16)

Note that the finite decreasing filtration, F˜ p ≡ W3 ∩F p , gives rise to the Hodge structure, F˜ p =

3− p

H 3−k,k (Y ),

p = 0, 1, 2, 3,

(2.17)

k=0

associated to the closed-string complex structure moduli space. In order to analyze the moduli-dependent relative period vector (2.14), we discuss the behavior of relative three forms under infinitesimal deformations. It is well-known that an infinitesimal closed-string complex structure deformation, ∂z , changes the Hodge

256

H. Jockers, M. Soroush

type of a ( p, q)-form. On the other hand an infinitesimal open-string deformation, ∂u , does not modify the closed-string periods because it gives rise to an infinitesimal deformation of the normal bundle of the submanifold, S, which only affects the two-form 2 (Y ). Thus, as has been shown rigorously in ref. [25], the infinitesimal deforsector, Hvar mations, ∂z and ∂u , viewed as tangent vectors in the open-/closed string moduli space, schematically act on the defined mixed Hodge structure as: ∂z ∂z ∂z / F2 ∩ W / F1 ∩ W / F0 ∩ W . F 3 ∩ WL3 3 L3LL L3LL LLL LLL∂u LLL∂u LL∂Lu ∂u LLL LLL LLL L% L% % ∂z ,∂u / F 1 ∩ W ∂z ,∂u / F 0 ∩ W F 2 ∩ W4 4 4

(2.18)

2 (Y ) ∼ W /W , constitutes a sub-system, Note that the two-form sector, Hvar = 4 3 ∂z ,∂u

∂z ,∂u

F 2 ∩ (W4 /W3 ) −→ F 1 ∩ (W4 /W3 ) −→ F 0 ∩ (W4 /W3 ),

(2.19)

which is closed with respect to the variations, ∂z and ∂u , and which will play an important role in deriving and solving the Picard-Fuchs differential equations of the relative periods (2.14). The variation of mixed Hodge structure exhibits the N = 1 special geometry of the open-/closed-string moduli space. As has been pointed out in ref. [24,25], we should view the emerging structures as a distinguished feature of N = 1 supergravity theories arising from N = 1 string compactifications and not as a property of a generic N = 1 supergravity theory. 3. Picard-Fuchs Equations for D5-Branes in Compact Calabi-Yau Geometries In this section we develop the machinery to compute effective superpotentials of D5-branes in compact Calabi-Yau threefolds. We focus on D5-brane geometries whose moduli spaces are describable by studying certain divisors of the embedding CalabiYau spaces. Furthermore these Calabi-Yau threefolds are realized as hypersurfaces in four-dimensional complex (weighted) projective spaces. The main idea is to express the relative three forms associated to the D5-brane geometry in terms of residue integrals. Since both open- and closed-string moduli enter in the definition of these residue integrals, we get a direct handle on the moduli dependence of relative three forms. Hence we are able to explicitly analyze the corresponding variation of mixed Hodge structure, which then allows us to derive the Picard-Fuchs equations of the relative period vector governing the effective superpotential.

3.1. Residue integrals for three forms in Calabi-Yau threefolds. Before we construct the mixed Hodge filtration for relative forms we first recall how to describe three forms of a Calabi-Yau threefold by means of residue integrals [40–42]. In the following the Calabi-Yau hypersurface, Y , is given as the zero locus, P ≡ 0, of a (quasi-) homogeneous polynomial, P, in the complex (weighted) projective space WP4(a1 ,a2 ,a3 ,a4 ,a5 ) . In order for the hypersurface, Y , to be Calabi-Yau the defining polynomial, P, must be (quasi-)homogeneous of degree d = a1 + · · · + a5 .

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

257

By integrating over a tubular neighborhood, γ , of the zero locus of the polynomial, P, the residue integral, p(x)

, (3.1) k = k+1 γ P yields a three form, k , on the Calabi-Yau threefold, Y . Here p(x) is a (quasi-)homogeneous polynomial of degree d k of the projective coordinates [x1 : x2 : x3 : x4 : x5 ], whereas the differential, , is given by [40]

=

5

(−1)m am xm d x1 ∧ . . . d xm . . . ∧ d x5 .

(3.2)

m=1

In this sum the differentials, d xm , are omitted as indicated by the hat, and the residue integral (3.1) is well-defined as it is invariant under quasi-homogeneous rescaling. Note that by acting with the de Rham differential, d, it is easy to see that the three form, k , is closed. Next we turn our attention to exact forms. First we observe that we can represent two forms with residue integrals an xn qm (x) − am xm qn (x) α = (−1)m+n d x1 ∧ . . . d xm . . . d xn . . . ∧ d x5 . Pk γ m
∂m q m q m ∂m P dα = k k+1 −

. (3.4) P Pk γ m Thus we have assembled all the ingredients to represent the de Rham three-form cohomology elements. As shown in ref. [40] and as suggested by the structure of the exact forms (3.4) each non-trivial element of the polynomial ring, C[x]/(∂n P), of degree d k corresponds to a distinct non-trivial element, k , in the cohomology group, H 3 (Y, C).2 A refined analysis reveals that a three form, k , of grade k, arising from a polynomial of degree d k, lies in the Hodge filtration module, F˜ 3−k , defined in Eq. (2.17) [40]. In particular the unique holomorphic three form, , of the Calabi-Yau hypersurface, Y , which spans the filtration, F˜ 3 , is given by 1 =

. (3.5) P γ This expression allows us to investigate the complex structure dependence of the holomorphic three form, , by considering a family of hypersurface polynomials, P(z), parametrized by the moduli, z. Moreover by taking k th order derivatives with respect to the parameters, z, we realize infinitesimal deformations of order k and obtain three forms at grade k. Thus we are able to explicitly study the variation of Hodge structure, ∂z ∂z ∂z F˜ 3 −→ F˜ 2 −→ F˜ 1 −→ F˜ 0 ,

(3.6)

of the closed-string complex structure moduli space. 2 For weighted projective spaces the residue integrals (3.1) do not always span the whole cohomology group, H 3 (Y, C). With residue integrals we can only describe those three-form cohomology elements that correspond to toric divisors in the mirror geometry.

258

H. Jockers, M. Soroush

So far we have argued that a three form, k , lies in the filtration module, F˜ 3−k . If, however, the associated polynomial, p(x), of degree dk, is trivial in the polynomial ring, C[x]/(∂n P), i.e. p(x) ≡ m qm (x) ∂m P(x) for some polynomials, qm (x), then according to Eq. (3.4) we can add an appropriate exact form, dα, such that the three form is reduced to a three form at grade k − 1. Hence we have shown that the three form, k , is even an element of the filtration module, F˜ 4−k . Recursively repeating this process eventually we either arrive at some non-trivial three-form cohomology element at lower grade or the final defining polynomial becomes zero. In the latter situation we have established that the original three form, k , is exact and thus trivial in cohomology. This reduction method is sometimes called the Griffiths-Dwork algorithm. Later we will use a generalization of this algorithm to derive the Picard-Fuchs equations for the relative periods.

3.2. Residue integrals for relative three forms in Calabi-Yau threefolds. In order to apply the concepts of N = 1 special geometry to D5-branes in compact Calabi-Yau threefolds the next task is to establish residue integral representations for relative three forms. These integrals are derived by exploiting the relative three-form decomposition (2.10) into a two-form/three-form pair. After having thoroughly explored the three form piece 2 (S), which in Sect. 3.1 we first focus now on the variable two-form cohomology, Hvar captures the open-string moduli dependence of the embedded D5-brane. Since we want to study D5-brane effective superpotentials the two cycle wrapped by the D5-brane is generically not holomorphic. In fact it is only holomorphic if the moduli coincide with a critical point of the superpotential. To avoid the complication of dealing with non-holomorphic submanifolds we employ the arguments of refs. [24,25] and replace the submanifold, S, by a holomorphic hypersurface, V , of the Calabi-Yau manifold, Y , such that this four-dimensional space embeds the wrapped cycles. One might be worried that the replacement introduces additional structure not related to the D5-brane geometry. However, we will see that for the examples discussed in this work, this substitution process does not give rise to fake additional moduli for the computed relative periods. Therefore we assume in the following that it is possible to use the simpler cohomology group, H 3 (Y, V ), instead of its complicated ancestor, H 3 (Y, S). In particular the two-form part of the relative three forms are now captured by the variable 2 (V ). cohomology, Hvar In practice the holomorphic four-dimensional subspace, V , is associated to a divisor of the Calabi-Yau space, Y , i.e.the manifold, V , arises as the zero locus, Q ≡ 0, of the (quasi-)homogeneous polynomial, Q, of degree f . Hence we can view the subspace, V , as the complete intersection of the Calabi-Yau polynomial, P, and the D5-brane polynomial, Q, in the (weighted) projective space, WP4(a1 ,a2 ,a3 ,a4 ,a5 ) . Therefore the residue integral, p(x) ξ k+l =

, (3.7) k+1 Q γˆ P represents a closed two form, ξ k+l , of the manifold, V [43]. Here the differential, , is given by Eq. (3.2), and we integrate over the tubular neighborhood, γˆ , of the intersection, {P ≡ 0} ∩ {Q ≡ 0}. The polynomial, p(x), must have degree, k d + f , so as to render the residue integral invariant under (quasi-)homogeneous rescaling as required for consistency reasons. The resulting form, ξ k+l , can only be non-zero for k ≥ 0 and > 0.

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

259

Analogously to Eq. (3.7) the (quasi-)homogeneous polynomials, qm (x), of degree (k − 1)d + f + am give rise to the residue integral, an xn qm (x) − am xm qn (x) (−1)m+n d x1 ∧ . . . d xm . . . d xn . . . ∧ d x5 , k Q P γˆ m
qm (x)∂m P qm (x)∂m Q ∂m qm (x) dβ = k k+1 + k +1 −

. P Q P Q P k Q γˆ m

(3.9)

The exact and the closed two forms enable us to study the cohomology resulting from the residue integrals. As shown in ref. [44] the residue integrals (3.7) capture non-trivial 2 (V ), which is precisely the cohocohomology elements in the variable cohomology, Hvar mology group relevant in the decomposition (2.10) of the relative cohomology group, H 3 (Y, V ).3 Moreover, the grades of the two forms (3.7) are compatible with the Hodge filtration of the variable two-form cohomology, i.e.the two form, ξ k , lies in the filtration module, F 3−k ∩ (W4 /W3 ), of the mixed Hodge structure introduced in Sect. 2.3. Altogether the two-form residue integrals provide for a suitable tool to investigate the 2 (V ). variation of Hodge structure (2.19) of the variable cohomology, Hvar So far we have separately discussed the three- and two-form part in the decomposition (2.16) of relative cohomology elements. However, in order to capture the variation of mixed Hodge structure (2.18) we need to take adequately into account the interplay of these two components. First we must incorporate the equivalence relation (2.12). Therefore we are required to derive for two forms the pullback, i ∗ , induced from the embedding, i : V → Y , on the level of residue integrals. This is achieved by writing the pullback of the two form (3.7) in terms of an additional residue with respect to the zero locus of the divisor, V . A few steps of algebra reveal for the pullback two form (3.7),4 i ∗α = −

qm (x)∂m Q

. Pk Q γˆ m

(3.10)

Second we observe that if we introduce open-string moduli, u, by considering a family of divisors, Q(u), this moduli dependence never enters the residue integral representation of the three forms (3.1). However, looking at the variation of mixed Hodge structure (2.18) for relative three forms we notice that the moduli, u, must enter the three-form component of relative cohomology elements. This becomes apparent by taking a derivative, ∂u , of a pure three-form piece appearing in the upper row of the variational diagram (2.18). It yields a two-form contribution in the lower row of the diagram. We readily implement this moduli dependence by representing a closed relative three form, , which arises from a pure closed three form, , as = ( , 0) =

p(x) log Q

. Pk

3 This is a consequence of the Hard Lefschetz theorem. 4 We have dropped an unimportant factor of 2πi.

(3.11)

260

H. Jockers, M. Soroush

This definition is now in agreement with the variation of mixed Hodge structure (2.18). Analogously we enhance the two forms (3.7) to relative two forms, α = (α, 0), (an xn qm (x)−am xm qn (x)) log Q (−1)m+n d x1 ∧ . . . d xm . . . d xn . . . ∧ d x5 , α= k P m
qm (x)∂m P log Q ∂m qm (x) log Q qm (x)∂m Q k dα =

. − − P k+1 Pk Pk Q m

(3.13)

Note that due to the introduced log Q-term the relative three form, dα, is indeed a relative exact form because it also contains the pullback term (3.10), which is required by the relative cohomology equivalence relation (2.12). Although the definitions of relative forms (3.11) and (3.12) yield the correct equivalence relation (2.12) and agrees with the variation of mixed Hodge structure (2.18), the introduction of the log Q-term may seem a little ad hoc. There is, however, another reason, which suggests the appearance of the log Q-term. The mixed Hodge structure of relative forms, H 3 (Y, V ), can also be defined as the filtration arising from the hypercohomology of the complexes, ∗Y (log Q), i.e.the spectral sequence of the hypercohomology p,q p degenerates at the term, E 1 = H q (Y, Y (log Q)), and abuts to the relative cohomology group, H p+q (Y, V ) [45,39]. The forms, ∗Y (log Q) ≡ ∗ Y (log Q), are locally generated by the one forms, 1 (Y ), and the logarithmic differential, d Q/Q ≡ d(log Q). Therefore in the decomposition (2.10) the log Q-term comes about naturally for the threeform component, corresponding to p = 0 and q = 3, as it generates the logarithmic differential, d(log Q), of the two-form components.

3.3. Open/closed Picard-Fuchs differential equations and flat coordinates. Having developed the techniques to render relative three forms as residue integrals we are now ready to make the connection to the advertised Picard-Fuchs equations. Their solutions are the relative three-form periods (2.14), which in turn determine flat coordinates of the open-/closed-string moduli space. Let us first discuss the system of linear Gauss-Manin differential equations, which controls the mixed Hodge filtration of the relative cohomology, H 3 (Y, V ), fibered over the open-/closed-string moduli space. We introduce a basis vector, π (z, u), of the relative three-form cohomology elements compatible with the mixed Hodge filtrations (2.15) and (2.16). In practice such a basis is constructed from the unique relative three form, (z, u), which spans the filtration module, F 3 , (z, u) = (, 0) =

log Q(u)

. P(z)

(3.14)

Recall that the dependence on the bulk moduli, z, arises in the polynomial, P(z), whereas the open-string moduli, u, appear in the family of divisors, Q(u). Due to Griffiths transversality we now generate a basis vector, π (z, u), by taking consecutive derivatives of the relative three form, (z, u). Furthermore subsets of this basis span the various filtration modules according to the variational diagram (2.18).

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

261

The Gauss-Manin system is the system of the linear differential equations, which expresses infinitesimal variations of the basis vector, π (z, u), with respect to the moduli in terms of a linear combination of the basis elements, π (z, u). Hence it reads: 0 ∂z k − Mk (z, u) π (z, u) ≡ ∇z k π (z, u), (3.15) 0 ∂u kˆ − Mkˆ (z, u) π (z, u) ≡ ∇u kˆ π (z, u). Here ‘ ’ indicates equality on the level of cohomology classes, i.e.equality modulo exact relative forms (2.12). This reflects the fact that in varying the basis vectors, π , we also modify the representatives of the relative cohomology classes. The indices, k and ˆ label the closed- and open-string moduli respectively. In the next sections we discuss k, in detail how to employ the residue integrals techniques so as to compute the matrices, Mk and Mkˆ , explicitly. The linear Gauss-Manin system (3.15) gives also rise to the Gauss-Manin connection, ∇z k and ∇u kˆ , for the relative three-form mixed Hodge bundle over the open-/closed-string moduli space. The Gauss-Manin connection is actually flat [24,25] [∇z k , ∇z ] = [∇u kˆ , ∇u ˆ ] = [∇z k , ∇u kˆ ] = 0.

(3.16)

Analogously to the flat Gauss-Manin connection in the context of N = 2 special geometry [6,46,47], the flatness of the connection is not a coincidence but a non-trivial and necessary property of the N = 1 special geometry governing the open/closed chiral ring [24,25]. Since the whole relative basis vector, π (z, u), is induced from derivatives of the relative three form, (z, u), we can readily construct from the linear Gauss-Manin system the differential Picard-Fuchs equations for the relative three form, (z, u), L A (z, u) 0.

(3.17)

According to the mixed Hodge variation (2.18) the linear Gauss-Manin system translates into Picard-Fuchs operators, L A , that are partial differential operators up to fourth order. As before the differential equations (3.17) hold on the level of relative cohomology classes. As a consequence the Picard-Fuchs operators also annihilate the relative periods (2.14) L A a (z, u) = 0. (3.18) We should stress that this set of Picard-Fuchs equations is only integrable due to the flatness of the Gauss-Manin connection (3.16). The solutions of the open/closed Picard-Fuchs equations (3.18) are the relative periods governing the effective superpotential (2.7). In particular, as argued in Sect. 2.1, this superpotential comprises both the pure closed-string periods for the RR three form fluxes and the true relative periods for the brane induced superpotential terms. Thus the bulk periods, α (z), are also annihilated by the open-/closed-string Picard-Fuchs operators, L A . As a consequence they exhibit the following general structure: bdry

L A = Lbulk A (z, ∂z ) + L A (z, u, ∂z , ∂u ).

(3.19)

The operators, Lbulk A , are the Picard-Fuchs operators of the closed-string bulk theory, bdry whereas the operators, L A , communicate the connection to the open-string boundary sector. Moreover the latter operators must annihilate the closed-string periods, α (z), to ensure that they are indeed solutions to the open-/closed-string Picard-Fuchs equations.

262

H. Jockers, M. Soroush

A complete set of relative periods, a , which solves the integrable Picard-Fuchs system (3.18), provides for a tool to compute flat coordinates of the open-/closed-string moduli space. By choosing a symplectic basis for the homology group, H3 (Y, Z), of three cycles, the closed string periods split into A- and B-periods. Then the flat coordinates, t, of the bulk sector are defined by tk (z) =

k (z) . 0 (z)

(3.20)

Here the periods, 0 (z) and k (z), constitute the A-periods with respect to the chosen symplectic basis of three cycles. Analogously the open flat coordinates, tˆ, arise from an appropriate subset of semi-periods and are defined by tˆkˆ (z, u) =

ˆ

k (z, u) . 0 (z)

(3.21)

As a consequence of N = 1 special geometry the relative period vector, expressed in terms of flat coordinates, t and tˆ, becomes [24,25] (t, tˆ) = a

1 , tk , ∂tk F(t) , 2 F(t) −

tk ∂tk F(t) ; tˆkˆ , W (t, tˆ) , ∗ .

(3.22)

k

The first two entries are the closed-string A-periods encoding the flat coordinates (3.20). The next two entries correspond to the symplectic dual closed-string B-periods, which, as a consequence of the underlying N = 2 special geometry, are induced from the closed-string holomorphic N = 2 prepotential, F(t) [48]. The other periods arise from open-string semi-periods governed by N = 1 special geometry [24,25]. Here the first entry gives rise to the open-string flat coordinates (3.21), whereas the second entry yields the holomorphic D5-brane superpotential components, W (t, tˆ), associated to the various two cycles labelled by index . In general these superpotential components are not integrable to a generating function. This reflects the fact that N = 1 special geometry is not as constraining as its N = 2 relative. The remaining semi-periods, which do not allow for an interpretation as flat coordinates or superpotentials, are denoted by ‘ ∗ ’. These semi-periods do not appear in refs. [24,25] because of the non-compactness of the considered local Calabi-Yau geometries. However, it would be interesting to find a physics interpretation for these semi-periods in the context of compact geometries. Computing flat coordinates, t and tˆ, involves a choice of basis, a , of the homology group, H3 (Y, V, Z). However, at special points in the open-/closed-string moduli space the choice of basis is further restricted such that we can derive an open-/closed-string mirror map. Most prominent is the mirror map at the large complex structure point as it allows to compute instanton corrections to the classical mirror geometry in the topological A-model. Recently, however, it has been demonstrated in local Calabi-Yau geometries that under certain circumstances a mirror map can also be computed at orbifold points in the moduli space [16–18,49], which then allows to determine equivariant invariants in the topological A-model [19–21]. For our examples discussed in the next sections we explicitly extract also the latter type of enumerative disk invariants in the context of D5-branes in compact Calabi-Yau manifolds.

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

263

4. D5-Branes in the Degree Eight Hypersurface in WP4(1,1,1,1,4) /(Z8 )2 × Z2 In this section we apply the developed tools to our first example. We consider the Calabi-Yau threefold, which arises as the mirror of the family of degree eight hypersurfaces in the weighted projective space, WP4(1,1,1,1,4) . The analyzed D-brane geometry in this Calabi-Yau space allows us to derive open/closed Picard-Fuchs equations. Their solutions yield the effective superpotential in flat coordinates. From this superpotential we extract a certain domain wall tension, which, for this example, is calculated by different means in ref. [29,30] in agreement with our results.

4.1. The Calabi-Yau and D-brane geometry. The first task is to specify the bulk Calabi-Yau geometry. The degree eight hypersurfaces in the weighted projective space, WP4(1,1,1,1,4) , give rise to a family of Calabi-Yau threefolds with one Kähler modulus and 149 complex structure moduli [50]. Here we are mainly interested in its mirror family of Calabi-Yau threefolds, Y , depending on one complex structure modulus and 149 Kähler moduli. Applying the standard Greene-Plesser construction [51], we realize the mirror (in its singular limit) as the degree eight hypersurface P(ψ) = x18 + x28 + x38 + x48 + x52 − 8 ψ x1 x2 x3 x4 x5 ,

(4.1)

in the (Z8 )2 × Z2 -orbifold of the weighted projective space, WP4(1,1,1,1,4) . The parameter, ψ, is the algebraic complex structure modulus of the Calabi-Yau threefold, Y . The orbifold group is generated by5 g1 = (1, 0, 0, 7, 0), g2 = (0, 1, 0, 7, 0), g3 = (0, 0, 1, 7, 0),

(4.2)

acting on the weighted projective coordinates, e.g.for the generator, g1 , we get g1 : [ x1 : x2 : x3 : x4 : x5 ] → [ η x1 : x2 : x3 : η7 x4 : x5 ], η ≡ e2πi/8 .

(4.3)

Resolving the orbifold singularities by standard toric techniques one obtains a smooth family of Calabi-Yau threefolds depending on 149 Kähler moduli. The next step is to introduce the divisor, V , which determines the D5-brane contents in our example. The divisor, V , is defined by the polynomial Q(φ) = x5 − φ x1 x2 x3 x4 .

(4.4)

Here φ is the algebraic open-string modulus. Note that this constitutes the most general polynomial of degree four invariant with respect to the (Z8 )2 × Z2 -orbifold group. The family of branes defined by the divisor, Q, is directly related to the D5-branes in the Calabi-Yau, Y , discussed in refs. [29,30]. There the relevant D5-branes wrap the holomorphic two cycles, C± , which are respectively given in the ambient weighted 5 Naively these generators give rise to the group, (Z )3 . However, a Z -subgroup acts trivially due to 8 4 quasi-homogeneous identifications of the projective coordinates [50].

264

H. Jockers, M. Soroush

projective space by6 C+ = x5 = 0 , x1 + µ x2 = 0 , x3 + ν x4 = 0 , µ8 = ν 8 = −1 ⊂ Y, C− = x5 − 8 ψ x1 x2 x3 x4 = 0 , x1 + µ x2 = 0 , x3 + ν x4 = 0 , µ8 = ν 8 = −1 ⊂ Y. (4.5) We make the important observation that the curves, C± , lie in the divisors given in terms of the polynomial, Q(φ), evaluated at the two critical points, φ+ = 0, φ− = 8 ψ.

(4.6)

Therefore we claim that the family of divisors, V , gives rise to the relative period integrals, which at the critical points describe the D5-branes, C± . This claim is supported by the fact that the spectrum of the two D5-branes, C± , consists of one (obstructed) open-string modulus. Moreover, this modulus enters a cubic superpotential, and the two brane configurations, C± , emerge at the two critical points of this superpotential [29]. As we go along and compute the effective superpotential in flat coordinates we further substantiate this picture and make this correspondence more precise. Before we conclude this section we establish that the open/closed moduli space parametrized by the algebraic moduli, ψ and φ, exhibits a Z8 × Z2 -symmetry. The generator of the Z8 -group acts on the algebraic moduli as

ψ φ

→ η

ψ φ

, η ≡ e2πi/8 .

(4.7)

This is indeed a symmetry as its action on the polynomials, P(ψ) and Q(φ), is readily compensated by the projective coordinate transformation, x1 → η−1 x1 . The Z2 -symmetry is generated by ψ ψ

→ , (4.8) φ 8ψ − φ and its action on the polynomials, P(ψ) and Q(φ), is balanced by the projective coordinate transformation, [x1 : x2 : x3 : x4 : x5 ] → [−x1 : x2 : x3 : x4 : x5 − 8 ψ x1 x2 x3 x4 ]. Note that the Z2 symmetry exchanges the two critical points, φ± , and hence maps a D5-brane wrapped on the holomorphic two cycle, C+ , to a D5-brane wrapped on the holomorphic two cycle, C− , and vice versa. Thus altogether the genuine open/closed moduli space is really a Z8 × Z2 -orbifold of the covering space parametrized by the algebraic variables, ψ and φ. The corresponding orbifold singularities emerge at the fixed point loci, ψ = 0 and φ = 4ψ, of the Z8 - and Z2 -group action (4.7) and (4.8). 6 In ref. [29] the Calabi-Yau threefold, Y , is given by the degree eight hypersurface polynomial, P( ˜ ψ) ˜ = x18 + x28 + x38 + x48 + x52 − 4ψ˜ x12 x22 x32 x42 , and the holomorphic two cycles, C± , are defined as C± = {x5 ± 2 ψ˜ x1 x2 x3 x4 = 0 , x1 + µ x2 = 0 , x3 + ν x4 = 0 , µ8 = ν 8 = −1}. These definitions translate to our conventions if we identify the algebraic closed string moduli as ψ˜ ≡ 4ψ 2 and change the weighted projective coordinates according to x5 → x5 − 4ψ x1 x2 x3 x4 .

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

265

4.2. The linear Gauss-Manin system of differntial equations. In this section we explicitly construct the linear Gauss-Manin system of differential equations for the D5-brane geometry alluded to in the previous section. By taking derivatives of the relative holomorphic three form, log Q(φ) (ψ, φ) =

, (4.9) P(ψ) defined in terms of the polynomials (4.1) and (4.4), we generate due to Griffiths transversality and according to the variational diagram (2.18) a basis for the relative cohomology group, H 3 (Y, V ). A convenient basis turns out to be π ≡ π3,3 , π2,3 , π1,3 , π0,3 , π2,4 , π1,4 , π0,4 (4.10) = , ∂ψ , ∂ψ2 , ∂ψ3 , ∂φ , ∂ψ ∂φ , ∂ψ2 ∂φ . All these basis elements are represented by relative three-form residue integrals (x1 x2 x3 x4 x5 )k π3−k,3 = k! 8k log Q(φ) , k = 0, 1, 2, 3, P(ψ)k+1 (4.11) (x1 x2 x3 x4 )k+1 x5k k π2−k,4 = −k! 8

, k = 0, 1, 2. P(ψ)1+k Q(φ) Note that the structure of the chosen basis is in accord with the mixed Hodge filtration defined in Sect. 2.3. In particular the bases for the decreasing filtration modules, F p , for p = 3, 2, 1, 0 are given by π3,3 , . . . , π p,3 , π2,4 , . . . , π p,4 , whereas the increasing weight filtration modules, W3 and W4 , are spanned by the basis vectors, (π,3 )=3,...,0 and π , respectively. The next task is to determine the linear Gauss-Manin system of differential equations (3.15) in terms of the above defined basis. Therefore we expand the vectors, ∂ψ π and ∂φ π, into the defined basis elements, πk,l . This procedure is trivial for some of the differentiated basis elements, namely directly from the definition of the basis vector (4.10) we can read off the following relations: ∂ψ πk,4 = πk+1,4 for k = 2, 1, ∂ψ πk,3 = πk+1,3 for k = 3, 2, 1, ∂φ πk,3 = πk+1,4 for k = 3, 2, 1.

(4.12)

However, in order to find the expansions for the derivatives of the remaining basis elements we are required to employ the whole machinery of relative three-form residue integrals developed in Sect. 3.2. That is to say we need to add in a systematic manner appropriate exact relative three forms (3.9) and (3.13) so as to establish the correct expansions. The performed reduction procedure constitutes a generalization of the Griffiths-Dwork algorithm in the context of relative form residue integrals. While the technical details of these long and tedious calculations are relegated to Appendix A, we simply collect the results here. For the remaining derivatives acting on the basis elements, πk,4 , we find 4ψ − φ ∂φ π2,4 π1,4 , 4φ 4ψ − φ 1 π0,4 + π1,4 , ∂φ π1,4 (4.13) 4φ φ R1 R2 R3 ∂φ π0,4 π2,4 + π1,4 + π0,4 . D2 D2 φ D2

266

H. Jockers, M. Soroush

As before ‘ ’ indicates that these equations hold on the level of cohomology classes. The polynomials, R1 , R2 , and R3 , are given by R1 = −128 φ 3 (φ − 8ψ)(φ − 4ψ), R2 = 112 φ 3 (φ − 8ψ)2 (φ − 4ψ),

(4.14) R3 = −2 5φ − 136ψφ + 1344ψ φ − 5632ψ φ + 8192ψ φ + 256 ,

8

7

2 6

3 5

4 4

and we further introduce the two discriminants D1 = 1 − (2ψ) , 8

4 πi φ(φ − 8ψ) − 4 e 2 . D2 =

(4.15)

=1

As the basis elements (4.10) arise from the variation of the relative three form, , it implies ∂ψ π0,4 = ∂φ π0,3 , and we find ∂ψ π0,4 = ∂φ π0,3

S1 S2 S3 π2,4 + π1,4 + π0,4 . D2 D2 D2

(4.16)

Here the polynomials, S1 , S2 , and S3 , are given by S1 = 512 φ 4 (φ − 8ψ), S2 = −448 φ 4 (φ − 8ψ)2 , S3 = 48 φ 4 (φ − 8ψ)3 . (4.17) We observe that the expansions of the derivatives acting on the elements, πk,4 , do not involve the basis elements, πk,3 . This manifests in this example the existence of the variational sub-system (2.19) within the variation of mixed Hodge structure (2.18). Finally it remains to expand the element, ∂ψ π0,3 , for which we obtain ∂ψ π0,3

256ψ 4 15(1 + (2ψ)8 ) 5(3 − 1280ψ 8 ) π3,3 + π − π1,3 2,3 D1 ψ 3 D1 ψ 2 D1 2(3 + 1280ψ 8 ) T1 T2 T3 + π0,3 + 3 π2,4 + π1,4 + 2 π0,4 , 2 ψ D1 ψ D1 D2 φψ D1 D2 φ ψ D1 D2 (4.18)

in terms of the polynomials7 4 256 φψ 8 + 448 φ 2 ψ 7 − 60 ψ + 15 φ D2 − ψ 3 φ S1 D1 , φ − 4ψ 4 1792φ 2 ψ 8 + 1792φ 3 ψ 7 − 112φ 4 ψ 6 + 48ψ 2 + 48φψ − 15φ 2 D2 T2 = φ − 4ψ − ψ 2 φ 2 S2 D1 , 4 T3 = 2 768φ 3 ψ 8 + 352φ 4 ψ 7 − 48φ 5 ψ 6 + 2φ 6 ψ 5 − 32ψ 3 φ − 4ψ −16φψ 2 − 6φ 2 ψ + 3φ 3 D2 − ψ φ 3 S3 D1 . (4.19) T1 =

7 Note that the polynomials, T to T , are actually finite for φ → 4ψ. 1 3

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

267

All the relations (4.12) to (4.18) are now summarized in the connection matrices, Mφ and Mψ , of the linear Gauss-Manin system (3.15). They read ⎛

Mψ

and

0 0 0

⎜ ⎜ ⎜ ⎜ 256ψ 4 =⎜ ⎜ D1 ⎜ 0 ⎜ ⎝ 0 0

1 0 0 3840ψ 8 +15 ψ 3 D1

0 0 0

0 1 0

0 0 1

− 15−6400ψ ψ 2 D1 0 0 0 ⎛

0 ⎜0 ⎜ ⎜0 ⎜ ⎜0 Mφ = ⎜ ⎜0 ⎜ ⎜ ⎝0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 0

8

0 0 0

0 0 0

0 0 0

2560ψ 8 +6 T1 T2 T3 ψ D1 ψ 3 D1 D2 φψ 2 D1 D2 φ 2 ψ D1 D2

0 0 0

1 0 0 S1 D2

0 0 R1 D2

0 1 0 ψ φ

S2 D2

−

1 φ R2 D2

0 0

1 0

0 1

S1 D2

S2 D2

S3 D2

0 0 1

1 4

⎞ ⎟ ⎟ ⎟ ⎟ ⎟, ⎟ ⎟ ⎟ ⎠ (4.20)

⎞

⎟ ⎟ ⎟ ⎟ S3 ⎟ D2 ⎟. 0 ⎟ ⎟ ψ 1⎟ − φ 4⎠

(4.21)

R3 φ D2

The discriminant, D1 , corresponds to the (mirror) conifold locus appearing already in the pure bulk geometry. At the zero locus of the discriminant, D2 , in the moduli space, the intersection of the two polynomials, P and Q, fails to be transversal at some points in the embedding space, WP4(1,1,1,1,4) , and as a consequence at these points the D5-brane divisor, V , becomes singular. Let us also remark that the matrices (4.21) reveal a block diagonal structure. As indicated before this demonstrates for this example that the variation of mixed Hodge structure (2.18) contains the sub-system (2.19). Let us now focus on the Gauss-Manin connection, ∇φ ≡ ∂φ −Mφ and ∇ψ ≡ ∂ψ −Mψ , of the open-/closed-string moduli space. As discussed in detail in Sect. 3.3 the underlying N = 1 special geometry requires the Gauss-Manin connection to be flat. For our example, which involves only two moduli, ψ and φ, the conditions (3.16) reduce to a single equation ∂φ Mψ − ∂ψ Mφ + [Mψ , Mφ ] = 0, (4.22) which is indeed satisfied for the constructed connection matrices (4.20) and (4.21). We should stress again that the flatness of this connection is a non-trivial condition imposed on the linear Gauss-Manin system. It insures that the associated system of Picard-Fuchs equations, which we examine in the next sections, are integrable. Furthermore the fulfilled integrability condition (4.22) serves also as a non-trivial check on our method of realizing the variation of mixed Hodge structure in terms of relative three-form residue integrals. 4.3. Relative periods in the vicinity of the orbifold point. First we want to solve the linear Gauss-Manin system parametrized by the algebraic moduli, ψ and φ, in the vicinity of the fixed point locus of the symmetry transformations (4.7) and (4.8), i.e.in the vicinity of the Z8 × Z2 orbifold singularity of the open/closed string moduli space located at ψ = 0 and φ = 4ψ = 0. From the linear Gauss-Manin system characterized by the two 7 × 7 matrices (4.21) and (4.20) we extract the following Picard-Fuchs operators (3.17):

268

H. Jockers, M. Soroush

1 ∂φ , L2 = L 2 ∂φ , L3 = Lbulk + Lbdry . L1 = L

(4.23)

2 , read 1 and L The differential operators, L 1 = (4ψ − φ) θψ − 4ψ θφ , L 2 = ∂ψ2 ∂φ − 48 φ 4 (φ − 8ψ)3 ∂ψ2 + 448 φ 4 (φ − 8ψ)2 ∂ψ − 512 φ 4 (φ − 8ψ), L D2 D2 D2 (4.24) and the operators, Lbulk and Lbdry , are Lbulk = θψ (θψ − 2)(θψ − 4)(θψ − 6) − (2ψ)8 (θψ + 1)4 , Lbdry = −

ψ 3 T3 2 ψ 2 T2 ψ T1 ∂ ∂ − ∂ψ ∂φ − ∂φ ψ φ 2 φ D2 φ D2 D2

(4.25)

in terms of the logarithmic derivatives, θψ ≡ ψ∂ψ and θφ ≡ φ∂φ . Note that the linear Gauss-Manin system is equivalent to two independent partial differential operators of up to degree four. Thus one of the operators, L1 or L2 , is redundant. However, in order to find solutions to the partial differential equations associated to these rather complicated operators we take advantage of the variational sub-system (2.19) governed by the two 1 and L 2 . Picard-Fuchs operators, L Before we investigate the sub-system (2.19) we observe that the Picard-Fuchs operators (4.23) exhibit indeed the structure advocated in Eq. (3.19). Namely, as discussed in Sect. 3.3 the bulk periods, α (ψ), determined by the hypergeometric differential equation, Lbulk α (ψ) = 0, form a subset of solutions to the open/closed Picard-Fuchs equations. Hence these periods are given in terms of hypergeometric functions [52], 1 1 1 1 1 2 3 0 0 8 (ψ) ≡ (ψ) = 4 F3 , , , ; , , ; (2ψ) , 8 8 8 8 4 4 4 3 3 3 3 2 3 5 1 1 2 8 , , , ; , , ; (2ψ) , (ψ) ≡ (ψ) = (2ψ) 4 F3 8 8 8 8 4 4 4 (4.26) 5 5 5 5 3 5 6 (2ψ)4 8 2 (ψ) ≡ 2 (ψ) = , , , ; , , ; (2ψ) , F 4 3 2! 8 8 8 8 4 4 4 6 7 7 7 7 5 6 7 (2ψ) 8 3 (ψ) ≡ 3 (ψ) = − , , , ; , , ; (2ψ) , F 4 3 3! 8 8 8 8 4 4 4 which enjoy for |ψ| <

1 2

the convergent expansion ∞ 4 k + 2α+1 cα α 8 (4ψ)2(4n+α) , α = 0, 1, 2, 3, (ψ) = 2α+1 4 (4k + 2α + 1) 2α α! 2 8 k=0 (4.27) where the constants, cα , are all one except for c3 = −1. The next task is to determine the solutions of the sub-system described by the two 1 and L 2 . We proceed in two steps. First we notice that a general differential operators, L 1 , is given by solution to the differential operator, L χ (ψ, φ) ≡ χ (u) with u ≡ φ(φ − 8ψ).

(4.28)

2 , on the ansatz (4.28) and obtain an Second we act with the differential operator, L ordinary differential equation for the function, χ (u),

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

θu (θu − 1)(θu − 2) −

u 4 4

(θu + 1)3 χ (u) = 0.

269

(4.29)

This is another hypergeometric differential equation with the three linearly independent solutions 1 1 1 2 3 u 4 4 χ (u) = 3 F2 , , , ; , ; 4 4 4 4 4 4 u 2 2 2 3 5 u 4 5 (4.30) , , , ; , ; χ (u) = 3 F2 4 4 4 4 4 4 4 u 2 3 3 3 5 6 u 4 . , , ; , ; χ 6 (u) = 3 F2 4 4 4 4 4 4 4 The power series of these hypergeometric functions read ∞ 4 k + 1+ 1 4+ 4 χ (u) = u 4k+ , = 0, 1, 2, (4.31) 4 (4k + + 1) 4 1+ 4

k=0

with radius of convergence |u| < 4. From the solutions (4.30) of the sub-system we construct three additional relative periods, αˆ (ψ, φ), which according to the structure of the operators, L1 and L2 , in Eq. (4.23) are integrals of the relation ∂φ αˆ (ψ, φ) = χ αˆ (φ(φ − 8ψ)) .

(4.32)

The integrals of this equation allow for integration constants, C αˆ (ψ), which have to be chosen such that the integrals are annihilated by the third Picard-Fuchs operator, L3 . Instead of solving yet another differential equation for C αˆ (ψ), we use the Z2 symmetry (4.8) so as to determine the integration constants. Due to the symmetry the relative periods split into symmetric and anti-symmetric solutions with respect to the Z2 action (4.8). The functions (4.28) are symmetric and, therefore, up to a symmetric bulk period (4.26), the integrals must be anti-symmetric in order to be solutions to the whole Picard-Fuchs system, i.e.the relative periods, αˆ (ψ, φ), vanish at the Z2 -fixed point locus, (ψ, φ) = (ψ, 4ψ). Hence altogether we arrive at the relative periods φ αˆ (ψ, φ) = χ αˆ (ζ (ζ − 8ψ)) dζ, (4.33) 4ψ

which are indeed annihilated by the Picard-Fuchs operator, L3 . They give rise to the power series 4 (ψ, φ) =

+∞ 4k k=0 n=0

5 (ψ, φ) =

+∞ 4k+1 k=0 n=0 +∞ 4k+2

(1/4 + k)4 (−1)n (φ − 4ψ)2n+1 (4ψ)8k−2n , (1/4)4 (4k − n)! n! (2n + 1) (1/2 + k)4 (−1)n+1 (φ − 4ψ)2n+1 (4ψ)8k+2−2n , π 2 (4k + 1 − n)! n! (2n + 1)

2 (3/4 + k)4 (−1)n (φ −4ψ)2n+1 (4ψ)8k+4−2n , (3/4)4 (4k + 2 − n)! n! (2n+1) k=0 n=0 (4.34) convergent for |φ(φ − 8ψ)| < 4. 6 (ψ, φ) =

270

H. Jockers, M. Soroush

In summary the relative periods, a (ψ, φ), a = 0, . . . , 6, of Eqs. (4.26) and (4.34) constitute a complete set of solutions to the open/closed Picard-Fuchs equations in the vicinity of the Z8 × Z2 orbifold point. 4.4. Effective superpotential and domain wall tension. The next step is to investigate the three-form relative periods computed in the previous section. The reason for analyzing the periods in the vicinity of the Z8 × Z2 orbifold point is twofold. First of all we work in a regime containing the two critical points (4.6) simultaneously. Therefore we can directly extract the domain wall tension between two D5-branes wrapping the cycles, C± , and compare to the results obtained in refs. [29,30]. Second we are able to extract equivariant orbifold invariants along the lines of refs. [16,18]. First we want to determine the flat coordinate, t, and the holomorphic prepotential, F, of the closed-string sector, that is to say we first focus on the first four entries of the relative period vector (3.22). In order to define a symplectic basis for the homology group, H3 (Y, Z), which singles out an unambiguous choice of flat closed-string periods, we impose the following two criteria. We demand that the flat periods are not singular at the orbifold locus, ψ = 0, and we require that the flat periods form one-dimensional irreducible representations with respect to the Z8 -monodromy group action (4.7), i.e.the transformation, ψ → η ψ, induces on flat periods a phase rotation, ηm , for some integer m. The physics motivation for the former requirement reflects the fact that, although there is a Z8 -orbifold singularity in the complex structure moduli space at ψ = 0, the Calabi-Yau, Y , itself is smooth. Therefore the flat periods should also be regular at the orbifold locus. The second criterion comes about as follows. The closed-string conformal field theory of the Calabi-Yau, Y , at the orbifold point is captured by a Landau-Ginzburg Z8 -orbifold theory [53], whose chiral multiplets fall into Z8 representations [54]. Since the holomorphic prepotential together with the flat coordinates encode the chiral ring of this conformal field theory [47,48,55], it is natural to also arrange the flat periods and hence the resulting chiral ring structure constants into Z8 representations. For our example the discussed two requirements pin down uniquely (up to two numerical constants) the closed-string flat periods (3.22), and for the calculated periods (4.26) we derive up to an overall numerical constant the closed-string flat coordinate t (ψ) =

1 (ψ) , 0 (ψ)

(4.35)

which yields the closed-string mirror map at the Z8 -orbifold locus. The first few terms in the expansion explicitly read8 19 5 1 541 9 2 177 327 13 t− t − t − t + ··· . (4.36) 4 1920 516 096 7 084 965 888 From the remaining periods we deduce the holomorphic prepotential, F, which by applying the stated criteria is only ambiguous up to a second numerical factor, and it becomes in terms of the flat coordinate, t, 1 3 (ψ) t 2 (ψ) F(t) = + . (4.37) 2 0 (ψ) ψ=ψ(t) 2 0 (ψ) ψ=ψ(t) ψ 2 (t) =

8 Note that the flat coordinate is a function of ψ 2 . Thus it is convenient to also state the mirror map as a function, ψ 2 (t).

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

271

The explicit expansion yields 77 7 161 071 11 20 606 066 649 15 626 507 087 510 997 19 1 3 t + t + t + t + t +· · · . 3! 7! · 8 11! · 16 15! · 256 19! · 256 (4.38) Note that, up to the two undetermined overall numerical scales, this expansion contains the genus zero orbifold invariants of the compact hypersurface, Y . It would be interesting to directly establish these invariants by a localization computation in the topological A-model. Such a computation fixes also the mentioned normalization ambiguities. Next we turn to the open-string sector. Analogously to the closed-string sector we demand that the open-string semi-periods come also in one-dimensional representations of the Z8 -orbifold group action (4.7). Since this condition is already fulfilled for the semi-periods, α , stated in Eq. (4.34), the remaining task is to identify semi-periods for the flat coordinates and the superpotential respectively. In order to yield a good coordinate system in the vicinity of the orbifold locus the open-string flat coordinate, tˆ, must not vanish identically along the Z8 -orbifold locus, ψ = 0. This latter requirement, however, is only fulfilled by the semi-period, 4 (ψ, φ). Therefore, up to a numerical factor, the open-string flat coordinate has to be F(t) =

tˆ(ψ, φ) =

4 (ψ, φ) . 0 (ψ)

(4.39)

From the flat coordinate, tˆ, we compute recursively the expansion of the open-mirror map, whose first few terms are given by 1 2 5 1 1 5 4 1 9 t tˆ + t 3 tˆ3 − t tˆ + t tˆ7 − tˆ 128 72 320 2688 55 296 72 799 8 21 337 6 5 6269 7 3 t tˆ + t tˆ − t tˆ − 10 321 920 967 680 5 529 600 (4.40) 2509 5 7 10 001 4 9 30 707 3 11 t tˆ − t tˆ + t tˆ + 1 720 320 27 525 120 510 935 040 10 291 59 1 t 2 tˆ13 + t tˆ15 + − tˆ17 + · · · . 1 610 219 520 148 635 648 339 738 624

φ(t, tˆ) = 4 ψ(t) + tˆ −

Here we have also inserted the closed-string mirror map (4.36) so as to eliminate the algebraic variable, ψ. Finally we need to identify the relative period encoding the superpotential, W . Looking again at the extracted prepotential, F, we notice that it transforms under the Z8 transformation, ψ → η ψ, as F → η6 F. On the other hand we know that the closed-string moduli space is equipped with a line bundle, L, whose first Chern class equals the Kähler form on the moduli space. Furthermore the holomorphic prepotential is a section of the line bundle, L2 [48,6], whereas the holomorphic superpotential, which arises from semi-periods and thus encodes the orbifold disk amplitudes, constitutes a section of the line bundle, L [6]. Therefore we expect the D5-brane superpotential to transform as W → η3 W with respect to the mentioned Z8 transformation.9 Hence up to an undetermined overall numerical normalization the D-brane effective superpotential in flat 9 This is in contrast to the local geometries discussed in refs. [16,18]. There the superpotential is required to be invariant with respect to the monodromy group.

272

H. Jockers, M. Soroush

coordinates is given in terms of the semi-period, 5 , by 5 (ψ, φ) W (t, tˆ ) = 0 (ψ)

.

(4.41)

ψ=ψ(t), φ=φ(t,tˆ )

It enjoys the expansion in flat coordinates 5 1 73 4 3 W (t, tˆ ) = −4 t tˆ + tˆ3 − t 5 tˆ + t tˆ 3 24 576 29 3 5 7 2 7 23 89 t tˆ + t tˆ − t tˆ9 + − tˆ11 + · · · . 720 960 32256 3041280

(4.42)

Let us now analyze and discuss our calculated results in detail. First of all we observe that the leading two terms of the effective superpotential (4.42) precisely agree with the effective cubic superpotential calculated in ref. [29]. The latter superpotential is derived by analyzing obstructions to deformations of matrix factorizations, which model the considered D5-brane geometry. However, by construction the obtained deformation superpotential is not given in terms of flat coordinates, and hence the sub-leading terms of the flat superpotential, W (t, tˆ ), are not visible. The critical points of the effective superpotential (4.41) with respect to the open-string flat coordinate, tˆ, are determined by 0 = ∂tˆ W (t, tˆ) =

χ 5 (φ(ψ − 8φ)) ∂φ(t, tˆ ) , ∂ tˆ 0 (ψ)

(4.43)

where we used Eq. (4.32). Thus due to Eq. (4.30) the critical points of the flat superpotential with respect to the flat coordinate, tˆ, are located at φ+ (t, tˆ ) = 0 and φ− (t, tˆ ) = 8 ψ(t). Hence the computed effective superpotential, W (t, tˆ), reproduces the correct critical loci (4.6), at which the D5-brane becomes supersymmetric and wraps one of the holomorphic two cycles, C± . We should emphasize that the agreement with the cubic deformation superpotential and the replication of the critical loci, φ± , are non-trivial confirmations of our computational methods. With the critical loci, φ± , of the effective superpotential, W (t, tˆ ), at hand it is straight-forward to compute the domain wall tension between the supersymmetric D5-brane wrapping the two cycle, C+ , and the supersymmetric D5-brane wrapping the two cycle, C− . Since the only dependence of the effective superpotential, W (t, tˆ ), on the open-string modulus is encoded in the relative period, 5 , the relevant information about the domain wall tension is captured in the domain wall period, τ (ψ), which is the difference of the relative period, 5 , evaluated at the critical points, φ± . Note that, in order to extract the domain wall period, it is necessary that the critical points, φ± , are in the radius of convergence of the stated relative period, 5 . As the two critical points, φ± , approach each other at the Z8 -orbifold point, its vicinity is suitable for this calculation. Starting from the expansion (4.34) we arrive after a few steps of algebra at the domain wall period τ (ψ) = 5 (ψ, φ+ ) − 5 (ψ, φ− ) =

+∞ k=0

2 (1/2 + k)4 (4ψ)8k+3 . π 3/2 (5/2 + 4k)

(4.44)

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

273

π Alternatively, using the gamma function identities, (z) (1−z) = sin(π z) and (z) (z+ √ 1 1−2z (2z), the domain wall period, τ (ψ), can also be written as 2) = π 2

−k+ 1 ∞ 2 1 (−8k + 5) 4π 2 τ (ψ) = , 4 8 ψ (−4k + 3) (−k + 3/2) (8ψ)

(4.45)

k=1

which is a solution of the inhomogeneous Picard-Fuchs equation 3 Lbulk τ (ψ) = (8ψ)8 . 8ψ

(4.46)

Let us pause to compare this result with the literature. In ref. [29] (cf. also ref. [30]) the domain wall tension period is computed in the vicinity of the large complex structure point of the closed string modulus, ψ. This result is then analytically continued to the Z8 -orbifold point, where the domain wall period splits into a contribution arising from closed string periods and into a contribution, τ (ψ), intrinsic to the domain wall tension at the orbifold locus. The latter part, τ (ψ), however, agrees precisely with the domain wall tension period (4.45).10 The agreement can also be seen by comparing the inhomogeneous Picard-Fuchs equations. Namely by rewriting the inhomogeneous Picard-Fuchs equation (4.46) in terms of the large complex structure coordinate, z = (8ψ)−8 , we find the inhomogeneous Picard-Fuchs equations for domain wall tensions in the large complex structure regime stated in refs. [30].11 We should stress the significance of this result. From the variation of mixed Hodge structure of relative three-form periods we have obtained an effective superpotential in flat coordinates. This superpotential encodes disk instanton corrected domain wall tensions of the mirror geometry, which are computed by different means in refs. [27–30]. Thus extracting the quantum corrected domain wall period is a highly non-trivial consistency check on the flat effective superpotential, W (t, tˆ ), and in particular on its sub-leading terms. Since we have managed to extract a uniquely distinguished set of flat relative periods in the vicinity of the Z8 × Z2 -orbifold singularity we are also able to extract orbifold invariants, namely the expansions of the prepotential (4.37) and the effective superpotential (4.42) in terms of flat coordinates yield closed- and open-string orbifold invariants respectively. In particular the flat effective superpotential (4.42) encodes the orbifold (0,1) disk amplitudes, Nk,n , which are defined by [16–18] W (t, tˆ ) =

1 (0,1) N t n tˆk . k! k,n

(4.47)

k,n

We have collected some of these orbifold disk invariants in Table 1. Note that for the invariants in the table the effective superpotential, W (t, tˆ ), is rescaled such that the disk (0,1) amplitude, N1,1 , is normalized to one, and furthermore the listed invariants are only defined up to the undetermined overall numerical normalization of the flat coordinates, t and tˆ. As explained before the normalization of the closed flat coordinate, t, is established by explicitly extracting a closed-string genus zero orbifold Gromov-Witten invariant in 10 Compared to ref. [29] the expression has an additional factor of ψ −1 . This factor is traced back to fact that the normalizing period, 0 (ψ), differs also by the same factor, ψ −1 . 11 Compared to ref. [30] we again need to take into account an additional factor of ψ −1 arising from the normalization of the period, 0 (ψ).

274

H. Jockers, M. Soroush (0,1)

Table 1. The table lists some orbifold disk invariants, Nk,n , for the analyzed D5-brane geometry in the degree eight Calabi-Yau hypersurface in WP4(1,1,1,1,4) /(Z8 )2 × Z2 . These invariants are normalized such (0,1)

that N1,1 = 1, and, furthermore, are ambiguous up to the numerical normalizations of the open- and closed-string flat coordiantes, t and tˆ (0,1)

Nk,n

n=0 1 2 3 4 5 6 7 8

k=1 0 1 0 0 0 5 96

0 0 0

2 0 0 0 0 0 0 0 0 0

3 − 21 0 0 0 73 − 384 0 0 0 94 379 − 860 160

4 0 0 0 0 0 0 0 0 0

5 0 0 0 29 24

0 0 0 91 60

0

6 0 0 0 0 0 0 0 0 0

7 0 0 − 147 16 0 0 0 308 259 − 1 46 080

0 0

8 0 0 0 0 0 0 0 0 0

9 0 1035 16

0 0 0 315 647 512

0 0 0

the mirror topological A-model, whereas the normalization of the open flat coordinate, tˆ, and the normalization of the superpotential, W , is obtained by matching the stated orbifold disk invariants with the open orbifold Gromov-Witten invariants in the topological A-model of the mirror configuration. Thus it would be interesting to pin down the normalization ambiguities and to check our results by performing an appropriate localization computation directly in the mirror topological A-model. 4.5. Large complex structure vicinity. An obvious task is to analyze the open/closed Picard-Fuchs equations in the vicinity of the large complex structure point. The effective superpotential in this regime potentially encodes disk instantons of the mirror D6-brane configuration. Instanton generated superpotentials appear in the mirror type IIA theory for chiral multiplets of the open-string sector that are massless and give rise to flat directions at the large volume point [12,13]. For the example at hand we do not expect any open/closed disk instantons associated to the (obstructed) open-string modulus, φ, because it interpolates between the two critical points, φ± , separated by a domain wall. The domain wall tension, however, remains finite at the large complex structure point [29,30], and therefore the obstructed modulus, φ, does not give rise to a flat direction, which would indicate the appearance of disk instantons. Nevertheless let us briefly discuss the singularity structure of the connection matrices (4.21) and (4.20) of the linear Gauss-Manin system in the vicinity of the large complex structure point. First we observe that the large complex structure locus, ψ = ∞, intersects the discriminant locus, D2 = 0, at the point, (ψ, φ) = (∞, ∞), in the open/closed moduli space. By analyzing the connection matrices in the vicinity of this intersection point we observe that the locus, φ = ∞, is singular for any value of ψ. Thus actually three boundary divisors of the moduli space meet at the point, (ψ, φ) = (∞, ∞). This intricate singularity structure becomes also apparent by looking at the degeneration of the D5-brane divisor, V , itself. The defining polynomial of the Calabi-Yau degenerates at the large complex structure point to the monomial, P(∞) = x1 x2 x3 x4 x5 . On the other hand in the limit, |φ| → +∞, the D5-brane divisor turns into Q(∞) = x1 x2 x3 x4 . Hence the intersection locus of the two polynomials, P and Q, which is complex two-dimensional at a generic point in the moduli space, obtains at the point, (ψ, φ) = (∞, ∞) complex three-dimensional components, which are given by the three-dimensional weighted projective spaces, WP3(1,1,1,4) , embedded into the ambient space, WP4(1,1,1,1,4) .

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

275

A detailed discussion of the large complex structure regime presumably requires to resolve the intersection point of the three boundary divisors in the open/closed moduli space along the lines ref. [56,57]. This analysis, however, is beyond the scope of this work, but we hope to come back to this issue elsewhere. 5. D5-branes in the Mirror Quintic Threefold Now we investigate a certain class of D5-branes in the mirror quintic threefold in the projective space, CP4 . Analogously to the example in the previous section we derive the open/closed Picard-Fuchs equations, and we compute the effective superpotential in flat coordinates. From the effective superpotential we also extract a domain wall tension, which agrees with the result obtained in ref. [27,28]. In the analysis we proceed analogously to the previous example. Therefore we mainly emphasis the differences in this section, defer the calculations to Appendix B, and refer for further explanation to Sect. 4. 5.1. The Calabi-Yau and D-brane geometry. In this section the bulk geometry of interest is given by the mirror of the quintic threefolds, Y , in the projective space, CP4 . The family of quintics depends on 101 complex structure moduli and one Kähler modulus, whence the family of mirror quintics, Y , have one complex structure and 101 Kähler moduli, as explained in detail in ref. [58]. It is defined by the homogeneous degree five polynomial, P(ψ) = x15 + x25 + x35 + x45 + x55 − 5 ψ x1 x2 x3 x4 x5 , (5.1) in the Z35 orbifold of the projective space, CP4 [51,58], and the algebraic modulus, ψ, parametrizes the one-dimensional complex structure moduli space. Furthermore the Z35 orbifold is generated by g1 = (1, 0, 0, 0, 4), g2 = (0, 1, 0, 0, 4), g3 = (0, 0, 1, 0, 4), g4 = (0, 0, 0, 1, 4), (5.2) where, for instance, the generator, g1 , acts on the homogeneous projective coordinates as12 g1 : [ x1 : x2 : x3 : x4 : x5 ] → [ ρ x1 : x2 : x3 : x4 : ρ 4 x5 ], ρ ≡ e2πi/5 .

(5.3)

Resolving the resulting orbifold singularities gives rise to the smooth family of mirror quintics depending on 101 Kähler moduli. The divisor, V , for the D5-brane geometry is specified by the homogeneous degree-four polynomial, Q. As there are only two monomials of degree four with definite (Z5 )3 charges we arrive at Q(φ) = x54 − φ x1 x2 x3 x4 ,

(5.4)

where the parameter, φ, is the algebraic open-string modulus. The geometry specified by the homogeneous polynomials, P(ψ) and Q(φ), is invariant with respect to the discrete Z5 symmetry, which acts on the open/closed algebraic moduli as ψ ψ , ρ ≡ e2πi/5 .

→ ρ (5.5) φ φ 12 Since the transformation, g g g g , induces a homogeneous rescaling of the projective coordinates there 1 2 3 4 are really only three independent generators (5.2).

276

H. Jockers, M. Soroush

The projective coordinate transformation, x1 → ρ −1 x1 , compensates the generator (5.5) and hence establishes this Z5 symmetry on the level of the algebraic variables, ψ and φ. As a consequence the open/closed string moduli space arises really as the Z5 orbifold of the covering space parametrized by the variables, ψ and φ. The next step is to relate the divisor, Q, to the two supersymmetric D5-branes appearing in refs. [27,28]. The geometric embedding, C± , into the Calabi-Yau threefold, Y , are specified by [28] C± =

x1 + x2 = 0 , x3 + x4 = 0 , x52 ±

5ψ x1 x3 = 0 ⊂ Y,

(5.6)

together with their images under the (Z5 )3 -orbifold group. By inserting the first two conditions, x1 + x2 = 0 and x3 + x4 , into the divisior, Q(φ), we observe that the holomorphic two cycles, C± , are simultaneously contained in the divisor, Q(φ), at the critical point, φ0 = 5 ψ.

(5.7)

As a consequence we expect that the divisor, V , specified by the polynomial, Q(φ), describes a configuration of D5-branes, which at the critical locus, φ0 , wraps both holomorphic two cycles, C± . The fact that the polynomial, Q(φ0 ), does not discriminate between the cycles, C+ and C− , gives us less control over the open-string moduli dependence of the described D5-brane configuration compared to the examples studied in Sect. 4. In particular by moving away from the critical locus the open-string modulus, φ, interlocks the deformations of both D5-brane cycles. Despite these subtleties we are still able to extract the correct domain wall tension √ by evaluating the derived effective superpotential at the critical points, (φ 1/2 )± = ± 5ψ.

5.2. Solutions to the open/closed Picard-Fuchs equations. In this section we derive the open/closed Picard-Fuchs system of partial differential equations and solve them in the vicinity of the Z5 -orbifold locus, i.e.in the vicinity of ψ = 0 and φ = 0. We perform our analysis along the lines of Sect. 4. Starting from the relative holomorphic three form, (ψ, φ), we construct by Griffiths transversality a basis for the relative cohomology group, H 3 (Y, V ). This basis is given by π ≡ π3,3 , π2,3 , π1,3 , π0,3 , π2,4 , π1,4 , π0,4 = , ∂ψ , ∂ψ2 , ∂ψ3 , ∂φ , ∂ψ ∂φ , ∂ψ2 ∂φ ,

(5.8)

where the indices are again labelled in accord with the variation of mixed Hodge structrure depicted in the diagram (2.18). We explicitly represent the basis elements (5.8) in terms of relative three-form residue integrals, which allow us to calculate the associated linear Gauss-Manin system

∂ψ − Mψ π 0, ∂φ − Mφ π 0.

(5.9)

The derivation of the connection matrices, Mψ and Mφ , is deferred to Appendix B. The result of this tedious analysis yields

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

⎛

Mψ

0 ⎜ 0 ⎜ 0 ⎜ ⎜ ψ = ⎜ ⎜ D1 ⎜ 0 ⎜ ⎝ 0 0

1 0 0

0 1 0

0 0 1

15ψ 2 25ψ 3 10ψ 4 D1 D1 D1

0 0 0

0 0 0

0 0 0

277

0 0 0

0 0 0

0 0 0

−φT1 16D1 D2

−φT2 16D1 D2

−φT3 16D1 D2

0 0

1 0

0 1

125 φ(φ−5ψ) −175φ(φ−5ψ)2 30 φ(φ−5ψ)3 D2 D2 D2

⎞ ⎟ ⎟ ⎟ ⎟ ⎟, ⎟ ⎟ ⎟ ⎠

(5.10)

and ⎛

0 ⎜0 ⎜ ⎜0 ⎜ ⎜0 Mφ = ⎜ ⎜ ⎜0 ⎜ ⎜0 ⎝

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

1 0 0 125 φ(φ−5ψ) D2 3 − 4φ

0 1 0

−175φ(φ−5ψ)2 D2 − φ−ψ 4φ 1 0 − 2φ 175(φ−ψ)(φ−5ψ)2 − 125(φ−ψ)(φ−5ψ) 4D2 4D2

0000

0 0 1 30 φ(φ−5ψ)3 D2

0 1 − 4φ −

− φ−ψ 4φ 15(φ−ψ)(φ−5ψ)3 2D2

⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟, ⎟ ⎟ ⎟ ⎟ ⎠ (5.11)

in terms of the polynomials T1 = φ 8000 − φ(φ − 5ψ)ψ 61φ 2 − 790ψφ + 2825ψ 2 − 16384 ψ, T2 = 57375φ 2 ψ 5 − 34000φ 3 ψ 4 + 7190φ 4 ψ 3 − 8 79φ 5 + 14336 ψ 2 + φ 19φ 5 + 95936 ψ − 11200 φ 2 , T3 = 22625φ 2 ψ 6 − 16325φ 3 ψ 5 + 4490φ 4 ψ 4 − 2 293φ 5 + 49152 ψ 3 + φ 37φ 5 + 112768 ψ 2 − φ 2 φ 5 + 26624 ψ + 1920 φ 3 ,

(5.12)

and the discriminants D1 = 1 − ψ 5 ,

D2 = φ(φ − 5ψ)4 − 256.

(5.13)

The discriminant locus, D1 = 0, corresponds to the familiar conifold point of the bulk geometry, whereas the locus, D2 = 0, describes again a singularity in the open sector. Namely the intersection locus of the two polynomials, P and Q, fails to be transversal at some points in the ambient projective space, CP4 . It is straightforward to check that the Gauss-Manin connection, ∇φ ≡ ∂φ − Mφ and ∇ψ ≡ ∂ψ − Mψ , is integrable, i.e.the Gauss-Manin connection is flat, [∇φ , ∇ψ ] ≡ 0. Integrability ensures that the associated open/closed Picard-Fuchs system of differential equations (3.18) for the relative periods is solvable. From the Gauss-Manin system (5.9) we extract three partial differential Picard-Fuchs operators of the form 1 ∂φ , L2 = L 2 ∂φ , L3 = Lbulk + Lbdry . L1 = L

(5.14)

278

H. Jockers, M. Soroush

1 and L 2 , read Here the operators, L 1 = (ψ − φ) θψ − 4ψ θφ − 3 ψ, L 2 = ∂ψ2 ∂φ + 1 + 15 (φ − ψ)(φ − 5ψ)3 ∂ψ2 − 175 (φ − ψ)(φ − 5ψ)2 ∂ψ L 4φ 2D2 4D2 125 + (φ − ψ)(φ − 5ψ), (5.15) 4D2 and the operators, Lbulk and Lbdry , are Lbulk = θψ (θψ − 1)(θψ − 2)(θψ − 3) − ψ 5 (θψ + 1)4 , ψ 4φ T3 ∂ψ2 ∂φ + T2 ∂ψ ∂φ + T1 ∂φ . Lbdry = 16D2

(5.16)

The solutions to the bulk Picard-Fuchs operator, Lbulk , in the vicinity of the orbifold point, ψ = 0, are determined by the hypergeometric functions 1 1 1 1 2 3 4 5 0 (ψ) = 4 F3 , , , ; , , ;ψ , 5 5 5 5 5 5 5 2 2 2 2 3 4 6 5 1 , , , ; , , ;ψ , (ψ) = ψ 4 F3 5 5 5 5 5 5 5 (5.17) 3 3 3 3 4 6 7 5 ψ2 2 (ψ) = , , , , ; , , ; ψ F 4 3 2! 5 5 5 5 5 5 5 3 4 4 4 4 6 7 8 5 ψ 3 (ψ) = − , , , , ; , , ; ψ F 4 3 3! 5 5 5 5 5 5 5 or in terms of the power series 5 ∞ k + α+1 cα 5 (ψ) = (5ψ)5k+α , α = 0, 1, 2, 3, 5 (5k + α + 1) α! 5α α+1 k=0 5 α

(5.18)

with radius of convergence |ψ| < 1. As in the previous example all the constants, cα , are one except for c3 = −1. In order to calculate the relative periods resulting from semi-periods we proceed analogously to Sect. 4. That is to say we first solve the variational sub-system (2.19) 1 and L 2 . The operator, L 1 , constrains a governed by the two differential operators, L solution of the sub-system to have the form 1

χ (ψ, φ) ≡ φ − 4 (5ψ − φ)2 λ(u) with u ≡ φ(5ψ − φ)4 ,

(5.19)

2 , applied to this ansatz yields for the function, λ(u), the whereas the second operator, L ordinary differential equation u 1 1 3 3 θu + − θu + λ(u) = 0. (5.20) θu θu + 2 4 256 4

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

279

A complete set of solutions to this differential equation of hypergeometric type reads 1 1 1 2 3 u 4 −1/2 , , ; , ; λ (u) = u , 3 F2 4 4 4 4 4 256 1 1 1 3 5 u (5.21) , , ; , ; , λ5 (u) = u −1/4 3 F2 2 2 2 4 4 256 3 3 3 5 6 u , , ; , ; . λ6 (u) = 3 F2 4 4 4 4 4 256 The power series of these solutions are convergent for |u| < 256 and are given by ∞ 4 k + 1+ −2 1 4+ 4 u k+ 4 , λ (u) = 4 = 0, 1, 2. (5.22) (4k + + 1) 1+ 4

k=0

Due to the integrability of the open/closed Picard-Fuchs system (5.14) we are able to derive the three relative periods, αˆ , by integrating the relations, ∂φ αˆ = χ αˆ (ψ, φ) = 1 φ − 4 (5ψ − φ)2 λαˆ φ(5ψ − φ)4 . In this integration process we must not forget to take into account the possibility of non-trivial integration constants, C αˆ (ψ). A detailed analysis, however, reveals that the periods φ 1 (5.23) φ − 4 (5ψ − ζ )2 λαˆ ζ (5ψ − ζ )4 dζ, αˆ (ψ, φ) = 0

furnish indeed the three linearly independent solutions to the whole open/closed PicardFuchs system (5.14). The power series of these solutions in the vicinity of the orbifold locus become ∞

4 (ψ, φ) =

(−1)n−k 4 (k + 1/4) 4 φ n+1/4 (5ψ)5k−n , 4 (1/4) (4n + 1) (5k − n)! (n − k)! 5k

k=0 n=k

5 (ψ, φ) =

2 π2

∞ 5k+1 k=0 n=k

(−1)n−k+1 4 (k + 1/2) φ n+1/2 (5ψ)5k−n+1 , (2n + 1) (5k − n + 1)! (n − k)!

(5.24)

∞ 5k+2

(−1)n−k 4 (k + 3/4) 8 (ψ, φ) = 4 φ n+3/4 (5ψ)5k−n+2 . (3/4) (4n + 3) (5k − n + 2)! (n − k)! 6

k=0 n=k

These expansions are convergent for |φ(5ψ − φ)4 | < 256. 5.3. Effective superpotential and domain wall tension. With the relative three-form periods (5.17) and (5.24) at hand we are now ready to extract distinguished flat open and closed coordinates so as to formulate the flat effective superpotential. The analysis is performed in the vicinity of the Z5 -orbifold point and therefore parallels our investigations in Sect. 4. Hence for additional explanations we again refer the reader to the previous section. So as to determine the flat coordinates, the prepotential and the effective superpotential, we apply the same criteria discussed thoroughly in the context of the previous example. Namely we require that the flat periods must not be singular at the orbifold

280

H. Jockers, M. Soroush

point, ψ = φ = 0, and furthermore the flat periods should furnish one-dimensional representations with respect to the Z5 -monodromy group at the orbifold locus. These requirements yield, up to numerical constants, the open/closed flat coordinates t (ψ) =

1 (ψ) 4 (ψ, φ) , tˆ(ψ, φ) = , 0 (ψ) 0 (ψ)

the prepotential, F(t), and the effective superpotential, W (t, tˆ), 1 3 (ψ) t 2 (ψ) 5 (ψ, φ) F(t) = + , W (t, tˆ ) = 2 0 (ψ) ψ=ψ(t) 2 0 (ψ) ψ=ψ(t) 0 (ψ)

(5.25)

, ψ=ψ(t), φ=φ(t,tˆ )

(5.26) in terms of the open/closed mirror maps. The first few terms in the expansion of the closed string mirror map read ψ(t) = t −

13 6 31 991 11 294 146 129 16 t − t − t + ··· . 360 9 979 200 326 918 592 000

(5.27)

For the leading terms of the open mirror map we arrive at 1

1 125 3 9 25 125 4 5 1 5 tˆ + t tˆ − 21 t tˆ + 27 t tˆ − 36 t 2 tˆ13 4 480 2 ·3 2 · 27 2 · 13 5 1 t tˆ17 − 53 + 43 tˆ21 + · · · . 2 · 51 2 · 63

φ 4 (t, tˆ) =

(5.28)

These expansions are obtained by inverting the flat coordinates (5.25). Finally the holomorphic prepotential, F(t), expressed in flat coordinates enjoys the expansion 1 1 8 1195 13 6904357 43753160719 F(t) = t 3 + t + t + t 18 + t 23 +. . . , 6 1008 10378368 266765571072 5523935200616448 (5.29) whereas the effective superpotential, W (t, tˆ), yields 1 6 5 6 2 333 5375 5 t tˆ − t 5 tˆ6 + t 4 tˆ10 W (t, tˆ) = − t tˆ2 + tˆ + 8 6144 288 2097152 14495514624 70625 26875 t 3 tˆ14 + t 2 tˆ18 − 168843754340352 104972574127030272 8725 103 t tˆ22 + − tˆ26 + · · · . 106113814420103626752 9442427122730076733440 (5.30) First we observe that only the square, tˆ2 , of the open flat coordinate, tˆ, enters the flat superpotential. This indicates that we really capture the product of deformations associated to the two individual branes, which at the critical locus wrap the D5-brane cycles, C± , in Eq. (5.6) respectively. In order to compare with a single D5-brane component we , which is given by introduce the superpotential, W (t, t˜) = W (t, tˆ2 ). W

(5.31)

, coincide with the deformation superpotenThe leading terms of the superpotential, W tial computed in ref. [59]. Furthermore the modified superpotential yields the expected √ 1/2 critical points, φ± (t, t˜) = ± 5ψ(t), with respect to the open-string coordinate, t˜.

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

281

(0,1)

Table 2. The table lists some orbifold disk invariants, Nk,n , for the analyzed D5-brane geometry in the (t, t˜). These invariants are normalized such that mirror quintic, which are extracted from the superpotential, W (0,1) N1,1 = 1, and are ambiguous up to the normalizations of the open- and closed-string coordinates, t and t˜ (0,1)

n=0 1 2

k=1 0 1 0

2 0 0 0

3 1 − 640 0 0

4 0 0 0

3

0

0

0

0

0

0

0

0

4

0

0

0

0

− 5375 13

0

0

0

0

5

0

0

0

0

0

0

0

0

6

1 − 36

0

999 217 ·5

0

0

0

0

0

0

0

7

0

0

0

0

0

0

0

0

8

0

0

0

0

0

0

7 ·55 819 − 5 40

55 ·7 011 679 242 ·1989

0

0

Nk,n

5 0 0 0

6 0 0 0

2 ·9

7 0 0 0

8 0 0 0

70 625 229 ·39

2 ·273

9 0 0 564 375 234 ·221

Analogously to the previous example we extract the domain wall tension period, τ (ψ), in the vicinity of the orbifold locus by evaluating the superpotential period, 5 , at the critical points τ (ψ) = 5 (ψ, φ) 1/2 √ − 5 (ψ, φ) 1/2 √ =−

2 5ψπ 2

φ ∞ k=0

Using the identity, (z) (1 − z) = τ (ψ) = −

= + 5ψ

(k (5ψ)5 (5k + 5/2) + 1/2)5

π sin(π z) ,

φ

= − 5ψ

k+1/2

.

(5.32)

we arrive at the expression

−k−1/2 ∞ 1 2π 2 (−5k − 3/2) , 5ψ (−k + 1/2)5 (5ψ)5

(5.33)

k=0

which is a solution of the inhomogeneous Picard-Fuchs equation Lbulk τ (ψ) = −

3π 4 10ψ

(5ψ)5 .

(5.34)

Thus we find again agreement with the domain wall period computed in refs. [27], which can be seen by either comparing the inhomogeneous Picard-Fuchs equations or by directly matching the domain wall periods at the orbifold point. (t, t˜), in Finally we have collected the orbifold disk invariants of the superpotential, W Table 2. As before these invariants are defined up to numerical normalizations of the flat coordinates and the effective superpotential. Since these invariants are extracted from the (t, t˜), in terms of the open-string variable, t˜, it is tempting to identify superpotential, W them with the obstruction disk invariants associated to a single D5-brane components. However, in order to substantiate this claim a better understanding of the relationship of the D5-brane divisor, V , to its individual D5-brane components is necessary. We should also remark that the open/closed Picard-Fuchs equations in the vicinity of the large complex structure point exhibit similar features as the open/closed Picard-Fuchs equations of the previous example as discussed in Sect. 4.5.

282

H. Jockers, M. Soroush

6. Conclusions In this paper we have provided new techniques to calculate effective superpotentials for the N = 1 low energy effective theory of type IIB Calabi-Yau compactifications with D5-branes and fluxes. These superpotentials depend on both open- and closed-string chiral fields associated to D5-brane deformations and complex structure deformations of the internal Calabi-Yau manifold respectively. For supersymmetric configurations, i.e. at the critical points of the superpotentials, these neutral chiral fields become massless and correspond to obstructed moduli fields. Thus geometrically the effective superpotentials capture the obstructions in the open-/closed-string moduli space. Analogously to D5-branes in local Calabi-Yau spaces discussed in refs. [24,25], we have expressed the superpotentials in terms of relative periods of D5-brane boundary divisors in compact Calabi-Yau threefolds. We have demonstrated that in the context of compact Calabi-Yau spaces these relative periods are attainable from a particular type of residue integrals. As for non-compact geometries these relative periods are governed by the underlying N = 1 special geometry [34]. This structure allowed us to parametrize the open/closed moduli space in terms of flat coordinates, which we also used to express the effective superpotential. The effective superpotentials in flat coordinates are interesting for several reasons. First of all we have analyzed the effective superpotential in the framework of the topological B-model, and therefore the flat superpotential becomes the disk partition function of the topological A-model of the mirror D6-brane configuration in the mirror Calabi-Yau geometry [14,15]. Thus at special points in the open/closed moduli space, namely at points where a distinguished set of flat coordinates can be determined, the flat superpotential encodes enumerative disk invariants of the A-model quantum geometry. So far most computational techniques, which are used to extract disk invariants, are mainly limited to D-branes in local Calabi-Yau configurations, whereas our approach is in particular suitable for D-branes in compact geometries. On the other hand our methods are potentially useful in the context of string phenomenology. The interplay of the bulk geometry with D-branes and background fluxes is a crucial ingredient in constructing phenomenological viable models. Therefore by providing a handle on the effective superpotential beyond the qualitative level we possibly get new insights into the vacuum structure of type II string compactifications with branes and background fluxes. Moreover the ability to reliably compute non-perturbative D-brane superpotentials might also shed light on aspects of dynamical supersymmetry breaking. In this work we have also applied our methods to two examples explicitly. Our first example constituted a certain D5-brane configuration embedded in the mirror of the degree-eight hypersurface in the weighted projective space, WP4(1,1,1,1,4) . This setup provides for one closed-string complex structure modulus and one open-string brane modulus, and it is directly related to the geometries discussed in refs. [29,30]. We derived the associated open/closed Picard-Fuchs partial differential equations and showed their integrability. Then we solved this system of differential equations in the vicinity of the orbifold point of the open/closed string moduli space. From the solutions we extracted a uniquely distinguished set of flat open-/closed-string coordinates together with their effective superpotential. As for local Calabi-Yau geometries in refs. [16–18], we determined (up to overall normalizations) from this superpotential a tower of disk orbifold invariants of the corresponding mirror geometry in the topological A-model. The resulting superpotential reproduces the correct critical locus in agreement with the leading order behavior of the superpotential computed by matrix-factorization

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

283

techniques in ref. [29]. With our methods, however, we were able to obtain the subleading corrections encoded in the flat coordinates. Finally by evaluating the superpotential at its two critical points we have calculated in the vicinity of the orbifold point the domain wall tension between the two supersymmetric D-brane configurations, which remarkably matches with the results calculated by different means in refs. [29,30]. Namely, there the domain wall tension emerges as a solution of an inhomogeneous Picard-Fuchs equation. The inhomogeneous term comes from a three-chain integral, which needs to be computed analytically, whereas in our approach the domain wall tension is determined purely algebraically. As our second example we have examined a particular family of D5-branes in the mirror quintic threefold. This setup depends on one complex structure and one D5brane modulus and is very similar to the first example. We again computed the effective superpotential in flat coordinates in the vicinity of the orbifold point, we extracted the corresponding orbifold disk invariants for the corresponding mirror configuration, and we determined the domain wall tension between two distinct supersymmetric D5-brane configurations in agreement with the results of refs. [27,28]. There are many open questions, which we have not addressed in this work. First of all the presented derivations of the open/closed Picard-Fuchs equations are rather tedious. However, the (quasi-)homogeneity of the defining polynomials of the Calabi-Yau space and of the D-brane divisor suggests that toric techniques might help to obtain the open/closed Picard-Fuchs equations more economically. For the two presented examples the open-string moduli are obstructed in the vicinity of the large complex structure locus. However, for configurations with open-string moduli that become unobstructed at the large complex structure point, the effective superpotential encodes large volume disk instantons of the topological A-model mirror geometry [12–14]. We expect that our methods are also applicable to such situations and therefore allow us to determine these integer invariants for D5-branes in suitable compact Calabi-Yau geometries. From the effective superpotential, which we have computed in the B-model, we have extracted orbifold disk invariants. It would be interesting to directly extract these equivariant invariants on the mirror A-model side by adequate localization techniques. Then the comparison with the topological A-model would also fix the overall normalization ambiguity of our B-model computation. Acknowledgements. We would like to thank Vincent Bouchard, Bogdan Florea, Shamit Kachru, Johanna Knapp, Wolfgang Lerche, Emanuel Scheidegger, and especially Peter Mayr for fruitful discussions and helpful correspondences. M. S. is grateful to the Simons Workshop in Mathematics and Physics 2008 for its stimulating atmosphere, where part of this work was completed. This work is supported by the Stanford Institute for Theoretical Physics and by the NSF Grant 0244728.

Appendix A. The Degree-Eight Hypersurface Example in WP4(1,1,1,1,4) /(Z8 )2 × Z2 In this appendix we give some further details that lead to the expressions for the connection matrices (4.20) and (4.21) for our first example. The basic idea is to extend the Griffiths-Dwork algorithm to the residue integrals of relative three forms. For ease of notation let us first introduce the abbreviations, u ≡ x1 x2 x3 x4 x5 and v = x1 x2 x3 x4 . Then together with Eqs. (4.1) and (4.4) we arrive at the simple relation v=

∂5 P Q − . 4ψ − φ 2(4ψ − φ)

(A.1)

284

H. Jockers, M. Soroush

In order to approach the problem systematically we need to find relations among all derivatives of the relative holomorphic three form (4.9) in the filtration (2.18). Let us first focus on the equations governing the sub-system (2.19). If we consider polynomial, Q, as another constraint besides the defining polynomial equation, P, then this defines a complete intersection manifold, and as a result the system of equations for the sub-system has to close by itself. Using the relation (A.1) and integrating by parts, which corresponds to adding an appropriate exact form (3.9), we find the following relations: 4φ ∂ ψ ∂φ ∂ 2 , 4ψ − φ φ 4φ 16ψ ∂φ3 + ∂ 2 , ∂ψ ∂φ2 (A.2) 4ψ − φ (4ψ − φ)2 φ 16φ 2 16φ 2 ∂ψ2 ∂φ ∂φ3 + ∂ 2 . 2 (4ψ − φ) (4ψ − φ)3 φ If we differentiate the relative three form, , once more, then according to the variational diagram (2.18), these derivatives cannot be independent anymore, but instead must be expressible in terms of the lower derivatives up to exact forms. Therefore, using again Eqs. (A.1) and (3.9), we find the following three relation among fourth-order derivatives: 4φ 32ψ 32ψ ∂3 + ∂ 2 , ∂4 + 4ψ − φ φ (4ψ − φ)2 φ (4ψ − φ)3 φ 4φ 16ψ 16φ 16(4ψ + φ) 2 ∂ψ2 ∂φ2 ∂ψ ∂φ2 − ∂3 − ∂ , ∂ψ ∂φ3 + 4ψ − φ (4ψ −φ)2 (4ψ −φ)2 φ (4ψ −φ)3 φ 4φ 32 φ 128 φ ∂ψ3 ∂φ ∂ψ ∂φ2 + ∂ 2 . (A.3) ∂ψ2 ∂φ2 − 2 4ψ − φ (4ψ − φ) (4ψ − φ)3 φ ∂ψ ∂φ3

However, in order to eventually close the sub-system we need one more non-trivial relation involving the fourth-order derivatives of the relative form, . To get this last equation we first observe the following algebraic equation holds: 1 − (2ψ)8 u 3 v = I1 + I2 + I3 + I4 + I5 + I6 + I7 + I8 , (A.4) where polynomials, I j , are given by 1 4 2 v x5 ∂5 P, 2 I3 = 8ψ 2 v 6 ∂5 P,

I1 =

I2 = 2ψ v 5 x5 ∂5 P, I4 = 8ψ 3 (x1 x2 x3 )7 ∂4 P,

I5 = 8ψ 4 (x1 x2 )8 x3 x5 ∂3 P,

I6 = 8ψ 5 x19 x22 x3 x4 x52 ∂2 P,

I7 = 8ψ 6 x13 (x2 x3 x4 )2 x53 ∂1 P,

I8 = 32ψ 7 u 3 ∂5 P.

(A.5)

If we now use Eq. (3.9) repeatedly, after a long and tedious computation we obtain 1 − (2ψ)8 ∂ψ3 ∂φ −128φ(φ − 8ψ)ψ 3 ∂φ 192ψ 2 48ψ 8 3 3 2 4 ∂φ2 −16 − + + 19φ ψ − 228φ ψ (4ψ − φ)4 (4ψ − φ)3 (4ψ − φ)2 192ψ 32 4 2 3 3 ∂φ3 −16ψ − + 9φ ψ − 144φ ψ (4ψ − φ)3 (4ψ − φ)2

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

285

2ψ 1 2 5 ∂ψ ∂φ2 − 60φ − ψ (4ψ − φ)2 (4ψ − φ) 64 5 4 2 ∂φ4 + 128ψ 6 (5φ + 4ψ)∂ψ2 ∂φ − 16ψ 2 + φ ψ − 20φ ψ (4ψ − φ)2 1 − 6φ 3 ψ 4 ∂ψ ∂φ3 − 4 64φψ 6 (ψ − φ) + 1 ∂ψ2 ∂φ2 . (A.6) −64ψ 4ψ − φ

+ 1600φψ 5 ∂ψ ∂φ − 32

Taking Eqs. (A.3) and (A.6), we can now express all fourth (and all higher order) derivatives of the holomorphic relative three form, , in terms of the lower order derivatives. In particular, if we choose the basis (4.10) and express all derivatives in terms of this basis, then we exactly arrive at Eq. (4.13). We can then proceed and compute in terms of the chosen basis (4.10) the connection matrix, Mφ , given in Eq. (4.21). So far, we have closed the system of linear differential equations with respect to the open-string modulus, φ. We also need to close the system with respect to the closedstring modulus, ψ. As in Eqs. (A.3) and (A.6) this is achieved by expressing ∂ψ4 in terms of lower order derivatives. Note that ∂ψ4 is the only additional three form we need to consider, as all the other fourth order derivatives appear in Eqs. (A.3) and (A.6). If we multiply Eq. (A.4) by x5 , then we find 1 u4 = x5 I j . 1 − (2ψ)8 j=1 8

(A.7)

Similarly to treatment of the variational sub-system, if we apply the relations (3.9) and (3.13) repeatedly in order to reduce the right-hand side of Eq (A.7) to lower degree, we find 15 5 1 − (2ψ)8 ∂ψ4 256 ψ 4 + 3 (1 + 256 ψ 8 )∂ψ − 2 (3 − 1280 ψ 8 )∂ψ2 ψ ψ 2 1 8 3 6 + (3 + 1280 ψ )∂ψ + 3 256 φ(2φ − ψ)ψ + 60 ∂φ + 1216 φ 3 ψ 3 ∂φ2 ψ ψ 4 φ(64φ(8φ − 7ψ)ψ 6 − 15) − 12ψ ∂ψ ∂φ + 576 φ 4 ψ 3 ∂φ3 + 2 ψ φ 8ψ 6 3 3 4 2 5 + 1664 φ ψ ∂ψ ∂φ + 8 + + − 192 φψ (ψ − φ) ∂ψ2 ∂φ φ2 φ ψ + 64 φ 5 ψ 3 ∂φ4 + 256 φ 4 ψ 4 ∂ψ ∂φ3 + 384 φ 3 ψ 5 ∂ψ2 ∂φ2 4 − 2 φ 2 + 4 φ ψ + 16 ψ 2 + 64 φ 3 ψ 6 (ψ − φ) ∂ψ3 ∂φ . φ

(A.8) Note that if we set the open-string modulus, φ, and its derivatives, ∂φ , to zero, then we exactly recover the relevant equation for closed-string variation (3.6) of the holomorphic three form, . Again, if we choose the basis (4.10) and rewrite everything in terms of this basis, then we arrive at the relation (4.18). Furthermore with Eq. (A.8) we can also compute the connection matrix, Mψ , presented in Eq. (4.20).

286

H. Jockers, M. Soroush

Appendix B. The Mirror Quintic Example In this appendix we provide some additional computational detail for the extended Griffiths-Dwork algorithm applied to our second example, i.e.D5-branes in the mirror of the quintic threefold. Let us first introduce the relevant residue integrals for the basis elements (5.8). They are given by

uk log Q(φ) , k = 0, 1, 2, 3, P(ψ)1+k ul v

, l = 0, 1, 2, = −l! 5l P(ψ)1+l Q(φ)

π3−k,3 = k! 5k π2−l,4

(B.1)

in terms of the polynomials, P(ψ) and Q(φ), defined in Eqs. (5.1) and (5.4). Analogously to the previous example the monomials, x1 x2 x3 x4 x5 and x1 x2 x3 x4 , are abbreviated by u and v respectively. In order to obtain the Gauss-Manin connection, we need to find the derivatives of the above basis elements with respect to both closed-string and open-string moduli. As in the other example, some of the derivatives are trivial and, in particular, the relations (4.12) are obviously also valid here. To find the other relations, we first realize that ∂5 P Q v= − , (B.2) ψ −φ 5(ψ − φ) and together with Eq. (3.9), we arrive at 4φ 3 ∂2 + ∂φ , ψ −φ φ ψ −φ 4φ 7ψ − 3φ 2 3 ∂φ3 + ∂φ + ∂φ , ∂ψ ∂φ2 2 ψ −φ (ψ − φ) (ψ − φ)2 16φ 2 4φ(9ψ − 5φ) 2 6(ψ + φ) ∂ψ2 ∂φ ∂φ3 + ∂φ + ∂φ . 2 3 (ψ − φ) (ψ − φ) (ψ − φ)3 ∂ψ ∂φ

(B.3)

With these relations we express on the level of cohomology all two forms in terms of the chosen basis elements {π2,4 , π1,4 , π0,4 }. That is to say, if we now differentiate any two form cohomology element one more time with respect to either one of the two moduli the result is entirely expressible in terms of the mentioned basis elements. Similarly to the previous example, with Eqs. (B.2) and (3.9) we obtain the following relations: 4φ 11 2 ∂φ4 + ∂φ3 + ∂ψ ∂φ2 , ψ −φ ψ −φ ψ −φ 4φ 1 6 ∂ψ ∂φ3 + ∂ 2 ∂φ + ∂ψ ∂φ2 , ∂ψ2 ∂φ2 ψ −φ ψ −φ ψ ψ −φ 4φ 1 ∂2 ∂2 + ∂ 2 ∂φ . ∂ψ3 ∂φ ψ −φ ψ φ ψ −φ ψ ∂ψ ∂φ3

(B.4)

To close the system of linear differential equations with respect to the open-string modulus, φ, we need one more equation. The last non-trivial equation is obtained by observing that (1 − ψ 5 )u 3 v = I1 + I2 + I3 + I4 + I5 , (B.5)

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

287

where the polynomials, I j , are given by I1 = I2 = I3 = I4 = I5 =

1 (x2 x3 x4 )4 x53 ∂1 P, 5 1 ψ x2 (x3 x4 )5 x54 ∂2 P, 5 1 2 ψ x1 x2 x32 x46 x55 ∂3 P, 5 1 3 ψ (x1 x2 x3 )2 x43 x56 ∂4 P, 5 1 4 3 ψ u ∂5 P. 5

(B.6)

Thus with Eq. (B.5) we find for the derivative, ∂ψ3 ∂φ , 19 2 9 φ (φ − 5ψ)∂φ2 − φ 3 (φ − 5ψ)∂φ3 4 4 19 9 1 2 2 2 − ψφ(φ −5ψ)∂ψ ∂φ − ψφ (φ −5ψ)∂ψ ∂φ − ψ 9φ(φ −5ψ)+4ψ 2 ∂ψ2 ∂φ 4 2 4 1 4 3 1 − φ (φ −5ψ)∂φ4 − ψφ 3 (φ −5ψ)∂ψ ∂φ3 − ψ 2 φ 3φ(φ −5ψ)+16ψ 2 ∂ψ2 ∂φ2 4 4 4 1 3 − ψ φ(φ − ψ)∂ψ3 ∂φ . 4 (B.7) We should also need to close the linear system with respect to the closed-string modulus, ψ. To determine this last relevant equation, we first note that the following algebraic relation holds: 1 1 3 1 2 5 1 u x 1 ∂1 P − u x 1 x 2 ∂2 P − u (x1 x2 )5 x3 ∂3 P 1 − 5 u4 = − 5ψ 5ψ 5ψ ψ 1 1 4 v ∂5 P. (x1 x2 x3 )5 x4 ∂4 P − − (B.8) 5ψ 5ψ (1 − ψ 5 ) ∂ψ3 ∂φ −2φ (φ − 5ψ)∂φ −

We now use Eqs. (3.9) and (3.13) to express all the residue integrals resulting from the right-hand side of Eq. (B.8) in terms of derivatives of the holomorphic relative three form, . After a long computation we obtain ∂ψ4

ψ5 ψ +15ψ 2 ∂ψ +25ψ 3 ∂ψ2 +10ψ 4 ∂ψ3 + 15ψφ ∂φ + 25ψφ 2 ∂φ2 1−ψ 5

+ 10ψφ 3 ∂φ3 + 50φψ 2 ∂ψ ∂φ + 30φ 2 ψ 2 ∂ψ ∂φ2 + 30φψ 3 ∂ψ2 ∂φ

+ φ 4 ψ∂φ4 + 4φ 3 ψ 2 ∂ψ ∂φ3 + 6φ 2 ψ 3 ∂ψ2 ∂φ2 + 4(φψ 4 − 1)∂ψ3 ∂φ . (B.9) Now we have a complete system of linear differential equations at hand, which allows us to determine the needed linear combinations of differentiated basis elements with respect to both open- and closed-string moduli. The non-trivial derivatives of the two-form basis elements are given by

288

H. Jockers, M. Soroush

ψ −φ 3 π1,4 − π2,4 , 4φ 4φ ψ −φ 1 ∂φ π1,4 π0,4 − π1,4 , 4φ 2φ 125(φ − ψ)(φ − 5ψ) 175(φ − ψ)(φ − 5ψ)2 ∂φ π0,4 − π2,4 + π1,4 4D2 4D2 15(φ − ψ)(φ − 5ψ)3 1 π0,4 , + − 4φ 2D2

∂φ π2,4

∂ψ π0,4

125 φ(φ − 5ψ) 175φ(φ − 5ψ)2 30 φ(φ − 5ψ)3 π2,4 − π1,4 + π0,4 . D2 D2 D2 (B.10)

There are also two non-trivial derivative relations in the three-form sector, namely the derivatives of the basis element, π0,3 , with respect to both the open- and closed-string moduli. After going through another long calculation, we find the relations ∂φ π0,3

125 φ(φ − 5ψ) 175φ(φ − 5ψ)2 30 φ(φ − 5ψ)3 π2,4 − π1,4 + π0,4 , D2 D2 D2 (B.11)

and ∂ψ π0,3

ψ 15ψ 2 25ψ 3 10ψ 4 π3,3 + π2,3 + π1,3 + π0,3 D1 D1 D1 D1 φ T1 φ T2 φ T3 − π2,4 − π1,4 − π0,4 . 16D1 D2 16D1 D2 16D1 D2

(B.12)

Here the polynomials T1 , T2 , T3 , and the discriminants, D1 and D2 , are respectively defined in Eqs. (5.12) and (5.13). With Eqs. (B.10), (B.11), and (B.12), we can now easily extract the connection matrices, Mφ and Mψ , which were presented in (5.10) and (5.11). References 1. Polchinski, J.: Dirichlet-Branes and Ramond-Ramond Charges. Phys. Rev. Lett. 75, 4724 (1995) 2. Kontsevich, M.: Homological Algebra of Mirror Symmetry. http://arXiv.org/abs/alg-geom/9411018v1, 1994 3. Hori, K., Katz, S., Klemm, A., Pandharipande, R., Thomas, R., Vafa, C., Vakil, R., Zaslow, E.: Mirror Symmetry. Providence RI: Amer. Math. Soc., 2003 4. Witten, E.: Mirror manifolds and topological field theory. http://arXiv.org/abs/hep-th/9112056v1, 1991 5. Antoniadis, I., Gava, E., Narain, K.S., Taylor, T.R.: Topological amplitudes in string theory. Nucl. Phys. B 413, 162 (1994) 6. Bershadsky, M., Cecotti, S., Ooguri, H., Vafa, C.: Kodaira-Spencer theory of gravity and exact results for quantum string amplitudes. Commun. Math. Phys. 165, 311 (1994) 7. Douglas, M.R.: D-branes, categories and N = 1 supersymmetry. J. Math. Phys. 42, 2818 (2001) 8. Lazaroiu, C.I.: Generalized complexes and string field theory. JHEP 0106, 052 (2001) 9. Aspinwall, P.S., Lawrence, A.E.: Derived categories and zero-brane stability. JHEP 0108, 004 (2001) 10. Lazaroiu, C.I.: D-brane categories. Int. J. Mod. Phys. A 18, 5299 (2003) 11. Aspinwall, P.S.: D-branes on Calabi-Yau manifolds. http://arXiv.org/abs/hep-th/0403166v1, 2004 12. Witten, E.: Chern-Simons Gauge Theory as a String Theory. Prog. Math. 133, 637 (1995) 13. Kachru, S., Katz, S.H., Lawrence, A.E., McGreevy, J.: Open string instantons and superpotentials. Phys. Rev. D 62, 026001 (2000) 14. Ooguri, H., Vafa, C.: Knot invariants and topological strings. Nucl. Phys. B 577, 419 (2000)

Effective Superpotentials for Compact D5-Brane Calabi-Yau Geometries

289

15. Labastida, J.M.F., Mariño, M.: Polynomial invariants for torus knots and topological strings. Commun. Math. Phys. 217, 423 (2001) 16. Bouchard, V., Klemm, A., Mariño, M., Pasquetti, S.: Remodeling the B-model. http://arXiv.org/abs/0709. 1453v1[hep-th], 2007 17. Brini, A., Tanzini, A.: Exact results for topological strings on resolved Y(p,q) singularities. http://arXiv. org/abs/0804.2598v3[hep-th], 2008 18. Bouchard, V., Klemm, A., Mariño, M., Pasquetti, S.: Topological open strings on orbifolds. http://arXiv. org/abs/0807.0597v1[hep-th], 2008 19. Coates, T., Corti, A., Iritani, H., Tseng, H.-H.: Computing Genus-Zero Twisted Gromov-Witten Invariants. http://arXiv.org/abs/math/0702234v3[math.AG], 2007 20. Bayer, A., Cadman, C.: Quantum cohomology of [Cn /µr ]. http://arXiv.org/abs/0705.2160v1[math.AG], 2007 21. Bouchard, V., Cavalieri, R.: On the mathematics and physics of high genus invariants of C3 /Z3 . http:// arXiv.org/abs/0709.1453v1[math.AG], 2007 22. Aganagic, M., Vafa, C.: Mirror symmetry, D-branes and counting holomorphic discs. http://arXiv.org/ abs/hep-th/0012041v1, 2000 23. Aganagic, M., Klemm, A., Vafa, C.: Disk instantons, mirror symmetry and the duality web. Z. Naturforsch. A 57, 1 (2002) 24. Lerche, W., Mayr, P., Warner, N.: Holomorphic N = 1 special geometry of open-closed type II strings. http://arXiv.org/abs/hep-th/0207259v2, 2002 25. Lerche, W., Mayr, P., Warner, N.: N = 1 special geometry, mixed Hodge variations and toric geometry. http://arXiv.org/abs/hep-th/0208039v1, 2002 26. Aganagic, M., Klemm, A., Mariño, M., Vafa, C.: The topological vertex. Commun. Math. Phys. 254, 425 (2005) 27. Walcher, J.: Opening mirror symmetry on the quintic. Commun. Math. Phys. 276, 671 (2007) 28. Morrison, D.R., Walcher, J.: D-branes and Normal Functions. http://arXiv.org/abs/0709.4028v1[hep-th], 2007 29. Knapp, J., Scheidegger, E.: Towards Open String Mirror Symmetry for One-Parameter Calabi-Yau Hypersurfaces. http://arXiv.org/abs/0805.1013v2[hep-th], 2008 30. Krefl, D., Walcher, J.: Real Mirror Symmetry for One-parameter Hypersurfaces. JHEP 0809, 031 (2008) 31. Taylor, T.R., Vafa, C.: RR flux on Calabi-Yau and partial supersymmetry breaking. Phys. Lett. B 474, 130 (2000) 32. Gukov, S., Vafa, C., Witten, E.: CFT’s from Calabi-Yau four-folds. Nucl. Phys. B 584, 69 (2000) [Erratum-ibid. B 608, 477 (2001)] 33. Witten, E.: Branes and the dynamics of QCD. Nucl. Phys. B 507, 658 (1997) 34. Mayr, P.: N = 1 mirror symmetry and open/closed string duality. Adv. Theor. Math. Phys. 5, 213 (2002) 35. Cachazo, F., Intriligator, K.A., Vafa, C.: A large N duality via a geometric transition. Nucl. Phys. B 603, 3 (2001) 36. Kachru, S., Katz, S.H., Lawrence, A.E., McGreevy, J.: Mirror symmetry for open strings. Phys. Rev. D 62, 126005 (2000) 37. Karoubi, M., Leruste, C.: Algebraic topology via differential geometry. Cambridge: Cambridge University Press, 1987 38. Deligne, P.: Théorie de Hodge II. Publ. Math. I.H.E.S. 40, 5 (1971) 39. Voisin, C.: Hodge Theory and Complex Algebraic Geometry II. Cambridge: Cambridge University Press, 2003 40. Griffiths, P.: On the periods of certain rational integrals: I. Ann. Math. 90, 460 (1969) 41. Candelas, P.: Yukawa couplings between (2,1) forms. Nucl. Phys. B 298, 458 (1988) 42. Lerche, W., Smit, D.J., Warner, N.P.: Differential equations for periods and flat coordinates in two-dimensional topological matter theories. Nucl. Phys. B 372, 87 (1992) 43. Libgober, A., Teitelbaum, J.: Lines on Calabi-Yau complete intersections, mirror symmetry, and Picard Fuchs equations. Int. Math. Res. Not. 1993, 13 (1993) 44. Griffiths, P.: A theorem concerning the differential equations satisfied by normal functions associated to algebraic cycles. Am. J. Math. 101, 94 (1979) 45. Deligne, P.: Théorie de Hodge III. Publ. Math. I.H.E.S. 55, 5 (1974) 46. Cecotti, S., Vafa, C.: Topological antitopological fusion. Nucl. Phys. B 367, 359 (1991) 47. Lerche, W., Vafa, C., Warner, N.P.: Chiral Rings in N = 2 Superconformal Theories. Nucl. Phys. B 324, 427 (1989) 48. Strominger, A.: Special Geometry. Commun. Math. Phys. 133, 163 (1990) 49. Aganagic, M., Bouchard, V., Klemm, A.: Topological Strings and (Almost) Modular Forms. Commun. Math. Phys. 277, 771 (2008) 50. Klemm, A., Theisen, S.: Considerations of one modulus Calabi-Yau compactifications: Picard-Fuchs equations, Kahler potentials and mirror maps. Nucl. Phys. B 389, 153 (1993)

290

51. 52. 53. 54. 55. 56.

H. Jockers, M. Soroush

Greene, B.R., Plesser, M.R.: Duality in Calabi-Yau moduli space. Nucl. Phys. B 338, 15 (1990) Rainville, E.: Special Functions. New York: The Macmillan Company, 1960 Witten, E.: Phases of N = 2 theories in two dimensions. Nucl. Phys. B 403, 159 (1993) Intriligator, K.A., Vafa, C.: Landau-Ginzburg Orbifolds. Nucl. Phys. B 339, 95 (1990) Dijkgraaf, R., Verlinde, H.L., Verlinde, E.P.: Topological Strings in D < 1. Nucl. Phys. B 352, 59 (1991) Candelas, P., De La Ossa, X., Font, A., Katz, S.H., Morrison, D.R.: Mirror symmetry for two parameter models. I. Nucl. Phys. B 416, 481 (1994) 57. Candelas, P., Font, A., Katz, S.H., Morrison, D.R.: Mirror symmetry for two parameter models. 2. Nucl. Phys. B 429, 626 (1994) 58. Candelas, P., De La Ossa, X.C., Green, P.S., Parkes, L.: A pair of Calabi-Yau manifolds as an exactly soluble superconformal theory. Nucl. Phys. B 359, 21 (1991) 59. Hori, K., Walcher, J.: F-term equations near Gepner points. JHEP 0501, 008 (2005) Communicated by A. Kapustin

Commun. Math. Phys. 290, 291–319 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0829-x

Communications in

Mathematical Physics

The Structure of Renormalization Hopf Algebras for Gauge Theories I: Representing Feynman Graphs on BV-Algebras Walter D. van Suijlekom Institute for Mathematics, Astrophysics and Particle Physics, Faculty of Science, Radboud University Nijmegen, Toernooiveld 1, 6525 ED Nijmegen, The Netherlands. E-mail: [email protected] Received: 28 August 2008 / Accepted: 30 January 2009 Published online: 23 May 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com

Abstract: We study the structure of renormalization Hopf algebras of gauge theories. We identify certain Hopf subalgebras in them, whose character groups are semidirect products of invertible formal power series with formal diffeomorphisms. This can be understood physically as wave function renormalization and renormalization of the coupling constants, respectively. After taking into account the Slavnov–Taylor identities for the couplings as generators of a Hopf ideal, we find Hopf subalgebras in the corresponding quotient as well. In the second part of the paper, we explain the origin of these Hopf ideals by considering a coaction of the renormalization Hopf algebras on the Batalin–Vilkovisky (BV) algebras generated by the fields and couplings constants. The so-called classical master equation satisfied by the action in the BV-algebra implies the existence of the above Hopf ideals in the renormalization Hopf algebra. Finally, we exemplify our construction by applying it to Yang–Mills gauge theory. 1. Introduction The mathematical formulation of quantum gauge theories forms one of the great challenges in mathematical physics. Recently, the perturbative structure of quantum Yang–Mills gauge theories has been more and more understood. On the one hand, many rigorous results can be obtained [3,4] using cohomological arguments within the context of the BRST-formalism [8–10,43]. On the other hand, renormalization of perturbative quantum field theories has been carefully structured using Hopf algebras [17,18,31]. The presence of a gauge symmetry induces a rich additional structure on these Hopf algebras, as has been explored in [2,32,36] and in the author’s own work [40,42]. All of this work is based on the algebraic transparency of BPHZ-renormalization, with the Hopf algebra reflecting the recursive nature of this procedure. Nevertheless, there are two objections to this approach to perturbative quantum field theories. Firstly, it is defined in momentum space and one is thus restricted to quantum field theories on flat spacetime and, secondly, it is defined as a graph-by-graph procedure

292

W. D. van Suijlekom

and not in terms of the – more physical – full Green’s functions. In this paper, we will address the second point and try to elucidate the Hopf algebraic structure on the level of Green’s functions in gauge theories. The first point has been addressed in the series of papers [16,19,29] and references therein, in the context of algebraic quantum field theories. The case of Yang–Mills theories was considered by Hollands in [28]. Interesting to note is that the Epstein–Glaser renormalization involved in this approach has an underlying Hopf algebraic structure as well [37] (see also [11] for rooted trees instead of Feynman graphs). The mainstream physics literature has taken a slightly different road to the perturbative approach to quantum gauge theories by putting functional integration techniques at its heart. Although the formal path integral manipulations provide a powerful technique, it is hard to overestimate the importance of a transparent algebraic description of perturbative quantum gauge theories. In this paper, we adopt the philosophy to put formal (perturbative) expansions in Feynman graphs as a starting point and try to (rigorously) derive results from that, avoiding functional techniques completely. For instance, the process of (BPHZ)-renormalization is captured by means of the Connes–Kreimer Hopf algebra. As we will explain in Sect. 2 below, this extends naturally to gauge theories. The Slavnov–Taylor identities for the couplings – the reminiscences of the gauge symmetry – are shown to be compatible with renormalization by establishing them as generators of a Hopf ideal in this Hopf algebra (Sect. 3). In the corresponding quotient Hopf algebra, we find certain Hopf subalgebras and we will show that their character group is a subgroup of the semidirect product of (invertible) formal power series with a formal diffeomorphism group. It should be noted that the existence of Hopf subalgebras has already been studied in the context of Dyson–Schwinger equations in [12,32] and in [21] for planar rooted trees. Also, our work reflects the semi-direct product considered in [24] for a scalar field theory, and [14,15] where the above groups of formal series and diffeomorphisms (and their noncommutative analogues) appear in the study of renormalization of quantum electrodynamics. Although the existence of the Hopf ideals can be established rigorously by a combinatorial proof, it is crucial to have a more conceptual understanding of their origin. This is what we do in the second part of the paper. Namely, we connect the renormalization Hopf algebras of gauge theories to the BV-algebras generated by the fields and coupling constants by making the latter a comodule BV-algebra over the Hopf algebras (Sect. 4). The induced action of the character group can be understood as wave function renormalization (invertible formal power series in k variables) and coupling constant renormalization (formal diffeomorphisms of Ck ). The origin of the Hopf ideals is clarified by identifying an ideal in the BV-algebra generated by the so-called master equation satisfied by the action. This ideal induces a Hopf ideal in the renormalization Hopf algebra which in the case of simple gauge theories coincides with the ideal generated by the Slavnov–Taylor identities for the couplings. On the level of the character group, the k parameters reduce to the subgroup in one parameter. This reflects the presence of a ‘fundamental coupling constant’ in such theories, in terms of which all other coupling constants can be expressed. We conclude this section by discussing the renormalization group and the beta-function in this context. In Sect. 5 we exemplify our construction by working out explicitly the case of Yang–Mills gauge theory, with a simple Lie group. The study of the effective action of gauge theories will be postponed to a second paper. In particular, we will study the Zinn–Justin equation from the Hopf-algebraic

Representing Feynman Graphs on BV-Algebras

293

point of view as well as its relation with the (classical) master equation (cf. Eq. (8) below) satisfied by the action.

2. Preliminaries on Hopf Algebras Let us briefly recall the role played by Hopf algebras as coordinate rings on (affine) groups, while referring to [44] for a complete treatment. We will consider commutative algebras H (over C) for which the set of characters G := HomC (H, C) is actually a group. The group structure on G induces the structure of a Hopf algebra on H , that is, a counit, η : H → C, a coproduct ∆ : H → H ⊗ H and an antipode S : H → H . These are all algebra maps and are supposed to a satisfy certain compatibility condition which we do not list here. Representations of G correspond one-to-one to corepresentations of H . In fact, if V is a G-module, then it is also a comodule over H , that is, there exists a map (called coaction) ρ : V → V ⊗ H such that gv = (1 ⊗ g)ρ(v). If V has additional structure, it is natural to require the coaction to respect this structure. We will further restrict our study to connected graded Hopf algebras for which there is a grading H = ⊕n∈N H n that is respected by the product and the coproduct: H k H l ⊂ H k+l ;

∆(H n ) =

n

H k ⊗ H n−k ,

k=0

and such that H 0 = C1. Dually, graded Hopf algebras correspond to (pro)-unipotent groups. We illustrate the above definition somewhat elaborately by discussing in the next two subsections two examples that are relevant in what follows.

2.1. Hopf algebra of Feynman graphs. We suppose that we have defined a (renormalizable) perturbative quantum field theory and specified the possible interactions between different types of fields. These fields are collected in a set Φ = {φ1 , . . . , φ N } whereas the different types of interactions – represented by vertices – constitute a set R V . In the Lagrangian formalism, it is natural to associate to each vertex a local monomial in the fields (present in the Lagrangian); we will denote this map by ι : RV → Loc(Φ), where Loc(Φ) is defined as the algebra of local polynomials in the fields φ j (see Definition 22 below). Propagators, on the other hand, are indicated by edges and form a set R E . Again, one assigns a monomial to each edge via ι : R E → Loc(Φ) but now ι(e) is of order 2 in the fields, involving precisely the field (and its conjugate in the case of fermions) that is propagating. We will assume that there are BRST-source terms present in the theory, which means that for each field φi there is a corresponding source field K φi in Φ. In other words, the set of fields is of the form Φ = {φ1 , · · · , φ N , K φ1 , · · · , K φ N }. This even-dimensionality is a manifestation of the structure on the fields of a Gerstenhaber algebra which we will explore later.

294

W. D. van Suijlekom

Example 1. Quantum electrodynamics describes the interaction of charged particles such as electrons with photons, with corresponding fields ψ and A. Their propagation is usually indicated by a straight and a wiggly line (for the electron and photon, respectively). There is only the interaction of an electron emitting a photon: this is indicated by a vertex of valence three; the mass term for the electron is indicated by a vertex of valence two. The dynamical and interactive character of the theory can be summarized by the following sets,1

The corresponding monomials in Loc([Φ]) are

with e and m the electric charge and mass of the electron, respectively. Example 2. Quantum chromodynamics describes the strong interaction between quarks and gluons, described by the fields ψ and A, respectively (see Sect. 5 below for more details). These are indicated by straight and wiggly lines. In addition, associated to the non-abelian gauge symmetry (with symmetry group SU (3)) there is the so-called ghost field ω, indicated by dotted lines, as well as the BRST-sources K ψ , K A and K ω . Between the fields there are four interactions, three BRST-source terms, and a mass term for the quark. This leads to the following sets of vertices and edges:

with the dashed lines representing the BRST-source terms, and

Note that the dashed edges do not appear in R E , i.e. the source terms do not propagate and in the following they will not appear as internal edges of a Feynman graph. Remark 3. Although these examples motivate our construction, we stress that for what follows it is not necessary to specify the fields in the set Φ nor the vertices and edges in R = R V ∪ R E explicitly. The relevant structure is encoded by the map ι : R → Loc(Φ). We note, however, that we make the following natural working assumptions: 1. Whenever a fermionic field, say ψ, interacts at a vertex v ∈ R V which does not involve a BRST-source, then ι(v) involves both ψ and ψ. 2. There is only one vertex for every BRST-source. 1 We specify the type of fields that are involved in the interaction by drawing a small neighborhood around the vertex instead of merely a dot.

Representing Feynman Graphs on BV-Algebras

295

3. There are no valence two vertices involving two different fields (thus, still allowing mass terms). Physically, the last condition means that we require order two polynomials other than mass terms in the Lagrangian not to be radiatively corrected. A Feynman graph is a graph built from the types of vertices present in R V and the types of edges present in R E . Naturally, we demand edges to be connected to vertices in a compatible way, respecting the type of vertex and edge. As opposed to the usual definition in graph theory, Feynman graphs have no external vertices. However, they do have external lines which come from vertices in Γ for which some of the attached lines remain vacant (i.e. no edge in R E attached). Implicit in the construction is the fact that source terms only arise as external lines since they are not in R E , justifying the name ’source term’. If a Feynman graph Γ has two external lines, both corresponding to the same field, we would like to distinguish between propagators and mass terms. In more mathematical terms, since we have vertices of valence two, we would like to indicate whether a graph with two external lines corresponds to such a vertex, or to an edge. A graph Γ with two external lines is dressed by a bullet when it corresponds to a vertex, i.e. we write Γ• . The above correspondence between Feynman graphs and vertices/edges is given by the residue res(Γ ). It is defined as the vertex or edge the graph corresponds to after collapsing all its internal points. For example, we have:

For the definition of the Hopf algebra of Feynman graphs, we restrict to one-particle irreducible (1PI) Feynman graphs. These are graphs that are not trees and cannot be disconnected by cutting a single internal edge. Definition 4 (Connes–Kreimer [17]). The Hopf algebra of Feynman graphs is the free commutative algebra H over C generated by all 1PI Feynman graphs with residue in R = R V ∪ R E , with counit (Γ ) = 0 unless Γ = ∅, in which case (∅) = 1, coproduct, γ ⊗ Γ /γ , ∆(Γ ) = Γ ⊗ 1 + 1 ⊗ Γ + γ Γ

where the sum is over disjoint unions of 1PI subgraphs with residue in R. The quotient Γ /γ is defined to be the graph Γ with the connected components of the subgraph contracted to the corresponding vertex/edge. If a connected component γ of γ has two external lines, then there are possibly two contributions corresponding to the valence two vertex and the edge; the sum involves the two terms γ• ⊗Γ /(γ → •) and γ ⊗Γ /γ . The antipode is given recursively by, S(Γ ) = −Γ − S(γ )Γ /γ . γ Γ

296

W. D. van Suijlekom

Two examples of this coproduct, taken from QED, are:

The above Hopf algebra is an example of a connected graded Hopf algebra: it is graded by the loop number L(Γ ) of a graph Γ . Indeed, one checks that the coproduct (and obviously also the product) satisfy the grading by loop number and H 0 consists of complex multiples of the empty graph, which is the unit in H , so that H 0 = C1. We denote by ql the projection in H onto H l . In addition, there is another grading on this Hopf algebra. It is given by the number of vertices and already appeared in [17]. However, since we consider vertices and edges of different types (wiggly, dotted, straight, et cetera), we extend to a multigrading as follows. As in [40], we denote by m Γ,r the number of vertices/internal edges of type r appearing in Γ , for r ∈ R. Moreover, let n γ ,r be the number of connected components of γ with residue r . For each v ∈ R V we define a degree dv by setting dv (Γ ) = m Γ,v − n Γ,v . The multidegree indexed by R V is compatible with the Hopf algebra structure as follows easily from the following relation: m Γ /γ ,v = m Γ,v − m γ ,v + n γ ,v , and the fact that m Γ Γ ,v = m Γ,v + m Γ ,v , and n Γ Γ ,v = n Γ,v + n Γ ,v . This gives a decomposition H n 1 ,...,n k , H= (n 1 ,...,n k )∈Zk

where k = |R V |. We denote by pn 1 ,...,n k the projection onto H n 1 ,...,n k . Note that also H 0,··· ,0 = C1. Lemma 5. There is the following relation between the grading by loop number and the multigrading by number of vertices: (N (v) − 2)dv = 2L , v∈R V

where N (v) is the valence of the vertex v. Proof. This can be easily proved by induction on the number of internal edges using invariance of the quantity v (N (v) − 2)dv − 2L under the adjoint of an edge. The group HomC (H, C) dual to H is called the group of diffeographism. This name was coined in [18] motivated by its relation with the group of (formal) diffeomorphisms of C, whose definition we recall in the next section. Stated more precisely, they constructed a map from the group of diffeographism to the group of formal diffeomorphisms. We will establish this result in general (i.e. for any quantum field theory) in Sect. 3 below.

Representing Feynman Graphs on BV-Algebras

297

2.2. Formal diffeomorphisms. Another Hopf algebra that will be of interest is that dual to the group Diff(C, 0) of formal diffeomorphisms of C tangent to the identity, it is known in the literature as the Faà di Bruno Hopf algebra (see for instance the short review [20]). The elements of this group are given by formal power series: f (x) = x an ( f )xn ; a0 ( f ) = 1 (1) n≥0

with the composition law given by ( f ◦ g)(x) = f (g(x)). The coordinates {an } generate a Hopf algebra with the coproduct, counit and antipode defined in terms of the pairing an , f := an ( f ) as ∆(an ), f ⊗ g = an , g ◦ f .

(an ) = an , 1 ,

S(an ), f = an , f −1 .

(2)

A convenient expression for the coproduct on an can be given as follows [15]. Consider the generating series an xn ; a0 = 1, A(x) = x n≥0

where x is considered as a formal parameter. Then the coproduct can be written as A(x)n+1 ⊗ an . (3) ∆A(x) = n≥0

One readily checks that indeed ∆A(x), g ⊗ f = f (g(x)). Remark 6. Actually, this Hopf algebra is the dual of the opposite group of Diff(C, 0). Instead of acting on C as formal diffeomorphisms, the opposite group Diff(C, 0)op can be characterized by its action on the algebra C[[x]] of formal power series in x. On the generator x, the action of Diff(C, 0)op is defined by the same formula (1) but it is extended to all of C[[x]] as an algebra map. We will denote in the following this group by Aut 1 (C[[x]]) := Diff(C, 0)op . Clearly, we have an analogous definition of formal diffeomorphisms of Ck tangent to the identity. The group Diff(Ck , 0) consists of elements: f (x) = ( f 1 (x), . . . , f k (x)) , where each f i is a formal power series of the following form: an(i)1 ···n k ( f )xn1 1 · · · xnk k ) f i (x) = xi ( (i) with a0,...,0 = 1 and x = (x1 , . . . , xk ).

Again, there is a dual Hopf algebra generated by the coordinates an(i)1 ···n k with the coproduct, counit and antipode defined by the analogous formula to Eq. (2). (i) Lemma 7. On the generating series Ai (x) = xi ( an 1 ···n k xn1 1 · · · xnk k ) the coproduct equals ∆(Ai (x)) = Ai (x) (A1 (x))n 1 · · · (Ak (x))n k ⊗ an(i)1 ···n k . n 1 ,...,n k

Closely related to these groups of formal diffeomorphisms, is the group of invertible power series in k parameters, denoted C[[x1 , . . . , xk ]]× . As above, it consists of formal series f with non-vanishing first coefficient a0 ( f ) = 0, but with product given by the algebra multiplication. The formula for the inverse is given by the Lagrange inversion formula for formal power series.

298

W. D. van Suijlekom

2.3. Birkhoff decomposition. We now briefly recall how renormalization is an instance of a Birkhoff decomposition in the group of characters of H as established in [17]. Let us first recall the definition of a Birkhoff decomposition. We let γ : C → G be a loop with values in an arbitrary complex Lie group G, defined on a smooth simple curve C ⊂ P1 (C). Let C± be the two complements of C in P1 (C), with ∞ ∈ C− . A Birkhoff decomposition of γ is a factorization of the form γ (z) = γ− (z)−1 γ+ (z);

(z ∈ C),

where γ± are (boundary values of) two holomorphic maps on C± , respectively, with values in G. This decomposition gives a natural way to extract finite values from a divergent expression. Indeed, although γ (z) might not holomorphically extend to C+ , γ+ (z) is clearly finite as z → 0. Now consider a Feynman graph Γ in the Hopf algebra H . Via the so-called Feynman rules – which are dictated by the Lagrangian of the theory – one associates to Γ the Feynman amplitude U (Γ )(z). It depends on some regularization parameter, which in the present case is a complex number z (dimensional regularization). The famous divergences of quantum field theory are now ‘under control’ and appear as poles in the Laurent series expansion of U (Γ )(z). On a curve around 0 ∈ P 1 (C) we can define a loop γ by γ (z)(Γ ) := U (Γ )(z) which takes values in the group of diffeographisms G = HomC (H, C). Connes and Kreimer proved the following general result in [17]. Theorem 8. Let H be a graded connected commutative Hopf algebra with character group G. Then any loop γ : C → G admits a Birkhoff decomposition. In fact, an explicit decomposition can be given in terms of the group G(K ) = HomC (H, K ) of K -valued characters of H , where K is the field of convergent Laurent series in z.2 If one applies this to the above loop associated to the Feynman rules, the decomposition gives exactly renormalization of the Feynman amplitude U (Γ ): the map γ+ gives the renormalized Feynman amplitude and the γ− provides the counterterm. Although the above construction gives a very nice geometrical description of the process of renormalization, it is a bit unphysical in that it relies on individual graphs that generate the Hopf algebra. Rather, as mentioned before, in physics the probability amplitudes are computed from the full expansion of Green’s functions. Individual graphs do not correspond to physical processes and therefore a natural question to pose is how the Hopf algebra structure behaves at the level of the Green’s functions. We will see in the next section that they generate Hopf subalgebras, i.e. the coproduct closes on Green’s functions. Here the so-called Slavnov–Taylor identities for the couplings will play a prominent role.

3. Feynman Graphs and Formal Diffeomorphisms In this section, the group of formal diffeomorphisms of C will be shown to arise as a quotient of the group of diffeographisms. As before, it is very convenient to work in a dual manner with the relevant Hopf algebras. 2 In the language of algebraic geometry, there is an affine group scheme G represented by H in the category of commutative algebras. In other words, G = HomC (H, . ) and G(K ) are the K -points of the group scheme.

Representing Feynman Graphs on BV-Algebras

299

We define the 1PI Green’s functions by Ge = 1 −

res(Γ )=e

Γ , Sym(Γ )

Gv = 1 +

res(Γ )=v

Γ Sym(Γ )

with e ∈ R E , v ∈ R V . The restriction of the sum to graphs Γ at loop order L(Γ ) = l is denoted by G rl . The following prepares for renormalization in the BV-formalism, which differs slightly from the usual wave function and coupling constant renormalization (see for instance [1, Sect. 6]). For each φ ∈ Φ we assume that we are given elements C φ ∈ H such that the following hold: 1. If φ only appears linearly in the Lagrangian then C φ C φi1 · · · C φi1 = 1 for ι(v) ∝ φφi1 · · · φim . 2. If ι(e) ∝ φφ then C φ C φ = G e . 3. For any field φi we have C K φi C φi = 1. Note that in general the C φ ’s are not uniquely determined by these conditions. However, in theories of interest such as Yang–Mills gauge theories, they actually are as illustrated by the next example. Example 9. For pure Yang–Mills gauge theories (see for notation Example 19 below) we have

and C K φ = (C φ )−1 for φ = A, ω, ω, h. Note that C ω C ω = which – as we shall see in Sect. 5 below – will be the usual wave function renormalization for the ghost propagator. Returning to the general setup, we assume that we have defined such elements C φ for all φ ∈ Φ. Remark 10. Let us pause to explain the meaning of the inverse of Green’s functions in our Hopf algebra. Since any Green’s function G r for r ∈ R starts with the identity, we can surely write its inverse formally as a geometric series. Recall that the Hopf algebra is graded by loop number. Hence, the inverse of a Green’s function at a fixed loop order is in fact well-defined; it is given by restricting the above formal series expansion to this loop order. More generally, we understand any real power of a Green’s function in this manner. In earlier work [42, Eq. (11)], we have shown that the coproduct on Green’s functions takes the following form: ⎛ ⎞m Γ,v v G Γ ⎝ ⎠ ∆(G r ) = G r ⊗ 1 + G r . (4) ⊗ N (v) φ Sym(Γ ) Cφ res(Γ )=r v∈R V ,v =r

φ

Here Nφ (r ) is the number of lines corresponding to the field φ ∈ Φ attached to r ∈ R; clearly, the total number of lines attached to r can be written as N (r ) = φ∈Φ Nφ (r ).

300

W. D. van Suijlekom

Remark 11. In order to reduce the above formula to Eq. (11) in [42] one observes that if v does not involve a BRST-source term then Gv Gv = , e Ne (v)/2 φ Nφ (v) e∈R E (G ) φ C since a fermionic field φ will always be accompanied by the field φ on a vertex that does not involve a BRST-source (cf. Remark 3), thus reducing the above formula to Eq. (11) in loc. cit.. It is sufficient to consider only the case of no BRST-sources since in either case (for r with or without BRST-source) the v’s appearing in the above formula will never involve a BRST-source. Proposition 12. Define elements Yv ∈ H for v ∈ R V as formal expansions: Gv Yv := N (v) . φ φ φ C The coproduct on (Yv )α with α ∈ R is given by ∆(Yvα ) = Yvα Yvn11 · · · Yvnkk ⊗ pn 1 ···n k (Yvα ), n 1 ···n k

where pn 1 ···n k is the projection onto graphs containing n i vertices of the type vi (i = 1, . . . , k = |R V |). Proof. First, one can obtain from Eq. (4) the coproduct on G r as ∆(G r ) = G r Yvn11 · · · Yvnkk ⊗ pn 1 ,...,n k (G r ), n 1 ,...,n k

which holds for any r ∈ R. A long but straightforward computation involving formal power series expansions yields the following expression for real powers (in the above sense) of the Green’s functions: ∆((G r )α ) = (G r )α Yvn11 · · · Yvnkk ⊗ pn 1 ,...,n k ((G r )α ), (5) n 1 ,...,n k

for r ∈ R and α ∈ R. Thus, also (C φ )α Yvn11 · · · Yvnkk ⊗ pn 1 ,...,n k ((C φ )α ), ∆((C φ )α ) =

(6)

n 1 ,...,n k

and a combination of these formulas together with the fact that ∆ is an algebra map yields the desired cancellations so as to obtain the stated formula. Remark 13. In [41,42] we considered the elements X v := (Yv )1/(N (v)−2) for vertices v of valence greater than 2. Currently, we are including vertices of valence 2 to incorporate mass terms, which motivates the definition of Yv instead. There is a striking similarity between the above formula for ∆(Yv ) and the coproduct in the Hopf algebra dual to Diff(Ck , 0), as in Lemma 7. In fact, we have the following:

Representing Feynman Graphs on BV-Algebras

301

Corollary 14. There is a surjective map from the Hopf algebra dual to the group Diff(Ck , 0)op to the Hopf subalgebra in H generated by pn 1 ···n k (Yv ). Proof. Whenever (n 1 , . . . , n k ) = (0, . . . , 0), we map the coordinates an(i)1 ...,n k of Diff(Ck , 0) to the elements pn 1 ,...,n i −1,...,n k (Yvi ) ∈ H , with k = |R V |. Indeed, pn 1 ···n k (Yvi ) vanishes for all n j < 0 ( j = i) and n i < −1, explaining the shift in (i) the i th index. Moreover, both a0,...,0 and p0,...,0 (Yvi ) are equal to the identity. Actually, with Eq. (5) above it is easy to see that the algebra generated by pn 1 ···n k (Yv ) and pn 1 ···n k (G e ) for v ∈ R V and e ∈ R E is a Hopf subalgebra, which we denote by H R . Equivalently, we can take as generators for H R the elements pn 1 ···n k (Yv ) and pn 1 ···n k (C φ ). In Proposition 29 below we will show that the corresponding dual group is in fact a subgroup of the semi-direct product (C[[x1 , . . . , xk ]]× )|R E | Diff(Ck , 0). We will next establish that a quotient of the Hopf algebra generated by pn 1 ,...,n k (Yv ) by a certain Hopf ideal is isomorphic to the Hopf algebra dual to (a subgroup of) Aut 1 (C[[x]]) ≡ Diff(C, 0)op . The latter is indeed a subgroup of Diff(Ck , 0)op under the diagonal embedding.

N (v)−2 N (v )−2 for Theorem 15. [42] The ideal J in H R generated by ql Yv − Yv v , v ∈ R V of valence greater than 2 (l ≥ 0), and Yv for all v of valence 2 is a Hopf ideal, i.e. ∆(J ) ⊂ J ⊗ H R + H R ⊗ J . Proof. First of all, with Proposition 12, the coproduct on Yv for val(v) = 2 is readily found to be an element in J ⊗ H R + H R ⊗ J . With Proposition 12, we can write the coproduct on the other generators of J as

N (v)−2 N (v)−2 ∆ Yv − YvN (v )−2 = YvN (v )−2 Yvn11 · · · Yvnkk ⊗ pn Yv − YvN (v )−2 ×

n

n

YvN (v)−2

− YvN (v )−2

Yvn11 · · · Yvnkk ⊗ pn YvN (v)−2

with n the multi-index (n 1 , . . . , n k ). The second term is clearly an element in J ⊗ H R . For the first term, note that each n i th power of Yvi can be written as ni

Yvnii = Yvi

N (v)−2 N (v)−2

ni

= Yv

N (vi )−2 N (v)−2

+ J .

Hence, the first term becomes modulo J ⊗ H R ,

n 1 (N (v1 )−2)+···+n k (N (vk )−2)

Yv1/N (v)−2 ⊗ pn 1 ···n k YvN (v)−2 − YvN (v )−2 . n 1 ···n k

Appealing to Lemma 5 now allows us to write this in terms of the loop number l to finally obtain for the first term ∞

2l+1 N (v)−2 YvN (v)−2) ⊗ ql Yv − YvN (v )−2 ,

l=0

which is indeed an element in H R ⊗ J .

302

W. D. van Suijlekom

R = H R /J is well-defined. In H R As a consequence, the quotient Hopf algebra H N (v )−2 N (v)−2 the relations Yv = Yv are satisfied, or, in terms of the X v of Remark 13 they are simply X v = X v . In physics these identities are called Slavnov–Taylor identities for the couplings; we will see later how they appear naturally from the relations between coupling constants. Moreover, the fact that we put Yv = 0 for vertices of valence 2 R we can drop the subscript v and use means that we consider a massless theory. In H 1/N (v)−2 ≡ X v independent of v ∈ R V as long as val(v) > 2. the notation X := Yv R takes the following form on the element X : Theorem 16. The coproduct in H ∆(X ) =

∞

(X )2l+1 ⊗ ql (X ),

l=0

R onto graphs of loop number l. where ql is the projection in H Proof. This follows directly by substituting X for X v in the expression for ∆(X v ) in Proposition 12 and using the relation from Lemma 5 between the number of vertices and the loop number. R contains a Hopf subalgebra that is generated by ql (X ) Thus, the Hopf algebra H and a comparison with Eq. (3) yields – after identifying ql (X ) with a2l – the following result: R generated by ql (X ) for l = 0, 1, . . . Theorem 17. The graded Hopf subalgebra in H is isomorphic to the Hopf algebra of the group of odd formal diffeomorphisms of C tangent to the identity. In other words, there is a homomorphism from the group of diffeographisms to Diff(C, 0)op ≡ Aut 1 (C[[x]]). This generalizes the result of [18] where such a map was constructed explicitly in the case of (massless) φ 3 -theory; for other theories a map has been constructed by Cartier and Krajewski. In the next section, we will explore its relation with the group of formal diffeomorphisms acting on the space of coupling constants. 4. Coaction on the Fields and Coupling Constants In this section, we will establish a connection between the Hopf algebra of Feynman graphs defined above and the fields, coupling constants and masses that characterize the field theory. This allows us to derive the Hopf ideals encountered in the previous section from the so-called master equation satisfied by the Lagrangian. Let us start by a careful setup of the algebra of local functions and functionals in the fields that constitute the field theory. Readers already familiar with this might want to skip to Sect. 4.4, where the connection is established between the BV-algebra of fields and the renormalization Hopf algebras. 4.1. Fields and BRST-sources. Although we have already introduced the set of fields Φ above, we have not said precisely what we mean with a field. Let us do so in a bit more generality than needed. A field φ is a section of a vector bundle E → M on the background manifold M. If the rank of the vector bundle E is r , the field is said to have r components, in which case we can write locally φ = φ a ea in terms of a basis ea of E.

Representing Feynman Graphs on BV-Algebras

303

Example 18. If E = M × C, then a section φ is a complex scalar field φ : M → C; it has one component. Example 19. Gauge fields are sections A of E = Λ1 ⊗ (P ×G g) with P a G-principal bundle and g = Lie(G). In the case that P is trivial, this becomes a gvalued one-form on M, i.e. A is a section of Λ1 (g). In this case, the rank of the vector bundle is dim(M) · rank(g) which leads to the familiar decomposition A = Aaµ dxµ T a , with {T 1 , . . . , T rank(g) } a basis for g and summation is understood. If we consider a set Φ consisting of 2N fields, we have specified 2N (graded) vector bundles each of which has a corresponding field as its section. As said, we will assume that the fields come in pairs of a field φi and an BRST-source K φi (i = 1, . . . , N ) and we write E i and E i∨ for the corresponding vector bundles which are of equal rank. In fact, E i∨ is the dual vector bundle of E i , although shifted in degree as we make more precise now. The fields φi are understood to have a so-called ghost degree gh(φi ) ∈ Z which is then extended to the BRST-sources by gh(K φi ) := −gh(φi ) − 1. In the physics literature, this is usually called the (total) ghost number. Summarizing, the elements of Φ constitute a section of the total vector bundle E tot : (φ1 , K φ1 , . . . , φ N , K φ N ) : M → E tot =

N

E i ⊕ E i∨ .

i=1

The grading on the fields turn E tot into a graded vector bundle. Example 20. In Sect. 5 below, we will focus on pure Yang-Mills gauge theories. In that case, there is the gauge field A as in Example 19 which (in the trivial bundle case) is a section of Λ1 ⊗ M × g, i.e. an element of Ω 1 (g). The so-called ghost fields ω and ω are assigned to each generator of g, in components ω = ωa T a and ω = ωa T a . Their ghost degrees are defined to be 1 and −1, respectively, so that ω is a section in Ω 0 (g[−1]) and ω in Ω 0 (g[1]). Also, there is the so-called auxiliary – or Nakanishi–Lantrup – field h = h a T a , which is a section in Ω 0 (g) and of degree 0. Corresponding to these fields, there are the BRST-sources K A , K ω , K ω and K h which are of respective ghost degree −1, −2, 0 and −1. Thus, the field content of pure Yang-Mills gauge theories can be summarized by the following sections: (A, ω, ω, h) ∈ Ω 1 (g) ⊕ Ω 0 (g[−1]) ⊕ Ω 0 (g[1]) ⊕ Ω 0 (g), (K A , K ω , K ω , K h ) ∈ X(g[1]) ⊕ Ω 0 (g) ⊕ Ω 0 (g[2]) ⊕ Ω 0 (g[1]), where X(g) denotes g-valued vector fields. Taken all together, they form a section of the total bundle.

304

W. D. van Suijlekom

4.2. Jet bundles in Lagrangian field theory. Let us now ‘prolong’ this total bundle E tot and construct the jet bundle J ∞ (E tot ). First, we generalize a little and briefly recall the theory of jet bundles. We refer to [38] for more details. Let π : E → M be a vector bundle on an m-dimensional manifold M and suppose u ∈ Γ (M, E) is a smooth section. For each x ∈ M consider a neighborhood U and a local trivialization π −1 (U) U × Rk with coordinates xµ , u a (x) with µ = 1, . . . , m and a = 1, . . . , k. Definition 21. The first-order jet jx1 (σ ) of a section σ of E at x is the equivalence class of sections for the relation σ ∼ σ ⇐⇒ σ (x) = σ (x), ∂i σ (x) = ∂i σ (x)

(i = 1, . . . , m)

for σ ∈ Γ (M, E). The set J 1 (E) of all such equivalence classes, J 1 (E) =

jx1 (σ ),

x∈M σ ∈Γ (M,E)

carries the structure of a vector bundle over M – with projection map π1 : jx1 (σ ) → x – and is called the first-order jet bundle of E. A local trivialization π1−1 (U) U × Rk+km is given in terms of the local coordinates {u a1 , ∂µ u a2 }. Besides the structure of a vector bundle over M, J 1 (E) is also a vector bundle over E, with projection map defined by πr,0 : jx1 (σ ) → σ (x). If we apply this construction repeatedly to the jet bundle itself, we obtain the n th -order jet bundle of E as J n (E) := J 1 (· · · (J 1 (E)) · · · ). n times

In other words, J n (E) consists of equivalence classes of sections, which are identified when their values and the values of their partial derivatives up to order n are equal. As a consequence, local coordinates on J n (E) are given by {u a1 , ∂µ u a2 , . . . , ∂µ1 · · · ∂µn u an }. From the above construction, it is clear that we can define maps πn,n : J n (E) → n J (E) for 1 ≤ n < n, which can be extended to n = 0 and n = n if we identify J 0 (E) with E and πn,n with the identity map on J n (E). The inverse limit of the resulting inverse system (J n (E), πn,n ) is called the infinite jet bundle and is denoted by J ∞ (E). As an infinite-dimensional vector bundle it has coordinates {u a1 , ∂µ u a2 , ∂µ1 ∂µ2 u a3 , . . .}. The jet bundle formalism is very convenient for specifying a field theory in terms of a Lagrangian. Indeed, such a function does not only depend on the fields, but also on their partial derivatives. Nevertheless, the condition of locality imposes an upper bound on the order of these partial derivatives, which motivates the following definition. Definition 22. A local form L(x, u (n) ) is a pullback of a horizontal differential form on some finite jet bundle J n (E) to J ∞ (E), i.e. an element in Ω •,0 (J ∞ E). The algebra of local forms is denoted by Loc(E).

Representing Feynman Graphs on BV-Algebras

305

Since Ω •,0 (J ∞ E) Ω • (M) ⊗ C ∞ (J ∞ E), a local form is the tensor product of a differential form on M with a smooth function in the coordinates xµ and ∂µ1 · · · ∂µn u a (0 ≤ n ≤ n) for some finite positive integer n. If the vector bundle E carries a grading, E = ⊕q E (q) the algebra Loc(E) becomes bigraded, L ∈ Loc(E) of bidegree ( p, q) if L has degree p as a differential form and ghost degree q. In this case, we write Loc( p,q) (E), Loc(E) = p≥0,q∈Z

and we have Loc( p,q) (E) Ω p (M) ⊗C ∞ (M) Loc(0,q) (E). In the case that E = E tot so that the sections of E constitute a set of fields Φ as above, we also write Loc(Φ) instead of Loc(E) (and similarly Loc( p,q) (Φ)) and with a slight abuse of notation u a ≡ φ a for φ ∈ Φ. We distinguish between sections and coordinates by writing explicitly the dependence of the second on the position on M. Thus, the components of a section σ ∈ Γ (M, E tot ) are given by φ a (x) = (φ a ◦ σ )(x) ∈ R with φ a on the right-hand-side, the fiber coordinates u a of the direct summand of E tot corresponding to φ. In a similar manner, we write for the coordinates of the higher order jet bundles ∂µ u a ≡ ∂µ φ a for a multi-index µ = (µ1 , . . . , µk ). The previous correspondence between coordinates and sections generalizes to ∂µ φ a (x) = (∂µ φ a ◦ j ∞ σ )(x) in terms of the infinite jet j ∞ σ ∈ Γ (M, J ∞ (E tot )) defined by the smooth section σ ∈ Γ (M, E tot ). Example 23. A scalar field theory is defined by the following Lagrangian L ∈ Loc(m,0) (M × R): L(x, φ, ∂i φ) = 21 dφ ∗ dφ − V (φ)(∗1), with V (φ) a polynomial in the field φ ∈ Γ (M × R). To any Lagrangian L, defined in general as a local m-form of the fields (m = dim M), one can associate the Lagrangian density L(x, φ(x)) := ( j ∞ σ )∗ L(x, φ), evaluated at a section σ of E. This density can be integrated to give the so-called action S[φ] := L(x, φ(x)). M

In general, we make the following definition. Definition 24. A local functional F[φ] is the integral of the pullback of a local m-form, i.e. F[φ] = M L(x, φ(x)) for L(x, φ) ∈ Loc(m,0) (E). The free commutative algebra generated (over C) by local functionals is denoted by F([E]). Again, in the case that E = E tot is associated to a set of fields Φ as above, we write F([Φ]) instead of F([E]). The grading by ghost degree on local m-forms carries over to a grading on local functionals, which we also denote by gh(F) for F ∈ F([E]).

306

W. D. van Suijlekom

4.3. The anti-bracket. We will now try to elucidate the above ‘doubling’ of the fields (adding a BRST-source for every field) in terms of the structure of a Gerstenhaber algebra on the algebra of local functionals F([Φ]). Recall that a Gerstenhaber algebra [23] is a graded commutative algebra with a Lie bracket of degree 1 satisfying the graded Leibniz property: (x, yz) = (x, y)z + (−1)(|x|+1)|y| y(x, z). Batalin and Vilkovisky encountered this structure in their study of quantum gauge theories [5–7]. In fact, they invented what is now called a BV-algebra (see for instance [39]): that satisfies: a Gerstenhaber algebra with an additional operator ∆ (x, y) = ∆(xy) − ∆(x)y + (−1)|x| x∆(y). We will define such an anti-bracket on the algebra of local functionals using the functional derivative. Definition 25. The left and right functional derivatives are the distributions defined by δL F d δR F a F[φ + tψφ ] = ψφ (x)dµ(x) = dµ(x), ψφa (x) a a dt δφ (x) M δφ (x) M for test functions ψφ of the same ghost degree as φ ∈ Φ. There is the following relation between the two functional derivatives: δR F δL F = (−1)gh(φ)(gh(F)−gh(φ)) a , δφ a (x) δφ (x) with gh the ghost degree. Proposition 26. The bracket (·, ·) defined by (F1 , F2 ) =

N rkE i i=1 a=1

M

δ R F1 δ L F2 δ R F1 δ L F2 − dµ(x), δφia (x) δ K φai (x) δ K φai (x) δφia (x)

gives F([Φ]) the structure of a Gerstenhaber algebra with respect to the ghost degree. Moreover, with ∆(F) =

N i=1

δL δR (F) δ K φai (x) δφia (x)

it becomes a BV-algebra. In the physics literature, it is common to write this anti-bracket on the fields generators in terms of the Dirac delta distribution as (K φai (x), φ bj (y)) = δ ab δi j δ(x − y), (K φai (x), K φa j (y)) = 0, (φia (x), φ bj (y)) = 0, which is then extended to F([Φ]) using the graded Leibniz property.

Representing Feynman Graphs on BV-Algebras

307

4.4. The comodule BV-algebra of coupling constants and fields. Since the coupling constants measure the strength of the interactions, we label them by the elements v ∈ R V and write accordingly λv . We consider the algebra A R generated by local functionals in the fields and formal power series (over C) in the coupling constants λv . In other words, we define A R := C[[λv1 , . . . , λvk ]] ⊗C F([Φ]), where k = |R V |. The BV-algebra structure on F([Φ]) defined in the previous section induces a natural BV-algebra structure on A R ; we denote the bracket on it by (·, ·) as well. Recall the notation H R for the Hopf subalgebra generated by the elements pn 1 ,...,n k (Yv ) (v ∈ R V ) and pn 1 ,...,n k (C φ ) (e ∈ R E ) in the Hopf algebra of Feynman graphs. Theorem 27. The algebra A R is a comodule BV-algebra for the Hopf algebra H R . The coaction ρ : A R → A R ⊗ H R is given on the generators by ρ : λv −→ ρ : φ −→

n 1 ···n k

n 1 ···n k

λv λnv11 · · · λnvkk ⊗ pn 1 ···n k (Yv ), φ λnv11 · · · λnvkk ⊗ pn 1 ···n k (C φ ),

for φ ∈ Φ, while it commutes with partial derivatives on φ. Proof. Since we work with graded Hopf algebras, it suffices to establish that (ρ⊗1)◦ρ = (1⊗∆)◦ρ. We claim that this follows from coassociativity (i.e. (∆⊗1)◦∆ = (1⊗∆)◦∆) of the coproduct ∆ of H R . Indeed, the first expression very much resembles the form of the coproduct on Yv as derived in Proposition 12: replacing therein each Yv on the first leg of the tensor product by λv and one ∆ by ρ gives the desired result. A similar argument applies to the second expression, using Eq. (6) above. Finally, since C K φi ≡ (C φi )−1 by definition in H R , it follows that ρ respects the and thus the BV-algebra structure. bracket and the operator ∆ v Corollary 28. The Green’s functions G ∈ H R can be obtained when coacting on the monomial M λv ι(v)(x)dµ(x) = M λv ∂µ1 φi1 (x) · · · ∂µ M φi M (x)dµ(x) for some index set {i 1 , . . . , i M }. Explicitly,

λv ∂µ1 φi1 (x) · · · ∂µ M φi M (x) M n1 nk λ v λ v1 · · · λ vk ∂µ1 φi1 (x) · · · ∂µ M φi M (x) ⊗ pn 1 ···n k (G v ). =

ρ

n 1 ···n k

M

Combining Theorem 27 with Corollary 14 yields an induced coaction on C[[λv1 , . . . , λvk ]] of the Hopf algebra dual to the group of diffeomorphisms on Ck tangent to (i) the identity. The formula for this coaction can be obtained by substituting an 1 ···n k for pn 1 ···n k (Yvi ) in the above formula for ρ(λv ). It induces a group action of Diff(Ck , 0) on C[[λv1 , . . . , λvk ]] by f (a) := (1 ⊗ f )ρ(a) for f ∈ Diff(Ck , 0) and a ∈ C[[λv1 , . . . , λvk ]]. In fact, we have the following:

308

W. D. van Suijlekom

Proposition 29. Let G be the group consisting of BV-algebra maps f : A R → A R given on the generators by f (λv ) = f nv1 ···n k λv λnv11 · · · λnvkk ; (v ∈ R V ), n 1 ···n k

f (φi ) =

n 1 ···n k

f ni 1 ···n k φi λnv11 · · · λnvkk ;

(i = 1, . . . , N ),

i v where f nv1 ···n k , f ni 1 ···n k ∈ C are such that f 0···0 = f 0···0 = 1. Then the following hold:

1. The character group G R of the Hopf algebra H R generated by pn 1 ···n k (Yv ) and pn 1 ···n k (C φ ) with coproduct given in Proposition 12, is a subgroup of G. 2. The subgroup N := { f : f (λv ) = λv } of G is normal and isomorphic to (C[[λv1 , . . . , λvk ]]× )|R E | . 3. G (C[[λv1 , . . . , λvk ]]× )|R E | Diff(Ck , 0). Proof. From Theorem 27, it follows that a character χ ∈ G R acts on A R as in the above formula upon writing f nv1 ···n k = χ ( pn 1 ···n k (Yv )) and f ni 1 ···n k = χ ( pn 1 ···n k (C φi )). For 2, one checks by explicit computation that N is indeed normal and that each series f i defines an element in C[[λv1 , . . . , λvk ]]× of invertible formal power series. Then 3 follows from the existence of a homomorphism from G to Diff(Ck , 0). It is given by restricting an element f to C[[λv1 , . . . , λvk ]]. This is clearly the identity map on Diff(Ck , 0) when considered as a subgroup of G and its kernel is precisely N . Remark 30. Note that the expression for the action of f ∈ G on the BRST-sources K φi can be derived from the expression for f (φi ) above using the fact that ( f (K φi )(x), f (φi )(y)) = δ(x − y). The action of (the subgroup of) (C[[λv1 , . . . , λvk ]]× )|R E | Diff(Ck , 0) on A R has a natural physical interpretation: the invertible formal power series act on the propagating fields as wave function renormalization whereas the diffeomorphisms act on the coupling constants λ1 , . . . , λk . The similarity with the semi-direct product structures obtained (via different approaches) in [24] for a scalar field theory and in [14,15] for quantum electrodynamics is striking. Example 31. Consider again pure Yang–Mills theory with fields A, ω, ω and h. Then, under the counterterm map γ− (z) ∈ G R (cf. Sect. 2.3) we can identify with wave function renormalization for the gluon propagator, and the combination with wave function renormalization for the ghost propagator. The above action of γ− (z) on the fields A, ω, ω is thus equivalent to wave function renormalization. We will come back to Yang–Mills theories in more detail in Sect. 5 below.

4.5. The master equation. The dynamics and interactions in the physical system is described by means of a so-called action S. In our formalism, S will be an element in A R of polynomial degree ≥ 2 of the form, S[φ] = (7) dµ(x) ι(e)(x) + dµ(x) λv ι(v)(x). e∈R E

v∈R V

Representing Feynman Graphs on BV-Algebras

309

The first sum in S describes the free field theory containing the propagators of the (massless) fields. The second term describes the interactions including the mass terms. Note that due to the restrictions in the sums, the action has finitely many terms, that is, it is a (local) polynomial functional in the fields rather than a formal power series. The action S is supposed to be invariant under some group of gauge transformations.3 We accomplish this in our setting by imposing the (classical) master equation, (S, S) = 0,

(8)

as relations in the BV-algebra A R . Proposition 32. The BV-ideal I = (S, S) is generated by polynomials in λv (v ∈ R V ), independent of the fields φ ∈ Φ. Proof. Let us write the master equation for the Lagrangian as a polynomial in A R : ∂µ1 φi1 (x) · · · ∂µ N φi N (x)dµ(x) ∈ A R (S, S) = ci1 ···i N M

with ci1 ···i N ∈ C[λ] a polynomial independent of the fields φ. For I to be a BV-ideal it has to satisfy (a, I ) ⊂ I for any a ∈ A R . The following property allows us to project onto each individual term in the above polynomial: f (x)K φi (x)dµ(x), ∂µ1 φi1 (y) · · · ∂µ N φi N (x)dµ(y) M M (±) ∂µk f (x)∂µ1 φi1 (x) · · · ∂µ = k φi k (x) · · · ∂µ N φi N (x)dµ(x). k s.t. i k =i

M

Here means that this factor is absent and f ∈ Cc∞ (M) is a test function. Note that Loc(E) is indeed a Cc∞ (M)-module. Iterating this property, we infer that f 1 (x1 )K φi1 (x1 )dµ(x1 ), · · · f N (x N )K φi N (x N )dµ(x N ), (S, S) · · · M

M

∝ ci1 ···i N F([ f ]). with F([ f ]) a local functional of the test functions f 1 , · · · , f N . Since these are arbitrary, it follows that ci1 ···i N ∈ I , in other words, I is already generated by the coefficients of the polynomial (S, S), as claimed. We still denote the image of the action S in A R /I under the quotient map by S; it satisfies the master equation (8) with the brackets as defined before. If we make the natural assumption that S is at most of order one in the BRST-sources, we can write S = S0 [λv , φi ] +

N rkE i

dµ(x)(sφi )a (x)K φai (x),

(9)

i=1 a=1 3 In addition, it is supposed to be invariant under the symmetry group of the underlying spacetime one works on, typically the Lorentz group. However, these transformations are linear in the fields and will consequently not give rise to non-linear equations such as the master equation discussed here. See for instance [26] for more details.

310

W. D. van Suijlekom

with sφi dictated by the previous form of S. Of course, this is the familiar BRSTdifferential acting on the field φi as a graded derivation and obviously satisfies sφi (x) = (S, φi (x)). As usual, validity of the master equation (S, S) = 0 implies that s is nilpotent: s 2 (φi ) = (S, (S, φi )) = ±((S, S), φi ) = 0, using the graded Jacobi identity. Moreover, the action S0 depending on the fields is BRST-closed, i.e. s S0 = 0, which follows by considering the part of the master equation (S, S) = 0 that is independent of the BRST-sources. The following result establishes an action and coaction on the quotient BV-algebra A R /I . Theorem 33. Let G IR be the (closed) subgroup of G R defined in Proposition 29 consisting of diffeomorphisms f that leave I invariant, i.e. such that f (I ) ⊂ I : 1. The group G IR acts on the quotient BV-algebra A R /I . 2. The ideal in H R defined by J := X ∈ H R : f (X ) = 0 for all f ∈ G IR

(10)

is a Hopf ideal. R = H R /J Consequently, G IR HomC (H R /J, C) and the quotient Hopf algebra H coacts on A R /I . Proof. First observe that G IR is closed since it can be given as the zero-set of polynomials in H R . Indeed, following [44, Lemma 12.4] we can write f (ak j )wk = f (ak j )wk + f (ai j )wi , ρ(w j ) = k∈K

k∈K−I

i∈I

where {wk : k ∈ K} is a (countable) basis for A R and {wi : i ∈ I} a basis for I . Thus, f should satisfy the equations f (ak j ) = 0 for k ∈ K and i ∈ I. For 2, we adopt the standard practice in algebraic geometry to relate (closed) subspaces to (radical) ideals. In the present case, we have a one-to-one correspondence between closed subspaces of HomC (H R , C) and radical ideals in the algebra H R as follows: to each subspace G one associates a ideal JG (which is prime and hence radical) by the above formula (10) and vice versa, for every such ideal J there is a subspace G J := HomC (H/J, C). By [27, Prop. 1.2] it follows that G JG = G and JG J = J . Furthermore, if G carries a group structure (as is the case for G IR ), the algebra H/JG is in fact a Hopf algebra which implies that JG is a Hopf ideal. R := H R /J on A R by ρ We denote the coaction of H ; it is given explicitly by ρ (a + I ) = (π I ⊗ π J ) ρ(a),

(11)

for a ∈ A R ; also, π I and π J are the projections onto the quotient algebra and Hopf algebra by I and J respectively. Let us now justify the origin of the explicit Hopf ideals that we have encountered in the previous section in the case that all coupling constants coincide. This happens for instance in the case of Yang–Mills theory with a simple gauge group, which is discussed in Sect. 5. In general, we make the following definition:

Representing Feynman Graphs on BV-Algebras

311

Definition 34. A theory defined by S is called simple when the following holds modulo the ideal λv val(v)=2 : N (v)−2

I = λv

− λvN (v )−2 val(v),val(v )>2 .

(12)

In other words, if we put the mass terms in S to zero, then the ideal I should be generated by the differences λvN (v)−2 − λvN (v )−2 for vertices with valence greater than 2. We denote by I the ideal in Eq. (12) modulo λv val(v)=2 . A convenient choice of generators for I is the following. Fix a vertex v ∈ R V of valence three,4 and define g := λv as the ‘fundamental’ coupling constant. Then I is generated by λv with val(v) = 2 and λv − g N (v )−2 with val(v ) > 2. Recall the ideal J from the previous section. Theorem 35. Let S define a simple theory in the sense described above.

1. The subgroup G I of diffeomorphisms that leave I invariant is isomorphic to HomC (H R /J , K ). 2. The Hopf algebra H R /J coacts on C[[g, φ]] := A R /I via the map ∞

ρ : g −→ ρ : φ −→

l=0 ∞

g 2l+1 ⊗ ql (X ), g 2l φ ⊗ ql (C φ ).

l=0

Proof. From the proof of Theorem 33 we see that 1 is equivalent to showing that G IR (G R ) J . Indeed, (G R ) J is the subgroup of characters on H R that vanish on J ⊂ H R , which is isomorphic to HomC (H R /J , K ). On the generators of I , an element f ∈ G R acts as

λnv11 · · · λnvkk λv f pn 1 ,...,n k (Yv ) f λv − g N (v )−2 = n 1 ,...,n k

−g N (v )−2 f ( pn 1 ,...,n k (YvN (v )−2 )) ,

where v is the chosen vertex of valence 3 corresponding to g. We will reduce this expression by replacing λvi by g N (vi )−2 , modulo terms in I . Together with Lemma 5 this yields ∞

)−2 N (v f λv − g = g 2l+N (v )−2 f ql Yv − YvN (v )−2

mod I .

l=0

The requirement that this is an element in I is equivalent to the requirement that f vanN (v )−2 ), i.e. on the generators of J , establishing the isomorphism ishes on ql (Yv − Yv I G R (G R ) J . For 2, one can easily compute ρ(I ) ⊂ I ⊗ H R + A R ⊗ J so that H R /J coacts on A R by projecting onto the two quotient algebras (as in Eq. (11)). 4 We suppose that there exists such a vertex; if not, the construction works equally well by choosing the vertex of lowest valence that is present in the set R V .

312

W. D. van Suijlekom

Corollary 36. The group G IR acts on A R /I as a subgroup of (C[[g]]× )|R E | Diff(C, 0). This last result has a very nice physical interpretation: the invertible formal power series act on the |R E | propagating fields as wave function renormalization whereas the diffeomorphisms act on one fundamental coupling constant g. We will appreciate this even more in the next section where we discuss the renormalization group flow.

4.6. Renormalization group. We will now establish a connection between the group of diffeographisms and the renormalization group à la Gell-Mann and Low [22]. This group describes the dependence of the renormalized amplitudes φ+ (z) on a mass scale that is implicit in the renormalization procedure. In fact, in dimensional regularization, in order to keep the loop integrals d 4−z k dimensionless for complex z, one introduces a factor of µz in front of them, where µ has dimension of mass and is calledthe unit of mass. For a Feynman graph Γ , Lemma 5 shows that this factor equals µz v (N (v)−2))δv (Γ )/2 reflecting the fact that the coupling constants appearing in the action get replaced by λv → µ z

v (N (v)−2))/2

λv

for every vertex v ∈ R V . As before, the Feynman rules define a loop γµ : C → G ≡ G(C), which now depends on the mass scale µ. Consequently, there is a Birkhoff decomposition for each µ: γµ (z) = γµ,− (z)−1 γµ,+ (z)

(z ∈ C).

As was shown in [18], the negative part γµ,− (z) of this Birkhoff decomposition is independent of the mass scale, that is ∂ γµ,− (z) = 0. ∂µ Hence, we can drop the index µ and write γ− (z) := γµ,− (z). In terms of the generator θt for the one-parameter subgroup of G(K ) corresponding to the grading l on H , we can write (t ∈ R). γet µ(z) = θt z γµ (z) , A proof of this and the following result can be found in [18]. Proposition 37. The limit

Ft := lim γ− (z)θt z γ− (z)−1 z→0

exists and defines a 1-parameter subgroup of G which depends polynomially on t when evaluated on an element X ∈ H . In physics, this 1-parameter subgroup goes under the name of renormalization group. In fact, using the Birkhoff decomposition, we can as well write γet µ,+ (0) = Ft γµ,+ (0),

(t ∈ R).

Representing Feynman Graphs on BV-Algebras

313

This can be formulated in terms of the generator β := group as µ

∂ γµ,+ (0) = βγµ,+ (0). ∂µ

d dt

Ft |t=0 of this 1-parameter

(13)

Let us now establish that this is indeed the beta-function familiar from physics by exploring how it acts on the coupling constants λv . First of all, although the name might suggest otherwise, the coupling constants depend on the energy or mass scale µ. Recall the action of G R on C[[λv1 , . . . , λvk ]] defined in the previous section. In the case of γµ,+ (0) ∈ G R , we define the (renormalized) coupling constant at scale µ to be λv (µ) = γµ,+ (0)(λv ). This function of µ (with coefficients in C[[λv ]]) satisfies the following differential equation: β (λv (µ)) = µ

∂ (λv (µ)) ∂µ

which follows easily from Eq. (13). This is exactly the renormalization group equation expressing the flow of the coupling constants λv as a function of the energy scale µ. Moreover, if we extend β by linearity to the action S of Eq. (7), we obtain Wilson’s continuous renormalization equation [45]: β(S(µ)) = µ

∂ (S(µ)) . ∂µ

This equation has been explored in the context of renormalization Hopf algebras in [25,30]. Equation (13) expresses β completely in terms of γµ,+ ; as we will now demonstrate, this allows us to derive that in the case of a simple theory all β-functions coincide. First, recall that the maps γµ are the Feynman rules dictated by S in the presence of the mass scale µ, which we suppose to satisfy the master equation (8). In other words, we are in the quotient of A R by I = (S, S) . In addition, assume that the theory defined by S is simple. If the regularization procedure respects gauge invariance, it is well-known that the Feynman amplitude satisfy the Slavnov–Taylor identities for the couplings. In terms of the ideal J defined in the previous section, this means that γµ (J ) = 0. Since J is a Hopf ideal (Theorem 15), it follows that both γµ,− and γµ,+ vanish on J . Indeed, the character γ given by the Feynman rules factorizes through H R /J for which the Birkhoff decomposition gives two characters γ+ and γ− of H R /J . In other words, if the unrenormalized Feynman amplitudes given by γµ satisfy the Slavnov–Taylor identities, so do the counterterms and the renormalized Feynman amplitudes. In particular, we find with Eq. (13) that β vanishes on the ideal I in C[[λv1 , . . . , λvk ]]. This implies the following result, which is well-known in the physics literature: Proposition 38. For a simple theory, all β-functions are expressed in terms of β(g) for the fundamental coupling constant g: β(λv ) = β(g N (v)−2 ).

314

W. D. van Suijlekom

5. Example: Pure Yang–Mills Theory Let us now exemplify the above construction in the case of a pure Yang–Mills theory. Let G be a simple Lie group with Lie algebra g. The gauge field A is a g-valued one-form, that is, a section of Λ1 ⊗ (M × g). As before, we have in components A = Aia dxi T a , where the {T a } form a basis for g. The structure constants { f cab } of g are defined by [T a , T b ] = f cab T c and the normalization is such that tr (T a T b ) = δ ab . In addition to the gauge fields, there are ghost fields ω, ω which are sections of M × g[−1] and M × g[1], respectively, and we write ω = ωa T a and ω = ωa T a . The auxiliary field – also known as the Nakanishi–Lantrup field – is denoted by h = h a T a and is a section of M × g. The form degree and ghost degree of the fields are combined in the total degree and summarized in the following table: ghost degree form degree total degree

A 0 +1 +1

ω +1 0 +1

ω −1 0 −1

h 0 0 0

We introduce BRST-sources for each of the above fields, K A , K ω , K ω and K h . The shift in ghost degree is illustrated by the following table: ghost degree form degree total degree

KA −1 +1 0

Kω −2 0 −2

Kω 0 0 0

Kh −1 0 −1

With these degrees, we can generate the algebra of local forms Loc(Φ), which decomposes as before into Loc( p,q) (Φ) with p the form degree and q the ghost degree. The total degree is then p + q and Loc(Φ) is a graded Lie algebra by setting [X, Y ] = X Y − (−1)deg(X ) deg(Y ) Y X, with the grading given by this total degree. This bracket should not be confused with the anti-bracket defined on local functionals in Sect. 4.3. The present graded Lie bracket is of degree 0 with respect to the total degree, that is, deg([X, Y ]) = deg(X ) + deg(Y ). It satisfies graded skew-symmetry, the graded Leibniz identity and the graded Jacobi identity: [X, Y ] = −(−1)deg(X ) deg(Y ) [Y, X ], [X Y, Z ] = X [Y, Z ] + (−1)deg(Y ) deg(Z ) [X, Z ]Y, (−1)deg(X ) deg(Z ) [[X, Y ], Z ] + (cyclic perm.) = 0. 5.1. The Yang–Mills action. In the setting of Sect. 4.5, the action S for pure Yang–Mills theory is the local functional 1 S= tr −d A ∗ d A − λ A3 d A ∗ [A, A] − λ A4 [A, A] ∗ [A, A] 4 M 1 −A ∗ dh + dω ∗ dω + ξ h ∗ h + λω Aω dω ∗ [A, ω] 2 1 − dω, K A + λ AωK A [A, ω], K A + h, K ω + λω2 K ω [ω, ω], K ω ∗ 1 , (14) 2

Representing Feynman Graphs on BV-Algebras

315

where ∗ denotes the Hodge star operator and ξ is the so-called gauge fixing (real) parameter. Also ·, · denotes the pairing between 1-forms and vector fields (or 0-forms and 0-forms). In contrast with the usual formula for the action in the literature, we have inserted the different coupling constants λv for each of the interaction monomials in the action. We will now show that validity of the master equation (S, S) = 0 implies that all these coupling constants are expressed in terms of one single coupling. First, using Eq. (9) we derive from the above expression the BRST-differential on the generators s A = −dω − λ AωK A [A, ω],

1 sω = − λω2 K ω [ω, ω], 2

sω = −h,

sh = 0.

The BRST-differential is extended to all of Loc( p,q) (Φ) by the graded Leibniz rule, and imposing it to anti-commute with the exterior derivative d. Actually, rather than on Loc(Φ), the BRST-differential is defined on the algebra C[[λ A3 , λ A4 , λω Aω , λ AωK A , λω2 K ω ]] ⊗ Loc(Φ). However, in order not to lose ourselves in notational complexities, we denote this tensor product by Loc(Φ) as well. Now, validity of the master equation implies that s 2 = 0. One computes using the graded Jacobi identity that

1 2 s 2 (A) = λ AωK A − λω2 K ω [dω, ω] + λ AωK A − λ AωK A λω2 K ω [A, [ω, ω]], 2 from which it follows that λ AωK A = λω2 K ω . Thus, with this relation the s becomes a differential, and actually forms – together with the exterior derivative – a bicomplex in which s ◦ d + d ◦ s = 0. Next, the master equation implies that s S0 = 0 and a lengthy computation yields for the first three terms in S0 that

s

Loc(0,1) O

s

d

/ Loc(1,1) O

s

Loc(0,0) O

Loc(0,−1) O

s

d

/ Loc(2,1) O

s

d

/ Loc(1,0) O

s

d

/ Loc(1,−1) O

d

/ Loc(2,0) O

/ ···

d

/ ···

s

d

/ Loc(2,−1) O

s

.. .

d

s

s

s

.. .

. . .O

. . .O

. . .O

s

.. .

d

/ ···

316

W. D. van Suijlekom

1 s −d A ∗ d A − λ A3 d A ∗ [A, A] − λ A4 [A, A] ∗ [A, A] 4 = 2 λ AωK A − λ A3 d A ∗ [A, dω] + (λ A4 − λ A3 λ AωK A )[dω, A] ∗ [A, A] 1 +λ AωK A −d A ∗ d A − λ A3 d A ∗ [A, A] − λ A4 [A, A] ∗ [A, A] , ω . 4 The last term is a commutator on which the trace vanishes and one is thus left with the equalities λ AωK A = λ A3 and λ A4 = λ A3 λ AωK A . The remaining terms in S0 yield under the action of s, 1 s −A ∗ dh + dω ∗ dω + ξ h ∗ h + λω Aω dω ∗ [A, ω] 2 = (λ AωK A − λω Aω )[A, ω] ∗ dh + (λω2 K ω − λω Aω )dω ∗ [dω, ω]. Thus, the master equation implies λ AωK A = λω Aω and λω2 K ω = λω Aω . Finally, if we write g = λ A3 , the master equation implies that λ A4 = g 2 and λω Aω = λ AωK A = λω2 K ω = g.

(15)

This motivates our definition of a simple theory in Sect. 4.5 above. Imposing these relations reduces the action S to the usual 1 tr −F ∗ F − A ∗ dh + dω ∗ dω + gdω ∗ [A, ω] + ξ h ∗ h S= 2 M +s A ∗ K A + sω ∗ K ω + sω ∗ K ω with the field strength F given by F = d A + g2 [A, A] and the BRST-differential now given by s A = −dω − g[A, ω],

1 sω = − g[ω, ω], 2

sω = −h,

sh = 0.

The extension to include fermions is straightforward, leading to similar expressions of the corresponding coupling constants in terms of g.

5.2. The action of G R . As alluded to before, when the counterterm map – seen as an element in G R – acts on the action S, it coincides with wave function renormalization. Let us make this precise in the present case. Clearly, wave function renormalization is given by the following factors: ZA = γ− (z)(G

);

Zω = Zω = γ− (z)(G

).

With this definition and Theorem 27 we find that γ− (z) acts as

γ− (z) · (d A ∗ d A) = γ− (z) (C A )2 d A ∗ d A = Z A d A ∗ d A, γ− (z) · (dω ∗ dω) = γ− (z)(C ω C ω )dω ∗ dω = Z ω dω ∗ dω,

Representing Feynman Graphs on BV-Algebras

317

by definition of the C φ ’s. This is precisely wave function renormalization for the gluon and ghost fields. Thus, renormalizing through the coefficients γ− (z)(C φ ) – although more appropriate for the BV-formalism – is completely equivalent to the usual wave function renormalization (see also [1, Sect. 6]). By construction, the terms −A ∗ dh and h, K ω do not receive radiative corrections. Indeed, this follows from the relations: C b C A = 1;

C K ω C b = 1,

in H R . Consequently, G R – and in particular the counterterm map γ− (z) – acts as the identity on these monomials. In fact, one realizes that S0 = γ− (z) · S is the renormalized action, and since γ− (z) ∈ G R acts as a BV-algebra map, also S0 satisfies the master equation (S0 , S0 ) = 0. This will be further explored in future work. 5.3. The Slavnov–Taylor identities. We now use Theorem 35 to obtain the relations between the Green’s function in Yang–Mills equations that are induced by the above master equation (S, S) = 0. In fact, the action S defines a simple theory in the sense defined before and Eq. (15) implies that the following relations hold in the quotient Hopf algebra H R /J : Y

)2 and Y

= (Y

=Y

=Y

=Y

.

In terms of the Green’s functions the most relevant read: G = (G )2

(G ) (G )3/2

2

,

G G = (G )3/2 (G )1/2 G

, and G

=G

.

These are precisely the Slavnov–Taylor identities for the coupling constants for pure Yang–Mills theory with a simple Lie group. 6. Outlook The connection we have established between renormalization Hopf algebras for gauge theories and the BV-algebras generated by the relevant fields and coupling constants paves the way for an incorporation of the full BV-formalism in the context of Hopf algebras. This formalism is very powerful in that it can handle theories that are renormalizable ‘in the modern sense’. Instead of restricting to Lagrangians with a finite number of terms, one allows here a formal series admitting an infinite number of counterterms; the only condition is then the (quantum) master equation. We expect that in this case the group (C[[g]]× )|R E | Diff(C, 0) encountered above gets replaced by the semi-direct product of so-called canonical transformations with the diffeomorphism group. Here canonical transformations are automorphisms of the BV-algebra A R , thus respecting the bracket. Another perspective of our work is in the direction of BRST-quantization. A description of the BRST-formalism – typically exploited in the physical literature involving functional methods – in the Hopf algebraic setting would elucidate the role it plays in renormalization of gauge theories.

318

W. D. van Suijlekom

There are potential applications of the current setup in the approach taken by Hollands in [28] to perturbatively quantizing Yang–Mills theories on curved spacetimes. There, Ward identities are formulated in terms of functionals as well and renormalization is supposed to respect them. Motivated by the present construction in momentum space, it is expected that these identities induce Hopf ideals in the Hopf algebra of [37] describing Epstein–Glaser renormalization. Another subject we have not touched is gauge theories with spontaneous symmetry breaking. It would be interesting to study renormalization of such theories in the present setup. Finally, the necessity of the Slavnov–Taylor-like identities in the work of Kreimer on quantum gravity [33,34] is quite intriguing. In fact, Theorem 15 can be extended [35] to the so-called core Hopf algebra that was introduced in [13], consisting of graphs with vertices of any valence. We postpone the study of the effective action in the Hopf algebraic setting to our next paper. The Zinn–Justin equation it satisfies will play a similar role as the (classical) master equation (8) in imposing identities between the 1PI Green’s functions, albeit now for any interaction and not only for those represented by the set R V as discussed in Sect. 4.5. Also, we will connect with the usual order-by-order in the loop number approach to renormalization of gauge theories that is taken in the physics literature. Acknowledgements. The author would like to thank Caterina Consani, George Elliott, Johan Martens and Jim Stasheff for their kind invitations in October 2007, where much of this work was initiated. I want to thank Glenn Barnich, Detlev Buchholz, Alain Connes, Klaus Fredenhagen, Eugene Ha, Dirk Kreimer, Matilde Marcolli and Jack Morava for valuable discussions and remarks. Finally, the Hausdorff Research Institute for Mathematics in Bonn is acknowledged for their hospitality during the final stages of this work. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References 1. Anselmi, D.: Removal of divergences with the Batalin-Vilkovisky formalism. Class. Quant. Grav. 11, 2181–2204 (1994) 2. van Baalen, G., Kreimer, D., Uminsky, D., Yeats, K.: The QED beta-function from global solutions to Dyson-Schwinger equations. Ann. Phys. 324, 205–219 (2009) 3. Barnich, G., Brandt, F., Henneaux, M.: Local BRST cohomology in the antifield formalism. I. General theorems. Commun. Math. Phys. 174, 57–92 (1995) 4. Barnich, G., Brandt, F., Henneaux, M.: Local BRST cohomology in the antifield formalism. II. Application to Yang-Mills theory. Commun. Math. Phys. 174, 93–116 (1995) 5. Batalin, I.A., Vilkovisky, G.A.: Gauge algebra and quantization. Phys. Lett. B102, 27–31 (1981) 6. Batalin, I.A., Vilkovisky, G.A.: Feynman rules for reducible gauge theories. Phys. Lett. B120, 166–170 (1983) 7. Batalin, I.A., Vilkovisky, G.A.: Quantization of Gauge theories with linearly dependent generators. Phys. Rev. D28, 2567–2582 (1983) 8. Becchi, C., Rouet, A., Stora, R.: The abelian Higgs-Kibble model. Unitarity of the S operator. Phys. Lett. B52, 344 (1974) 9. Becchi, C., Rouet, A., Stora, R.: Renormalization of the abelian Higgs-Kibble model. Commun. Math. Phys. 42, 127–162 (1975) 10. Becchi, C., Rouet, A., Stora, R.: Renormalization of gauge theories. Annals Phys. 98, 287–321 (1976) 11. Bergbauer, C., Kreimer, D.: The Hopf algebra of rooted trees in Epstein-Glaser renormalization. Ann. H. Poincaré 6, 343–367 (2005) 12. Bergbauer, C., Kreimer, D.: Hopf algebras in renormalization theory: Locality and Dyson-Schwinger equations from Hochschild cohomology. IRMA Lect. Math. Theor. Phys. 10, 133–164 (2006) 13. Bloch, S., Kreimer, D.: Mixed Hodge structures and renormalization in physics. Commun. Num. Theor. Phys. 2 4, 637–718 (2008)

Representing Feynman Graphs on BV-Algebras

319

14. Brouder, C., Frabetti, A.: Renormalization of QED with planar binary trees. Eur. Phys. J. C19, 715–741 (2001) 15. Brouder, C., Frabetti, A., Krattenthaler, C.: Non-commutative Hopf algebra of formal diffeomorphisms. Adv. Math. 200, 479–524 (2006) 16. Brunetti, R., Fredenhagen, K.: Microlocal analysis and interacting quantum field theories: Renormalization on physical backgrounds. Commun. Math. Phys. 208, 623–661 (2000) 17. Connes, A., Kreimer, D.: Renormalization in quantum field theory and the Riemann-Hilbert problem. I: The Hopf algebra structure of graphs and the main theorem. Commun. Math. Phys. 210, 249–273 (2000) 18. Connes, A., Kreimer, D.: Renormalization in quantum field theory and the Riemann-Hilbert problem. II: The beta-function, diffeomorphisms and the renormalization group. Commun. Math. Phys. 216, 215–241 (2001) 19. Dütsch, M., Fredenhagen, K.: Perturbative algebraic field theory, and deformation quantization. In: Mathematical physics in mathematics and physics (Siena, 2000), Volume 30 of Fields Inst. Commun., Providence, RI: Amer. Math. Soc., 2001, pp. 151–160 20. Figueroa, H., Gracia-Bondia, J.M., Varilly, J.C.: Faà di Bruno Hopf algebras. http://arXiv.org/List/math/ 0508337, 2005 21. Foissy, L.: Faà di Bruno subalgebras of the Hopf algebra of planar trees from combinatorial DysonSchwinger equations. Adv. Math. 218, 136–162 (2008) 22. Gell-Mann, M., Low, F.E.: Quantum electrodynamics at small distances. Phys. Rev. 95, 1300–1312 (1954) 23. Gerstenhaber, M.: The cohomology structure of an associative ring. Ann. of Math. 78, 267–288 (1963) 24. Girelli, F., Krajewski, T., Martinetti, P.: Wave-function renormalization and the Hopf algebra of Connes and Kreimer. Mod. Phys. Lett. A16, 299–303 (2001) 25. Girelli, F., Krajewski, T., Martinetti, P.: An algebraic Birkhoff decomposition for the continuous renormalization group. J. Math. Phys. 45, 4679–4697 (2004) 26. Gomis, J., Weinberg, S.: Are nonrenormalizable gauge theories renormalizable? Nucl. Phys. B469, 473–487 (1996) 27. Hartshorne, R.: Algebraic Geometry. Number 52 in Graduate Texts in Mathematics. New York: SpringerVerlag, 1977 28. Hollands, S.: Renormalized quantum Yang-Mills fields in curved spacetime. Rev. Math. Phys. 20, 1033–1172 (2008) 29. Hollands, S., Wald, R.M.: On the renormalization group in curved spacetime. Commun. Math. Phys. 237, 123–160 (2003) 30. Krajewski, T., Martinetti P.: Wilsonian renormalization, differential equations and Hopf algebras. http:// arXiv.org/abs/:0806.4309v2[hepth], 2008 31. Kreimer, D.: On the Hopf algebra structure of perturbative quantum field theories. Adv. Theor. Math. Phys. 2, 303–334 (1998) 32. Kreimer, D.: Anatomy of a gauge theory. Ann. Phys. 321, 2757–2781 (2006) 33. Kreimer, D.: A remark on quantum gravity. Ann. Phys. 323, 49–60 (2008) 34. Kreimer, D.: Not so non-renormalizable gravity. In: Fauser, B., Tolksdorf, J., Zeidlers, E. (eds.) Quantum Field Theory: Competitive Models. Birkhaeuser (2009) 35. Kreimer, D., van Suijlekom, W.D.: Recursive relations in the core Hopf algebra. Nucl. Phys. B (to appear) 36. Kreimer, D., Yeats, K.: An étude in non-linear Dyson-Schwinger equations. Nucl. Phys. B Proc. Suppl. 160, 116–121 (2006) 37. Pinter, G.: The Hopf algebra structure of Connes and Kreimer in Epstein-Glaser renormalization. Lett. Math. Phys. 54, 227–233 (2000) 38. Saunders, D.J.: The Geometry of Jet Bundles. Cambridge: Cambridge University Press, 1989 39. Stasheff, J.: Deformation theory and the Batalin-Vilkovisky master equation. In: Deformation theory and symplectic geometry (Ascona, 1996), Volume 20 of Math. Phys. Stud., Dordrecht: Kluwer Acad. Publ., 1997, pp. 271–284 40. van Suijlekom, W.D.: Renormalization of gauge fields: A Hopf algebra approach. Commun. Math. Phys. 276, 773–798 (2007) 41. van Suijlekom, W.D.: Multiplicative renormalization and Hopf algebras. In: Ceyhan, O., Manin, Y.-I., Marcolli, M. eds, Arithmetic and Geometry Around Quantization. Basel: Birkhäuser Verlag, 2008 42. van Suijlekom, W.D.: Renormalization of gauge fields using Hopf algebras. In: Fauser, J.T.B., Zeidler, E., eds., Quantum Field Theory. Basel: Birkhäuser Verlag, 2008 43. Tyutin, I.V.: Gauge invariance in field theory and statistical physics in operator formalism. LEBEDEV75-39, available at http://arXiv.org/abs/0812.0580v2[hepth], 2008 44. Waterhouse, W.C.: Introduction to Affine Group Schemes. New York: Springer, 1979 45. Wilson, K.G.: Renormalization group methods. Adv. in Math. 16, 170–186 (1975) Communicated by A. Connes

Commun. Math. Phys. 290, 321–334 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0725-9

Communications in

Mathematical Physics

Large Deviations, Fluctuations and Shrinking Intervals Mark Pollicott1 , Richard Sharp2 1 Department of Mathematics, University of Warwick, Coventry CV4 7AL, UK.

E-mail: [email protected]

2 School of Mathematics, University of Manchester, Oxford Road,

Manchester M13 9PL, UK. E-mail: [email protected]; [email protected] Received: 29 August 2008 / Accepted: 23 September 2008 Published online: 14 January 2009 – © Springer-Verlag 2009

Abstract: This paper concerns the statistical properties of hyperbolic diffeomorphisms. We obtain a large deviation result with respect to slowly shrinking intervals for a large class of Hölder continuous functions. In case of time reversal symmetry, we obtain a corresponding version of the Fluctuation Theorem. 0. Introduction In statistical mechanics, the second law of thermodynamics states that the entropy of a system increases until it reaches an equilibrium. However, since this is a statistical law, away from thermodynamic equilibrium the entropy may increase or decrease over a given amount of time and the Cohen-Gallavotti fluctuation theorem implies that the relative probability that entropy will flow in a direction opposite to that given by the second law of thermodynamics decreases exponentially. To formulate a mathematical model for these results, let T : → be a mixing hyperbolic diffeomorphism, let µ be an equilibrium state for a Hölder continuous function and let : → R be a Hölder continuous function such that dµ > 0. Let MT denote the space of all T -invariant Borel probability measures on and write I = dm : m ∈ MT . If we denote n (x) = (x) + (T x) + · · · + (T n−1 x) then, by standard large deviation estimates, we can deduce that if (− p, p) ⊂ I then the limit µ x : n1 n (x) ∈ ( p − δ, p + δ) 1 lim log (0.1) n→+∞ n µ x : n1 n (x) ∈ (− p − δ, − p + δ) exists. The limit takes a particularly simple form if we assume that T : → has a time reversal symmetry (i.e., an involution i : → such that i ◦ T ◦ i = T −1 )

322

M. Pollicott, R. Sharp

and we consider functions of the special form = − ◦ i ◦ T , for a given function : → R. Let µ and µ be the equilibrium states of and , respectively. A version of the following theorem was formulated by Gallavotti in 1995 [4,5] and particularly nice treatments appear in the work of Ruelle [19] and Maes and Verbitsky [11], and in the book [8]. For a more abstract formulation, see Wojtkowski [20]. (See also Gentile [6] for the case of Anosov flows.) Fluctuation Theorem. Suppose that µ is not the measure of maximal entropy for T . Then (i) we have that dµ > 0; (ii) there exists p ∗ > 0 such that, if | p| < p ∗ , then µφ ( x : n1 n (x) ∈ ( p − δ, p + δ) ) 1 log = p. lim lim δ→0 n→+∞ n µφ ({x : n1 n (x) ∈ (− p − δ, − p + δ)}) The physical interpretation corresponds to the particular choice of functions (x) = − log DT | E u (x) , for which the corresponding equilibrium state µ is the Sinai-RuelleBowen measure, and (x) = − log | det(Dx f )|. The quantity log | det(Dx f )|dµ (x) is then the entropy production originally introduced by Ruelle [18]. The existence of the limit in (0.1) and part (ii) of the Fluctuation Theorem follows from a basic large deviation result. Note that, under the assumption that the equilibrium state for is not the measure of maximal entropy, we have int(I ) = ∅. In the following, we adopt the convention that inf ∅ = −∞. Large Deviation Theorem. [9,13] Suppose that µ is not the measure of maximal + entropy for T . There is a real analytic rate function I : int(I ) → R , with I ( p) = 0 if and only if p = dµ , such that, for an interval J ⊂ R, we have 1 log µ lim n→+∞ n

n (x) x: ∈J n

= − inf{I ( p) : p ∈ J ∩ int(I )}.

In particular, we have that for p ∈ int(I ), 1 log µ δ→0 n→+∞ n lim lim

x:

n (x) ∈ ( p − δ, p + δ) n

= −I ( p).

These theorems lead to the following natural question. Question. Can we obtain similar results where δ is allowed to shrink as a function of n (and we only need to take a single limit as n → +∞)? More precisely, if δn decreases to zero sufficiently slowly, do we have 1 log µ lim n→+∞ n

n (x) ∈ ( p − δn , p + δn ) x: n

= −I ( p)?

We shall show that, subject to a modest condition on the function , the answer to the question is always in the affirmative provided δn−1 grows no faster than n 1+κ , for some κ = κ() > 0. The condition on that we require is the following.

Large Deviations, Fluctuations and Shrinking Intervals

323

Diophantine Condition. We say that a function satisfies the Diophantine condition (with respect to a transformation T ) if there are periodic orbits T n i xi = xi (i = 1, 2, 3) such that n 3 (x3 ) − n 1 (x1 ) α= n 2 (x2 ) − n 1 (x1 ) is a diophantine number (i.e., there exists c > 0 and β > 1 such that |qα − p| ≥ cq −β , for all p ∈ Z and q ∈ N). The Diophantine condition is strictly stronger than the condition that is non-lattice, i.e., that if, for a, b ∈ R, {a n (x) + bn : T n x = x, n ∈ N} ⊂ Z then a = b = 0. A non-lattice condition is certainly necessary for our results since the local limit theorems for hyperbolic diffeomorphisms [7,10] which we use require it. In Sect. 1, we shall see, in particular, that if satisfies the Diophantine condition then µ is not the measure of maximal entropy for T . (Of course, this holds under much weaker conditions.) An answer the above question is given by the following version of the large deviation theorem with shrinking intervals. Theorem 1. Let T : → be a mixing hyperbolic diffeomorphism and let µ be the equilibrium state of a Hölder continuous function : → R. Let : → R be a Hölder continuous function which satisfies the Diophantine condition. Then there exists κ > 0 such that, if δn > 0 decreases to zero and δn−1 = O(n 1+κ ), as n → +∞, we have, for p ∈ int(I ),

n (x) 1 x: lim log µ ∈ ( p − δn , p + δn ) = −I ( p). n→+∞ n n In fact, it follows from the Large Deviation Theorem that an upper bound holds for all Hölder continuous : → R and sequences δn → 0 without assuming any further condition. Proposition 0.1. Suppose that : → R is a Hölder continuous function such that µ is not the measure of maximal entropy. Let δn > 0 be any sequence converging to zero. Then

n (x) 1 x: ∈ ( p − δn , p + δn ) ≤ −I ( p). lim sup log µ n n→+∞ n Theorem 1 leads to a version of the fluctuation theorem for shrinking intervals. Theorem 2. Let T : → be a mixing hyperbolic diffeomorphism with time reversal symmetry i : → . Let : → R be Hölder continuous and let = − ◦ i ◦ T . Suppose that satisfies the Diophantine condition then there exists κ > 0 such that, if δn > 0 decreases to zero and δn−1 = O(n 1+κ ), as n → +∞, we have, for | p| < p ∗ , µ ({x : n (x)/n ∈ ( p − δn , p + δn )}) 1 log = p. n→+∞ n µ ({x : n (x)/n ∈ (− p − δn , − p + δn )}) lim

(0.2)

In Sect. 1 we recall some basic results about hyperbolic diffeomorphisms and the thermodynamic formalism associated to them. In Sect. 2, we discuss the corresponding properties of shifts of finite type. In Sect. 3, we describe some examples related to our results. In Sect. 4, we prove a large deviations result with shrinking intervals in the context of subshifts of finite type and deduce Theorem 1. In Sect. 5, we restrict to systems with time reversal symmetry and prove Theorem 2.

324

M. Pollicott, R. Sharp

1. Hyperbolic Diffeomorphisms In this section we recall some basic definitions and results. Let M be a compact C ∞ Riemannian manifold and let T : M → M be a C ∞ diffeomorphism. We call a compact T -invariant set hyperbolic if: (1) (2) (3) (4)

T : → is transitive; the periodic orbits for T : → are dense in ; −n U ; there exists an open set U ⊃ such that = ∩∞ n=−∞ T there exists C > 0, 0 < λ < 1 and a splitting T M = E u ⊕ E s such that DT n v ≤ Cλn v where v ∈ E s DT −n v ≤ Cλn v where v ∈ E u , for n ≥ 0.

We call the restriction T : → a hyperbolic diffeomorphism. In the case that = M we call T a transitive Anosov diffeomorphism. We write MT for the space of T -invariant measures on . For a continuous function : → R, we define its pressure by the variational principle: P() = sup h m (T ) + dm : m ∈ MT , where h m (T ) denotes the measure theoretic entropy. A function of the form u ◦ T − u, where u : X → R is continuous, is called a coboundary. We have P( + u◦T −u + c) = P() + c, where c ∈ R is a constant. Assume from now on that is Hölder continuous. We write µ for the equilibrium state of , i.e., the unique µ ∈ MT such that P() = h µ (T ) + dµ . This measure is unchanged by adding a coboundary and a constant to . The pressure of may also be characterized in terms of periodic points, as in the following proposition. Proposition 1.1. Suppose that : → R is Hölder continuous. Then 1 n log e (x) . n→+∞ n n

P() = lim

T x=x

dm : m ∈ MT . If is cohomologous to a conRecall that we defined I = stant c then I = {c}; otherwise, I is a non-trivial closed interval. The equilibrium state of the function which is identically zero is called the measure of maximal entropy. In view of the above discussion, µ is the measure of maximal entropy if and only if is cohomologous to a constant. If satisfies the Diophantine condition then, in particular, there are twoprobability measures, ν1 and ν2 , supported on periodic orbits, for which dν1 = dν2 , so I is not a single point. Hence, the Diophantine condition for implies that µ is not the measure of maximal entropy for T . We shall now concentrate on a fixed Hölder continuous function : → R and a measure µ , the equilibrium state of another Hölder continuous function : → R. To avoid a degenerate situation, we suppose that is not cohomologous to a constant

Large Deviations, Fluctuations and Shrinking Intervals

325

(i.e., µ is not the measure of maximal entropy). Then the function q → P( + q) is strictly convex and real analytic. Furthermore, d P( + q) = dµ+q , int(I ) = µ+q : q ∈ R dq and the endpoints of I are

lim

q→±∞

dµ+q .

Let I ( p) : int(I ) → R denote the (real analytic) Legendre transform of P( + q) − P(), i.e., −I ( p) = inf{P( + q) − P() − qp : q ∈ R}. (Since we can always add a constant to without affecting I ( p) or µ we can assume without loss of generality that P() = 0.) We also have that −I ( p) = P( + ξ p ) − ξ p p, where ξ p is the unique real number with d P( + q) = dµ+ξ p = p. dq q=ξ p Theorem 1 is relatively straightforward to prove in the case where δn decreases more slowly than n −1 . Indeed, in this case it holds assuming only that is non-lattice. To see this, recall that local limit theorems for hyperbolic diffeomorphisms [7,10] imply that if is non-lattice then

1 log µ x : ( − p)n (x) ∈ (−δ, δ) n

n (x) δ δ 1 x: log µφ ∈ p− ,p+ , = lim n→+∞ n n n n

−I ( p) = lim

n→+∞

This shows that the required growth rate holds if δn = δ/n or decreases more slowly than this. However, to prove the full version of Theorem 1 we need to study a symbolic model for T : → . 2. Subshifts of Finite Type Let A be a k × k matrix with entries 0 or 1, which is aperiodic (i.e., there exists n ≥ 1 such that An has all entries positive). We let ∞ X= x∈ {1, . . . , k} : A(xn , xn+1 ) = 1 for all n ∈ Z , n=−∞

the space of two-sided sequences with adjacent entries allowed by A. Let σ : X → X be the two-sided subshift of finite type defined . We make X into a by (σ x)n = xn+1−|n| compact metric space by defining d(x, y) = ∞ , where δi j is the n=−∞ (1 − δxn yn )2 Kronecker symbol. Aperiodicity of A is equivalent to topological mixing for σ .

326

M. Pollicott, R. Sharp

Similarly, let X = x∈ +

∞

{1, . . . , k} : A(xn , xn+1 ) = 1 for all n ∈ Z

+

,

n=0

the corresponding space of one-sided sequences. Let σ : X + → X + be the one-sided + subshift of finite type defined by (σ x) n = x n+1 . As above, we make X into a compact ∞ −n metric space by defining d(x, y) = n=0 (1 − δxn yn )2 . Again, aperiodicity of A is equivalent to topological mixing for σ . By analogy with the previous section we write Mσ for the space of σ -invariant measures on X and for a continuous function ψ : → R, we define its pressure to be P(ψ) = sup h m (σ ) + ψ dm : m ∈ Mσ , where h m (σ ) denotes the measure theoretic entropy. A function of the form u ◦ σ − u, where u : X → R is continuous, is called a coboundary. We have P(ψ + u ◦σ −u + c) = P(ψ) + c, where c ∈ R is a constant. Assume from now on that ψ is Hölder continuous. We write µψ for the equilibrium state of ψ, i.e., the unique µψ ∈ Mσ such that P(ψ) = h µψ (T ) + ψ dµψ . This measure is unchanged by adding a coboundary and a constant to ψ. Exactly the same results hold for σ : X + → X + . A particularly useful property of subshifts of finite type, which is relevant for our analysis, is that they serve as models of hyperbolic diffeomorphisms, in the following precise sense. Proposition 2.1. Given a mixing hyperbolic diffeomorphism T : → , there exists a mixing (two-sided) subshift of finite type σ : X → X and a Hölder continuous surjective map π : X → such that (1) T ◦ π = π ◦ σ ; (2) π one-to-one almost everywhere with respect to the equilibrium states of f ◦ π and f , for any Hölder continuous function f : → R [1]. Given Hölder continuous functions , : → R, we can use Proposition 2.1 to define φ = ◦ π : X → R and ψ = ◦ π :→ R, which are also Hölder continuous, and ψ satisfies the Diophantine condition if and only if does. Furthermore, P( + q) = P(φ + qψ), for q ∈ R. To shorten our subsequent notation, we shall write ψ p = ψ − p. Notice that ψ p satisfies the Diophantine condition if and only if ψ does. With this notation −I ( p) = inf P(φ + qψ p ) = P(φ + ξ p ψ p ) q∈R

and d P(φ + qψ p ) = ψ p dµφ+ξ p ψ = 0. dq q=ξ p Our analysis makes use of transfer operators and thus it is necessary to work initially with one-sided shifts of finite type.

Large Deviations, Fluctuations and Shrinking Intervals

327

Definition. For functions ψ, φ ∈ C α (X + , R), define the family of transfer operators Lφ+(ξ +iu)ψ : C α (X + , C) → C α (X + , C) by Lφ+(ξ +iu)ψ k(x) =

eφ(y)+(ξ +iu)ψ(y) k(y),

σ y=x

for ξ and u ∈ R. By adding a coboundary and a constant to φ, we may assume that φ is normalized, i.e., that Lφ 1 = 1. In particular, if φ is normalized then P(φ) = 0. Normalization leaves the equilibrium state µφ unchanged. From now on, we shall write ξ = ξ p . Proposition 2.2. (1) The operator Lφ+ξ ψ p has a simple eigenvalue λξ = e P(φ+ξ ψ p ) and the rest of the spectrum is contained in a disk of smaller radius. (2) For u ∈ R, the operator Lφ+(ξ +iu)ψ p has spectral radius ≤ λξ . (3) There exists a > 0 such that, for |u| < a, Lφ+(ξ +iu)ψ p has a simple eigenvalue e P(φ+(ξ +iu)ψ p ) , depending analytically on u, with |e P(φ+(ξ +iu)ψ p ) | < λξ for u = 0. Furthermore, the rest of the spectrum of Lφ+(ξ +iu)ψ p is contained in a disk of radius θ λξ , for some θ < 1. (4) d 2 P(φ + (ξ + iu)ψ p ) = −σ 2 < 0. du 2 u=0 The following identity will be important in subsequent calculations. Lemma 2.1. If φ is normalized then

e(ξ p +iu)ψ p (x) dµφ (x) = n

Lnφ+(ξ p +iu)ψ p 1(x)dµφ (x).

Later, we shall need to bound iterates of Lφ+(ξ +iu)ψ p . Estimates of the kind we require were developed in [15], following the ideas of Dolgopyat [3]. Lemma 2.2. Assume that ψ satisfies the Diophantine condition. Then there exists γ > 0, D > 0 and C, c > 0 such that, for |u| ≥ a, we have that m L2N φ+(ξ +iu)ψ p 1∞

≤

Cλnξ

c m 1− γ , for m ≥ 1, |u|

(2.1)

where N = [D log |u|]. Proof. Since we are assuming the Diophantine condition, the hypotheses of Proposition 2 in [15] hold. This gives the inequality (2.1).

328

M. Pollicott, R. Sharp

3. Examples In this section, we discuss some examples related to our theorems. We begin by considering two examples for subshifts of finite type. These can easily be adapted to Axiom A diffeomorphisms [2]. In particular, they show that the Diophantine condition is necessary for Theorem 1. Example 1. Consider the case of a (two-sided) full shift on two symbols σ : X → X , where X = {0, 1}Z , and a function ψ : X → R defined by β if x0 = 0 ψ(x) = −1 if x0 = 1. By a judicious choice of β we can arrange for the Diophantine condition to hold. An example of a hyperbolic diffeomorphism with the same behaviour is given by a suitable horseshoe. However, as the next example shows, results such as Theorem 1 cannot hold without some assumption on the Hölder continuous function. In particular, it is necessary that a non-lattice condition is satisfied. It is not clear if shrinking interval results hold for functions which are non-lattice but fail to satisfy the Diophantine condition. Example 2. Consider again a (two-sided) full shift on two symbols σ : X → X . Now define a function ψ : X → R by β − 1 if x0 = 0 ψ(x) = −1 if x0 = 1, with β > 1. For any σ n x = x, β −1 ψ n (x) − β −1 n ∈ Z, so ψ is not a non-lattice function (and hence, in particular, does not satisfy the Diophantine condition), regardless of the value of β. It is easy to see that Iψ = [−1, β − 1] and that p0 := ψ dµ0 = −1 + β/2, where µ0 is the ( 21 , 21 )-Bernoulli measure. Of course, p0 ∈ int(Iψ ). For any x ∈ X , we have that ψ n (x)/n = −1 + (mβ/n), where 0 ≤ m ≤ n. Thus, for a sequence δn > 0, ψ n (x)/n ∈ ( p0 − δn , p0 + δn ) is equivalent to

1 m − β < δn . −δn < n 2 If we restrict to odd values of n, then this fails for large n as soon as we take δn = O (n −(1+) ), for > 0. Thus, for large odd values of n, we see that {x : ψ n (x) ∈ ( p0 − δn , p0 + δn )} = ∅. In particular, for any choice of measure µφ , the exponential growth rate is less than the exponent −I ( p0 ). As in Example 1, a suitable horseshoe gives a smooth version. We next give two examples of Anosov diffeomorphisms for which there is a timereversing involution. Example 3. Let T : M → M be an Anosov diffeomorphism and consider the product :=: M × M → M × M defined by T (x, y) = (T x, T −1 y). This is diffeomorphism T −1 again Anosov and satisfies i ◦ T ◦ i = T , where i(x, y) = (y, x). Example 4. Consider the map T (x, y) = (y, −x + C x + f (x)) on the 2-torus, where 1 C is an integer and f is a small perturbation.

0 1 If f (·) is C small then this is a perturbation of the linear map associated to −1 and if |C| > 2 then this is Anosov. Let C 2 −1 S(x, y) = (y, x) then S is the identity and ST S = T .

Large Deviations, Fluctuations and Shrinking Intervals

329

4. Large Deviations and Shrinking Intervals We will first consider large deviations results for a one-sided subshift of finite type. The two-sided result can then be deduced from this and results for hyperbolic diffeomorphisms can be derived from this using symbolic dynamics (using Proposition 1.1). Let us consider a mixing one-sided subshift of finite type σ : X + → X + and Hölder continuous functions ψ, φ : X + → R. In this context, our shrinking large deviations result takes the following form. Proposition 4.1. Suppose that ψ : X + → R satisfies the Diophantine condition (with respect to σ ). Then there exists κ > 0 such that, if δn > 0 decreases to zero and 1+κ ), as n → +∞, we have, for p ∈ int(I ), δ −1 ψ N = O(n

n ψ (x) 1 x: ∈ ( p − δn , p + δn ) = −I ( p). lim log µφ n→∞ n n We shall prove the theorem under the additional assumption that δn goes to zero faster than 1/n, so n := nδn → 0. As we have seen, the result holds automatically if δn−1 = O(n). This assumption means that we can convert the problem to a local limit-type problem. However, matters are simplified by our only wanting weaker information, i.e., the exponential growth rate rather than an asymptotic. Recall that ψ p := ψ − p. Then ψ n (x) x: ∈ ( p − δn , p + δn ) = {x : ψ pn (x) ∈ (−n , n )}. n We shall first prove a modified result, where the interval (−n , n ) is replaced by a sequence of smooth test functions. Let χ : R → R be a compactly supported C k function (where k will be chosen later). We shall write χn (y) = χ (n−1 y) and we note that the Fourier transform satisfies χ n (u) = n χ (n u). Let us define ρ(n) := χn (ψ pn (x))dµφ (x). Proposition 4.2. There exists κ > 0 such that, provided n−1 = O(n κ ), we have 1 log ρ(n) = P(φ + ξ p ψ p ) = −I ( p). n For technical reasons it is useful to modify χn to ωn (y) = e−ξ y χn (y) (where ξ = ξ p ). Then n ρ(n) = eξ ψ p (x) ωn (ψ pn (x))dµφ (x). lim

n→∞

To prove Proposition 4.1 we first use the inverse Fourier transform and Fubini’s Theorem to write

∞ n 1 ρ(n) = n (u)du eiuψ p (x) dµφ (x) χ 2π −∞

∞ n 1 = ωn (u)du e(ξ +iu)ψ p (x) dµφ (x) 2π −∞

∞ 1 = ωn (u)du, (4.1) Lnφ+(ξ +iu)ψ p 1(x)dµφ (x) 2π −∞

330

M. Pollicott, R. Sharp

using Lemma 2.1 for the last equality. We shall estimate ρ(n) by splitting the outer integral over R into two pieces. 4.1. u close to zero. If we choose a > 0 sufficiently small, we can change coordinates on (−a, a) to v = v(u) and write e P(φ+(ξ +iu)ψ p ) = λξ (1 − v 2 + i Q(v)), for |v| < a, say. If Pψ+(ξ +iu)ψ p is the associated one dimensional eigenprojection, then by perturbation theory Pφ+(ξ +iu)ψ p (1) = 1 + O(|v|). Using the formula Lnφ+(ξ +iu)ψ p 1 = en P(φ+(ξ +iu)ψ p ) (1 + O(|v|)) + O(θ n ) (0 < θ < 1), we may write

a ωn (u)du Lnφ+(ξ +iu)ψ p 1(x)dµg (x) −a a du (1 − v 2 + i Q(v))n (1 + O(|v|)) = λnξ ωn (u(v)) dv + O(λnξ θ n ) dv −a √ n a n χ n λnξ (0) 2λξ 2 n = (1 − v + i Q(v)) (1 + O(|v|)) dv + O + O(λnξ θ n ), σ n −a (4.2) n −1 )

where the O(n estimate follows from a simple calculation in [14, p. 409]. Using another calculation in [14, pp. 408–409] we see that the principle term in the last line of (2) is asymptotic to a λnξ (1 − v 2 )n dv; −a

we may estimate this as λnξ multiplied by the factor by making the substitution w = √ √ n χ (0) 2 a (0) 2 a n χ 2 n (1 − v ) dv = 2 (1 − v 2 )n dv σ σ −a 0 √ 2 (0) 2 a (1 − w)n n χ = dw σ w 1/2 0 √ n χ (0) 2 1 (1 − w)n = dw + O((1 − a 2 )n ) σ w 1/2 0 √ χ (0) n ∼ 2π (4.3) √ , σ n v2 ,

as n → +∞. Moreover, the term rising from the O(|v|) term in the integrand is of order n a a2 λξ n 2 n n n (1 − v ) |v|dv = λξ (1 − w) dw = O λξ . n 0 −a So, in particular, we have shown that

a √ χ n λnξ (0) n n ωn (u)du = 2π Lφ+(ξ +iu)ψ p 1(x)dµφ (x) . √ +O σ n n −a (4.4)

Large Deviations, Fluctuations and Shrinking Intervals

331

4.2. Away from zero. It remains to estimate the integral in (4.1) over |u| ≥ a and, in particular, to show that its contribution is smaller than the above. To do this we shall use a bound on the transfer operators Lφ+(ξ +iu)ψ p . We shall also use the following simple lemma. Lemma 4.1. If χ : R → R is C k and compactly supported then the Fourier transform χ (u) satisfies χ (u) = O(|u|−k ), as |u| → ∞. Using Lemma 2.2, we have the bound

n ωn (u) Lφ+(ξ +iu)ψ p 1(x)dµφ (x) |u|≥a

i zu n = n e (n u)du Lφ+(ξ +iu)ψ p 1(x)dµφ (x) χ |u|≥a ∞

c n/2[D log |u|] −k 1 1− γ =O u du . k−1 u n a

(4.5)

We need to show that this quantity tends to zero more quickly than n n −1/2 . To see this we shall split the integral in (4.5) into two parts: ∞ c n/2[D log |u|] −k 1− γ u du u a n δ ∞ c n/2[D log |u|] −k c n/2[D log |u|] −k = 1− γ 1− γ u du + u du, u u nδ a where we choose δ < δ < 1/γ . The first integral may be bounded by

n/2Dδ log n n δ c c n/2[D log |u|] −k 1− γ u du = O n δ 1 − δ γ u n a and, since δ γ < 1, this tends to zero faster than the reciprocal of any polynomial. The second integral may be bounded by ∞ c n 1 − γ u −k du = O(n (1−k)δ ). δ u n Combining these estimates we see that

n n (u)du = O(n−(k−1) n (1−k)δ ) Lφ+(ξ +iu)ψ p 1(x)dµφ (x) χ |u|≥a

= O(n (k−1)(δ−δ ) ). We obtain the required bound by choosing k sufficiently large that (k − 1)(δ − δ ), (k − 1)δ > 1. Together with (4.4), this completes the proof of Proposition 4.2. Proposition 4.1 follows by an approximation argument. Choose smooth functions χ + , χ − : R → [0, 1] such that χ − ≤ χ(−1,1) ≤ χ + . Then ψ n (x) x: ∈ ( p − δn , p + δn ) ≤ χn+ (ψ pn (x))dµφ , χn− (ψ pn (x))dµφ ≤ µφ n which gives the required estimate.

332

M. Pollicott, R. Sharp

Now let σ : X → X be the corresponding two-sided subshift of finite type. In [16, §3], it was shown how the analogue of Proposition 4.2 (and hence Proposition 4.1) may be deduced for σ : X → X , given the result for σ : X + → X + . Thus the analogue of Proposition 4.1 holds in the two-sided case. As a consequence of the two-sided version of Proposition 4.1,

n (x) x ∈: ∈ ( p − δn , p + δn ) µ n

ψ n (x) x∈X: ∈ ( p − δn , p + δn ) . = µφ n This shows that the limit in Theorem 1 exists and is equal to P(φ + ξ p ψ p ) = inf{P(φ + qψ) − P(φ) − qp : q ∈ R} = inf{P( + q) − P() − qp : q ∈ R} = −I ( p). This completes the proof of Theorem 1. 5. Fluctuation Theorems In this final section, we derive our generalization of the Fluctuation Theorem for slowly shrinking intervals. In fact, Theorem 2 will follow directly from Theorem 1 once we show the limit has the desired form. This follows from standard arguments and is included for completeness. Let T : → be a mixing hyperbolic diffeomorphism with a time-reversing involution i : → , i ◦ T ◦ i = T −1 . For a Hölder continuous function : → R, let = − ◦ i ◦ T . Observe that P() = P( ◦ i ◦ T) and thus by the variational principle, P() = h(µ ) + dµ ≥ h(µ ) + ◦ i ◦ T dµ , i.e., dµ = ( − ◦ i ◦ T ) dµ ≥ 0. Moreover, equality occurs only when µ is the equilibrium state ◦ i ◦ T (which is equivalent to µ being the measure of maximal entropy). This explains part (i) of the Fluctuation Theorem. Suppose that satisfies the Diophantine condition and that δn > 0 decreases to zero such that δn−1 = O(n 1+κ ), where κ > 0 is chosen so that Theorem 1 holds. By applying Theorem 1 to the numerator and denominator in (0.2), we obtain µ ({x : n (x)/n ∈ ( p − δn , p + δn )}) 1 log = I (− p) − I ( p) n→+∞ n µ ({x : n (x)/n ∈ (− p − δn , − p + δn )}) lim

(provided − p, p ∈ int(I )). We need to exploit some special symmetries of the pressure function P( + q) − P(). More precisely, we have the following. Lemma 5.1. P( + q) = P( − (1 + q)). Proof. Let T n x = x be a periodic point. Then since ( ◦ T −1 )n (x) = n (x) we have that ( ◦ i)n (x) + q( ◦ i)n (x) = = = = =

( ◦ i)n (x) + q(( ◦ i)n (x) − ( ◦ i ◦ T ◦ i)n (x)) ( ◦ i)n (x) + q(( ◦ i)n (x) − ( ◦ T −1 )n (x)) ( ◦ i)n (x) + q(( ◦ i)n (x) − n (x)) n (x) − (1 + q)(n (x) − ( ◦ i)n (x)) n (x) − (1 + q) n (x).

Large Deviations, Fluctuations and Shrinking Intervals

333

Since i acts as a bijection on the set of periodic orbits of period n, we obtain n n n n e (x)+q (x) = e (x)−(1+q) (x) . T n x=x

T n x=x

In particular, the two sums have the same exponential growth rate and hence the result follows from Proposition 1.1. Lemma 5.2. There exists p ∗ > 0 such that I = [− p ∗ , p ∗ ]. Proof. Since is not cohomologous to a constant, I is a non-trivial interval. By Lemma 5.1, d P( + q) d P( − (1 + q)) dµ+q = = = − dµ−(1+q) . dq dq Thus

lim

t→+∞

dµ+t = − lim

which shows that I has the required form.

t→−∞

dµ+t ,

Lemma 5.3. For | p| < p ∗ , I (− p) − I ( p) = p. Proof. We have I (− p) − I ( p) = inf (P( + q) + qp) − inf (P( + q) − qp) q∈R

q∈R

= inf (P( − (1 + q)) + qp) − inf (P( + q) − qp) q∈R

q∈R

= inf (P( + r ) − r p + p) − inf (P( + q) − qp) r ∈R

= p, as required.

q∈R

Combining Eq. (5.1) and Lemmas 5.2 and 5.3 completes the proof of Theorem 2. Remark. Some fluctuation theorems and large deviation results for Young towers (and thus for Billiards and Hénon attractors) have already been proved by Young and ReyBellet [17]. For shrinking intervals, the analogue of the upper bound presumably follows easily. The proof of the lower bound should follow from suitable properties of the corresponding transfer operator. In particular, [12] extended Dolgopyat’s results on transfer operators to Hölder functions satisfying a Diophantine condition in terms of four periodic points. References 1. Bowen, R.: Markov partitions for Axiom A diffeomorphisms. Amer. J. Math. 92, 907–918 (1970) 2. Bowen, R.: ω-limit sets for axiom A diffeomorphisms. J. Differ. Eq. 18, 333–339 (1975) 3. Dolgopyat, D.: Prevalence of rapid mixing in hyperbolic flows. Erg. Th. Dynam. Sys. 18, 1097–1114 (1998) 4. Gallavotti, G.: Reversible Anosov diffeomorphisms and large deviations. Math. Phys. Electron. J. 1, 1–12 (1995) 5. Gallavotti, G., Cohen, E.: Dynamical ensembles of stationary states. J. Stat. Phys. 80, 931–970 (1995)

334

M. Pollicott, R. Sharp

6. Gentile, G.: Large deviation rule for Anosov flows. Forum Math. 10, 89–118 (1998) 7. Guivarc’h, Y., Hardy, J.: Théorèmes limites pour une classe de chaines de Markov et applications aux difféomorphismes d’Anosov. Ann. Inst. H. Poincaré Probab. Statist. 24, 73–98 (1988) 8. Jiang, D.-Q., Qian, M., Qian, M.-P.: Mathematical theory of nonequilibrium steady states. Lecture Notes in Mathematics, 1833, Berlin: Springer 2004 9. Kifer, Y.: Large deviations in dynamical systems and stochastic processes. Trans. Amer. Math. Soc. 321, 505–524 (1990) 10. Lalley, S.: Ruelle’s Perron-Frobenius theorem and the central limit theorem for additive functionals of one-dimensional Gibbs states. In: Adaptive statistical procedures and related topics (Upton, N.Y., 1985) IMS Lecture Notes Monogr. Ser., 8, Hayward, CA: Inst. Math. Statist., 1986, pp. 428–446 11. Maes, C., Verbitskiy, E.: Large deviations and a fluctuation symmetry for chaotic homeomorphisms. Commun. Math. Phys. 233, 137–151 (2003) 12. Melbourne, I.: Rapid decay of correlations for nonuniformly hyperbolic flows. Trans. Amer. Math. Soc. 359, 2421–2441 (2007) 13. Orey, S., Pelikan, S.: Deviations of trajectory averages and the defect in Pesin’s formula for Anosov diffeomorphisms. Trans. Amer. Math. Soc. 315, 741–753 (1989) 14. Pollicott, M., Sharp, R.: Rates of recurrence for Zq and Rq extensions of subshifts of finite type. J. London Math. Soc. 49, 401–418 (1994) 15. Pollicott, M., Sharp, R.: Error terms for closed orbits of hyperbolic flows. Erg. Th. Dynam. Sys. 21, 545–562 (2001) 16. Pollicott, M., Sharp, R.: Distribution of ergodic sums for hyperbolic maps. In: Representation theory, dynamical systems, and asymptotic combinatorics, Amer. Math. Soc. Transl. Ser. 2, 217, Providence, RI: Amer. Math. Soc., 2006, pp. 167–183 17. Rey-Bellet, L., Young, L.-S.: Large deviations in non-uniformly hyperbolic dynamical systems. Erg. Th. Dynam. Sys. 28, 587–612 (2008) 18. Ruelle, D.: Positivity of entropy production in nonequilibrium statistical mechanics. J. Statist. Phys. 85, 1–23 (1996) 19. Ruelle, D.: Smooth dynamics and new theoretical ideas in nonequilibrium statistical mechanics. J. Statist. Phys. 95, 393–468 (1999) 20. Wojtkowski, M.: Abstract fluctuation theorems. Ergod. Th. Dynam. Sys., to appear:doi:10.1017/ S014338570800163, 2008 Communicated by G. Gallavotti

Commun. Math. Phys. 290, 335–355 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0795-3

Communications in

Mathematical Physics

On the Structure of the Fusion Ideal Christopher L. Douglas Department of Mathematics, University of California, Berkeley, CA 94720, USA. E-mail: [email protected] Received: 2 September 2008 / Accepted: 29 December 2008 Published online: 21 April 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com

Abstract: We prove that there is a finite level-independent bound on the number of relations defining the fusion ring of positive energy representations of the loop group of a simple, simply connected Lie group. As an illustration, we compute the fusion ring of G 2 at all levels. 1. Introduction Background and overview. Two-dimensional rational conformal field theory has applications to the study of critical phenomena in statistical mechanics and condensed matter physics, and is intimately related to the theory of operator algebras and to three-dimensional topological field theory [3,12,21,23,24]. The basic invariant of a rational conformal field theory is the associated fusion ring, whose generators are the primary fields of the theory and whose multiplication is determined by the dimensions of the spaces of conformal blocks [22]. For a class of conformal field theories associated to loop groups, the fusion ring can be described as a Grothendieck group of positive energy representations, with product structure determined by Connes fusion. Our purpose in this note is to investigate algebraic properties of these fusion rings, and particularly to describe a certain bound on the complexity of the fusion ring of a loop group. The fusion ring Fk [G] of positive energy representations of a level k central extension of the loop group LG of the compact, simple, simply connected Lie group G can be presented as a quotient R[G]/Ik of the representation ring of G by the “fusion ideal” Ik . This quotient is freely generated as an abelian group by the irreducible representations of G whose highest weights have level less than or equal to k. The fusion ideal contains and is often generated by the level k + 1 irreducible representations of G, suggesting that the number of generators of the ideal goes to infinity with the level. However, Gepner [11] The author was supported in part by an NSF Graduate Research Fellowship and in part by a Miller Research Fellowship.

336

C. L. Douglas

conjectured that these fusion rings are all global complete intersections, that is the fusion ideal is generated by exactly n elements, for a group G of rank n. We describe a situation intermediate between these two extremes: there is a finite level-independent upper bound on the number of representations needed to generate the fusion ideal. Moreover, there is a uniform presentation of the fusion ring of G in terms of bases for the modules of representations of central extensions of the centralizer subgroups of G. As an example, we work out the necessary bases for the subgroups of G 2 and thereby give an explicit computation of the G 2 fusion ring. Our primary technique is to analyze a twisted Mayer-Vietoris spectral sequence converging to the twisted equivariant K-homology of the group G. By results of Freed, G (G) for twistHopkins, and Teleman [8–10] the twisted equivariant K-homology K k+h ∨ ∨ ing k + h is isomorphic to the fusion ring Fk [G] of positive energy representations of the loop group LG at level k. In fact, the twisted equivariant K-homology of G is a Frobenius algebra encoding the full two-dimensional topological field theory associated to the underlying conformal field theory. Moreover this twisted equivariant K-homology admits operations by K-theory classes of moduli spaces of surfaces, altogether forming the state space of a topological conformal field theory [7]. Though here we limit our attention to the multiplicative behavior of the fusion ring, computational tools from twisted homotopy theory could also be used to analyze the further topological conformal structure of these field theories. Results and organization. Parametrized stable homotopy theory is a convenient framework for studying twisted generalized cohomology theories, and in Sect. 2.1 we recall the definitions of parametrized spectra and their associated twisted homology and cohomology theories. We then introduce the twisted Mayer-Vietoris spectral sequence in Sect. 2.2 and discuss a particular such spectral sequence for the K-homology and equivariant K-homology of G. In Sect. 2.3 we reformulate the equivariant spectral sequence in terms of “twisted representation modules”, that is in terms of modules of representations of central extensions of subgroups of G, and we express the d 1 differentials of the spectral sequence as twisted holomorphic induction maps between these modules. The resulting resolution of the fusion ring is implicit in the work of Freed, Hopkins, and Teleman, is described in papers of Kitchloo and Morava [13,14], and is presented in detail from the point of view of the twisted K-theory of C ∗ -algebras by Meinrenken [16]. The reader already familiar with this resolution of the fusion ring may want to turn directly to the results of Sect. 3 and Sect. 4, after glancing at the list of Lie theory notation in Sect. 2.2.1. In Sect. 3, we compute bases for the representation modules of the centralizer subgroups of G 2 , with particular attention to the twisted module of representations of the nontrivial central extension of the SO(4) subgroup. We then exploit singular planes for the twisted holomorphic induction maps to simplify the analysis of the differential in the spectral sequence, thereby facilitating an explicit description of the fusion ring. Theorem 1.1. For k > 0 the level k fusion ring of G 2 is presented as follows: R[G 2 ]/(ρ(1, k ) ; ρ(0, k ) + ρ(0, k +1) ; ρ(1, k −1) + ρ(1, k +1) ; ρ(k+2,0) ) k even, 2 2 2 2 2 Fk [G 2 ] = R[G 2 ]/(ρ(0, k+1 ) ; ρ(1, k−1 ) + ρ(1, k+1 ) ; ρ(0, k−1 ) + ρ(0, k+3 ) ; ρ(k+2,0) ) k odd. 2

2

2

2

2

Here ρ(a,b) is the irreducible representation with highest weight λas λlb , for λs and λl the short and long fundamental weights of G 2 .

On the Structure of the Fusion Ideal

337

We abstract this G 2 computation to other simply-connected groups in Sect. 4. In particular, we establish that the E 1 term of the twisted Mayer-Vietoris spectral sequence for the twisted equivariant K-homology of G consists entirely of free R[G]-modules. This entails a generalization of a result of Pittie [17], which might be of independent interest: Proposition 1.2. Let G be simple and simply connected, and let H be the centralizer of an element of G. The restriction of the generator of HG3 (G; Z) to the orbit G/H classifies . The submodule Rk [H ] ⊂ R[ H ], consisting an S 1 -central extension of H , denoted H for which the central circle acts by the k th power of scalar of representations of H multiplication, is free as an R[G]-module. For convenient reference, we list all the centralizer subgroups H of all simple, simply connected groups G for which the twisted representation module Rk [H ] is in fact twisted, that is not isomorphic to the representation ring R[H ]. The fusion ring admits a presentation in terms of induction maps on bases for these twisted modules, and the level-independent finiteness of the fusion ideal follows. Theorem 1.3. For G a compact, simple, and simply connected Lie group, there exists an integer n G such that for all levels k, the fusion ring of G at level k has a presentation of the form Fk [G] = R[G]/(ρ1 (k), ρ2 (k), . . . , ρn G (k)), for representations ρi (k) ∈ R[G] depending on the level. An upper bound on the number of generators n G of the fusion ¯ the cenideal is i∈ D¯ |WG |/|W Z (i ) |; here we denote the Dynkin diagram of G by D, tralizer of the edge of the Weyl chamber containing the fundamental weight λi by Z (i ), and the Weyl group of H ⊂ G by W H . We have recently discovered two papers that also study the algebraic structure of the fusion ring: the first [4], by Bouwknegt and Ridout, shows that in type A and type C the fusion ring is always a global complete intersection over the integers—in fact the authors construct an integral fusion potential in these cases; the second paper [5], by Boysal and Kumar, describes a series of intriguing conjectures for finite but not complete intersection presentations of the fusion rings for the classical groups and for G 2 . 2. Computing Twisted Equivariant K-Theory We begin this section by reviewing a general framework for twisted equivariant homology theory, namely parametrized equivariant spectra. We then introduce our primary computational tool, the twisted Mayer-Vietoris spectral sequence, and describe it in detail for the elementary twists of equivariant K-theory over a simple simply connected compact Lie group. Finally, we reformulate this spectral sequence in representationtheoretic terms, obtaining a resolution of the fusion ring by explicit twisted representation modules. 2.1. Recollections on twisted equivariant homology theories. A function on a space X / L o_ _ _/ X of a line bunis a map X → C, and a twisted function is a section C dle over X . Similarly, for a spectrum F, an F-cohomology class is a homotopy class of maps X → F, and a twisted F-cohomology class is a homotopy class of sections / E o_ _ _/ X of a “bundle of spectra” over X with fiber F. If F is instead a F G-equivariant spectrum, for some compact Lie group G, the same heuristic suggests a

338

C. L. Douglas

notion of twisted G-equivariant F-cohomology, as equivariant sections of a bundle of equivariant spectra. These ideas can be made precise using parametrized equivariant spectra, a theory of which is developed in extensive detail by May and Sigurdsson [15]. Recall that a prespectrum is a collection of based spaces {Fi } together with based maps Fi → Fi+1 . A prespectrum parametrized over X is a collection of spaces {E i } equipped with maps E i → X and sections X → E i , together with structure maps X E i → E i+1 commuting with the maps to and from X . The cohomology and homology of X with coefficients in F and E are defined as follows: F n (X ) E n (X ) Fn (X ) E n (X )

:= := := :=

π−n Map(X, F) = colim πi−n Map(X, Fi ), π−n (X, E) = colim πi−n (X, E i ), πn X + ∧ F = colim πi+n (X × Fi )/X, πn E/X = colim πi+n E i /X.

Notice that E is not itself a spectrum, so the notation cannot cause confusion: E n (X ) and E n (X ) are twisted forms of F-cohomology and F-homology of X . Here in the first instance the expressions Map, , ∧, and /X refer respectively to the derived mapping space, sections, smash product, and total prespectrum. The nonderived prespectrum of sections is defined levelwise by (X, E)i = (X, E i ); the nonderived total prespectrum is defined similarly by (E/X )i = E i/X . The section and total functors are respectively the right adjoint p∗ and the left adjoint p! to the pullback functor p ∗ : Spec → Spec/X for the projection p : X → ∗; that is (X, E) = p∗ (E) and E/X = p! (E). Under appropriate assumptions on X , F, and E, the above four colimit expressions, interpreted without derivation, give a more concrete description of the corresponding untwisted and twisted cohomology and homology groups. The twisted equivariant story is analogous. For a compact Lie group G, a G-equivariant prespectrum is a collection of based G-spaces {FV } for G-representations V , together with compatible equivariant structure maps W FV → FV ⊕W . Similarly, a G-equivariant prespectrum parametrized over the G-space X is a collection of G-spaces {E V } over X , with sections from X , together with compatible equivariant structure maps XW E V → E V ⊕W . The equivariant cohomology and homology of X with coefficients in F and E are defined as follows: FGn (X ) := π−n Map(X, F)G , n (X ) := π−n (X, E)G , EG

FnG (X ) := πn (X + ∧ F)G , E nG (X ) := πn (E/ X )G . These compact expressions are a convenient mnemonic, though they suppress various details. As before, the expressions Map, , ∧, and /X should all be understood as derived. Similarly (−)G is a derived functor, specifically the derivation of the composite forget

G-fixed

G-Spec/X −−−→ naive-G-Spec/X −−−−→ Spec/X —the first functor forgets to naive Gprespectra, that is G-prespectra indexed only on trivial G-representations, and the second functor takes levelwise G-fixed points. Note that even if E is a fibrant G-prespectrum over X , the total spectrum E/X is unlikely to be fibrant, and so must be replaced by a fibrant G-prespectrum before taking levelwise fixed points—this complexity appears already in nonparametrized equivariant homology theory.

On the Structure of the Fusion Ideal

339

2.2. The twisted Mayer-Vietoris spectral sequence. Twisted homology theories enjoy all the good properties of ordinary homology theories, including the existence of a Mayer-Vietoris spectral sequence. Indeed the Mayer-Vietoris spectral sequence, which captures the idea that we can recover the value of a theory globally from the values locally, encodes the essence of what it means to be a homology theory. Suppose we have a G-prespectrum E parametrized over a G-space X . Given an open equivariant covering U = {Ui → X } of X , we can form the associated simplicial cover sU =

i

Ui ← ←

i, j

← ← ··· . Ui j ← Ui jk ← ← ← ← i, j,k

Here U I refers to the intersection of the Ui for i ∈ I . Note that the indexing set I may contain repeated indices, and the simplicial object sU does have degeneracy maps. Provided the cover is numerable, the realization |sU| is homotopy equivalent to X , and we may hope to obtain information about the twisted equivariant homology E ∗G (X ) from the homology E ∗G (U I ) of the pieces of the simplicial cover. Indeed, the simplicial filtration of |sU| leads, a la Segal [19], to our desired spectral sequence: Proposition 2.1. For E a G-prespectrum parametrized over a G-space X , and U a numerable open covering of X with associated simplicial cover sU, there are spectral sequences E 2pq = H p (E qG (sU)) ⇒ E G p+q (X ), pq

q

p+q

E 2 = H p (E G (sU)) ⇒ E G (X ). We think of the first spectral sequence as a twisted Mayer-Vietoris spectral sequence, and of the second as a twisted Bousfield-Kan spectral sequence. May and Sigurdsson [15] refer to both as “Mayer-Vietoris”, but Freed, Hopkins, and Teleman [8] consider the second of “Atiyah-Hirzebruch” type. Since both spectral sequences are constructed by a method of Segal, we see that, in any case, it’s a party. 2.2.1. TMVSS in K-homology. Here we describe the twisted Mayer-Vietoris spectral sequence for the twisted K-homology of a simple, simply connected Lie group G, and in the next section consider the corresponding equivariant spectral sequence. Certain twists of K-theory can be constructed as follows. Fix a Hilbert space H and an action of the projective unitary group PU(H) on a fixed model K for the K-theory prespectrum; such a model can be built from spaces of Fredholm operators on H, as in Atiyah-Singer [2] or more recently Atiyah-Segal [1]. Given a principal PU(H) bundle P on a space X , form the parametrized prespectrum E P := P × PU(H) K over X —this parametrized prespectrum has spaces on the levelwise associated bundles P × PU(H) K i . Such principal PU(H) bundles over X are classified by maps X → BPU(H) K (Z, 3) and therefore up to isomorphism by classes in H 3 (X ; Z). Given a twisting class τ ∈ H 3 (X ; Z), define the τ -twisted K-homology of X as K τ,i (X ) := (E P )i (X ), for a principal bundle P classified by τ . Fix a simple simply connected Lie group G of rank n. In order to describe the twisted Mayer-Vietoris spectral sequence for K τ,n (G) for a twisting τ ∈ H 3 (G; Z) ∼ = Z, we

340

C. L. Douglas

n of G as follows. We use the following notation: construct a particular cover U = {Ui }i=0

T = maximal torus of G g = Lie algebra of G t = Lie algebra of T R = root lattice of g W = weight lattice of g R = coroot lattice of g W = coweight lattice of g W = Weyl group of g W aff A n {αi }i=1 n {λi }i=1 D h∨

= affine Weyl group of g = Weyl alcove of g = simple roots of g = fundamental weights of g = affine Dynkin diagram of g = dual Coxeter number of g

Recall that the root lattice R ⊂ t∗ is generated by the roots {α}, the eigenvalues for the eigenspaces gα of the restriction of the adjoint representation of g to the maximal torus t. The coroot lattice R ⊂ t is generated by the coroots {h α }, which by definition are the unique elements h α ∈ [gα , g−α ] such that α(h α ) = 2. The weight lattice W ⊂ t∗ is by definition evaluation dual to the coroot lattice, and the coweight lattice W ⊂ t∗ is by definition evaluation dual to the root lattice. As G is simply connected, the coroot lattice can also be characterized as R = ker(exp : t → G). The roots α ∈ t∗ of g act on t by reflections in the kernels ker α, and these reflections generate the Weyl group W . The affine Weyl group is the semi-direct product R W , and the Weyl alcove is the quotient A = t/W aff . The group G acts on itself by conjugation, with quotient G/G isomorphic to the Weyl exp alcove A—the isomorphism is the exponential map A −−→ G/G. The cover of G we are interested in is pulled back along π : G → G/G from a cover of the alcove. The Weyl alcove A is a simplex embedded in t. The n + 1 vertices {vıˆ }i=0,...,n of A are determined as follows: the vertex vıˆ is the fixed point of the subgroup of the affine Weyl group generated by the reflections corresponding to the roots α0 , α1 , . . . , αi , . . . , αn . Here α0 is the affine (lowest) root, which acts on t by reflection in the plane P perpendicular to the coroot of the highest root, with P half way between the origin and that coroot. We have used the shorthand ıˆ to denote the complement in {0, 1, . . . , n} of the element i; indeed we will abbreviate the complement of any collection S ⊂ {0, 1, . . . , n} by S. n Generally, for a subset S of the simple affine roots {αi }i=0 (that is a subset of the affine Dynkin diagram D), there is a corresponding face FS of the alcove, defined as the fixed set of the subgroup of W aff generated by the reflections corresponding to {αi }i∈S . ıˆ = A\Fi —these open sets are the ıˆ }, with U Cover the alcove A by open sets {U complements of the codimension-one faces—and define the cover U = {Uıˆ } of G by ıˆ ). Let c S ∈ A denote both the barycenter of the face FS and the exponential Uıˆ = π −1 (U of this point in T ⊂ G. The open set Uıˆ deformation retracts (equivariantly with respect to the conjugation action) to the conjugacy class of cıˆ , which is the quotient G/Z (cıˆ ). More generally, the intersection i ∈S / Uıˆ deformation retracts to the conjugacy class of

On the Structure of the Fusion Ideal

341

c S , that is to G/Z (c S ). We can now describe the E 1 term of the twisted Mayer-Vietoris spectral sequence for K τ (G), based on the cover U: K τ,q (G/Z (c S )) ⇒ K τ, p+q (G). E 1pq = S⊂D,|S|=n− p

In an abuse of notation, here the twisting bundle τ on G/Z (c S ) is implicitly the restriction of the twisting bundle τ along the inclusion φ : G/Z (c S ) → G; this twisting is classified, of course, by the image of the cohomology class τ ∈ H 3 (G; Z) under the map φ ∗ : H 3 (G; Z) → H 3 (G/Z (c S ); Z). This E 1 term, and the corresponding d 1 differential, can be described more explicitly as follows. Note that for S ⊂ T ⊂ D, there is an inclusion Z (c S ) ⊂ Z (cT ) and a corresponding projection π S,T : G/Z (c S ) → G/Z (cT ). The group G is homotopy equivalent to the realization of the simplicial space Z k = S⊂D,|S|=n−k G/Z (c S ). For each i ∈ D, pick a principal PU(H) bundle τi on G/Z (cıˆ ) and pick isomorphisms

φij : πı∗j ,ˆı τi − → πı∗j ,jˆ τ j such that φij φjk φki = 1, and such that the resulting principal bundle on |Z · | G is classified by τ ∈ H 3 (G; Z). Fix an order 0, 1, . . . , n on the simple affine roots D such that the affine root is first. For S ⊂ D let t (S) ∈ D be the first root that is not in S. The E 1 term can now be written K (π ∗ τt (S) ),q (G/Z (c S )) ⇒ K τ, p+q (G). E 1pq = S⊂D,|S|=n− p

S,t (S)

Let S = { j0 , j1 , . . . , j p } be the complement of S, with j0 < j1 < · · · < j p in the chosen order of the affine roots, and set T = S ∪ js for some 0 ≤ s ≤ p. With respect to this presentation, the S-T component of the d 1 differential is (−1)s times the composite φ

K π ∗ τt (S) (G/Z (c S )) − → Kπ ∗ S,t (S)

τ S,t (T ) t (T )

∗ π∗ = K π S,T

(G/Z (c S )) (G/Z (c S )) τt (T )

T,t (T )

π∗S,T

−−→ K π ∗

τ T,t (T ) t (T )

(G/Z (cT )).

Here the isomorphism φ is induced by the chosen twisting isomorphisms φi j , the map π∗S,T is the natural map in twisted K-homology, and the homological degree q remains implicit. This particular presentation of the spectral sequence is unsightly in part because it is meant to isolate the exact role of the twisting isomorphism φ. In our later computations, we will use that we can work with less standard presentations of the same spectral sequence. 2.2.2. TMVSS in equivariant K-homology. We have belabored the non-equivariant twisted Mayer-Vietoris spectral sequence in part because the equivariant spectral sequence is precisely analogous—replacing “K ” by “K G ” everywhere in the previous section yields the correct equivariant spectral sequence. However, establishing that this is the case requires care, because the construction of twisted equivariant K-homology is more delicate than that of twisted K-homology or twisted equivariant K-cohomology. It is a technical headache to construct a G-prespectrum of the homotopy type of the equivariant K-theory spectrum that admits an equivariant action by a topological group whose homotopy type is K (Z, 2), such that on homotopy the action is given by tensoring an equivariant vector bundle by a line bundle. Indeed, no such construction

342

C. L. Douglas

exists in the literature, and as a result there is at present no description of a parametrized G-prespectrum representing twisted equivariant K-homology. Our primary focus in this note is computational, and as such we do not undertake to build such a parametrized G-prespectrum; rather, we sidestep the problem by using a C ∗ -algebra approach to K-homology. Rosenberg [18] and Meinrenken [16] describe twisted equivariant K-homology as follows. Fix a G-Hilbert space H containing infinitely many copies of each finite-dimensional G-representation. Equivariant bundles of G-C ∗ -algebras over X with fiber the compact operators KG (H) and structure group PUG (H) are classified up to Morita equivalence by HG3 (X ; Z). Such bundles are called “Dixmier-Douady” bundles. Pick such a bundle A over X whose invariant class is τ ∈ HG3 (X ; Z), and for any space Y over X , define G K τ,i (Y ) := KKiG ((Y, A|Y )),

where is the G-C ∗ -algebra of sections vanishing at infinity, and KKiG is Kasparov K-theory. In the following, we let τ refer not only to homology classes in HG3 (X ; Z) but to implicit choices of representing Dixmier-Douady bundles Aτ . This C ∗ -algebra twisted K-homology is a generalized homology theory with closed support, that is a generalized Borel-Moore homology, and as such has spectral sequences not for open covers but for closed filtrations. In particular it has a spectral sequence for the skeletal filtration of the realization |Z · | G of the simplicial space Z k = S⊂D,|S|=n−k G/Z (c S ), as follows. Proposition 2.2. For G simple, simply connected, and τ ∈ HG3 (G; Z) represented by a fixed Dixmier-Douady bundle A, there is a spectral sequence of K G (∗)-modules: G G K τ,q (G/Z (c S )) ⇒ K τ, E 1pq = p+q (G). S⊂D,|S|=n− p

Here c S denotes the barycenter of the face of the Weyl alcove of g indexed by the subset S of the affine Dynkin diagram D. S denote For S ⊂ T ⊂ D, let π S,T : G/Z (c S ) → G/Z (cT ) be the projection and let the complement of S in D. Given classes τi ∈ HG3 (G/Z (cıˆ ); Z) represented by fixed

→ πı∗j ,jˆ A j Dixmier-Douady bundles Ai , together with isomorphisms φi j : πı∗j ,ˆı Ai − such that φi j φ jk φki = 1, and such that the resulting Dixmier-Douady bundle on G is classified by τ ∈ HG3 (G; Z), the above spectral sequence is presented by G G K (π E 1pq = ∗ τt (S) ),q (G/Z (c S )) ⇒ K τ, p+q (G). S,t (S)

S⊂D,|S|=n− p

For S = { j0 , j1 , . . . , j p } and T = S ∪ js , the S-T component of the d 1 differential is (−1)s times the map K πG∗

τ S,t (S) t (S)

φ

(G/Z (c S )) − → K πG∗

τ S,t (T ) t (T )

= K πG∗

∗ S,T π

(G/Z (c S )) (G/Z (c S )) τt (T )

T,t (T )

π∗S,T

−−→ K πG∗

τ T,t (T ) t (T )

(G/Z (cT )).

On the Structure of the Fusion Ideal

343

This spectral sequence is not new. It arose originally in the work of Freed, Hopkins, and Teleman, though only a related spectral sequence in K-cohomology appears in their papers. It has been described explicitly by Meinrenken [16] and is discussed by Kitchloo and Morava [13,14]. The main theorem about this particular twisted Mayer-Vietoris spectral sequence in equivariant K-homology is the following. Theorem 2.3. (Freed-Hopkins-Teleman [8]) The twisted Mayer-Vietoris spectral sequence for K τG+h ∨ (G), as in Proposition 2.2, collapses at the E 2 term, and the E 2 term, as a K G (∗) = R[G] module, is the fusion ring Fτ (G), that is the ring of positive energy representations of the loop group LG at level τ , concentrated in homological degree 0. The spectral sequence is 2-periodic in the internal direction, with all groups of odd internal degree vanishing, and this structure is implicit in the statement of the above theorem. Freed, Hopkins, and Teleman also prove an analogous and much more difficult result for not-necessarily simply connected G.

2.3. A resolution of the fusion ring. We redescribe the above twisted Mayer-Vietoris spectral sequence for K τG (G) in terms of twisted representation modules for subgroups of the group G, and then even more explicitly in terms of invariants for actions of subgroups of the affine Weyl group of G. In light of Theorem 2.3, the result is a resolution of K τG (G), therefore of the fusion ring—our resolution differs from that of Meinrenken [16] at most by a change of presentation. The components of the E 1 term of our spectral sequence have the form K τG (G/Z ), for Z the centralizer of a point of the Weyl alcove. In order to give a representationtheoretic description of this twisted K-homology group, we dualize to twisted K-cohomology. Though G/Z may not be equivariantly Spinc , it is canonically h ∨ -twisted equivariantly Spinc , and so satisfies twisted Poincaré duality: K τG+h ∨ (G/Z ) ∼ = K Gτ (G/Z ). (See [6] for a definition and discussion of twisted Spinc structures and [16] for a description of the h ∨ -twisted structure on adjoint orbits.) Next we translate the twisted K-theory of the quotient G/Z to a twisted K-theory of a point: K Gτ (G/Z ) ∼ = K Zτ¯ (∗). Here τ¯ is the image of τ under the equivalence HG3 (G/Z ) = H Z3 (∗)—more precisely, of course, G-equivariant Dixmier-Douady bundles on G/Z correspond to Z -equivariant Dixmier-Douady bundles on a point, and τ¯ refers to the image under this correspondence of the bundle representing τ . The group H Z3 (∗) classifies S 1 -central extensions Z of Z , and we can express the twisted K-theory in terms of this extension: K Zτ¯ ,0 (∗) = Rτ¯ [Z ], K Zτ¯ ,1 (∗) = 0. The “twisted representation module” Rτ¯ [Z ] is by definition the submodule of the representation ring of the τ¯ -extension Z consisting of those representations for which the central circle acts by scalar multiplication. Henceforth we simplify the notation by referring

344

C. L. Douglas

to the Z -equivariant Dixmier-Douady bundle on a point corresponding to the G-bundle τ on G/Z not by τ¯ but simply by τ . As in Proposition 2.2 we consider any twisting bundle τ on G as being built from twist

ing bundles τi on G/Z (cıˆ ) via compatible bundle isomorphisms φi j : πı∗j ,ˆı τi − → πı∗j ,jˆ τ j . The map π S,T : G/Z (c S ) → G/Z (cT ) is the projection corresponding to the inclusion ι S,T : Z (c S ) → Z (cT ) for a pair S ⊂ T ⊂ D of subsets of the affine Dynkin diagram D. The above discussion allows us to write the E 1 term of the spectral sequence converging to K τG+h ∨ (G) in terms of representation modules: E 1p,0 = Rι∗ τt (S) [Z (c S )]. S⊂D,|S|=n− p

S,t (S)

The spectral sequence is 2-periodic in the internal direction, and henceforth we write only a single homological line. Next we investigate the d 1 differential. As before let T = S ∪ js for js ∈ S. In the absence of any twisting, the S-T component of the differential in the spectral sequence would be (−1)s times the natural map K G (G/Z (c S )) → K G (G/Z (cT )). Given the chosen complex (or more generally Spinc ) structures on G/Z (c S ) and G/Z (cT ), therefore the chosen Poincaré duality isomorphisms, this natural map could be described representation-theoretically by holomorphic induction R[Z (c S )] → R[Z (cT )]. In the presence of twistings, the map is, as one might imagine, a twisted holomorphic induction: Rι∗

τ S,t (S) t (S)

φ

[Z (c S )] − → Rι∗

τ S,t (T ) t (T )

= Rι∗S,T ι∗

[Z (c S )]

[Z (c S )] τt (T )

T,t (T )

ι∗S,T

−−→ Rι∗

τ T,t (T ) t (T )

[Z (cT )].

Here ι∗S,T is holomorphic induction for the inclusion of central extensions ι∗ ι∗

S,T Z (c S )

τ T,t (T ) t (T )

ι∗

⊂Z (cT ) T,t (T )

τt (T )

with respect to the canonical twisted Spinc structures on G/Z (c S ) and G/Z (cT ). The real ι∗ τt (S) into a twist occurs in the map φ, which transforms a representation of Z (c S ) S,t (S) ι∗

τt (T )

according to the bundle isomorphisms φi j constructing representation of Z (c S ) S,t (T ) the twisting τ from the twistings τi . For example, even when all the twistings τi are trivial, the isomorphisms φi j may still involve tensoring by nontrivial line bundles, and the overall differential remains decidedly twisted—this is the type of twisted holomorphic induction that arises in non-equivariant computations of twisted K-homology [6]. We can eliminate the central extensions from our description of the spectral sequence by expressing the twisted representation modules as affine Weyl invariants in the representation ring of the maximal torus T of G. For α ∈ t∗ a simple root of G, let wα : t∗ → t∗ denote the involution given, as usual, by wα (β) = β − β(h α )α—this is the reflection in the plane ker h α . For α0 ∈ t∗ the affine (lowest) root, by slight abuse of notation, let wkα0 : t∗ → t∗ denote the involution given by reflection in the plane −k α20 + ker h α0 . For a subset S ⊂ D\α0 of the affine Dynkin diagram not containing the affine root, let W Sk be the group of reflections of t∗ generated by {ws }s∈S —this does not depend on k. For T = S ∪ α0 , let WTk be the group of reflections generated by W Sk and by wkα0 . The reflection group WTk fixes an affine face of the polyhedron bounded by the

On the Structure of the Fusion Ideal

345

planes {−k α20 + ker h α0 ; ker h α1 ; ker h α2 ; . . . ; ker h αn }; we will refer to this polyhedron as the level k Weyl alcove. We would like to identify the representation module Rι∗ τt (S) [Z (c S )] in terms S,t (S) of affine Weyl invariants. For any integer kt (S) ∈ Z ∼ = HG3 (G) restricting to the ∼ twisting class τt (S) ∈ H 3 (G/Z (c )), there is an isomorphism Rι∗ τ [Z (c S )] = G

t (S)

S,t (S) t (S)

kt (S) WS

Z[W ] —this is seen by comparing the kt (S) -affine Weyl invariants in the weight lattice of G to the Weyl invariants in the kt (S) -affine slice of the weight lattice of the central extension Z (c S ) associated to the generating twist 1 ∈ Z ∼ = H 3 (G). Let k ∈ Z G

correspond to the class of the twisting τ ∈ HG3 (G). Because the twistings τi glue together to the global twisting τ , we can take ki = k for all i. This choice might seem strange when, for instance, τi is trivial (in which case ki = 0 would also be a sensible convention), but the differentials in the spectral sequence take a convenient form with respect to this description of the representation modules in terms of W Sk invariants. Altogether, the E 1 term of the spectral sequence is now

E 1p,0 =

k

Z[W ]W S .

S⊂D,|S|=n− p

This presentation allows us to absorb both the twisting maps φ and the induction maps ι∗ composing the differential into a single weight induction map, as follows. For T = S ∪ js with js ∈ S, the S-T component of the d 1 differential in the spectral sequence is (−1)s times the map k

k

Z[W ]W S → Z[W ]WT , ⎡ ⎤ ⎡ ⎤ W Sk WTk Aµ+ρ A µ+ρ S ⎦ T ⎦ ⎣ → ⎣ . W S0 WT0 A ρT Aρ S Here µ is a weight in W , and ρ S is the half sum of the positive roots of Z (c S ), with respect to the Weyl chamber determined by the subset S of the affine Dynkin diagram D—similarly for ρT . For a weight λ ∈ W , the expression AλW denotes the antisymmetrization of λ with respect to the reflection group W . Note that the invariant module Wk

Z[W ]

W Sk

is spanned by the quotients

S Aµ+ρ S W0

. Of course, this induction map is just the

Aρ SS

usual weight description of holomorphic induction, adjusted by an affine translation depending on the twisting k. Using Theorem 2.3 we can summarize this presentation of the twisted Mayer-Vietoris spectral sequence in twisted equivariant K-homology as follows. Proposition 2.4. The complex of R[G]-modules ⎤ ⎡ W Sk

k

S⊂D,|S|=n− ⎡ ⎤p

Z[W ]W S , with differ-

WTk

Aµ+ρ Aµ+ρ ential having S-T component ⎣ 0 S ⎦ → (−1)s ⎣ 0 T ⎦, where T = S ∪ js , is W

Aρ SS

W

AρTT

acyclic except in degree zero, where it has homology the level k fusion ring Fk [G].

346

C. L. Douglas

3. The Fusion Ring of G 2 The complex coming from the Mayer-Vietoris spectral sequence in twisted equivariant K-homology provides a presentation of the fusion ring. We illustrate such presentations by explicitly describing the fusion ring of G 2 . Theorem 3.1. For k > 0, the level k fusion ring of G 2 is given as an R[G 2 ]-module, and therefore as a ring, by k even, R[G 2 ]/(ρ(1, k ) ; ρ(0, k ) + ρ(0, k +1) ; ρ(1, k −1) + ρ(1, k +1) ; ρ(k+2,0) ) 2 2 2 2 2 Fk [G 2 ] = R[G 2 ]/(ρ(0, k+1 ) ; ρ(1, k−1 ) + ρ(1, k+1 ) ; ρ(0, k−1 ) + ρ(0, k+3 ) ; ρ(k+2,0) ) k odd. 2

2

2

2

2

Here ρ(a,b) is the irreducible representation of G 2 with highest weight λa1,0 λb0,1 , for λ1,0 and λ0,1 respectively the short and long fundamental weights of G 2 . Proof. The E 1 term of the twisted Mayer-Vietoris spectral sequence has the form d tw

d tw

R[G 2 ] ⊕ Rk [SO(4)] ⊕ R[SU(3)] ←−− R[U(2)v ] ⊕ R[U(2)d ] ⊕ R[U(2)h ] ←−− R[T ]. Here d tw is twisted holomorphic induction. Further U(2)v , U(2)d , and U(2)h denote the subgroups of G 2 whose representations rings are Z[W ]Wv , Z[W ]Wd , and Z[W ]Wh , where Wv , Wd , and Wh are the reflections in the vertical, diagonal, and horizontal walls of the Weyl chamber decomposition of the dual of the Lie algebra of the torus of G 2 . Note that U(2)d and U(2)h are generated by pairs of long roots of G 2 , while U(2)v is generated by a pair of short roots. Also, up to R[G 2 ]-module isomorphism the twisted representation module Rk [SO(4)] only depends on the parity of the level k. We first compute the various summands of this E 1 term as R[G 2 ]-modules. Lemma 3.2. The representation rings R[T ], R[U(2)s ], R[U(2)l ], R[SU(3)], and R[SO(4)], and the twisted representation module R1 [SO(4)] have the following structure, as R := R[G 2 ]-modules. Here U(2)s , respectively U(2)l , is the U(2) subgroup of G 2 generated by the eigenspaces for ±α for α a short, respectively long, simple root of G2: T T T T T T R[T ] = R ρ0,0 ⊕ R ρ0,1 ⊕ R ρ0,2 ⊕ R ρ1,0 ⊕ R ρ1,1 ⊕ R ρ1,2 T T T T T T ⊕R ρ2,−1 ⊕ R ρ2,0 ⊕ R ρ2,1 ⊕R ρ3,−1 ⊕ R ρ3,0 ⊕ R ρ3,1 , U(2)s U(2)s U(2)s U(2)s U(2)s U(2)s R[U(2)s ] = R ρ0,0 ⊕ R ρ0,1 ⊕ R ρ0,2 ⊕ R ρ1,0 ⊕ R ρ1,1 ⊕ R ρ1,2 , U(2)l U(2)l U(2)l U(2)l U(2)l U(2)l R[U(2)l ] = R ρ0,0 ⊕ R ρ1,0 ⊕ R ρ2,0 ⊕ R ρ3,0 ⊕ R ρ0,1 ⊕ R ρ−2,2 , SU(3)

⊕ R ρ−1,0 ,

SO(4)

⊕ R ρ1,−1 ⊕ R ρ0,−1 ,

R[SU(3)] = R ρ0,0 R[SO(4)] = R ρ0,0

SU(3)

SO(4),1

R1 [SO(4)] = R ρ0,0

SO(4)

SO(4)

SO(4),1

⊕ R ρ1,0

SO(4),1

⊕ R ρ1,−1 .

H refers to the irreducible representation of H with highest weight The notation ρa,b λa1,0 λb0,1 ; as before, λ1,0 and λ0,1 are respectively the short and long fundamental weights SO(4),1

1

of G 2 . The element ρa,b ∈ Z[W ]W SO(4) is the irreducible representation of the generating central extension of SO(4) with the indicated highest weight. The Weyl chambers for all groups have been determined by the corresponding subsets of the affine simple roots.

On the Structure of the Fusion Ideal

347

Proof. The first five presentations are determined by explicit, and tedious, computation. For example, note that R[SO(4)] ∼ = Z[m, n, p]/( p 2 − mn − m − n − 1) with ∼ R[G 2 ] = Z[a, b] action by a = m + p and b = pm − p + n + m; the elements {1, p, n} provide a module basis; here m, n, and p are respectively the irreducible representations of SO(4) with highest weights (2, −1), (0, −1), and (1, −1). (It is interesting to note that though R[SO(4)] is free over R[G 2 ] it is itself a singular variety.) Note that the basis we have chosen for R[U(2)l ] is carefully tailored for use in the proof of Theorem 3.1, and does not arise in a natural way—for instance, we do not know if it is a Gröbner basis for any presentation of the module. The calculation for the twisted representation module R1 [SO(4)] relies on the un1 twisted case. As an R[SO(4)]-module R1 [SO(4)] = Z[W ]W SO(4) is generated by the two 2-dimensional irreducible modules r and s, whose highest weights are (1, 0) and (0, 0) respectively. In light of the presentation of R[SO(4)], this implies that R1 [SO(4)] is generated over R[G 2 ] by {r, s, r 2 s, s 2 r, s 3 }. Note that the generating representations of G 2 are a = r s + r 2 − 1 and b = r 3 s − 2r s + r 2 + s 2 − 2, and that as = s 2 r + r 2 s − s and −a(r 2 s − 2s + r ) + b(r + s) = r 2 s − 4s − r + r s 2 + s 3 . The indicated basis follows. Proposition 2.4 presents the E 1 term of the twisted Mayer-Vietoris spectral sequence as a complex whose homology is the fusion ring, with terms given by affine Weyl invariants: k

k

Z[W ]WG 2 ⊕ Z[W ]W SO(4) ⊕ Z[W ]W SU(3) d1

← − Z[W ]WU(2)v ⊕ Z[W ]WU(2)d ⊕ Z[W ]

k WU(2)

h

d2

← − Z[W ]WT .

Substituting the bases from Lemma 3.2 into this presentation, the complex in question appears as in Fig. 1.

Even level Odd level U(2)v

U(2)d

U(2)h

SU(3)

SO(4)

Alcove walls

T

Singular locus

λ2

Degree 0 Degree 1

λ1

Degree 2 G2 Fig. 1. Basis of highest weights for an R[G 2 ]-module resolution of the fusion ring Fk [G 2 ]

348

C. L. Douglas d2

To compute the desired cokernel of d1 , observe that the induction map d2 :Z[W ]WT − → Z[W ]

k WU(2)

h

is surjective. This implies that d1 (Z[W ]

k WU(2)

h

) ⊂ d1 (Z[W ]WU(2)v ⊕ Z[W ]WU(2)d ). U(2)

U(2)

d Next consider d1 (Z[W ]WU(2)d ). The representations ρ(k,0)d and ρ(k−1,0) in Z[W ]WU(2)d

SU(3),k

induce respectively to the irreducible generators ρ(k,0)

k

SU(3),k

and ρ(k−1,0) of Z[W ]W SU(3) . A

priori, the remaining four generators of Z[W ]WU(2)d would create four relations in the fusion ideal. However, three of the generators of the diagonal Weyl invariants appear on the singular wall for holomorphic induction from the representation ring of the torus to the affine representation ring of the horizontal U(2). As a result, the differential has the form U(2)

T d d2 (ρ(k+1,0) ) = (V, −ρ(k+1,0) , 0), U(2)d T ) = (V , −ρ(k−1,1) , 0), d2 (ρ(k−1,1) U(2)d T ) = (V , −ρ(k−3,2) , 0), d2 (ρ(k−3,2)

where V , V , and V are elements of Z[W ]WU(2)v . It follows that U(2)

U(2)

U(2)

d d d d1 (ρ(k+1,0) , ρ(k−1,1) , ρ(k−3,2) ) ⊂ d1 (Z[W ]WU(2)v ).

The remaining generator of Z[W ]WU(2)d has differential k

U(2)

G2 d d1 (ρ(k+2,0) ) = (−ρ(k+2,0) , 0) ∈ Z[W ]WG 2 ⊕ Z[W ]W SU(3) . U(2)v , (0, k2 )

Finally, consider d1 (Z[W ]WU(2)v ). If the level k is even, the representations ρ

U(2)v U(2) SO(4),k SO(4),k , and ρ k v induce on the one hand to the generators ρ k , ρ k , and (0, 2 ) (1, 2 −1) (1, k2 −1) (0, 2 −1) k of Z[W ]W SO(4) and on the other hand to the corresponding irreducible repreρ SO(4),k (0, k2 −1) sentations in Z[W ]WG 2 . The remaining generators of Z[W ]WU(2)v have differentials

ρ

U(2)v ) (1, k2 )

d1 (ρ

U(2) d1 (ρ k v ) (0, 2 +1)

= (−ρ G 2 k , 0), (1, 2 )

= (−ρ G 2 k

(0, 2 +1)

d1 (ρ U(2)k v ) = (−ρ G 2 k (1, 2 +1)

(1, 2 +1)

SO(4),k ), (0, k2 )

, −ρ

, −ρ SO(4),k ). k (1, 2 −1)

v v v , ρ U(2)k−1 , and ρ U(2)k−3 induce to the If the level k is odd, the representations ρ U(2)k−1

(1,

k W SO(4)

2

)

(0,

2

)

(1,

2

)

and to the corresponding irreducible reprecorresponding generators of Z[W ] sentations in Z[W ]WG 2 . The remaining generators of Z[W ]WU(2)v have differentials d1 (ρ U(2)k+1v ) = (−ρ G 2 k+1 , 0), (0,

2

)

U(2) d1 (ρ k+1v ) (1, 2 ) U(2)v ) (0, k+3 2 )

d1 (ρ

(0,

=

2

)

SO(4),k (−ρ G 2 k+1 , −ρ k−1 ), (1, 2 ) (1, 2 ) SO(4),k ). (0, k−1 2 )

= (−ρ G 2 k+3 , −ρ (0,

2

)

The indicated presentations of the fusion ring follow.

On the Structure of the Fusion Ideal

349

The fusion ideal of G at level k, that is the ideal Ik ⊂ R[G] such that Fk [G] = R[G]/Ik , contains and is often generated by the irreducible representations of G whose highest weights are on the affine wall of the level k + 1 Weyl alcove; the order of this collection of irreducible representations tends to infinity with the level. However, the above theorem shows that for G 2 , the fusion ideal is generated by 4 representations, independent of the level—in Sect. 4 we will see that such a finite bound exists for all groups. Note that we expect that a minimal presentation of the G 2 fusion ring can be obtained from the presentation in Theorem 3.1, at both even and odd levels, by simply omitting the relation ρ(k+2,0) . For a group G of rank n, the fusion ideal Ik has at least n generators. When this lower bound is achieved, the fusion ring is a zero-dimensional complete intersection. The complexification of the fusion ring Fk [G] ⊗ C always has this form, but in general the structure of the integral fusion ring is more subtle. 4. Finiteness of the Fusion Ideal The fusion ring of positive energy representation of the loop group LG is the homology of a complex whose terms are sums of submodules of representation rings of central extensions of subgroups of G. In this section we describe the structure of these so-called twisted representation modules, proving in particular that they are always free as modules over the representation ring of G. We then generalize the G 2 computation from Sect. 3 to all simple simply connected groups, proving that there is a level-independent bound on the number of generators of the fusion ideal. 4.1. Twisted representation modules. In Sect. 2.3, we described a resolution of the fusion ring Fk [G] whose terms had the form of twisted representation modules Rk [H ]. Recall that the group G is simply connected, the subgroup H is the centralizer in G of an is classified by the image element of the Weyl alcove of G, the central extension H of the generating twist of G under the map HG3 (G) → HG3 (G/H ) = H H3 (∗), and the for which the central R[H ]-module Rk [H ] is the submodule of representations of H circle acts by the k th power of scalar multiplication. In this section we describe Rk [H ] as an R[G]-module, and enumerate all the subgroups H of all groups G for which there exists a k for which the module Rk [H ] is indeed twisted, that is not isomorphic to the representation ring R[H ]. All semi-simple subgroups of E 8 arising as centralizers are twisted in this sense, and for entertainment we describe the structure of these groups in detail. Modulo the Serre conjecture (that is the Quillen-Suslin theorem), Pittie [17] proved that for G a compact connected simply connected Lie group and H a closed connected subgroup of maximal rank, the representation ring R[H ] is a free R[G]-module. The same techniques are effective for studying twisted representation modules: Proposition 4.1. Let G be compact, connected, simply connected, simple, and let H be the centralizer of an element of G. For all k the twisted representation module Rk [H ] is a free R[G]-module. is the circle bundle associated to a connected principal Proof. The central extension H ¯ . Let T¯ denote the maximal torus of H¯ . The twisted cyclic-group subbundle H ⊂ H representation module Rk [H ] is an R[H ]-submodule of R[ H¯ ], and similarly Rk [T ] is an R[H ]-submodule of R[T¯ ]. Here the modules Rk [H ] and Rk [T ] are the submodules

350

C. L. Douglas

and T ; these extensions are defined by of level k representations of the extensions H the restriction of the generating twist of G to the orbits G/H and G/T . The inclusions Rk [H ] → R[ H¯ ] and Rk [T ] → R[T¯ ] commute with both the restriction and the induction maps R[ H¯ ] R[T¯ ] and Rk [H ] Rk [T ]. The induction map from R[T¯ ] to R[ H¯ ] splits the corresponding restriction, and as a result the induction map from Rk [T ] to Rk [H ] also splits the restriction. The module Rk [H ] is thereby an R[H ]-direct summand of Rk [T ]. The module Rk [T ] is isomorphic as an R[H ]-module to the untwisted module R[T ], which in turn is free as an R[G]-module. Hence the module Rk [H ] is a projective, therefore by the Quillen-Suslin theorem free R[G]-module, as desired. No doubt, twisted representation modules are free in greater generality than this proposition describes. The splitting described in the above proof can also be seen in terms of affine Weyl invariants. The representation ring R[H ] is isomorphic to the Weyl invariants Z[W ]W H in the weight lattice, and the twisted representation module Rk [H ] is isomorphic, as an k R[H ]-module, to the invariants Z[W ]W H ; here the affine Weyl action “W Hk ” is defined, as in Sect. 2.3, by reflections in the Weyl hyperplanes containing a particular face of ¯ W ⊂ t∗ denote the weight lattice of the covering group the level k Weyl alcove. Let H¯ of H —as a lattice in t∗ , it inherits the various Weyl actions. There is a commutative diagram k

Z[WO ]W H Z[W ]

/ Z[ ¯ W ]W Hk O / Z[ ¯ W]

·µ−1

·µ−1

/ Z[ ¯ W ]W H0 O / Z[ ¯ W]

The upward maps are affine induction to the invariant modules. The righthand hori¯ W )W Hk fixed by the zontal maps are multiplication by µ−1 for some weight µ ∈ ( affine action. The rightmost induction map splits the corresponding inclusion, and so the leftmost induction must similarly provide a splitting. Note, as follows, that the twisted representation module Rk [H ] is free of the same rank as R[H ], namely |WG |/|W H |. The rank of Rk [H ] as an R[G]-module is the same as the rank of Rk [H ] ⊗ R[G] Z as a Z-module. By the discussion in Sect. 2.3, this latter module is isomorphic to K Gτk (G/H ) ⊗ K G (∗) K G (G). Here τk ∈ HG3 (G/H ) is the twisting determined by restriction from the twisting k ∈ Z ∼ = HG3 (G), and the τk K-theory groups are implicitly Z/2-graded. Because K G (G/H ) ∼ = Rk [H ] is free as a module over K G (∗) ∼ = R[G], the Künneth spectral sequence for the twisted equivariant K-cohomology of G/H × G is concentrated in homological degree 0 and collapses at E 2 . The tensor product K Gτk (G/H ) ⊗ K G (∗) K G (G) is therefore isomorphic to K Gτk (G/H × G) ∼ = K [τk ] (G/H ). In this last expression [τk ] is the image of τk under the 3 map HG (G/H ) → H 3 (G/H ). The rank of K [τk ] (G/H ) as a Z-module is the same as the rank of K [τk ] (G/H ) ⊗ Q as a Q-module. Because the twisting class [τk ] is torsion, the Atiyah-Hirzebruch spectral sequence for the rational [τk ]-twisted K-cohomology of G/H collapses to an isomorphism K [τk ] (G/H ) ⊗ Q ∼ = H ∗ (G/H ) ⊗ Q. The Q-module ∗ H (G/H ) ⊗ Q has rank |WG |/|W H |, as desired. We now briefly describe when Rk [H ] is indeed twisted, for the various centralizer subgroups of the simple simply connected groups G. This information is essential for explicit computations of the fusion rings, as illustrated for G 2 in Sect. 3. As before, for

On the Structure of the Fusion Ideal

351

a subset S of the affine Dynkin diagram D of G, the corresponding group HS is the centralizer of the face FS of the Weyl alcove, where FS is defined as the fixed set of the subgroup of the affine Weyl group generated by the reflections corresponding to the simple roots {αi }i∈S . Proposition 4.2. For a subset S of the affine Dynkin diagram D of G, the representation ∨ module Rk [HS ] is isomorphic to R[HS ] as an R[HS ]-module precisely when gcdi∈ S hi divides the level k. Dynkin Proof. Recall that the Coxeter labels nh i ∈ N of the nodes αi of the nonaffine diagram D¯ are defined by −α0 = i=1 h i αi . The dual Coxeter labels h i∨ are defined in terms of the Coxeter labels by h i∨ = h i F(αi , αi )/F(α0 , α0 ), where F is an invariant inner product. Let {bi } ⊂ t be the coroots, that is the unique vectors in the tori of the fundamental algebras su(2)i such that αi (bi ) = 2. The fundamental weights λi are by definition evaluation dual to the coroots. Identify t and t∗ using the inner product F. Observe that the plane spanned by the face FSk of the level k Weyl alcove contains a weight of the torus T of G precisely when the twisted representation module Rk [HS ] is isomorphic to R[HS ] as an R[HS ]-module. When S = ıˆ ⊂ D is a subset of the affine Dynkin diagram of order n, the vertex FSk is a weight of T when ti := −2F(α0 , λi )/F(α0 , α0 ) divides the level k. Note that n 2F( nj=1 h j α j , λi ) h j F(b j , λi )F(αi , αi ) −2F(α0 , λi ) = = ti = F(α0 , α0 ) F(α0 , α0 ) F(α0 , α0 ) j=1

h i F(αi , αi ) = = h i∨ . F(α0 , α0 ) More generally, for any subset S ⊂ D, the span of the face FSk contains a weight when ∨ the greatest common divisor gcdi∈ S h i divides the level. All the dual Coxeter labels for groups of type A and type C are 1. There are no twisted representation modules for these groups, and the resulting fusion rings are simpler as a result. The dual Coxeter labelings in types B and D are 1 α0 α1

α2

α3

1

2

2

···

αn−1 2

/

αn 1 1 αn

1 α0 α1

α2

α3

1

2

2

···

αn−3

αn−2

αn−1

2

2

1

In type B, the module Rk [HS ] is twisted when k is odd and S contains α0 , α1 , and αn . In type D, the module Rk [HS ] is twisted when k is odd and S contains α0 , α1 , αn−1 , and αn . The exceptional dual Coxeter labels are

/

α0

α2

1

2

1

α0

α1

α2

1

2

3

α1

/

α3

α4

2

1

352

C. L. Douglas

1 α0 2 α2 α1

α3

α4

α5

α6

1

2

3

2

1

2 α2 α0

α1

α3

α4

α5

α6

α7

1

2

3

4

3

2

1

3 α2 α1

α3

α4

α5

α6

α7

α8

α0

2

4

6

5

4

3

2

1

Let h refer to the subgroup of G whose Lie algebra is generated by h and t. The only twisted representation module for G 2 is, as we saw in Sect. 3, the module Rk [su(2) × su(2)] for odd level k. For the group F4 , the modules Rk [sp(3)×su(2)] and Rk [su(4)× su(2)] are twisted for odd k, while Rk [su(3)2 ] is twisted for k not divisible by 3; the only other twisted module is Rk [su(2)3 ] at odd level. For E 6 , the three modules of the form Rk [su(6) × su(2)] are twisted at odd level, while Rk [su(3)3 ] is twisted when k is not divisible by 3; similarly, three modules of the form Rk [su(4) × su(2)2 ] and the module Rk [su(2)4 ] are twisted at odd level. For E 7 , we encounter the first order 4 twisting, for Rk [su(4)2 × su(2)]; the two modules Rk [su(6) × su(3)] and the module Rk [su(3)3 ] have order 3 twisting; there are fourteen modules with order 2 twist, which we do not list. The eight centralizers of the affine vertices of the Weyl alcove of E 8 are listed in Table 1; the corresponding representation modules are all twisted, with order given by the corresponding dual Coxeter label. Besides these, there are two modules of the form Rk [su(6) × su(3)] and modules Rk [su(3)3 ] and Rk [su(3)3 × su(2)] with order 3 twisting, one module Rk [su(4)2 ×su(2)] with order 4 twisting, and twenty-five modules with order 2 twisting. Table 1. Full rank semi-simple centralizers in E 8 Vertex v1 v2 v3 v4 v5 v6 v7 v8

Centralizer Spin(16)/Z/2 SU(9)/Z/3 SU(2) ×Z/4 SU(8) {SU(2) × SU(3) × SU(6)}/Z/6 SU(5) ×Z/5 SU(5) Spin(10) ×Z/4 SU(4) E 6 ×Z/3 SU(3) E 7 ×Z/2 SU(2)

Here the Z/2 action on Spin(16) is the one whose quotient is sSpin(16), and in all cases the finite cyclic group determining the quotient either injects into or surjects onto the center of each factor of the product of simply connected groups.

On the Structure of the Fusion Ideal

353

4.2. Abstract presentations of the fusion ring. Thanks to Proposition 4.1, the method we used to compute the fusion ring of G 2 suffices to determine a presentation of any fusion ring Fk [G] in terms of bases for the various twisted representation modules Rk [H ] of centralizers H in G. In particular, there is a level-independent bound on the number of generators of the fusion ideal. Theorem 4.3. For G a compact, simple, simply connected Lie group, there is a positive integer n G such that for all positive k, there is a presentation of the level k fusion ring Fk [G] of the form Fk [G] = R[G]/(ρ1 (k), ρ2 (k), . . . , ρn G (k)), where the ρi (k) ∈ R[G] are representations of G depending on the level. An upper bound on the number of generators n G of the fusion ideal is i∈ D¯ |WG |/|W H0i |, where D¯ is the nonaffine Dynkin diagram of G, and H0i is the centralizer of the edge of the Weyl chamber of G through the fundamental weight λi . Proof. By the discussion in Sect. 2.3, the fusion ring Fk [G] is the cokernel ⎛ ⎞ coker ⎝ Rk [Hıj ] − → Rk [Hıˆ ]⎠ . {i< j}∈D

i∈D

This is the homology in degree zero of the complex coming from the twisted MayerVietoris spectral sequence. The map here is twisted holomorphic induction, described explicitly in terms of Weyl invariants in the weight lattice in Proposition 2.4. The twisted induction map Rk [H0ij → Rk [Hıj ] is surjective, where 0, i, and j are distinct nodes ] − of D. Because the complex of twisted representation modules is acyclic in homological degree 1, the cokernel in question is therefore equal to ⎛ ⎞ coker ⎝ Rk [H0j ] − → Rk [Hıˆ ]⎠ . 0< j∈D

i∈D j

j

→ Rk [Hjˆ ] is surjective. Let {h 1 , . . . , h b( j) } ⊂ Rk [H0j ] denote a The map Rk [H0j ] − j

j

collection of elements mapping to a basis over R[G] for Rk [Hjˆ ], and let {g1 , . . . , gb(0, j) } ⊂ Rk [H0j ] denote a basis over R[G] of Rk [H0j ]. For h ∈ Rk [H0j ], write the differential in the complex as ∂h = ∂0 h − ∂1 h ∈ Rk [H0ˆ ] ⊕ Rk [Hjˆ ]. There are representations j j j j cr s ∈ R[G] such that ∂1 gr = cr s ∂1 h s . There results a presentation of the fusion ring: Fk [G] = R[G]/

1 cb(0,1)s ∂0 h 1s , 2 2 2 2 ∂0 g12 − c1s ∂0 h 2s , ∂0 g22 − c2s ∂0 h 2s , . . . , ∂0 gb(0,2) − cb(0,2)s ∂0 h 2s ,

∂0 g11 −

1 c1s ∂0 h 1s , ∂0 g21 −

1 1 c2s ∂0 h 1s , . . . , ∂0 gb(0,1) −

..., n n n n ∂0 g1n − c1s ∂0 h ns , ∂0 g2n − c2s ∂0 h ns , . . . , ∂0 gb(0,n) − cb(0,n)s ∂0 h ns .

The rank b(0, j) of Rk [H0j ] is equal to the ratio |WG |/|W H0j |, where W H0j is the Weyl group of the centralizer of the nonaffine edge F0j of the Weyl alcove. Note that in this presentation, the differential ∂0 coming from twisted holomorphic induction depends on j j the level k, but the various representations gr and h s only depend on the level modulo the least common multiple of the dual Coxeter labels of the affine Dynkin diagram of G.

354

C. L. Douglas

When all of the modules appearing in the complex resolving the fusion ring are untwisted, the number of generators of the fusion ideal can be reduced by making convenient choices of bases. Proposition 4.4. The ideals defining the fusion rings Fk [SU(n)], Fk [Sp(n)], F2k [Spin(n)], F2k [G 2 ], F6k [F4 ], F6k [E 6 ], F12k [E 7 ], and F60k [E 8 ] are generated by n G representations, where |WG | |WG | nG = − . |W H0i | |W Hıˆ | i∈ D¯

Proof. By the discussion in Sect. 4.1, the indicated levels are those for which all of the modules appearing in the resolution of the fusion ring are untwisted; that is, for those k, we have Rk [HS ] ∼ = R[HS ]. By Steinberg’s explicit presentations of representation rings [20], there exists a set {µ1 , . . . , µb( j) } ⊂ W of weights such that the corresponding highest weight irreducible representations {µ1 Hjˆ , . . . , µb( j) Hjˆ } ⊂ Rk [Hjˆ ] form a basis for Rk [Hjˆ ] as an R[G]-module. Under the induction map ∂1 the irreducible representations {µ1 H0j , . . . , µb( j) H0j } ⊂ Rk [H0j ] map to the basis elements {µ1 Hjˆ , . . . , µb( j) Hjˆ } ⊂ Rk [Hjˆ ]. Moreover, Steinberg’s construction ensures that the collection {µ1 , . . . , µb( j) } extends to a collection {µ1 , . . . , µb( j) , ν1 , . . . , νb(0, j)−b( j) } of weights whose corresponding highest weight irreducible representations form a basis j j j for Rk [H0j ]. In the proof of Theorem 4.3, we may take h s = µs H0j and gr = h r for j

r ≤ b( j), so that the submatrix (cr s )1≤r ≤b( j) of representations is the identity. As a result, the first b( j) generators of the b(0, j) generators of the fusion ideal induced from Rk [H0j ] are zero. We expect that the bound from this proposition applies at all levels, not only the levels at which the resolution of the fusion ring is untwisted. However, even this bound is extremely far from tight—in practice, by exploiting the singular planes for twisted holomorphic induction, one can produce a much smaller set of generators of the fusion ideal. Acknowledgements. We would like to thank Mike Hopkins for introducing us to twisted K-theory, and for suggesting the twisted equivariant Mayer-Vietoris spectral sequence as an approach, initially, to computations in non-equivariant twisted K-theory. We became distracted by other techniques for those computations, and we apologize for the resulting extreme delay in the publication of these results. We thank Eckhard Meinrenken for various helpful conversations and particularly for sharing his C ∗ -algebra perspective on the twisted Mayer-Vietoris spectral sequence. We are also indebted to Mark Haiman, Max Lieblich, and André Henriques for informative and illuminating discussions. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References 1. Atiyah, M.F., Segal, G.: Twisted K-theory. Ukr. Mat. Visn. 1(3), 287–330 (2004) 2. Atiyah, M.F., Singer, I.M.: Index theory for skew-adjoint Fredholm operators. Inst. Hautes Études Sci. Publ. Math. 37, 5–26 (1969) 3. Belavin, A.A., Polyakov, A.M., Zamolodchikov, A.B.: Infinite conformal symmetry in two-dimensional quantum field theory. Nucl. Phys. B 241(2), 333–380 (1984)

On the Structure of the Fusion Ideal

355

4. Bouwknegt, P., Ridout, D.: Presentations of Wess-Zumino-Witten fusion rings. Rev. Math. Phys. 18, 201–232 (2006) 5. Boysal, A., Kumar, S.: A conjectural presentation of fusion algebras. Preprint, http://arxiv.org/abs/0802. 3035v1[math.GR], 2008 6. Douglas, C.L.: On the twisted K-homology of simple Lie groups. Topology 45(6), 955–988 (2006) 7. Freed, D., Hopkins, M., Teleman, C.: Consistent orientation of moduli spaces. Preprint, http://arxiv.org/ abs/0711.1909v2[math.AT], 2007 8. Freed, D., Hopkins, M., Teleman, C.: Loop groups and twisted K-theory I. Preprint, http://arxiv.org/abs/ 0711.1906v1[math.AT], 2007 9. Freed, D., Hopkins, M., Teleman, C.: Loop groups and twisted K-theory II. Preprint, http://arxiv.org/abs/ 0511232v2[math.AT], 2005 10. Freed, D., Hopkins, M., Teleman, C.: Loop groups and twisted K-theory III. Preprint, http://arxiv.org/abs/ math/0312155v2[math.AT], 2003 11. Gepner, D.: Fusion rings and geometry. Commun. Math. Phys. 141(2), 381–411 (1991) 12. Henkel, M.: Conformal Invariance and Critical Phenomena. Texts and Monographs in Physics. Berlin: Springer-Verlag, 1999 13. Kitchloo, N.: Dominant K-theory and integrable highest weight representations of Kac-Moody groups. Preprint, http://arxiv.org/abs/0710.0167v2[math.AT], 2007 14. Kitchloo, N., Morava, J.: Thom prospectra for loopgroup representations. In: Elliptic cohomology, Volume 342 of London Math. Soc. Lecture Note Ser., Cambridge: Cambridge Univ. Press, 2007, pp. 214–238 15. May, J.P., Sigurdsson, J.: Parametrized homotopy theory, Volume 132 of Mathematical Surveys and Monographs. Providence, RI: Amer. Math. Soc., 2006 16. Meinrenken, E.: On the quantization of conjugacy classes. Preprint, http://arxiv.org/abs/0707. 3963v1[math.DG], 2007 17. Pittie, H.V.: Homogeneous vector bundles on homogeneous spaces. Topology 11, 199–203 (1972) 18. Rosenberg, J.: Continuous-trace algebras from the bundle theoretic point of view. J. Austral. Math. Soc. Ser. A 47(3), 368–381 (1989) 19. Segal, G.: Classifying spaces and spectral sequences. Inst. Hautes Études Sci. Publ. Math. 34, 105–112 (1968) 20. Steinberg, R.: On a theorem of Pittie. Topology 14, 173–177 (1975) 21. Tsvelik, A.M.: Quantum Field Theory in Condensed Matter Physics. Cambridge: Cambridge University Press, Second edition, 2003 22. Verlinde, E.: Fusion rules and modular transformations in 2D conformal field theory. Nucl. Phys. B 300(3), 360–376 (1988) 23. Wassermann, A.: Operator algebras and conformal field theory. III. Fusion of positive energy representations of LSU(N ) using bounded operators. Invent. Math. 133(3), 467–538 (1998) 24. Witten, E.: Quantum field theory and the Jones polynomial. In: Braid group, knot theory and statistical mechanics, II. Volume 17 of Adv. Ser. Math. Phys., River Edge, NJ: World Sci. Publ., 1994, pp. 361–451 Communicated by Y. Kawahigashi

Commun. Math. Phys. 290, 357–369 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0831-3

Communications in

Mathematical Physics

Torelli Theorem for the Deligne–Hitchin Moduli Space Indranil Biswas1 , Tomás L. Gómez2,3 , Norbert Hoffmann4,5 , Marina Logares6 1 School of Mathematics, Tata Institute of Fundamental Research,

Homi Bhabha Road, Bombay 400005, India. E-mail: [email protected]

2 Instituto de Ciencias Matemáticas (CSIC-UAM-UC3M-UCM),

Serrano 113bis, 28006 Madrid, Spain

3 Facultad de Ciencias Matemáticas, Universidad Complutense de Madrid,

28040 Madrid, Spain. E-mail: [email protected]

4 Freie Universität Berlin, Institut für Mathematik, Arnimallee 3, 14195 Berlin, Germany 5 Universität Göttingen, Mathematisches Institut, Bunsenstrasse 3-5, 37073 Göttingen,

Germany. E-mail: [email protected]; [email protected]

6 Departamento de Matematica Pura, Facultade de Ciencias,

Rua do Campo Alegre 687, 4169-007 Porto, Portugal. E-mail: [email protected] Received: 9 September 2008 / Accepted: 2 March 2009 Published online: 7 May 2009 – © Springer-Verlag 2009

Abstract: Fix integers g ≥ 3 and r ≥ 2, with r ≥ 3 if g = 3. Given a compact connected Riemann surface X of genus g, let MDH (X ) denote the corresponding SL(r, C) Deligne–Hitchin moduli space. We prove that the complex analytic space MDH (X ) determines (up to an isomorphism) the unordered pair {X, X }, where X is the Riemann surface defined by the opposite almost complex structure on X .

1. Introduction Let X be a compact connected Riemann surface of genus g, with g ≥ 2. We denote by X R the C ∞ real manifold of dimension two underlying X . Let X be the Riemann surface defined by the almost complex structure −J X on X R ; here J X is the almost complex structure of X . Fix an integer r ≥ 2. The main object of this paper is the SL(r, C) Deligne–Hitchin moduli space MDH (X ) = MDH (X , SL(r, C)) associated to X . This moduli space MDH (X ) is a complex analytic space of complex dimension 1 + 2(r 2 − 1)(g − 1), which comes with a natural surjective holomorphic map MDH (X ) −→ CP1 = C ∪ {∞}. We briefly recall from [Si1, p. 7] the description of MDH (X ) (in [Si1], the group GL(r, C) is considered instead of SL(r, C)). • The fiber of MDH (X ) over λ = 0 ∈ C ⊂ CP1 is the moduli space MHiggs (X ) of semistable SL(r, C) Higgs bundles (E, θ ) over X (see Sect. 2 for details).

358

I. Biswas, T. L. Gómez, N. Hoffmann, M. Logares

• The fiber of MDH (X ) over any λ ∈ C∗ ⊂ CP1 is canonically biholomorphic to the moduli space Mconn (X ) of holomorphic SL(r, C) connections (E, ∇) over X . In fact the restriction of MDH (X ) to C ⊂ CP1 is the moduli space MHod (X ) −→ C of λ–connections over X for the group SL(r, C) (see Sect. 3 for details). • The fiber of MDH (X ) over λ = ∞ ∈ CP1 is the moduli space MHiggs (X ) of semistable SL(r, C) Higgs bundles over X . Indeed, the complex analytic space MDH (X ) is constructed by glueing MHod (X ) to the analogous moduli space MHod (X ) −→ C of λ–connections over X . One identifies the fiber of MHod (X ) over λ ∈ C∗ with the fiber of MHod (X ) over 1/λ ∈ C∗ ; the identification is done using the fact that the holomorphic connections over both X and X correspond to representations of π1 (X R ) in SL(r, C) (see Sect. 4 for details). This construction of MDH (X ) is due to Deligne [De]. In [Hi2], Hitchin constructed the twistor space for the hyper–Kähler structure of the moduli space MHiggs (X ); the complex analytic space MDH (X ) is identified with this twistor space (see [Si1, p. 8]). We note that while both MHod (X ) and MHod (X ) are complex algebraic varieties, the moduli space MDH (X ) does not have any natural algebraic structure. If we replace X by X , then the isomorphism class of the Deligne–Hitchin moduli space clearly remains unchanged. In fact, there is a canonical holomorphic isomorphism of MDH (X ) with MDH (X ) over the automorphism of CP1 defined by λ −→ 1/λ. We prove the following theorem (see Theorem 4.1): Theorem 1.1. Assume that g ≥ 3, and if g = 3, then assume that r ≥ 3. The isomorphism class of the complex analytic space MDH (X ) determines uniquely the isomorphism class of the unordered pair of Riemann surfaces {X , X }. In other words, if MDH (X ) is biholomorphic to the Deligne–Hitchin moduli space MDH (Y ) for another compact connected Riemann surface Y , then either Y ∼ = X or Y ∼ = X. This paper is organized as follows. Higgs bundles are dealt with in Sect. 2; we also obtain a Torelli theorem for them (see Corollary 2.5). The λ–connections are considered in Sect. 3, which also contains a Torelli theorem for their moduli space (see Corollary 3.5). Finally, Sect. 4 deals with the Deligne–Hitchin moduli space; here we prove our main result. 2. Higgs Bundles Let X be a compact connected Riemann surface of genus g, with g ≥ 3. Fix an integer r ≥ 2. If g = 3, then we assume that r ≥ 3. Let Mr,O X

(2.1)

be the moduli space of semistable SL(r, C)–bundles on X . So Mr,O X parameterizes all S–equivalence classes of semistable vector bundles E over X of rank r together with

Deligne–Hitchin Moduli Space

359

an isomorphism r E ∼ = O X . The moduli space Mr,O X is known to be an irreducible normal complex projective variety of dimension (r 2 − 1)(g − 1). Let s Mr, O X ⊂ Mr,O X

(2.2)

be the open subvariety parameterizing stable SL(r, C) bundles on X . This open subvariety coincides with the smooth locus of Mr,O X according to [NR1, p. 20, Theorem 1]. Lemma 2.1. The holomorphic cotangent bundle s s T ∗ Mr, O X −→ Mr,O X

does not admit any nonzero holomorphic section. Proof. Fix a point x0 ∈ X , and consider the Hecke correspondence q

p

s Mr, O X ←− P −→ U ⊆ Mr,O X (x0 )

defined as follows: • Mr,O X (x0 ) denotes the modulispace of stable vector bundles F over X of rank r together with an isomorphism r F ∼ = O X (x0 ). • U ⊆ Mr,O X (x0 ) denotes the locus of all F for which every subbundle F ⊂ F with 0 < rank(F ) < r has negative degree; such vector bundles F are called (0 , 1)–stable (see [NR2, p. 306, Def. 5.1], [BBGN, p. 563]). • p : P −→ U is the Pr −1 –bundle whose fiber over any vector bundle F ∈ U parameterizes all hyperplanes H in the fiber Fx0 . s • q : P −→ Mr, O X sends any vector bundle F ∈ U and hyperplane H ⊆ Fx0 to the vector bundle E given by the short exact sequence 0 −→ E −→ F −→ Fx0 /H −→ 0 of coherent sheaves on X ; here the quotient sheaf Fx0 /H is supported at x0 . As Mr,O X (x0 ) is a smooth unirational projective variety (see [Se, p. 53]), it does not admit any nonzero holomorphic 1–form. The subset U ⊆ Mr,O X (x0 ) is open due to [BBGN, p. 563, Lemma 2], and the conditions on r and g ensure that the codimension of the complement Mr,O X (x0 ) \ U is at least two. Hence also H 0 (U, T ∗ U) = 0 due to Hartog’s theorem. Since H 0 (Pr −1 , T ∗ Pr −1 ) = 0, any relative holomorphic 1–form on the Pr −1 –bundle p : P −→ U vanishes identically. Thus we conclude that H 0 (P, T ∗ P) = 0. s s The same follows for the variety Mr, O X , because the algebraic map q : P −→ Mr,O X is dominant.

360

I. Biswas, T. L. Gómez, N. Hoffmann, M. Logares

We denote by K X the canonical line bundle on X . Let MHiggs (X ) = MHiggs (X , SL(r, C)) denote the moduli space of semistable SL(r, C) Higgs bundles over X . So MHiggs (X ) parameterizes all S–equivalence classes of semistable pairs (E , θ ) consisting of a vector bundle E over X of rank r together with an isomorphism r E ∼ = O X , and a Higgs field θ : E −→ E ⊗ K X with trace(θ ) = 0. The moduli space MHiggs (X ) is an irreducible normal complex algebraic variety of dimension 2(r 2 −1)(g −1) according to [Si3, p. 70, Theorem 11.1]. There is a natural embedding ι : Mr,O X → MHiggs (X )

(2.3)

defined by E −→ (E , 0). Let MsHiggs (X ) ⊂ MHiggs (X ) be the Zariski open locus of Higgs bundles (E, θ ) whose underlying vector bundle E is stable (openness of MsHiggs (X ) follows from [Ma, p. 635, Theorem 2.8(B)]). Let s pr E : MsHiggs (X ) −→ Mr, OX

(2.4)

s be the forgetful map defined by (E, θ ) −→ E, where Mr, O X is defined in (2.2). One has a canonical isomorphism ∼

s MsHiggs (X ) −→ T ∗ Mr, OX

(2.5)

s s of varieties over Mr, O X , because holomorphic cotangent vectors to a point E ∈ Mr,O X correspond, via deformation theory and Serre duality, to Higgs fields θ : E −→ E ⊗ K X with trace(θ ) = 0. In particular, MsHiggs (X ) is contained in the smooth locus

MHiggs (X )sm ⊂ MHiggs (X ). We recall that the Hitchin map H : MHiggs (X ) −→

r

H 0 (X, K X⊗i )

(2.6)

i=2

is defined by sending each Higgs bundle (E, θ ) to the characteristic polynomial of θ [Hi1,Hi2]. The multiplicative group C∗ acts on the moduli space MHiggs (X ) as follows: t · (E , θ ) = (E , tθ ). acts on the Hitchin space ri=2 H 0 (X, K X⊗i ) as

(2.7)

t · (v2 , . . . , vi , . . . , vr ) = (t 2 v2 , . . . , t i vi , . . . , t r vr ) ,

(2.8)

On the other hand,

C∗

where vi ∈ H 0 (X, K X⊗i ) and i ∈ {2, . . . , r }. The Hitchin map H in (2.6) intertwines these two actions of C∗ . Note that there is no nonzero holomorphic function on the Hitchin space which is homogeneous of degree 1 for this action (a function f is homogeneous of degree d if f (t ·(v2 , . . . , vr )) = t d f ((v2 , . . . , vr ))), because all the exponents of t in (2.8) are at least two.

Deligne–Hitchin Moduli Space

361

Lemma 2.2. The holomorphic tangent bundle s s T Mr, O X −→ Mr,O X

does not admit any nonzero holomorphic section. Proof. The proof of [Hi1, p. 110, Theorem 6.2] carries over to this situation as follows. s A holomorphic section s of T Mr, O X provides (by contraction) a holomorphic function s f : T ∗ Mr, O X −→ C

(2.9)

s on the total space of the cotangent bundle T ∗ Mr, O X , which is linear on the fibers. Under the isomorphism in (2.5), it corresponds to a function on MsHiggs (X ). The conditions on g and r imply that the complement of MsHiggs (X ) has codimension at least two in MHiggs (X ). Since the latter is normal, the function f in (2.9) extends to a holomorphic function

f : MHiggs (X ) −→ C , for example by [Sc, p. 90, Cor. 2]. Since f is linear on the fibers, we know that f is homogeneous of degree 1 for the action (2.7) of C∗ . On the moduli space MHiggs (X ), the Hitchin map (2.6) is proper [Ni, Theorem 6.1], and also its fibers are connected. Therefore, the function f is constant on the fibers of the Hitchin map. Hence f comes from a holomorphic function on the Hitchin space, which is still homogeneous of degree 1. We noted earlier that there are no nonzero holomorphic functions on the Hitchin space which are homogeneous of degree 1. Therefore, f = 0, and consequently we have f = 0 and s = 0. Corollary 2.3. The restriction of the holomorphic tangent bundle T MHiggs (X )sm −→ MHiggs (X )sm s sm does not admit any nonzero holomorphic section. to ι(Mr, O X ) ⊂ MHiggs (X )

Proof. Using Lemma 2.2, it suffices to show that the normal bundle of the embedding s sm ι : Mr, O X → MHiggs (X )

has no nonzero holomorphic sections. The isomorphism in (2.5) allows us to identify s this normal bundle with T ∗ Mr, O X . Now the assertion follows from Lemma 2.1. The next step is to show that the above property uniquely characterizes the subvariety ι(Mr,O X ) ⊂ MHiggs (X ). This will follow from the following proposition. Proposition 2.4. Let Z be an irreducible component of the fixed point locus ∗

MHiggs (X )C ⊆ MHiggs (X ). Then dim(Z ) ≤ (r 2 − 1)(g − 1), with equality only for Z = ι(Mr,O X ).

(2.10)

362

I. Biswas, T. L. Gómez, N. Hoffmann, M. Logares

Proof. The C∗ –equivariance of the Hitchin map H in (2.6) implies ∗

MHiggs (X )C ⊆ H −1 (0), because 0 is the only fixed point in the Hitchin space. We recall that H −1 (0) is called the nilpotent cone. The irreducible components of H −1 (0) are parameterized by the conjugacy classes of the nilpotent elements in the Lie algebra sl(r, C), and each irreducible component of H −1 (0) is of dimension (r 2 − 1)(g − 1) [La]. Thus dim(Z ) ≤ (r 2 − 1)(g − 1), and if equality holds, then Z is an irreducible component of the nilpotent cone H −1 (0). A result due to Simpson, [Si3, p. 76, Lemma 11.9], implies that the only irreducible component of H −1 (0) contained in the fixed ∗ point locus MHiggs (X )C defined in (2.10) is the image ι(Mr,O X ) of the embedding in (2.3). Corollary 2.5. The isomorphism class of the complex analytic space MHiggs (X ) determines uniquely the isomorphism class of the Riemann surface X , meaning if MHiggs (X ) is biholomorphic to MHiggs (Y ) for another compact connected Riemann surface Y of the same genus g, then Y ∼ = X. Proof. Let Z ⊂ MHiggs (X ) be a closed analytic subset with the following three properties: • Z is irreducible and has complex dimension (r 2 − 1)(g − 1). • The smooth locus Z sm ⊆ Z lies in the smooth locus MHiggs (X )sm ⊂ MHiggs (X ). • The restriction of the holomorphic tangent bundle T MHiggs (X )sm to the subspace Z sm ⊂ MHiggs (X )sm has no nonzero holomorphic sections. By Corollary 2.3, the image ι(Mr,O X ) of the embedding ι in (2.3) has these properties. The action (2.7) of C∗ on MHiggs (X ) defines a holomorphic vector field MHiggs (X )sm −→ T MHiggs (X )sm . The third assumption on Z says that any holomorphic vector field on MHiggs (X )sm vanishes on Z sm . Therefore, it follows that the stabilizer of each point in Z sm ⊂ MHiggs (X ) has nontrivial tangent space at 1 ∈ C∗ , and hence the stabilizer must be the full group C∗ . ∗ This shows that the fixed point locus MHiggs (X )C ⊆ MHiggs (X ) contains Z sm , and hence also contains its closure Z in MHiggs (X ). Due to Proposition 2.4, this can only happen for Z = ι(Mr,O X ). In particular, we have Z ∼ = Mr,O X . We have just shown that the isomorphism class of MHiggs (X ) determines the isomorphism class of Mr,O X . The latter determines the isomorphism class of X due to a theorem of Kouvidakis and Pantev [KP, p. 229, Theorem E]. Remark 2.6. In [BG], an analogous Torelli theorem is proved for Higgs bundles (E , θ ) such that the rank and the degree of the underlying vector bundle E are coprime. 3. The λ–Connections In this section, we consider vector bundles with connections, and more generally with λ–connections in the sense of [Si2, p. 87] and [Si1, p. 4]. We denote by MHod (X ) = MHod (X , SL(r, C)) the moduli space of triples of the form (λ , E , ∇), where λ is a complex number, and (E , ∇) is a λ–connection on X for the group SL(r, C). We recall that given any λ ∈ C, a λ–connection on X for the group SL(r, C) is a pair (E , ∇), where

Deligne–Hitchin Moduli Space

363

• E −→ X is a holomorphic vector bundle of rank r together with an isomorphism r E ∼ = OX . • ∇ : E −→ E ⊗ K X is a C–linear homomorphism of sheaves satisfying the following two conditions: (1) If f is a locally defined holomorphic function on O X and s is a locally defined holomorphic section of E, then

(2) The operator

r

∇( f s) = f · ∇(s) + λ · s ⊗ d f. E −→ ( r E) ⊗ K X induced by ∇ coincides with λ · d.

The moduli space MHod (X ) is a complex algebraic variety of dimension 1 + 2(r 2 − 1)(g − 1). It is equipped with a surjective algebraic morphism pr λ : MHod (X ) −→ C

(3.1)

defined by (λ, E, ∇) −→ λ. A 0–connection is a Higgs bundle, so MHiggs (X ) = pr −1 λ (0) ⊂ MHod (X ) is the moduli space of Higgs bundles considered in the previous section. In particular, the embedding (2.3) of Mr,O X into MHiggs (X ) also gives an embedding of Mr,O X into MHod (X ). Slightly abusing notation, we denote this embedding again by ι : Mr,O X → MHod (X ). It maps the stable locus

(3.2)

s Mr, O X ⊂ Mr,O X

into the smooth locus MHod (X )sm ⊂ MHod (X ).

(3.3)

We let C∗ act on MHod (X ) as t · (λ, E, ∇) = (t · λ, E, t · ∇).

(3.4)

This extends the C∗ action on MHiggs (X ) introduced above in formula (2.7). Proposition 3.1. Let Z be an irreducible component of the fixed point locus ∗

MHod (X )C ⊆ MHod (X ). Then dim(Z ) ≤ (r 2 − 1)(g − 1), with equality only for Z = ι(Mr,O X ). Proof. A point (λ, E, ∇) ∈ MHod (X ) can only be fixed by C∗ if λ = 0. Hence Z is automatically contained in MHiggs (X ). Now the claim follows from Proposition 2.4. A 1–connection is a holomorphic connection in the usual sense, so Mconn (X ) := pr −1 λ (1) ⊂ MHod (X )

(3.5)

is the moduli space of SL(r, C) holomorphic connections (E, ∇) over X . We denote by Msconn (X ) ⊂ Mconn (X )

and

MsHod (X ) ⊂ MHod (X )

the Zariski open subvarieties where the underlying vector bundle E is stable (openness follows from [Ma, p. 635, Theorem 2.8(B)]).

364

I. Biswas, T. L. Gómez, N. Hoffmann, M. Logares

Proposition 3.2. The forgetful map s pr E : Msconn (X ) −→ Mr, OX

(3.6)

defined by (E , ∇) −→ E admits no holomorphic section. Proof. This map pr E is surjective, because a criterion due to Atiyah and Weil implies that every stable vector bundle E on X of degree zero admits a holomorphic connection. In fact, E admits a unique unitary holomorphic connection according to a theorem of Narasimhan and Seshadri [NS]; this defines a canonical C ∞ section s s Mr, O X −→ Mconn (X )

(3.7)

of the map pr E . Since any two holomorphic SL(r, C)–connections on E differ by a Higgs field θ : E −→ E ⊗ K X with trace(θ ) = 0, the map pr E in (3.6) is a holomorphic s s torsor under the holomorphic cotangent bundle T ∗ Mr, O X −→ Mr,O X . Given a complex manifold M, we denote by TR M the tangent bundle of the underlying real manifold MR , and by JM : TR M −→ TR M the almost complex structure of M. Let : X −→ M

(3.8)

be a holomorphic torsor under a holomorphic vector bundle V −→ M. To each C ∞ section s : M −→ X of , we can associate a (0, 1)–form ∂s ∈ C ∞ (M, 0,1 M ⊗ V) in the following way. The vector bundle homomorphism := ds + JX ◦ ds ◦ JM : TR M −→ s ∗ TR X ds satisfies the identity + ds ◦ JM = JX ◦ ds − ds ◦ JM − JX ◦ ds + ds ◦ JM = 0 , JX ◦ ds

(3.9)

and, since is holomorphic, we also have = d ◦ ds + JM ◦ d ◦ ds ◦ JM = id − id = 0. d ◦ ds

(3.10)

maps into the subbundle of vertical tangent The equation in (3.10) means that ds vectors in s ∗ TR X , which is canonically isomorphic to VR (the real vector bundle under as a real 1–form lying the complex vector bundle V). Thus we can consider ds ∈ C ∞ (M, T ∗ M ⊗ VR ). ds R Identify TR M with T 0,1 M using the R–linear isomorphism defined by √ v −→ v − −1·JM (v), is and also identify VR with V using the identity map. From (3.9) it follows that ds actually a C–linear homomorphism from T 0,1 M to V in terms of these identifications. Let ∂s ∈ C ∞ (M, 0,1 M ⊗ V) From the construction of ∂s it is clear be the (0 , 1)–form with values in V defined by ds. that

Deligne–Hitchin Moduli Space

365

• ∂s vanishes if and only if s is holomorphic, and • ∂s is ∂–closed. Therefore, ∂s defines a Dolbeault cohomology class [ ] := [∂s] ∈ H∂0,1 (M, V) ∼ = H 1 (M, V).

(3.11)

Since V acts on : X −→ M, each section v ∈ C ∞ (M, V) acts on the sections of ; we denote this action by s −→ v + s. The above construction implies that ∂(v + s) = ∂v + ∂s.

(3.12)

Consequently, the Dolbeault cohomology class [ ] in (3.11) does not depend on the choice of the C ∞ section s. From (3.12) it also follows that [ ] vanishes if and only if the torsor in (3.8) admits a holomorphic section. s We now take to be the torsor pr E in (3.6) under the cotangent bundle T ∗ Mr, OX , ∞ and we take s to be the C section in (3.7). For this case, the class s ∗ s [∂s] ∈ H 1 (Mr, O X , T Mr,O X )

(3.13)

has been computed in [BR, p. 308, Theorem 2.11]; the result is that it is a nonzero s multiple of c1 ( ), where is the ample generator of Pic(Mr, O X ). In particular, the cohomology class (3.13) of the torsor pr E in question is nonzero. Therefore, pr E does not admit any holomorphic section. We note that the forgetful map pr E defined in Proposition 3.2 extends canonically from Msconn (X ) to MsHod (X ). Slightly abusing notation, we denote this extended map again by s pr E : MsHod (X ) −→ Mr, OX .

This map is defined by (λ, E, ∇) −→ E, and it also extends the map pr E in (2.4). Corollary 3.3. The only holomorphic map s s s : Mr, O X −→ MHod (X )

with pr E ◦ s = id is the restriction s s ι : Mr, O X → MHod (X ) of the embedding ι defined in (3.2).

Proof. The composition s

pr λ

s s Mr, O X −→ MHod (X ) −→ C, s where pr λ is the projection in (3.1), is a holomorphic function on Mr, O X , and hence it ∗ is a constant function. Up to the C action in (3.4), we may assume that this constant is either 0 or it is 1. s If this constant were 1, then s would factor through pr −1 λ (1) = Mconn (X ), which would contradict Proposition 3.2. s Hence this constant is 0, and s factors through pr −1 λ (0) = MHiggs (X ). Thus s corresponds, under the isomorphism (2.5), to a holomorphic global section of the vector s bundle T ∗ Mr, O X . But any such section vanishes due to Lemma 2.1; this means that s is indeed the restriction of the canonical embedding ι in (3.2).

366

I. Biswas, T. L. Gómez, N. Hoffmann, M. Logares

Corollary 3.4. As in (3.3), let MHod (X )sm be the smooth locus of MHod (X ). The restriction of the holomorphic tangent bundle T MHod (X )sm −→ MHod (X )sm s sm does not admit any nonzero holomorphic section. to ι(Mr, O X ) ⊂ MHod (X )

Proof. We denote the holomorphic normal bundle of the restricted embedding s sm ι : Mr, O X → MHod (X ) s by N . Due to Lemma 2.2, it suffices to show that this vector bundle N over Mr, O X has no nonzero holomorphic sections. One has a canonical isomorphism ∼

MsHod (X ) −→ N

(3.14)

s of varieties over Mr, O X , defined by sending any (λ, E, ∇) to the derivative at t = 0 of the map

C −→ MHod (X ),

t −→ (t · λ , E , t · ∇).

Using this isomorphism, from Corollary 3.3 we conclude that vector bundle N over s Mr, O X does not have any nonzero holomorphic sections. This completes the proof. Corollary 3.5. The isomorphism class of the complex analytic space MHod (X ) determines uniquely the isomorphism class of the Riemann surface X . Proof. The proof is similar to that of Corollary 2.5. Let Z ⊂ MHod (X ) be a closed analytic subset satisfying the following three conditions: • Z is irreducible and has complex dimension (r 2 − 1)(g − 1). • The smooth locus Z sm ⊆ Z lies in the smooth locus MHod (X )sm ⊂ MHod (X ). • The restriction of the holomorphic tangent bundle T MHod (X )sm to the subspace Z sm has no nonzero holomorphic sections. From Corollary 3.4 we know that ι(Mr,O X ) satisfies all these conditions. Consider the vector field on MHod (X )sm given by the action of C∗ on MHod (X ) in (3.4). From the third condition on Z we know∗that this vector field vanishes on Z sm . This implies that the fixed point locus MHod (X )C contains Z sm , and hence also contains its closure Z . Therefore, using Proposition 3.1 it follows that Z = ι(Mr,O X ); in particular, Z is isomorphic to Mr,O X . Finally the isomorphism class of X is recovered from the isomorphism class of Mr,O X using [KP, p. 229, Theorem E]. 4. The Deligne–Hitchin Moduli Space We recall Deligne’s construction [De] of the Deligne–Hitchin moduli space MDH (X ), as described in [Si1, p. 7]. Let X R be the C ∞ real manifold of dimension two underlying X . Fix a point x0 ∈ X R . Let Mrep (X R ) := Hom(π1 (X R , x0 ), SL(r, C))//SL(r, C)

Deligne–Hitchin Moduli Space

367

denote the moduli space of representations ρ : π1 (X R , x0 ) −→ SL(r, C); the group SL(r, C) acts on Hom(π1 (X R , x0 ), SL(r, C)) through the adjoint action of SL(r, C) on itself. Since the fundamental groups for different base points are identified up to an inner automorphism, the space Mrep (X R ) is independent of the choice of x0 . Hence we will omit any reference to x0 . The Riemann–Hilbert correspondence defines a biholomorphic isomorphism ∼

Mrep (X R ) −→ Mconn (X ).

(4.1)

It sends a representation ρ : π1 (X R ) −→ SL(r, C) to the associated holomorphic SL(r, C)–bundle E ρX over X , endowed with the induced connection ∇ρX . The inverse of (4.1) sends a connection to its monodromy representation, which makes sense because any holomorphic connection on a Riemann surface is automatically flat. Given λ ∈ C∗ , we can similarly associate to a representation ρ : π1 (X R ) −→ SL(r, C) the λ–connection (E ρX , λ · ∇ρX ). This defines a holomorphic open embedding C∗ × Mrep (X R ) −→ MHod (X )

(4.2)

∗ onto the open locus pr −1 λ (C ) ⊂ MHod (X ) of all triples (λ , E , ∇) with λ = 0. Let J X denote the almost complex structure of the Riemann surface X . Then −J X is also an almost complex structure on X R ; the Riemann surface defined by −J X will be denoted by X . We can also consider the moduli space MHod (X ) of λ–connections on X , etcetera. Now one defines the Deligne–Hitchin moduli space

MDH (X ) := MHod (X ) ∪ MHod (X ) by glueing MHod (X ) to MHod (X ), along the image of C∗ × Mrep (X R ) for the map in (4.2). More precisely, one identifies, for each λ ∈ C∗ and each representation ρ ∈ Mrep (X R ), the two points (λ , E ρX , λ · ∇ρX ) ∈ MHod (X )

and

(λ−1 , E ρX , λ−1 · ∇ρX ) ∈ MHod (X ).

This identification yields a complex analytic space MDH (X ) of dimension 2(r 2 − 1) (g − 1) + 1. This analytic space does not possess a natural algebraic structure since the Riemann–Hilbert correspondence (4.1) is holomorphic and not algebraic. The forgetful map pr λ in (3.1) extends to a natural holomorphic morphism pr : MDH (X ) −→ CP1 = C ∪ {∞} whose fiber over λ ∈ CP1 is canonically biholomorphic to • the moduli space MHiggs (X ) of SL(r, C) Higgs bundles on X if λ = 0, • the moduli space MHiggs (X ) of SL(r, C) Higgs bundles on X if λ = ∞, • the moduli space Mrep (X R ) of equivalence classes of representations Hom(π1 (X R , x0 ), SL(r, C))//SL(r, C) if λ = 0 , ∞. Now we are in a position to prove the main result.

(4.3)

368

I. Biswas, T. L. Gómez, N. Hoffmann, M. Logares

Theorem 4.1. The isomorphism class of the complex analytic space MDH (X ) determines uniquely the isomorphism class of the unordered pair of Riemann surfaces {X , X }. Proof. We denote by MDH (X )sm ⊂ MDH (X ) the smooth locus, and by T MDH (X )sm −→ MDH (X )sm its holomorphic tangent bundle. Since MHod (X ) is open in MDH (X ), Corollary 3.4 implies that the restriction of T MDH (X )sm to s sm ι(Mr, ⊂ MDH (X )sm O X ) ⊂ MHod (X )

(4.4)

does not admit any nonzero holomorphic section. The same argument applies if we replace X by X . Since MHod (X ) is also open in MDH (X ), the restriction of T MDH (X )sm to s sm ι(Mr, ⊂ MDH (X )sm O ) ⊂ MHod (X ) X

(4.5)

does not admit any nonzero holomorphic section either. Here Mr,O X is the moduli space of holomorphic SL(r, C)–bundles E on X , and ι denotes, as in (2.3) and in (3.2), the canonical embedding of Mr,O X into MHiggs (X ) ⊂ MHod (X ) defined by E −→ (E, 0). The rest of the proof is similar to that of Corollary 2.5. We will extend the C∗ action on MHod (X ) in (3.4) to MDH (X ). First consider the action of C∗ on MHod (X ) defined as in (3.4) by substituting X in place of X . Note that the action of any t ∈ C∗ on the open subset C∗ × Mrep (X R ) −→ MHod (X ) in (4.2) coincides with the action of 1/t on C∗ × Mrep (X R ) −→ MHod (X ). Therefore, we get an action of C∗ on MDH (X ). Let η : MDH (X )sm −→ T MDH (X )sm

(4.6)

be the holomorphic vector field defined by this action of C∗ . Let Z ⊂ MDH (X ) be a closed analytic subset with the following three properties: • Z is irreducible and has complex dimension (r 2 − 1)(g − 1). • The smooth locus Z sm ⊆ Z lies in the smooth locus MDH (X )sm ⊂ MDH (X ). • The restriction of the holomorphic tangent bundle T MDH (X )sm to the subspace Z sm has no nonzero holomorphic sections. We noted above that both ι(Mr,O X ) and ι(Mr,O X ) (see (4.4) and (4.5)) satisfy these conditions. The third condition on Z implies that the vector field η in (4.6) vanishes on Z sm . It ∗ follows that the fixed point locus MDH (X )C contains Z sm , and hence also contains its closure Z . Therefore, using Proposition 3.1 we conclude that Z is one of ι(Mr,O X ) and ι(Mr,O X ). Using [KP, p. 229, Theorem E] we now know that the isomorphism class of the analytic space MDH (X ) determines the isomorphism class of the unordered pair of Riemann surfaces {X , X }. This completes the proof of the theorem. Acknowledgements. The first and second authors were supported by the grant MTM2007-63582 of the Spanish Ministerio de Educación y Ciencia. The second author was also supported by the grant 200650M066 of Comunidad Autónoma de Madrid. The third author was supported by the SFB/TR 45 ‘Periods, moduli spaces and arithmetic of algebraic varieties’. The fourth author was supported by the grant SFRH/BPD/27039/2006 of the Fundação para a Ciência e a Tecnologia and CMUP, financed by F.C.T. (Portugal) through the programmes POCTI and POSI, with national and European Community structural funds.

Deligne–Hitchin Moduli Space

369

References [BBGN] [BG] [BR] [De] [Hi1] [Hi2] [KP] [La] [Ma] [NR1] [NR2] [NS] [Ni] [Sc] [Se] [Si1] [Si2] [Si3]

Biswas, I., Brambila-Paz, L., Gómez, T.L., Newstead, P.E.: Stability of the picard bundle. Bull. London Math. Soc. 34, 561–568 (2002) Biswas, I., Gómez, T.L.: A Torelli theorem for the moduli space of Higgs bundles on a curve. Quart. J. Math. 54, 159–169 (2003) Biswas, I., Raghavendra, N.: Curvature of the determinant bundle and the kähler form over the moduli of parabolic bundles for a family of pointed curves. Asian J. Math. 2, 303–324 (1998) Deligne, P.: Letter to C. T. Simpson (March 20, 1989) Hitchin, N.J.: Stable bundles and integrable systems. Duke Math. J. 54, 91–114 (1987) Hitchin, N.J.: The self-duality equations on a Riemann surface. Proc. Lond. Math. Soc. 55, 59–126 (1987) Kouvidakis, A., Pantev, T.: The automorphism group of the moduli space of semistable vector bundles. Math. Ann. 302, 225–268 (1995) Laumon, G.: Un analogue global du cône nilpotent. Duke Math. J. 57, 647–671 (1988) Maruyama, M.: Openness of a family of torsion free sheaves. J. Math. Kyoto Univ. 16, 627–637 (1976) Narasimhan, M.S., Ramanan, S.: Moduli of vector bundles on a compact Riemann surface. Ann. Math. 89, 14–51 (1969) Narasimhan, M.S., Ramanan, S.: Geometry of Hecke cycles. I. In: C. P. Ramanujan—a tribute, Tata Inst. Fund. Res. Studies in Math. 8, Berlin-New York: Springer, 1978, pp. 291–345 Narasimhan, M.S., Seshadri, C.S.: Stable and unitary vector bundles on a compact Riemann surface. Ann. of Math. 82, 540–567 (1965) Nitsure, N.: Moduli space of semistable pairs on a curve. Proc. Lond. Math. Soc. 62, 275–300 (1991) Scheja, G.: Fortsetzungssätze der komplex-analytischen cohomologie und ihre algebraische charakterisierung. Math. Ann. 157, 75–94 (1964) Seshadri, C.S.: Fibrés vectoriels sur les courbes algébriques (notes written by J.-M. Drézet), Astérisque 96, Paris: Société Math. de France, 1982 Simpson, C.T.: A weight two phenomenon for the moduli of rank one local systems on open varieties. http://arxiv.org/abs/:0710.2800.v1[math.AG], 2007 Simpson, C.T.: Moduli of representations of the fundamental group of a smooth projective variety. I. Inst. Hautes Études Sci. Publ. Math. 79, 47–129 (1994) Simpson, C.T.: Moduli of representations of the fundamental group of a smooth projective variety. II. Inst. Hautes Études Sci. Publ. Math. 80, 5–79 (1994)

Communicated by N. A. Nekrasov

Commun. Math. Phys. 290, 371–387 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0842-0

Communications in

Mathematical Physics

Growth of Sobolev Norms and Controllability of the Schrödinger Equation Vahagn Nersesyan Laboratoire de Mathématiques, Université de Paris-Sud XI, Bâtiment 425, 91405 Orsay Cedex, France. E-mail: [email protected] Received: 18 September 2008 / Accepted: 27 February 2009 Published online: 15 May 2009 – © Springer-Verlag 2009

Abstract: In this paper we obtain a stabilization result for both linear and nonlinear Schrödinger equations under generic assumptions on the potential. Then we consider the Schrödinger equations with a potential which has a random time-dependent amplitude. We show that if the distribution of the amplitude is sufficiently non-degenerate, then any trajectory of the system is almost surely non-bounded in Sobolev spaces. Contents 1. 2. 3.

Introduction . . . . . . . . . . . . . . . . . Preliminaries . . . . . . . . . . . . . . . . . Controllability of the Schrödinger Equation . 3.1 Stabilization result . . . . . . . . . . . 3.2 Approximate controllability . . . . . . . 3.3 Proof of Theorem 3.3 . . . . . . . . . . 3.4 Genericity of Condition 3.2 . . . . . . . 4. Applications . . . . . . . . . . . . . . . . . 4.1 Nonlinear Schrödinger equation . . . . 4.2 Randomly forced Schrödinger equation 4.2.1 Growth of Sobolev norms . . . . . 4.2.2 Proof of Theorem 4.7. . . . . . . . References . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

371 373 374 374 376 378 380 382 382 384 384 385 386

1. Introduction We consider the problem i z˙ = −z + V (x)z + u(t)Q(x)z, x ∈ D, z|∂ D = 0, z(0, x) = z 0 (x),

(1.1) (1.2) (1.3)

372

V. Nersesyan

where D ⊂ Rm is a bounded domain with smooth boundary, V, Q ∈ C ∞ (D, R) are given functions, u is the control, and z is the state. Under some hypotheses on V and Q (see Condition 3.2), we prove a stabilization result for problem (1.1), (1.2). This result is then applied to show that almost any trajectory of a random Schrödinger equation is non-bounded in Sobolev spaces. As it is shown in Sect. 3.4, the hypotheses on V and Q are in some sense generic. Let us recall some previous results on the controllability of Schrödinger equations. A general negative result for bilinear control systems is obtained by Ball, Marsden and Slemrod [5]. Application of this result to (1.1), (1.2) implies that the set of attainable points from any initial data in H 2 admits a dense complement in H 2 . We refer the reader to the papers [1,3,4,27,29] and the references therein for controllability of finite-dimensional systems. In [7], Beauchard proves that one can obtain an exact controllability result if the phase space is properly chosen. More precisely, in the case m = 1, V (x) = 0 and Q(x) = x exact controllability of the problem is proved in H 7 -neighborhoods of the eigenstates. Beauchard and Coron [8] established later a partial global exact controllability result, showing that the system in question is also controlled between some neighborhoods of any two eigenstates. A stabilization property for finite-dimensional approximations of the Schrödinger equation is obtained by Beauchard et al., in [9], which was generalized by Beauchard and Mirrahimi [10] to the infinite-dimensional case for m = 1, V (x) = 0 and Q(x) = x (see also the paper by Mirrahimi [22]). Recently Chambrion et al. [14], under some assumptions on V, Q ∈ C ∞ (D, R), derived the approximate controllability of (1.1), (1.2) in L 2 from the controllability of finite-dimensional projections. See also the papers [6,12,16,20,21,32] and the references therein for controllability results by boundary controls and controls supported in a given subdomain and the book [15] by Coron for an introduction to the later developments and methods in the control theory of nonlinear systems. The main result of this paper states that any neighborhood of the first eigenfunction of the operator − + V is attainable from any initial point z 0 ∈ H 2 . This result, combined with the time reversibility property of the system and the fact that the equation is linear, implies an approximate controllability property in L 2 . Let us describe in a few words the main ideas of the proof. As V, Q and u are realvalued, the L 2 norm is preserved by the flow of the system. Thus it suffices to consider the restriction of (1.1), (1.2) to the unit sphere S in L 2 . We introduce a Lyapunov function V(z) that controls the H 2 -norm of z. The infimum of V on the sphere S is attained at the first eigenfunction e1,V of the operator − + V . Using the same idea as in [9], which consists in generating trajectories with Lyapunov techniques, we choose a feedback law u(z) such that the function V decreases on the solutions of the corresponding system: V(Ut (z 0 , u)) < V(z 0 ), t > 0, where Ut (·, u) is the resolving operator of (1.1), (1.2). Then iterating this construction and using the fact that the system is autonomous, we prove that the H 2 -weak ω-limit set of any solution contains the minimum point of the function V, i.e. the eigenfunction e1,V (see Sects. 3.1 and 3.3). The ideas of the proof work also in the case of the nonlinear equation. Furthermore, these controllability results are generalized in [24] to higher Sobolev spaces H l , l > 2. Under the same hypotheses on V and Q, we prove global approximate controllability for problem (1.1), (1.2). We next use the above-mentioned controllability result to study the large time behavior of solutions of a random Schrödinger equation. We prove that if the distribution of the random potential is sufficiently non-degenerate (see Condition 4.6), then the trajectories of the system are almost surely non-bounded. It is interesting to compare this

Growth of Sobolev Norms and Controllability of the Schrödinger Equation

373

result with that of Eliasson and Kuksin [17], where KAM-technique is applied to prove the reducibility of a linear Schrödinger equation with time-quasiperiodic potential. In particular, it is proved that for most values of the frequency vector the Sobolev norms of the solutions are bounded. Examples of unbounded solutions of 1D linear Schrödinger equations with some random potentials are constructed in [11,18], where also the growth rate estimates are given. Our assumptions on the distribution of the potential are more general, and the proof also works in the case of the nonlinear equation. However, at this level of generality, we do not have any lower bound on the rate of growth of Sobolev norms. Let us mention also the papers [30,31] by Wang, where the growth of Sobolev norms for linear Schrödinger equations are estimated and some time periodic potentials are constructed such that the Sobolev norms of the solutions for the corresponding problem remain bounded. The idea of the proof is to show that the first entrance time to any ball centered at the origin in H −ε is almost surely finite. This implies immediately that almost any trajectory of the system approaches the origin arbitrarily closely in H −ε . Combining this with the fact that the L 2 -norm is preserved, we conclude that almost any trajectory is non-bounded in H l for any l > 0. In conclusion, let us note that the results of this paper imply the irreducibility in L 2 of the Markov chain associated with (1.1). This property is not sufficient to prove the ergodicity of the dynamics generated by the Schrödinger equation with a random potential. However, in the case of finite-dimensional approximations, that question is treated in the paper [23], in which an exponential mixing property is established. We hope the methods developed in this work will help to tackle the infinite-dimensional case. Notation. In this paper we use the following notation. Let D ⊂ Rm , m ≥ 1 be a bounded domain with smooth boundary. Let H s := H s (D) be the Sobolev space of order s ∈ R endowed with the norm · s . Consider the operators −z + V z, z ∈ D(− + V ) := H01 ∩ H 2 , where V ∈ C ∞ (D, R). We denote by {λ j,V } and {e j,V } the sets of eigenvalues and normalized eigenfunctions of − + V . Let ·, · and · be the scalar product and the norm in the space L 2 . Let S be the unit sphere in L 2 . For a Banach space X , we shall denote by B X (a, r ) the open ball of radius r > 0 centered at a ∈ X . 2. Preliminaries The following lemma establishes the well-posedness of system (1.1)–(1.3). 1 ([0, ∞), R) Lemma 2.1. For any z 0 ∈ H01 ∩ H 2 (resp. z 0 ∈ L 2 ) and for any u ∈ L loc 2 problem (1.1)–(1.3) has a unique solution z ∈ C([0, ∞), H ) (resp. z ∈ C([0, ∞), L 2 )). Furthermore, the resolving operator Ut (·, u) : L 2 → L 2 taking z 0 to z(t) satisfies the relation

Ut (z 0 , u) = z 0 , t ≥ 0.

(2.1)

See [13] for the proof. Notice that the conservation of L 2 -norm implies that it suffices to consider the controllability properties of (1.1), (1.2) on the unit sphere S. In Sect. 4.2, we replace the control u by a random process. Namely, we consider the equation i z˙ = −z + V (x)z + β(t)Q(x)z, x ∈ D,

374

V. Nersesyan

where β(t) is a random process of the form β(t) =

+∞

Ik (t)ηk (t − k), t ≥ 0.

(2.2)

k=0

Here Ik (·) is the indicator function of the interval [k, k + 1) and ηk are independent identically distributed (i.i.d.) random variables in L 2 ([0, 1], R). Let z 0 be a L 2 -valued random variable independent of {ηk }. Denote by Fk the σ -algebra generated by z 0 , η0 , . . . , ηk−1 . Lemma 2.2. Under the above conditions, Uk (·, β) is a homogeneous Markov chain with respect to Fk . This lemma is proved by standard arguments (e.g., see [25]). 3. Controllability of the Schrödinger Equation 3.1. Stabilization result . Let us introduce the Lyapunov function V(z) := α(− + V )P1,V z2 + 1 − |z, e1,V |2 , z ∈ S ∩ H01 ∩ H 2 , where α > 0 and P1,V z := z − z, e1,V e1,V is the orthogonal projection in L 2 onto the closure of the vector span of {ek,V }k≥2 . Notice that V(z) ≥ 0 for all z ∈ S ∩ H01 ∩ H 2 and V(z) = 0 if and only if z = ce1,V , |c| = 1. For any z ∈ S ∩ H01 ∩ H 2 , we have V(z) ≥ α(− + V )P1,V z2 ≥

α α (P1,V z)2 − C1 ≥ z2 − C2 . 2 4

Thus C(1 + V(z)) ≥ z2

(3.1)

for some constant C > 0. Following the ideas of [9], we wish to choose a feedback law u(·) such that d V(z(t)) ≤ 0 dt for the solution z(t) of (1.1)–(1.3). Let us assume that z(t) ∈ H01 ∩ H 2 for all t ≥ 0. Using (1.1), we get d V(z(t)) = 2α Re((− + V )P1,V z˙ , (− + V )P1,V z) − 2 Re(˙z , e1,V e1,V , z) dt = 2α Re((− + V )P1,V (iz − i V z − iu Qz), (− + V )P1,V z) −2 Re(iz − i V z − iu Qz, e1,V e1,V , z). Integrating by parts and using the fact that (− + V )P1,V z|∂ D = z|∂ D = e1,V |∂ D = 0,

Growth of Sobolev Norms and Controllability of the Schrödinger Equation

375

we obtain 2α Re(−i(− + V )2 P1,V z, (− + V )P1,V z) − 2 Re(iz − i V z, e1,V e1,V , z) = 2α Re(−i∇(− + V )P1,V z, ∇(− + V )P1,V z) +2α Re(−i V (− + V )P1,V z, (− + V )P1,V z) +2λ1,V Re(i z, e1,V e1,V , z) = 0. Thus d V(z(t)) = 2u Im(α(− + V )P1,V (Qz), (− + V )P1,V z − Qz, e1,V e1,V , z). dt Let us take u(z) := −δ Im(α(−+V )P1,V (Qz), (− + V )P1,V z−Qz, e1,V e1,V , z),

(3.2)

where δ > 0 is a small constant. Then 2 d V(z(t)) = − u 2 (z(t)). dt δ

(3.3)

i z˙ = −z + V (x)z + u(z)Q(x)z, x ∈ D.

(3.4)

Consider the equation

Proposition 3.1. For any z 0 ∈ H01 ∩ H 2 problem (3.4), (1.2), (1.3) has a unique solution z ∈ C([0, ∞), H01 ∩ H 2 ). Moreover, the following properties hold: (i) If z 0 ∈ H01 ∩ H 2 , then z ∈ C([0, ∞), H01 ∩ H 2 ). (ii) Let Ut (·) : H01 ∩ H 2 → H01 ∩ H 2 be the resolving operator. If T > 0, z n ∈ H01 ∩ H 2 and z n z 0 in H 2 , then UT (z n k ) UT (z 0 ) in H 2 for some sequence kn ≥ 1. Sketch of the proof. The local well-posedness of (3.4), (1.2) and (1.3) is standard (see [13]). From the construction of the feedback u it follows that a finite-time blow-up is impossible. Hence the solution is global in time. Let us show that u(z n ) → u(z 0 ) for any z n ∈ H01 ∩ H 2 such that z n z 0 in H 2 . Notice that (3.2) and the fact that Q is real imply that u(z) = −δ Im(α Q(− + V )z, (− + V )z) + u(z) ˜ = u(z), ˜ where u(z) ˜ = −δ Im(α(− + V )P1,V (Qz), (− + V )(−z, e1,V e1,V ) +α(− + V )(−Qz, e1,V e1,V ), (− + V )z +α(−∇ Q · ∇z − zQ), (− + V )z −Qz, e1,V e1,V , z). Thus u(z ˜ n ) → u(z ˜ 0 ). As Ut (z n ) is bounded in C([0, T ], H 2 ), for some sequence kn ≥ 1 we have U· (z kn ) → U (·) in L 2 ([0, T ], L 2 ) and Ut (z kn ) U (t) in H 2 for all t ∈ [0, T ]. Passing to the limit in (3.4), we see that U (t) is a solution of problem (3.4), (1.2), (1.3). Uniqueness gives U (t) = Ut (z 0 ). This completes the proof.

376

V. Nersesyan

Thus if z 0 , z 0 ∈ H01 ∩ H 2 , then (3.3) is verified for z(t) = Ut (z 0 ). A density argument proves the identity for any z 0 ∈ H01 ∩ H 2 . Let us assume that the functions V and Q satisfy the following condition: Condition 3.2. The functions V, Q ∈ C ∞ (D, R) are such that: (i) Qe1,V , e j,V = 0 for all j ≥ 2, (ii) λ1,V − λ j,V = λ p,V − λq,V for all j, p, q ≥ 1 such that {1, j} = { p, q} and j = 1. The theorem below is the main result of this section. Theorem 3.3. Under Condition 3.2, there is a finite or countable set J ⊂ R∗+ such that for any α ∈ / J and z 0 ∈ S ∩ H01 ∩ H 2 with z 0 , e1,V = 0 and 0 < V(z 0 ) < 1 there is a sequence kn ≥ 1 verifying Ukn (z 0 ) ce1,V in H 2 , where c ∈ C, |c| = 1. See SubSect. 3.3 for the proof of this theorem. The following lemma shows that the hypothesis on the initial condition z 0 is not restrictive. Lemma 3.4. For any z 0 ∈ S there is a control u ∈ C ∞ ([0, ∞), R) and a time k ≥ 1 such that Uk (z 0 , u), e1,V = 0. Proof. It suffices to find a control u and a time k ≥ 1 such that √ Uk (z 0 , u) − ce1,V < 2

(3.5)

for some c ∈ C, |c| = 1. Take any zˆ 0 ∈ S ∩ H01 ∩ H 2 such that ˆz 0 , e1,V = 0 and √ 2 . z 0 − zˆ 0 < 2 By Theorem 3.3, there is a control u ∈ C ∞ ([0, ∞), R) and a time k ≥ 1 such that √ 2 . Uk (ˆz 0 , u) − ce1,V < 2 Using the fact that the L 2 -distance between two solutions of (1.1), (1.2) with the same control is constant, we obtain (3.5).

3.2. Approximate controllability. Before proving Theorem 3.3, let us give an application of the result. For any d > 0 define the set Cd = {u ∈ C ∞ ([0, ∞), R) : sup |u(t)| < d}. t∈[0,∞)

We say that problem (1.1), (1.2) is approximately controllable in L 2 at integer times if for any ε, d > 0 and for any points z 0 , z 1 ∈ S there is a time k ∈ N and a control u ∈ Cd such that Uk (z 0 , u) − z 1 < ε.

Growth of Sobolev Norms and Controllability of the Schrödinger Equation

377

Theorem 3.5. Under Condition 3.2, problem (1.1), (1.2) is approximately controllable in L 2 at integer times. Proof. Theorem 3.3 implies that for any z ∈ S ∩ H01 ∩ H 2 there is u ∈ Cd such that Uk (z, u) − e1,V <

ε 2

(3.6)

for some k ≥ 1. As the L 2 -distance between two solutions of (1.1), (1.2) with the same control is constant, by a density argument, we get that for any z ∈ S a control u ∈ Cd exists such that (3.6) holds. Here we need the following result often referred to as time reversibility property of the Schrödinger equation. Lemma 3.6. Suppose that Uk (¯z , w) = y¯ for some z ∈ L 2 , w ∈ Cd and k ≥ 1. Then Uk (y, u) = z, where u(t) = w(k − t). The proof of this lemma is clear. Let us fix any z 0 , z 1 ∈ S and let u 0 , w ∈ Cd be such that ε , 2 ε Uk0 (z 0 , u 0 ) − e1,V < 2 Uk1 (¯z 1 , w) − e1,V <

for some k0 , k1 ≥ 1. Define y := Uk1 (¯z 1 , w). Then by Lemma 3.6, we have Uk1 (y, u 1 ) = z 1 , where u 1 (t) := w(k1 − t). Again using the fact that the L 2 -distance between two solutions of (1.1), (1.2) with the same control is constant, we get Uk1 (e1,V , u 1 ) − z 1 = e1,V − y <

ε . 2

Taking k = k0 + k1 and u(t) ˆ = u 0 (t), t ∈ [0, k0 ) and u(t) ˆ = u 1 (t − k0 ), t ∈ [k0 , ∞), we obtain Uk (z 0 , u) ˆ − z 1 < ε. Finally, using the continuity of Uk (z 0 , ·), we find u ∈ Cd satisfying Uk (z 0 , u) − z 1 < ε.

Remark 3.7. We note that for m = 1, Q(x) = x a stronger result is obtained by K. Beauchard and M. Mirrahimi [10] in the case of the space L 2 . They prove a result of approximate stabilization of eigenstates. The proof of this result remains literally the same for system (1.1), (1.2) under Condition 3.2. One should just pay attention to the fact that in the case of any space dimension m the spectral gap property for the eigenvalues used in [10] does not hold. The argument can be replaced by Lemma 3.10.

378

V. Nersesyan

3.3. Proof of Theorem 3.3. Step 1. Let us suppose that u(Ut (z 0 )) = 0 for all t ≥ 0. Then Ut (z 0 ) =

∞

e−iλ j,V t z 0 , e j,V e j,V .

(3.7)

j=1

Substituting (3.7) into (3.2), we get ∞

0=

αλk,V z 0 , e j,V ek,V , z 0 (− + V )(P1,V (Qe j,V )), ek,V e−i(λ j,V −λk,V )t

j=1,k=2 ∞

αλk,V e j,V , z 0 z 0 , ek,V ek,V , (− + V )(P1,V (Qe j,V ))ei(λ j,V −λk,V )t

−

j=1,k=2

−

∞

z 0 , e j,V e1,V , z 0 Qe j,V , e1,V ei(λ1,V −λ j,V )t

j=1

+

∞

e j,V , z 0 z 0 , e1,V Qe j,V , e1,V e−i(λ1,V −λ j,V )t

j=1 ∞

=

P(z 0 , Q, j, k)e−i(λ j,V −λk,V )t

j=2,k=2 ∞

(αλ j,V e j,V , (− + V )(P1,V (Qe1,V )) + Qe j,V , e1,V )

+

j=2

× z 0 , e1,V e j,V , z 0 e−i(λ1,V −λ j,V )t −

∞ (αλ j,V (− + V )(P1,V (Qe1,V )), e j,V + Qe j,V , e1,V ) j=2

× e1,V , z 0 z 0 , e j,V ei(λ1,V −λ j,V )t ,

(3.8)

where P(z 0 , Q, j, k) is a constant. In view of Condition 3.2, (ii), Lemma 3.10 below implies that the coefficients of exponential functions in (3.8) vanish. Condition 3.2, (i), implies that the set J := {α ∈ R : αλ j,V (− + V )(P1,V (Qe1,V )), e j,V + Qe j,V , e1,V = 0 for some j ≥ 2} is finite or countable. Thus we get that z 0 = ce1,V for some c ∈ C, |c| = 1 which is in contradiction to V(z 0 ) > 0. Thus there is a time t0 > 0 such that u(Ut0 (z 0 )) = 0 and V(Uk (z 0 )) − V(z 0 ) = − for any k ≥ t0 .

2 δ

k 0

u 2 (Us (z 0 ))ds < 0

Growth of Sobolev Norms and Controllability of the Schrödinger Equation

379

Step 2. Let K be the H 2 -weak ω-limit set of the trajectory for (3.4), (1.2) issued from z 0 , i.e. K := {z ∈ H01 ∩ H 2 : Ukn (z 0 ) z in H 2 for some kn → ∞}. Let m := inf V(z). z∈K

This infimum is attained, i.e. there is e ∈ K such that V(e) = inf V(z). z∈K

Indeed, take any minimizing sequence z n ∈ K, so that V(z n ) → m. By (3.1), z n is bounded in H 2 . Thus, without loss of generality, we can assume that z n e in H 2 . This implies that V(e) ≤ lim inf n→∞ V(z n ) = m. Let us show that e ∈ K. We can choose a sequence kn ≥ 1 such that Ukn (z 0 ) − z n ≤

1 . n

(3.9)

As Ukn (z 0 ) is bounded in H 2 , without loss of generality, we can suppose that Ukn (z 0 ) e, ˜ e˜ ∈ S ∩ H01 ∩ H 2 . Clearly, (3.9) implies that e = e, ˜ hence e ∈ K and V(e) = m. Let us show that V(e) = 0. Suppose that V(e) > 0. As V(e) ≤ V(z 0 ) < 1, we have e, e1,V = 0. Then, by Step 1, there is a time k ≥ 1 such that V(Uk (e)) < V(e). Proposition 3.1 implies that Uk (e) ∈ K. This contradicts the definition of e. Hence V(e) = 0. Thus e = ce1,V , |c| = 1 and ce1,V ∈ K. Remark 3.8. We note that if there is a sequence n k ≥ 1 such that Un k (z 0 ) converges in H 2 and z 0 satisfies the hypotheses of Theorem 3.3, then the proof of the stabilization result obtained in [9] for finite-dimensional approximations of the Schrödinger equation works giving Un k (z 0 , u) → e1,V in H 2 . However, the existence of such a sequence is an open question. Remark 3.9. Modifying slightly Condition 3.2, Theorem 3.3 can be restated for the eigenfunction ei,V , i ≥ 1. Indeed, one should replace λ1,V and e1,V by λi,V and ei,V in Condition 3.2 and use the Lyapunov function Vi (z) := α(− + V )Pi,V z2 + 1 − |z, ei,V |2 , z ∈ S ∩ H01 ∩ H 2 ,

(3.10)

where Pi,V is the orthogonal projection in L 2 onto the closure of the vector span of {ek,V }k=i . Lemma 3.10. Suppose that r j ∈ R and rk = r j for k = j. If ∞

c j eir j t = 0

(3.11)

j=1

for any t ≥ 0 and for some sequence c j ∈ C such that all j ≥ 1.

∞

j=1 |c j |

< ∞, then c j = 0 for

380

V. Nersesyan

Proof. Multiplying (3.11) by e−irn t and integrating on the interval [0, T ], we get 1 cn = − T

∞ j=1, j=n

cj

T

ei(r j −rn )t dt = −

0

1 T

∞

cj

j=1, j=n

ei(r j −rn )T − 1 →0 i(r j − rn )

as T → ∞, by the Lebesgue theorem on dominated convergence.

3.4. Genericity of Condition 3.2. Let us recall some definitions. Let X be a complete metric space and A ⊂ X . Then A is said to be a G δ set if it is a countable intersection of dense open sets. It follows from the Baire theorem that any G δ subset is dense. A set B ⊂ X is called residual if it contains a G δ subset. Let us endow the space C ∞ (D, R) with its usual topology given by the countable family of norms: pn (Q) := sup |∂ α Q(x)|. |α|≤n x∈D

The set P of all functions Q ∈ C ∞ (D, R) such that property (i) of Condition 3.2 is verified is a G δ . Indeed, let us fix an integer j ≥ 1 and let P j be the set of functions Q ∈ C ∞ (D, R) verifying Qe1,V , e j,V = 0. The unique continuation theorem for the operator −+V (see [19]) implies that there is a ball B ⊂ D such that e1,V (x)e j,V (x) = 0 for all x ∈ B. Let Q ∈ C ∞ (D, R) be such that Q = 0, supp Q ⊂ B and Q ≥ 0. Then Q ∈ P j , hence P j is non-empty. Clearly, P j is open. Take any Q 1 ∈ C ∞ (D, R) such that Q 1 e1,V , e j,V = 0 and Q 2 ∈ P j . Then (Q 1 + τ Q 2 )e1,V , e j,V = 0 for all τ = 0. Thus P j is dense in C ∞ (D, R) and P = ∩∞ j=1 P j is a G δ set. The following lemma shows that property (ii) of Condition 3.2 is generic in the 1D case. Lemma 3.11. Let I ⊂ R be an interval and let Q be the set of all functions V ∈ C ∞ (I, R), verifying λi,V − λ j,V = λ p,V − λq,V

(3.12)

for all i, j, p, q ≥ 1 such that {i, j} = { p, q} and i = j. Then Q is a G δ set. 2

d Proof. It is well known that the spectrum {λ j,V } of − dx 2 + V is non-degenerate for ∞ any V ∈ C (D, R), and e j,V and λ j,V are real-analytic in V (e.g., see [26]). Let us introduce the set Qn , n ≥ 1 of all functions V ∈ C ∞ (D, R) such that (3.12) is satisfied for any 1 ≤ i, j, p, q ≤ n. Clearly,

Q=

∞

Qn .

n=1

It suffices to prove that Qn is open and dense in C ∞ (D, R). The fact that Qn is open follows directly from the continuity of λ j,V in V. Let us prove that Qn is dense in C ∞ (D, R).

Growth of Sobolev Norms and Controllability of the Schrödinger Equation

381

Take any 1 ≤ i, j, p, q ≤ n such that {i, j} = { p, q} and i = j, and let Qi, j, p,q be the set of functions V ∈ C ∞ (D, R) such that (3.12) is satisfied. Suppose we have proved that for any V ∈ C ∞ (D, R) there is σ ∈ C ∞ (D, R) such that λi,V +τ σ − λ j,V +τ σ = λ p,V +τ σ − λq,V +τ σ ,

(3.13)

for any small τ > 0. This implies that Qi, j, p,q is dense. On the other hand, Qi, j, p,q is open. Hence Qn is dense, as Qn = Qi, j, p,q . 1≤i, j, p,q≤n

To prove (3.13), following [2], let us write λ j,V +τ σ = λ j,V + α j τ + β j (τ )τ 2 ,

(3.14)

e j,V +τ σ = e j,V + v j τ + w j (τ )τ .

(3.15)

2

Differentiating the identity (−

d2 + V + τ σ − λ j,V +τ σ )e j,V +τ σ = 0 dx 2

with respect to τ at τ = 0 and using (3.14) and (3.15), we get (−

d2 + V − λ j,V )v j + (σ − α j )e j,V = 0. dx 2

Taking the scalar product of this identity with e j,V , we obtain σ, |e j,V |2 = α j .

(3.16)

Suppose that λi,V +τn σ − λ j,V +τn σ = λ p,V +τn σ − λq,V +τn σ for any σ ∈ C ∞ (D, R) and for some sequence τn → 0. Clearly, this implies that αi − α j = α p − αq . In view of (3.16), this gives |ei,V |2 − |e j,V |2 = |e p,V |2 − |eq,V |2 .

(3.17)

On the other hand, by Theorem 9 in [26] (see p. 46), the system {|en,V |2 } is independent for any V ∈ L 2 . This contradiction proves (3.13) and completes the proof of the lemma.

We now turn to the multidimensional case. Let us assume that D = [0, 1]n and introduce the space G := {V ∈ C ∞ (D, R) : V (x1 , . . . , xn ) = V1 (x1 ) + · · · + Vn (xn ) for some Vk ∈ C ∞ ([0, 1], R), k = 1, . . . , n}. Endow G with the metric of C ∞ (D, R). It is not difficult to verify that G is a closed subspace in C ∞ (D, R).

382

V. Nersesyan

Lemma 3.12. The set of all functions V ∈ G, verifying λi,V − λ j,V = λ p,V − λq,V

(3.18)

for all i, j, p, q ≥ 1 such that {i, j} = { p, q} and i = j, is a G δ set. Proof. Notice that any eigenfunction of − + V , V ∈ G has the form el,V (x1 , . . . , xn ) = el1 ,V1 (x1 ) · · · · · eln ,Vn (xn ), where elk ,Vk (xk ) is an eigenfunction of the operator −

d2 dxk2

(3.19)

+ Vk . Indeed, any function of

the form (3.19) is an eigenfunction, and the set of all functions of this form is a basis in L 2 (D). Let i, j, p, q ≥ 1 be such that {i, j} = { p, q} and i = j, and let ein ,Vn (xn ), e jn ,Vn (xn ), e pn ,Vn (xn ) and eqn ,Vn (xn ) be the eigenfunctions in (3.19). Without loss of generality, we can assume that the functions (ein ,Vn (xn ))2 , (e jn ,Vn (xn ))2 , (e pn ,Vn (xn ))2 and (eqn ,Vn (xn ))2 are linearly independent (see Theorem 9 in [26]). Any eigenfunction elk ,Vk has a finite number of zeros in interval [0, 1]. Hence, choosing appropriately the point x ∗ ∈ [0, 1]n−1 , we see that the functions (ei,V (x ∗ , xn ))2 , (e j,V (x ∗ , xn ))2 , (e p,V (x ∗ , xn ))2 and (eq,V (x ∗ , xn ))2 , xn ∈ [0, 1] are linearly independent. This implies that relation (3.17) does not hold. Thus the proof of Lemma 3.11 works implying the genericity.

4. Applications 4.1. Nonlinear Schrödinger equation. Let us consider the nonlinear Schrödinger equation i z˙ = −z + V (x)z + u(t)Q(x)|z|2 z, x ∈ D, z|∂ D = 0, z(0, x) = z 0 (x),

(4.1) (4.2) (4.3)

where D ⊂ R3 is a bounded domain with smooth boundary. Problem (4.1)-(4.3) is locally well-posed. 1 ([0, ∞), R) there is a time Lemma 4.1. For any z 0 ∈ H01 ∩ H 2 and for any u ∈ L loc T > 0 such that problem (4.1)-(4.3) has a unique solution z ∈ C([0, T ], H 2 ). Furthermore, the resolving operator Ut (·, u) : H01 ∩ H 2 → H01 ∩ H 2 taking z 0 to z(t) satisfies the relation

Ut (z 0 , u) = z 0 , t ∈ [0, T ]. See [13] for the proof. Define z(t) = Ut (z 0 , u) and let us calculate the derivative d Vi (z(t)) = 2α Re((− + V )Pi,V z˙ , Pi,V (− + V )z) − 2 Re(˙z , ei,V ei,V , z) dt = 2α Re((− + V )Pi,V (iz − i V z − iu Q|z|2 z), (− + V )Pi,V z) −2 Re(iz − i V z − iu Q|z|2 z, ei,V ei,V , z) = 2u Im(α(− + V )Pi,V (Q|z|2 z), (− + V )Pi,V z −Q|z|2 z, ei,V ei,V , z),

Growth of Sobolev Norms and Controllability of the Schrödinger Equation

383

where Vi is defined by (3.10). Take u(z) := − Im(α(− + V )Pi,V (Q|z|2 z), (− + V )Pi,V z − Q|z|2 z, ei,V ei,V , z). (4.4) Problem (4.1)-(4.3) with feedback (4.4) is globally well-posed in H 2 (cf. Theorem 3.1). Let Ut (·) : H01 ∩ H 2 → H01 ∩ H 2 be the resolving operator. In order to formulate the main result, we introduce the following hypothesis: Condition 4.2. The functions V, Q ∈ C ∞ (D, R) are such that: (i) Qei,V e j,V , e p,V eq,V = 0 for all i, j, p, q ≥ 1, (ii) λi,V − λ j,V + λ p,V − λq,V = λi ,V − λ j ,V + λ p ,V − λq ,V for all integers i, j, p, q, i , j , p , q such that {i, j, p, q} = {i , j , p , q } and {i, p} = { j, q}. The theorem below is the version of Theorem 3.3 for system (4.1)-(4.3). Theorem 4.3. Under Condition 4.2, there is a finite or countable set J ⊂ R∗+ such that for any α ∈ / J , ł ≥ 1 and z 0 ∈ S ∩ H01 ∩ H 2 with z 0 , el,V = 0 and 0 < Vl (z 0 ) < 1 there is a sequence kn ≥ 1 verifying Ukn (z 0 ) cel,V in H 2 , where c ∈ C, |c| = 1. The proof of this theorem is very close to that of Theorem 3.3. One should notice that, under Condition 4.2, there is a time t0 > 0 such that u(Ut0 (z 0 , 0)) = 0, and then conclude as in Step 2 in the proof of Theorem 3.3. Remark 4.4. Notice that, as Eq. (4.1) is nonlinear, the distance between two solutions with the same control is not constant. Hence the proof of approximate controllability given in Theorem 3.5 does not work here. Lemma 4.5. For any l ≥ 1, d > 0 and z 0 ∈ S there is a control u ∈ Cd and a time k ≥ 1 such that Uk (z 0 , u), el,V = 0. Proof. Assume that z 0 , el,V = 0. Let us show that there is a control u ∈ Cd such that Uk (z 0 , u), el,V = 0 for some k ≥ 0. As (4.1) is nonlinear, the proof given in Lemma 3.4 does not work. If z 0 ∈ / {ce j,V : c ∈ C, |c| = 1, j ≥ 1}, then, by Theorem 4.3, there is an integer p ≥ 1, a sequence kn ≥ 1 and a constant c ∈ C, |c| = 1 such that Ukn (z 0 ) ce p,V in H 2 . Hence, without loss of generality, we can assume that z 0 = e p,V for some p = l. Let us introduce the following two-dimensional subspace of L 2 ([0, 1], R): E = {a sin(λ p,V − λl,V )t + b cos(λ p,V − λl,V )t : a, b ∈ R}. For any u ∈ E, define the mapping (u) = U1 (e p,V , u), el,V , whenever the solution Ut (e p,V , u) exists up to time t = 1. Notice that (0) = e−iλ p,V e p,V , el,V = 0, hence is well defined in a neighborhood of 0 ∈ E. We are going to show that the conditions of the inverse mapping theorem are satisfied in a neighborhood of the point 0 ∈ E. Clearly, is continuously differentiable. Let us show that the mapping D (0) :

384

V. Nersesyan

E → C is an isomorphism. Consider the linearization of (4.1), (4.2), z 0 = e p,V around (e−iλ p,V t e p,V , 0): i y˙ = −y + V (x)y + u(t)Q(x)e3p,V e−iλ p,V t , x ∈ D, y|∂ D = 0, y(0) = 0.

(4.5) (4.6) (4.7)

One can verify that D (0)(u) = y(1), el,V . System (4.5)-(4.7) is equivalent to t y = −i e−iλ p,V s u(s)S(t − s)(Qe3p,V )ds, (4.8) 0

where S(t) is the unitary group associated with i − i V. Taking the scalar product of (4.8) with el,V , we obtain for t = 1, 1 y, el,V = −ie−iλl,V Qe3p,V , el,V e−i(λ p,V −λl,V )s u(s)ds. 0

Condition 4.2 implies that λ p,V − λl,V = 0, hence D (0) : E → C is an isomorphism. Applying the inverse mapping theorem, we conclude that is a C 1 diffeomorphism in a neighborhood of 0 ∈ E. Thus there is a control u ∈ Cd such that U1 (e p,V , u), el,V = 0.

4.2. Randomly forced Schrödinger equation. 4.2.1. Growth of Sobolev norms Let us consider the problem i z˙ = −z + V (x)z + β(t)Q(x)z, x ∈ D, z|∂ D = 0, z(0) = z 0 ,

(4.9) (4.10) (4.11)

where V, Q ∈ C ∞ (D, R) are given functions. We assume that β(t) is a random process of the form (2.2), where the random variables ηk verify the following condition: Condition 4.6. The random variables ηk have the form ηk (t) =

∞

b j ξ jk g j (t), t ∈ [0, 1],

j=1

where {g j } is an orthonormal basis in L 2 ([0, 1], R), b j > 0 are constants with ∞

b2j < ∞,

j=1

and ξ jk are independent real-valued random variables such that Eξ 2jk = 1. Moreover, the distribution of ξ jk possesses a continuous density ρ j with respect to the Lebesgue measure and ρ j (r ) > 0 for all r ∈ R.

Growth of Sobolev Norms and Controllability of the Schrödinger Equation

385

Notice that this condition in particular implies that P{u − β L 2 ([0,l]) < ε} > 0 for any u ∈ L 2 ([0, l]) and ε > 0. Moreover, using the continuity of the mapping Ul (z 0 , ·) : L 2 ([0, l]) → L 2 (D), for any δ > 0 we can find a constant ε > 0 such that P{Ul (z 0 , β) − Ul (z 0 , u) < δ} ≥ P{u − β L 2 ([0,l]) < ε} > 0. Hence, any point Ul (z 0 , u), u ∈ L 2 ([0, l]) is in the support of the measure D(Ul (z 0 , β)). The following theorem is the main result of this section. Theorem 4.7. Suppose that Conditions 3.2 and 4.6 are satisfied. Then for any s > 0 and z ∈ H s \{0} we have P{lim sup Uk (z, β)s = ∞} = 1.

(4.12)

k→∞

4.2.2. Proof of Theorem 4.7. By Theorem 3.5, system (1.1), (1.2) is approximately controllable at integer times. Since the equation is linear in z, it suffices to prove (4.12) for any z ∈ S ∩ H s . Without loss of generality, we can assume that s ∈ (0, 2]. Step 1. Let us fix a constant r > 0 and introduce the stopping time τr (z) = min{k ≥ 0 : Uk (z, β) ∈ B H −s (0, r )}, z ∈ B L 2 (0, 1). Then we have P{τr (z) < ∞} = 1.

(4.13)

Indeed, choose an arbitrary point z ∈ S ∩ B H −s (0, r ). By the property of approximate controllability in L 2 , there is a control u ∈ Cd such that Ul (z, u) is sufficiently close to z in L 2 , hence Ul (z, u) ∈ B H −s (0, r ). As Ul (z, u) is in the support of measure D(Ul (z, β)), we have P{Ul (z, β) ∈ B H −s (0, r )} > 0. Using the continuity of the resolving operator in negative Sobolev norms, we see that there is an H −s -neighborhood O = O(z) of z such that sup P{τr (y) > l} < 1.

y∈O

From the compactness of B L 2 (0, 1) in H −s it follows that there is a time k ≥ 1 such that a :=

sup

y∈B L 2 (0,1)

P{τr (y) > k} < 1.

(4.14)

Using the Markov property and (4.14), we obtain P{τr (y) > nk} = E(I{τr (y)>(n−1)k} P{τr (x) > k}|x=U(n−1)k (y,β) ) ≤ aP{τr (y) > (n − 1)k}. Hence P{τr (y) > nk} ≤ a n . Using the Borel–Cantelli lemma, we arrive at (4.13).

(4.15)

386

V. Nersesyan

Step 2. Take any z ∈ S ∩ H s . Choosing r =

1 n

and using (4.13), we get

P{lim inf Uk (z, β)−s = 0} = 1. k→∞

(4.16)

Define the event A := {ω ∈ : lim sup Uk (z, β)s < ∞}. k→∞

Suppose that P{A} > 0. By (4.16), for almost any ω ∈ A there is a sequence n k → ∞ such that lim Un k (z, β)−s = 0.

n→∞

(4.17)

On the other hand, for any ω ∈ A, there is a subsequence of n k (which is also denoted by n k ) and an element w ∈ S such that Un k (z, β) − w → 0. This contradicts (4.17). Thus P{A} = 0. Remark 4.8. In view of Theorem 4.3, under Condition 4.2, Theorem 4.7 also holds in the case of nonlinear Equation (4.1). The proof is literally the same. One should just pay attention to the fact that, as in this case finite time blow-up is possible, the restriction of the solution at integer times forms a Markov chain with values in H s ∪ {∞} (e.g., see [28]). Acknowledgements. The author would like to thank Armen Shirikyan for his guidance and encouragements.

References 1. Agrachev, A., Chambrion, T.: An estimation of the controllability time for single-input systems on compact Lie groups. J. ESAIM Control Optim. Calc. Var. 12(3), 409–441 (2006) 2. Albert, J.H.: Genericity of simple eigenvalues for elliptic PDE’s. Proc. Amer. Math. Soc. 48, 413–418 (1975) 3. Albertini, F., D’Alessandro, D.: Notions of controllability for bilinear multilevel quantum systems. IEEE Transactions on Automatic Control 48(8), 1399–1403 (2003) 4. Altafini, C.: Controllability of quantum mechanical systems by root space decomposition of su(n). J. Math. Phys. 43(5), 2051–2062 (2002) 5. Ball, J.M., Marsden, J.E., Slemrod, M.: Controllability for distributed bilinear systems. SIAM J. Control Optim. 20, 575–597 (1982) 6. Baudouin, L., Puel, J.-P.: Uniqueness and stability in an inverse problem for the Schrödinger equation. Inverse Problems 18, 1537–1554 (2001) 7. Beauchard, K.: Local controllability of a 1-D Schrödinger equation. J. Math. Pures Et Appl. 84(7), 851–956 (2005) 8. Beauchard, K., Coron, J.-M.: Controllability of a quantum particle in a moving potential well. J. Funct. Anal. 232(2), 328–389 (2006) 9. Beauchard, K., Coron, J.-M., Mirrahimi, M., Rouchon, P.: Implicit Lyapunov control of finite dimensional Schrödinger equations. Syst. Cont. Lett. 56, 388–395 (2007) 10. Beauchard, K., Mirrahimi, M.: Approximate stabilization of a quantum particle in a 1D infinite square potential well. http://arxiv.org/abs/0801.1522v1[math.AP], 2008, to apppear SIAMJ Cont. Opt.

Growth of Sobolev Norms and Controllability of the Schrödinger Equation

387

11. Bourgain, J.: On growth of Sobolev norms in linear Schrödinger equations with smooth time dependent potential. J. Anal. Math. 77, 315–348 (1999) 12. Burq, N.: Contrôle de l’équation des plaques en présence d’obstacles strictement convexes. Mémoire de la S.M.F. 55, 126 (1993) 13. Cazenave, T.: Semilinear Schrödinger equations. Courant Lecture Notes in Mathematics, 10, Providence, RI: Amer. Math. Soc., 2003 14. Chambrion, T., Mason, P., Sigalotti, M., Boscain, U.: Controllability of the discrete-spectrum Schrödinger equation driven by an external field. http://arxiv.org/abs/0801.4893v3[math.OC], 2008 15. Coron, J.-M.: Control and nonlinearity. Mathematical Surveys and Monographs, Providence, RI: Amer. Math. Soc., 136, 2007 16. Dehman, B., Gérard, P., Lebeau, G.: Stabilization and control for the nonlinear Schrödinger equation on a compact surface. Math. Z. 254(4), 729–749 (2006) 17. Eliasson, L.H., Kuksin, S.B.: On reducibility of Schrödinger equations with quasiperiodic in time potentials. Commun. Math. Phys. 286(1), 125–135 (2009) 18. Erdogan, M.B., Killip, R., Schlag, W.: Energy growth in Schrödinger’s equation with Markovian forcing. Commun. Math. Phys. 240, 1–29 (2003) 19. Jerison, D., Kenig, C.E.: Unique continuation and absence of positive eigenvalues for Schrödinger operators (with an appendix by E. M. Stein). Ann. Math. 121(3), 463–494 (1985) 20. Lebeau, G.: Contrôle de l’équation de Schrödinger. J. Math. Pures Appl. 71, 267–291 (1992) 21. Machtyngier, E., Zuazua, E.: Stabilization of the Schrödinger equation. Portugaliae Matematica 51(2), 243–256 (1994) 22. Mirrahimi, M.: Lyapunov control of a particle in a finite quantum potential well. IEEE Conf. on Decision and Control, San Diego, 2006 23. Nersesyan, V.: Exponential mixing for finite-dimensional approximations of the Schrödinger equation with multiplicative noise. http://arxiv.org/abs/0710.3693v1[math-ph], 2007 24. Nersesyan, V.: Global approximate controllability for Schrödinger equation in higher Sobolev norms. In preparation, 2009 25. Øksendal, B.: Stochastic Differential Equations. Berlin-Heidelberg-New York: Springer–Verlag, 2003 26. Pöschel, J., Trubowitz, E.: Inverse Spectral Theory. New York: Academic Press, 1987 27. Ramakrishna, V., Salapaka, M., Dahleh, M., Rabitz, H., Pierce, A.: Controllability of molecular systems. Phys. Rev. A 51(2), 960–966 (1995) 28. Revuz, D.: Markov Chains. Amsterdam: North–Holland, 1984 29. Turinici, G., Rabitz, H.: Quantum wavefunction controllability. Chem. Phys. 267, 1–9 (2001) 30. Wang, W.-M.: Bounded Sobolev norms for linear Schrödinger equations under resonant perturbations. J. Func. Anal. 254(11), 2926–2946 (2008) 31. Wang, W.-M.: Logarithmic bounds on Sobolev norms for time dependent linear Schrödinger equations. Commun. PDE 33(12), 2164–2179 (2008) 32. Zuazua, E.: Remarks on the controllability of the Schrödinger equation. CRM Proc. Lecture Notes 33, 193–211 (2003) Communicated by I. M. Sigal

Commun. Math. Phys. 290, 389–398 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0758-8

Communications in

Mathematical Physics

A New Spectral Triple over a Space of Connections Johannes Aastrup1 , Jesper Møller Grimstrup2 , Ryszard Nest3 1 SFB 478 “Geometrische Strukturen in der Mathematik”, Hittorfstr. 27,

48149 Münster, Germany. E-mail: [email protected]

2 The Niels Bohr Institute, Blegdamsvej 17, DK-2100 Copenhagen,

Denmark. E-mail: [email protected]

3 Matematisk Institut, Universitetsparken 5, DK-2100 Copenhagen,

Denmark. E-mail: [email protected] Received: 18 September 2008 / Accepted: 23 November 2008 Published online: 17 March 2009 – © Springer-Verlag 2009

Abstract: A new construction of a semifinite spectral triple on an algebra of holonomy loops is presented. The construction is canonically associated to quantum gravity and is an alternative version of the spectral triple presented in [1]. Contents 1. 2. 3. 4. 5. 6.

Introduction . . . . . . . . . . . . . . . . . . . . . Completion of the Configuration Spaces . . . . . . The Coordinate Change and the Riemannian Metric The Algebra and the Dirac Type Operator . . . . . . Semifiniteness . . . . . . . . . . . . . . . . . . . . Concluding Remarks . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

389 390 392 393 394 397

1. Introduction In the papers [3,4] we commenced a programme of combining Connes noncommutative geometry with quantum gravity. This programme is motivated by the formulation of the Standard Model coupled to gravity in terms of noncommutative geometry, see [8]. There, the Standard Model coupled to gravity is formulated as a single gravitational model, a spectral triple, and the classical action is obtained via a spectral action principle natural to noncommutative geometry. The fact that the classical Standard Model is so readily translated into the language of noncommutative geometry raises the question whether there exists a corresponding translation of the quantization procedure of QFT. Since noncommutative geometry is essentially gravitational such a translation would presumably involve quantum gravity. Using this line of reasoning we successfully constructed a semifinite spectral triple over a space of connections [1,2]. The spectral triple involves an algebra of holonomy loops and the interaction between the Dirac type operator and the algebra reproduces the

390

J. Aastrup, J. M. Grimstrup, R. Nest

Fig. 1. Repeated subdivisions of a lattice

Poisson structure of General Relativity when formulated in Ashtekar variables [6]. Furthermore, the associated Hilbert space corresponds, up to a discrete symmetry group, to the Hilbert space of diffeomorphism invariant states known from Loop Quantum Gravity [9]. In this paper we construct a new semifinite spectral triple which differs from the triple constructed in [1,2] through the form of the Dirac type operator. The operator presented in this paper is significantly simpler and thus possibly more suitable for actual spectral computations. The construction of the operator is based on a reparameterization of the space of connections, such that the structure maps are deleting copies of the structure group. Hence the spectral triple can be constructed by writing a Dirac operator on each copy of the structure group. Whereas the reparameterized Dirac operator is simpler than the one in [3,4], its interaction with the algebra of loops becomes more complicated. 2. Completion of the Configuration Spaces We recall from [1] how we constructed the completion of spaces of connections. The construction is a variant of the Ashtekar-Lewandowski construction, see [5]. The setup is a manifold M and a trivial G-principal fiber bundle over M, where G is a compact connected Lie group. Denote by A the space of smooth G-connections. We start with a system S of graphs on M. The system has to be dense and directed according to the Definitions 2.1.6 and 2.1.7 in [1]. The specific examples we have in mind are the following two: Example 1. Let T be a triangulation of M. We let 0 be the graph consisting of all the edges in this triangulation. Strictly speaking this is not a graph if the manifold is not compact, but in this case we can consider 0 as a system of graphs instead. Let Tn be the triangulation obtained by barycentric subdividing each of the simplices in T n times. The graph n is the graph consisting of the edges of Tn . In this way we get a directed and dense system S = {n } of graphs. Example 2. Let 0 be a finite, d-dimensional lattice and let 1 be the lattice obtained by subdividing each cell in 0 into 2d cells, see Fig. 1. Correspondingly, let i be the lattice obtained by repeating i such subdivisions of 0 . In this way we get a directed and dense system S = {n } of graphs. We assume that the system S we are dealing with is of the form S = {Sn }n ∈ N, where Sn are finite graphs and Sn ⊂ Sn+1 . Also we assume the edges to be oriented and

A New Spectral Triple over a Space of Connections

391

Fig. 2. The structure map of one subdivision

we assume the embeddings Sn ⊂ Sn+1 to preserve the orientation. This is clearly the case in Example 1 and 2. We define An = G e(Sn ) , where e(Sn ) denotes the number of edges in Sn . In other words we have just associated to each edge a copy of G. We think of An as A restricted to Sn ; namely for each connection we associate to each edge in Sn the holonomy of the connection along the edge, which is just an element of G. There are natural maps Pn,n+1 : An+1 → An defined in the following way: If an edge ei ∈ Sn is the composition ei1 ei2 · · · eik , where ei1 , ei2 , . . . , eik ∈ Sn+1 then (gi1 , . . . , gik ) gets mapped to gi1 · · · gik in the i’component of An . If el ∈ Sn+1 is not the subdivision of any edges in Sn the map Pn,n+1 just forgets the i’s component in An+1 . See Fig. 2. Given these maps we can define S

A = lim An . ←

Since An has a natural compact Hausdorff topology, and the maps Pn,n+1 are continuous, S

A has a natural compact Hausdorff topology. S A smooth connection ∇ gives rise to an element in A by ∇ → (H ol(e1 , ∇), . . . , H ol(ee(Sn ) , ∇)) ∈ An , where H ol(ei , ∇) denotes the holonomy of ∇ along ei . S We therefore get a map from A to A . This map is a dense embedding, see [1] for details.

392

J. Aastrup, J. M. Grimstrup, R. Nest

3. The Coordinate Change and the Riemannian Metric We will now further assume that edges from Sn get subdivided into two in Sn+1 . This is clearly the case in example 1 and 2. Therefore for a single edge, the projective system looks like n

G ← G2 ← G4 ← · · · G2 ← G2

n+1

← ···

with structure maps Pn,n+1 (g1 , . . . , g2n+1 ) = (g1 g2 , . . . , g2n+1 −1 g2n+1 ). We will from now on focus on the case of a single edge, since the general case is basically just more notation. Like in [1] we define the coordinate transformation n

n : A n = G 2 → G 2

n

by n (g1 , . . . , g2n ) = (g1 g2 · · · g2n , g2 g3 · · · g2n , . . . , g2n −1 g2n , g2n ) . n

It is easy to see that n preserves the Haar measure on G 2 . The inverse of n is given by −1 −1 −1 −1 n (g1 , . . . , g2n ) = g1 g2 , g2 g3 , . . . , g2n −1 g2n , g2n . The important feature of the coordinate change is the following: n Pn,n+1 −1 (g1 , . . . , g2n+1 ) = (g1 , g3 , . . . , g2n+1 −1 ). n+1 S

We will from now on use to identify A with a projective system of the form n

G ← G2 ← G4 ← · · · G2 ← G2

n+1

← ···

with structure maps Pn,n+1 (g1 , . . . , g2n+1 ) = (g1 , g3 , . . . , g2n+1 −1 ). Hence the structure maps have been simplified significantly. This way of writing the projective system can be seen in the following way: The edge is divided into 2n smaller edges. The coordinate g1 corresponds to the holonomy along the entire edge. The coordinate g2 corresponds to the holonomy along the entire edge minus the first of the 2n edges. The coordinate g3 corresponds to the holonomy along the entire edge minus the first two of the 2n edges and so on and so forth. See Fig. 3. We now choose a left and right invariant metric ·, · on G. We will consider it as n a metric on T ∗ G. We equip T ∗ An = T ∗ G 2 with the product metric and denote it by ·, ·n . Note that ∗ ∗ (u) n+1 = v, un , (1) Pn,n+1 (v), Pn,n+1

A New Spectral Triple over a Space of Connections

393

Fig. 3. The new parameterization S

and hence the family of metrics ·, ·n descend to a metric on T ∗ A = limn T ∗ An , which we will also denote by ·, ·. n n Denote by L 2 (An , Cl(T ∗ An )) the Hilbert space L 2 (G 2 , Cl(T ∗ G 2 )), where Cl(T ∗ n n 2 2 G ) is the Clifford bundle with respect to ·, ·n , and G is equipped with the Haar ∗ measure. Because of (1), and because the Haar measure of G 2n is 1, the map Pn,n+1 defines a Hilbert space embedding ∗ : L 2 (An , Cl(T ∗ An )) → L 2 (An+1 , Cl(T ∗ An+1 )). Pn,n+1

We can thus define S

S

L 2 (A , Cl(T ∗ A )) = lim L 2 (An , Cl(T ∗ An )). →

4. The Algebra and the Dirac Type Operator S

We want to construct a spectral triple related to A . Let v be a vertex in S and assume that G is unitarily represented as matrices. A loop L in S with base point v defines a S matrix valued function h L over A via S

h L (∇) = H ol(L , ∇), ∇ ∈ A . Definition 4.0.1. The algebra Bv of holonomy loops based in v is the ∗-algebra generated by the h L ’s, where L is running through all the loops in S based in v. Since the representation of G is unitary the h L ’s are bounded functions and therefore S define bounded operators on L 2 (A , M N ), where M N is the N × N matrices in which G is represented. In particular Bv can be completed to a C ∗ -algebra. We want to construct a spectral triple for Bv . Since Bv is an algebra of functions over S S A , we will do this by constructing a Dirac type operator on A . To be more precise S S the operator will act on L 2 (A , Cl(T ∗ A )).

394

J. Aastrup, J. M. Grimstrup, R. Nest

Let g be the Lie algebra of G. We choose an orthonormal basis {ei } for g with respect to ·, ·. We also denote by {ei } the corresponding left translated vectorfields. On G define the bare Dirac type operator by ei · dei ξ, ξ ∈ L 2 (G, Cl(T G)), Db (ξ ) = i

where dei means deriving with respect to ei in the trivialization given by {ei }, and · means Clifford multiplication. n n n On G 2 we define the operator Dn, j acting on L 2 (G 2 , Cl(T G 2 )) simply as Db n n 2 ∗ 2 acting on the j’s copy of G. Since ·, ·n identifies T G with T G , we can consider Dn, j as an operator acting on L 2 (An , Cl(T ∗ An )). Note that given a Dirac type operator D acting on L 2 (An−1 , Cl(T ∗ An−1 )) we can define an operator E n (D) acting on L 2 (An , Cl(T ∗ An )) simply by letting it act on the n odd variables of An = G 2 . Definition 4.0.2. Let {a j,k } j∈N0 ,1≤k≤2 j−1 ( with the odd convention that 2−1 = 1) be a sequence of non zero real numbers. The n th Dirac type operator is defined inductively via D0 = Db and a n,k Dn,2k . Dn = E n (Dn−1 ) + k

By construction it is clear that

∗ ∗ (Dn (ξ )) = Dn+1 Pn,n+1 (ξ ) . Pn,n+1

(2)

Proposition 4.0.3. The family of operators {Dn } descends to a densely defined essenS S tially self adjoint operator D on L 2 (A , Cl(T ∗ A )). Proof. By (2) it follows that {Dn } descends to a densely defined operator D on S S L 2 (A , Cl(T ∗ A )). The operators Dn are formally self adjoint elliptic differential operators on compact manifolds, and hence orthonormal diagonalizable. Because of (2) S S we can find an orthonormal basis for L 2 (A , Cl(T ∗ A )) diagonalizing D with real eigenvalues. In particular D is essentially self adjoint. Proposition 4.0.4. The commutator [h L , D] is bounded for all h L ∈ Bv . Proof. A given loop L belongs to Sn for some n. Therefore the action of h L on S S L 2 (A , M N ⊗ Cl(T ∗ A )) depends only, by construction of the coordinate change, on the copies of G appearing at the n’level. Therefore [h L , D] = [h L , Dn ]. On the other hand [h L , Dn ] is an order zero operator on a compact manifold, and hence bounded. 5. Semifiniteness We will in this section assume that G has the property that the kernel of the bare Dirac type operator is Cl(g), where Cl(g) is understood as the sections in C L(T G) generated by left invariant vectorfields. For U (1) this is trivial, and the computation in the Appendix of [1] shows that this is also the case for SU (2), which is the example of most interest. We do not know if all compact Lie groups possess this property.

A New Spectral Triple over a Space of Connections

395

One of the crucial demands of being a unital spectral triple is that the Dirac operator should have compact resolvent. This is however clearly not the case for D, since it has infinite dimensional kernel. We will however see that we have a semifinite spectral triple. For a semifinite spectral triple one replaces the compact resolvent condition with the condition that 1 +1

D2

is compact with respect to a certain trace, i.e. the trace should be thought of as integrating out the infinite degeneracy in the spectrum of D. The following definition first appeared in [7]. Definition 5.0.5. Let N be a semifinite von Neumann algebra with a semifinite trace τ . Let Kτ be the τ - compact operators. A semifinite spectral triple (B, H, D) is a ∗-subalgebra B of N , a representation of N on the Hilbert space H and an unbounded densely defined self adjoint operator D on H affiliated with N satisfying 1. b(λ − D)−1 ∈ Kτ for all b ∈ B and λ ∈ / R.. 2. [b, D] is densely defined and extends to a bounded operator. We will now prove that S S Bv , L 2 A , M N ⊗ Cl T ∗ A ,D , is a semifinite spectral triple. We therefore need to specify a semifinite von Neumann algebra N with a semifinite trace τ . We can use {ei } to trivialize T ∗ G. Doing this in each copy of G we can also trivialize ∗ T An . Hence we can factorize S S S S = L 2 A ⊗ M N ⊗ Cl Tid∗ A , L 2 A , M N ⊗ Cl T ∗ A where

S Cl Tid∗ A = lim Cl Tid∗ An . n

S

Since the problem arises from the infinite dimensionality of Cl(Tid∗ A ) we will take the algebra S N = B L2 A ⊗ M N ⊗ C, S

where C is the following von Neumann algebra acting on Cl(Tid∗ A ): • We write Tid∗ An+1 = Tid∗ An ⊕ Vn,n+1 , and

ˆ Vn,n+1 , Cl Tid∗ An+1 = Cl Tid∗ An ⊗Cl

then, with abuse of notation,

∗ : Cl Tid∗ An → Cl Tid∗ An+1 Pn,n+1

396

J. Aastrup, J. M. Grimstrup, R. Nest

is given by ∗ Pn,n+1 (v) = v ⊗ 1Cl(Vn,n+1 ) .

Define C as the weak closure of the C ∗ -algebra B = lim Cl Tid∗ An →

S

with respect to the representation on Cl(Tid∗ A ). We denote by Pn∗ the natural map from Cl(Tid∗ An ) to B. Note that B is a UHF-algebra. Since the dimension of the Clifford algebra is a power of 2 when n ≥ 1, B is the CAR-algebra and has a normalized trace. This trace can be described in the following way: Cl(Tid∗ An ) is a matrix algebra, and hence has a normalized trace τn . By definition of the normalized trace we have ∗ = τn . τn+1 ◦ Pn,n+1

Thus {τn } descends to a trace τ on B. In particular τ (1) = 1. This remedies the defect S that Cl(Tid∗ A ) is infinite dimensional. S

Note that the action of B on Cl(Tid∗ A ) is just the GNS-representation of B with respect to the normalized trace on B. Therefore C is the hyperfinite II1 factor, and τ extends to a finite trace on C. S Tensoring with the ordinary operator trace tr on B(L 2 (A ) ⊗ M N ) we obtain a semifinite trace T r on N . Theorem 5.0.6. The triple S S Bv , L 2 A , M N ⊗ Cl T ∗ A ,D is a semifinite spectral triple with respect to (N , T r ) when a j,k → ∞. Proof. Clearly Bv ⊂ N . Also by Proposition 4.0.4, the commutators [h L , D] are bounded. We therefore only need to check that D is affiliated with N and that D has T r -compact resolvent. Let Pn,λ be the spectral projection of Dn corresponding to the eigenvalue λ. To this ∞ in N in the following way: projection we associate a projection Pn,λ S

The embedding of L 2 (An ) → L 2 (A ) induces an embedding S In : B L 2 (An ) ⊗ M N → B L 2 A ⊗ M N . ∞ = (I ⊗ P ∗ )(P ), where P ∗ : Cl(T ∗ A ) → B is the natural map. Define Pn,λ n n,λ n n id n Suppose ξ is an eigenvector for Dn with eigenvalue λ. Since Dn+1 (v) = 0, v ∈ Cl Vn,n+1 , ∞ is a subprowe see that Pn,n+1 (ξ ) ⊗ v is an eigenvector for Dn+1 . This shows that Pn,λ jection of Pλ , the spectral projection of D corresponding to the eigenvalue λ.

A New Spectral Triple over a Space of Connections

397

Fig. 4. A different reparameterization ∞ P weakly, P ∈ N , and hence D is affiliated with N . Since Pn,λ λ λ By the assumption on the bare Dirac type operator and since a j,k → ∞ the only new eigenvectors with eigenvalues in a given bounded set introduced by going from Dn to ∗ Dn+1 will, from a certain step, be of the form Pn,n+1 (ξ ) ⊗ v, where ξ is an eigenvector of Dn with eigenvalue in the bounded set and v ∈ C L(Vn,n+1 ). Thus in every bounded set of R there are only finitely many eigenvalues of D and the associated spectral projections are finite with respect to T r .

6. Concluding Remarks The present construction of the spectral triple is based on the reparameterization of A into a projective system of the form n

G ← G2 ← G4 ← · · · G2 ← G2

n+1

S

← ···

with structure maps

Pn,n+1 g1 , . . . , g2n+1 = g1 , g3 , . . . , g2n+1 −1 .

The Dirac operator we have constructed is just a weighted sum of Dirac operators on each of the copies of G. The reparameterization we have chosen is by no means unique. The reparameterization relies on a choice of labeling of the new degrees of freedom which are generated by going from step n to step n + 1. Another choice of labeling is indicated in Fig. 4. The labeling can in general be done in many different ways, where each choice of labeling gives rise to different spectral triples. At the end one would expect some symmetry condition singling out the spectral triple which might be relevant in physics. One feature of the chosen reparameterization that seems appealing is that it works for the system S of all subgraphs of the line segment. Since a subgraph of the line consists of points in the line, choose an endpoint and label the segments like in Fig. 3. This will S define a reparameterization of A into a projective system where the structure maps are just deleting copies of G. It is a priori not clear how this can be obtained with the labeling indicated in Fig. 4. The construction of the Dirac type operator readily carries over. S However L 2 (A ) is not separable anymore, and it is not clear how this can be turned into a semifinite spectral triple. Furthermore this reparameterization of all subgraphs does not seem to work for the system of all analytic subgraphs, when the manifold has dimension bigger than one. The spectral triple constructed by means of the reparameterization in this article differs from the one constructed in [1]. The spectral analysis of the one constructed in [1] appears to be more complicated than the reparameterized ones. However the original

398

J. Aastrup, J. M. Grimstrup, R. Nest

Fig. 5. Loop running through first half of an edge

Dirac type operator appears to be more natural since it is more symmetrical. This is related to the interaction between the Dirac type operators with the loop algebra. In fact, the interaction of the algebra with the reparameterized Dirac type operator seems to be less natural due to an asymmetry which arises through the reparameterization. For example a loop L running through the first half of an edge, see Fig. 5, has in the reparameterization the action of the form (Lξ ) (. . . , g1 , g2 , . . .) = · · · g2−1 g1 · · · ξ (. . . , g1 , g2 , . . .) , whereas a loop running through the second half of the edge has an action of the form (Lξ )(. . . , g1 , g2 , . . .) = · · · g2 · · · ξ(. . . , g1 , g2 , . . .). Therefore the construction has a built in asymmetry. It remains to be clarified whether any of these different Dirac type operators are singled out by some arguments of symmetry related to physical principles. Acknowledgement. We thank the referees for their careful reading of the paper and for useful suggestions. Johannes Aastrup was funded by the German Research Foundation (DFG) within the research project Geometrische Strukturen in der Mathematik (SFB 478).

References 1. Aastrup, J., Grimstrup, J., Nest, R.: On spectral triples in quantum gravity II. J. Noncomm. Geom. 3(1), 47–81 (2009) 2. Aastrup, J., Grimstrup, J., Nest, R.: On spectral triples in quantum gravity I. Class. Quant. Grav. http:// arxiv.org/abs/0802.1783V1[hepth], (2008, in press) 3. Aastrup, J., Grimstrup, J.M.: Spectral triples of holonomy loops. Commun. Math. Phys. 264(3), 657–681 (2006) 4. Aastrup, J., Grimstrup, J.M.: Intersecting Connes noncommutative geometry with quantum gravity. Int. J. Mod. Phys. A 22(8–9), 1589–1603 (2007) 5. Ashtekar, A., Lewandowski, J.: Representation theory of analytic holonomy C ∗ -algebras. In: Knots and Quantum Gravity (Riverside, CA, 1993). Oxford Lecture Ser. Math. Appl., vol. 1, pp. 21–61. Oxford University Press, New York (1994) 6. Ashtekar, A., Lewandowski, J.: Background independent quantum gravity: a status report. Class. Quant. Grav. 21(15), R53–R152 (2004) 7. Carey, A.L., Phillips, J., Sukochev, F.A.: On unbounded p-summable Fredholm modules. Adv. Math. 151(2), 140–163 (2000) 8. Chamseddine, A.H., Connes, A., Marcolli, M.: Gravity and the standard model with neutrino mixing. Adv. Theor. Math. Phys. 11(6), 991–1089 (2007) 9. Fairbairn, W., Rovelli, C.: Separable Hilbert space in loop quantum gravity. J. Math. Phys. 45(7), 2802– 2814 (2004) Communicated by Y. Kawahigashi

Commun. Math. Phys. 290, 399–435 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0849-6

Communications in

Mathematical Physics

A Class of Integrable Flows on the Space of Symmetric Matrices Anthony M. Bloch1, , Vasile Brînz˘anescu2,3, , Arieh Iserles4 , Jerrold E. Marsden5, , Tudor S. Ratiu6, 1 Department of Mathematics, University of Michigan, Ann Arbor,

MI 48109, USA. E-mail: [email protected]

2 “Simion Stoilow” Institute of Mathematics of the Romanian Academy,

P.O.Box 1-764, 014700 Bucharest, Romania

3 Department of Mathematics and Informatics, University of Pite¸sti,

110040 Pite¸sti, Romania. E-mail: [email protected]

4 Department of Applied Mathematics and Theoretical Physics, University of Cambridge,

Wilberforce Road, Cambridge CB3 0WA, UK. E-mail: [email protected]

5 Control and Dynamical Systems 107-81, California Institute of Technology,

Pasadena, CA 91125, USA. E-mail: [email protected]

6 Section de Mathématiques and Bernoulli Center, Ecole Polytechnique

Fédérale de Lausanne, CH-1015 Lausanne, Switzerland. E-mail: [email protected] Received: 21 February 2007 / Accepted: 30 March 2009 Published online: 8 July 2009 – © Springer-Verlag 2009

Abstract: For a given skew symmetric real n × n matrix N, the bracket [X, Y ] N = X N Y − Y N X defines a Lie algebra structure on the space Sym(n, N ) of symmetric n × n real matrices and hence a corresponding Lie-Poisson structure. The purpose of this paper is to investigate the geometry, integrability, and linearizability of the Hamiltonian system X˙ = [X 2 , N ], or equivalently in Lax form, the equation X˙ = [X, X N + N X ] on this space along with a detailed study of the Poisson geometry itself. If N has distinct eigenvalues, it is proved that this system is integrable on a generic symplectic leaf of the Lie-Poisson structure of Sym(n, N ). This is established by finding another compatible Poisson structure. If N is invertible, several remarkable identifications can be implemented. First, (Sym(n, N ), [·, ·]) is Lie algebra isomorphic with the symplectic Lie algebra sp(n, N −1 ) associated to the symplectic form on Rn given by N −1 . In this case, the system is the reduction of the geodesic flow of the left invariant Frobenius metric on the underlying symplectic group Sp(n, N −1 ). Second, the trace of the product of matrices defines a non-invariant non-degenerate inner product on Sym(n, N ) which identifies it with its dual. Therefore Sym(n, N ) carries a natural Lie-Poisson structure as well as a compatible “frozen bracket” structure. The Poisson diffeomorphism from Sym(n, N ) to sp(n, N −1 ) maps our system to a Mischenko-Fomenko system, thereby providing another proof of its integrability if N is invertible with distinct eigenvalues. Third, there is a second ad-invariant inner product on Sym(n, N ); using it to identify Sym(n, N ) with itself and Research partially supported by NSF grants CMS-0408542 and DMS-0604307. Research partially supported by the Swiss SCOPES grant IB7320-110721/1, 2005-2008, and MEdC

Contract 2-CEx 06-11-22/25.07.2006. Research partially supported by the California Institute of Technology and NSF-ITR Grant ACI-0204932. Research partially supported by the Swiss NSF and the Swiss SCOPES grant IB7320-110721/1.

400

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

composing it with the dual of the Lie algebra isomorphism with sp(n, N −1 ), our system becomes a Mischenko- Fomenko system directly on Sym(n, N ). If N is invertible and has distinct eigenvalues, it is shown that this geodesic flow on Sym(n, N ) is linearized on the Prym subvariety of the Jacobian of the spectral curve associated to a Lax pair formulation with parameter of the system. If, on the other hand, N has nullity one and distinct eigenvalues, in spite of the fact that the system is completely integrable, it is shown that the flow does not linearize on the Jacobian of the spectral curve. Contents 1. 2. 3. 4. 5. 6. 7. 8.

Introduction . . . . . . . . . . . . . . . . . . The Lie Algebra and the Euler–Poincaré Form Poisson Structures . . . . . . . . . . . . . . . The Sectional Operator Equations and Relation to Mischenko-Fomenko Flows . . . . . . . . . Lax Pairs with Parameter . . . . . . . . . . . Involution . . . . . . . . . . . . . . . . . . . Independence . . . . . . . . . . . . . . . . . . Linearization of the Flow . . . . . . . . . . .

. . . . . . . . . . . . . . . . 400 . . . . . . . . . . . . . . . . 402 . . . . . . . . . . . . . . . . 406 . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

416 419 422 425 427

1. Introduction The problem and discussion of the results. Fix N ∈ so(n), the space of skew symmetric n × n matrices, also regarded as the Lie algebra of SO(n), the n-dimensional proper orthogonal group. This paper continues the analysis, begun by Bloch and Iserles in [5], of the following set of ordinary differential equations on Sym(n), the linear space of n × n symmetric matrices: X˙ = [X 2 , N ].

(1.1)

Here, X ∈ Sym(n), X˙ denotes the time derivative, and initial conditions are denoted X (0) = X 0 ∈ Sym(n). It is easy to check that [X 2 , N ] ∈ Sym(n), so that if the initial condition is in Sym(n) then X (t) ∈ Sym(n) for all t. As will be seen shortly, this system is Hamiltonian and, despite its quadratic dependence on X , conservation of energy guarantees that solutions of (1.1) exist forall t ∈ R. Because of the obvious identity X 2 , N = [X, X N + N X ] = X 2 N −N X 2 , Eq. (1.1) may be rewritten in the Lax form X˙ = [X, X N + N X ],

(1.2)

again with initial conditions X (0) = X 0 ∈ Sym(n).1 Define the N -bracket by [X, Y ] N := X N Y − Y N X . It is easy to check that this makes Sym(n) into a Lie algebra and with this structure it will be denoted Sym(n, N ). The structure of this Lie algebra is completely analyzed in the present paper. Using the trace inner product, identify Sym(n, N ) with its dual and endow it with the associated Lie-Poisson structure. As will be done below, it is straightforward to show that the system (1.1) is Hamiltonian with respect to this Lie-Poisson structure with Hamiltonian 1 Integrable equations that bear a formal resemblance to Eq. (1.1); that is, to (1.2), in the context of free associative algebras are given in [18 and 24].

A Class of Integrable Flows on the Space of Symmetric Matrices

401

equal to the quadratic form defined by the Frobenius metric. Interestingly, the system is also Hamiltonian with respect to a compatible “frozen” Poisson structure; this provides a bi-Hamiltonian structure for Eq. (1.1). We study the Poisson geometry on Sym(n, N ) for both Poisson structures and, in particular, determine the generic leaves and the Casimir functions of both Poisson structures relative to which the system (1.1) is bi-Hamiltonian. The Poisson geometry in the case N is not invertible turns out to be particularly rich. A key result of the paper is that if N has distinct eigenvalues (one of which could be zero), this system is integrable on the generic symplectic leaf of Sym(n, N ) (of either the Lie-Poisson or the frozen Lie-Poisson structures). The proof makes use of the Lax pair with parameter found in [5] to find a class of integrals that, as we show using the preceding bi-Hamiltonian structure together with a technique inspired by [22], are in involution.2 Related work on bi-Hamiltonian structures may be found in [17 and 6]. Independence is proved directly. We show that if N is invertible, the Lie algebra Sym(n, N ) is isomorphic to the symplectic Lie algebra sp(n, N −1 ), where the symplectic form on Rn is given by N −1 . Thus, in this case, the system (1.1) is Lie-Poisson on (the dual of) sp(n, N −1 ), and so the system is the (Euler-Poincaré or Lie-Poisson) reduction of the geodesic flow on the underlying symplectic group, denoted by Sp(n, N −1 ), relative to the Frobenius metric. If N is invertible there is a Poisson diffeomorphism from sp(n, N −1 ) to Sym(n, N ), the inverse of which maps our system to a Mischenko-Fomenko system (see [19–21])3 , thereby providing another proof of integrability in the case that N is invertible with distinct eigenvalues. In addition, by identifying the symmetric matrices with themselves by an an ad-invariant inner product if N is invertible (as opposed to the standard identification by the trace of the product used before which is valid in general, even if N is not invertible), our flow can be seen as a Mischenko-Fomenko flow on its dual. A byproduct of our work is thus the bi-Hamiltonian structure for the associated MischenkoFomenko system on sp(n, N −1 ). Bi-Hamiltonian structures for Mischenko-Fomenko systems were first discussed in [6,17], and later in [22]. We also note that the sequence of integrals we produce by our Lax pair with parameter method on Sym(n, N ) is not produced by shifting the arguments in Casimir functions. Relative to the Lie- Poisson structure on Sym(n, N ), our method for analyzing this system appears to be fundamentally different from completely integrable systems either of rigid body or Toda type (on symmetric matrices) and none of the standard involution theorems (see e.g. [25]) seem to be applicable. Since the system (1.1) is integrable and its integrals are polynomials, one would expect that this system may be algebraically completely integrable (as defined, for example, in [3]). It turns out that the situation is quite involved. If N is invertible and has all eigenvalues distinct, then the linearization criterion in [3 or 11] applies and the system is linearizable on the Jacobian of the associated spectral curve. In spite of this fact, we could not prove that the system is algebraically completely integrable. However, the spectral curve has an involution, and thus the system is in fact linearizable on a Prym variety. If N has odd size, distinct eigenvalues, and nullity one, we show by the concrete study of the case n = 5 that the system (1.1) is not linearizable on the Jacobian of the spectral curve. On the other hand, it was already shown that the system is integrable, so this 2 A related result on bi-Hamiltonian structures for rigid body type equations with a parameter can be found in [7]. Note that the bi-Hamiltonian structure in the present paper is for the equations without parameter, which is more relevant for the present study. 3 We thank A. Bolsinov for this observation and the referee for a related observation.

402

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

situation is an example of an integrable system all of whose integrals are polynomials but whose flow does not linearize on the Jacobian of the spectral curve. The structure of the paper. In Sect. 2, the Lie algebra structure on the space of symmetric matrices induced by N is introduced and in the case in which N is invertible, the isomorphism with sp(n, N −1 ) is set up. In Sect. 3, two compatible Poisson structures are defined and the associated bi-Hamiltonian structure is analyzed, and the symplectic leaves and Casimir functions of both Poisson structures are determined. In Sect. 4 the system (1.1) is shown not to directly lie in this family. However, the dual of a Lie algebra isomorphism defines a Poisson isomorphism from sp(n, N −1 ) to Sym(n, N ); its inverse maps (1.1) to a Mischenko-Fomenko system on sp(n, N −1 ) if N has distinct eigenvalues. This fact provides a proof of complete integrability of (1.1) if N is invertible with distinct eigenvalues. Section 5 returns to the system (1.1) on Sym(n, N ), presents the Lax pair with parameter, and finds a new family of functions containing the right number of functionally independent integrals of motion; this set of functions is thus a candidate for the Liouville integrals. In Sect. 6 involutivity of these integrals is shown using the bi-Hamiltonian structure and Sect. 7 proves the independence of these functions provided that N has distinct eigenvalues and is either invertible or has nullity one. Finally, Sect. 8 is devoted to the proofs of the linearization statements given above. 2. The Lie Algebra and the Euler–Poincaré Form Regarding N as a Poisson tensor on Rn , the bracket of two functions f, g is defined in the standard way as { f, g} N = (∇ f )T N ∇g.

(2.1)

The Hamiltonian vector field associated with a function h (with the convention that f˙(z) = X h (z) · ∇ f (z) = { f, h} (z)) is easily checked to be given by X h (z) = N ∇h(z).

(2.2)

Quadratic functions. For each X ∈ Sym(n), define the quadratic Hamiltonian Q X by Q X (z) :=

1 T z X z, z ∈ Rn . 2

Let Q := {Q X | X ∈ Sym(n)} be the vector space of all such functions. Note that the map Q : X ∈ Sym(n) → Q X ∈ Q is an isomorphism. Using (2.2) it follows that the Hamiltonian vector field of Q X has the form X Q X (z) = N X z.

(2.3)

The Poisson bracket of two such quadratic functions is easy to work out. Lemma 2.1. For X, Y ∈ Sym(n), we have {Q X , Q Y } N = Q [X,Y ] N ,

(2.4)

where, as earlier, [X, Y ] N := X N Y − Y N X ∈ Sym(n). In addition, Sym(n) is a Lie algebra relative to the Lie bracket [·, ·] N and with this structure will be denoted Sym(n, N ). Therefore, Q : X ∈ (Sym(n, N ), [·, ·] N ) → Q X ∈ (Q, {·, ·} N ) is a Lie algebra isomorphism.

A Class of Integrable Flows on the Space of Symmetric Matrices

403

Proof. Using (2.1), we have {Q X , Q Y } N (z) = (∇ Q X ) (z)T N (∇ Q Y ) (z) = (X z)T N Y z = z T X N Y z 1 = z T (X N Y − Y N X ) z = Q [X,Y ] N (z). 2 Recall that the notation Q V is reserved only for symmetric matrices V . Since X, Y ∈ Sym(n, N ) implies that [X, Y ] N = X N Y − Y N X ∈ Sym(n, N ) we can write Q [X,Y ] N in the preceding equation. The bracket [·, ·] N on Sym(n, N ) is clearly bilinear and antisymmetric. The Jacobi identity follows by a straightforward direct verification. It is a general fact that Hamiltonian vector fields and Poisson brackets are related by X f , X g = −X { f,g} , (2.5) where the bracket on the left-hand side is the Jacobi-Lie bracket. Thus, it is natural to look at the corresponding algebra of Hamiltonian vector fields on the Poisson manifold (Rn , {·, ·} N ) associated to quadratic Hamiltonians. If we take f = Q X and g = Q Y , with X f = N X and X g = N Y, and recall that the Jacobi-Lie bracket of linear vector fields is the negative of the commutator of the associated matrices, then we have the following result, which can also be verified directly. Proposition 2.2. Equations (2.4) and (2.5) imply N [X, Y ] N = [N X, N Y ].

(2.6)

Letting LH denote the Lie algebra of linear Hamiltonian vector fields on Rn relative to the commutator bracket of matrices, (2.6) states that the map X ∈ (Sym(n, N ), [·, ·] N ) → N X ∈ (LH, [·, ·]) is a homomorphism of Lie algebras4 . Invertible case. If N is invertible, then this homomorphism is an isomorphism. In addition, the non-degeneracy of N implies that n is even and that Rn is a symplectic vector space relative to the symplectic form defined by N −1 , that is, (u, v) → u · N −1 v for u, v ∈ Rn . Therefore, the Lie algebra (LH, [·, ·]) is isomorphic to the Lie algebra sp(n, N −1 ) of linear infinitesimally symplectic maps of Rn relative to the symplectic form defined above by N −1 . Recall that elements Z ∈ sp(n, N −1 ) are characterized by the identity Z T N −1 + N −1 Z = 0 which is equivalent to the statement that N −1 Z is a symmetric n × n matrix. Thus N X ∈ sp(n, N −1 ) is equivalent to X = X T , as expected. We summarize these considerations in the following statement that can also be found in [27] at the end of Remark 22 in Sect. 44, p. 245. Proposition 2.3. Let N ∈ so(n). The map Q : X ∈ (Sym(n, N ), [·, ·] N ) → Q X ∈ (Q, {·, ·} N ) is a Lie algebra isomorphism. The map : X ∈ (Sym(n, N ), [·, ·] N ) → N X ∈ (LH, [·, ·]) is a Lie algebra homomorphism and if N is invertible it induces an isomorphism of (Sym(n, N ), [·, ·] N ) with sp(n, N −1 ). 4 We thank Gopal Prasad for suggesting isomorphisms of this type; they are closely related to well-known properties of linear Hamiltonian vector fields, as in [16], Prop. 2.7.8.

404

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

Noninvertible case. Assume that N is a general skew- symmetric matrix, not necessarily invertible. We shall determine now the structure of the Lie algebra (Sym(n, N ), [·, ·] N ). The point of departure is the fact that if N is non-degenerate, then X ∈ (Sym(n, N ), [·, ·] N ) → N X ∈ (LH, [·, ·]) = (sp(n, N −1 ), [·, ·]) is a Lie algebra isomorphism. Recall that if Rn has an inner product, which we shall take in what follows to be the usual dot product associated to the basis in which the skew-symmetric matrix N is given, and L : Rn → Rn is a linear map, then Rn decomposes orthogonally as Rn = im L T ⊕ker L. Taking L = N in this statement and recalling that N T = −N, we get the orthogonal decomposition Rn = im N ⊕ ker N. Let 2 p = rank N and d := n − 2 p. Then N¯ := N |im N : im N → im N defines a non-degenerate skew symmetric bilinear form and, by the previous proposition, (Sym(2 p), [·, ·] N¯ ) is isomorphic as a Lie algebra to (sp(2 p, N¯ −1 ), [·, ·]). In this direct sum decomposition of Rn , the skew- symmetric matrix N takes the form N¯ 0 N= , 0 0 where N¯ is a (2 p) × (2 p) skew-symmetric non-degenerate matrix. The Lie algebra (Sym(2 p), [·, ·] N¯ ) acts on the vector space M(2 p)×d of (2 p) × d matrices (which we can think of as linear maps of ker N to im N ) by S · A := S N¯ A, where S ∈ (Sym(2 p), [·, ·] N¯ ) and A ∈ M(2 p)×d . Indeed, if S, S ∈ Sym(2 p) and A ∈ M(2 p)×d , then [S, S ] N¯ · A = (S N¯ S − S N¯ S) N¯ A = S N¯ S N¯ A − S N¯ S N¯ A = S · (S · A) − S · (S · A).

(2.7)

Now form the semidirect product Sym(2 p) M(2 p)×d . Its bracket is defined by [(S, A), (S , A )] = ([S, S ] N¯ , S · A − S · A) = (S N¯ S − S N¯ S, S N¯ A − S N¯ A)

(2.8)

for any S, S ∈ Sym(2 p) and A, A ∈ M(2 p)×d . Next, define the Sym(d)-valued Lie algebra two-cocycle C : Sym(2 p) M(2 p)×d × Sym(2 p) M(2 p)×d → Sym(d) by C((S, A), (S , A )) := A T N¯ A − (A )T N¯ A for any

S, S

∈ Sym(2 p) and A,

A

(2.9)

∈ M(2 p)×d . The cocycle identity

C([(S, A), (S , A )], (S , A )) + C([(S , A ), (S , A )], (S, A)) + C([(S , A ), (S, A)], (S , A )) = 0 for any S, S , S ∈ Sym(2 p) and A, A , A ∈ M(2 p)×d is a straightforward verification. Now extend Sym(2 p) M(2 p)×d by this cocycle. That is, form the vector space (Sym(2 p) M(2 p)×d ) ⊕ Sym(d) and endow it with the bracket [(S, A, B), (S , A , B )]C : = S N¯ S − S N¯ S, S N¯ A − S N¯ A, A T N¯ A − (A )T N¯ A (2.10) for any S, S ∈ Sym(2 p), A, A ∈ M(2 p)×d , and B, B ∈ Sym(d).

A Class of Integrable Flows on the Space of Symmetric Matrices

405

Proposition 2.4. The map : ((Sym(2 p) M(2 p)×d ) ⊕ Sym(d), [·, ·]C ) → (Sym(n, N ), [·, ·] N ) given by

(S, A, B) :=

S A AT B

(2.11)

is a Lie-algebra isomorphism. Proof. It is obvious that is a vector space isomorphism, therefore only the Liealgebra homomorphism condition needs to be verified. So, let (S, A, B), (S , A , B ) ∈ (Sym(2 p) M(2 p)×d ) ⊕ Sym(d) and compute ([(S, A, B), (S , A , B )]) = (S N¯ S − S N¯ S, S N¯ A − S N¯ A, A T N¯ A −(A )T N¯ A) S N¯ S − S N¯ S S N¯ A − S N¯ A = (S N¯ A − S N¯ A)T A T N¯ A − (A )T N¯ A S A N¯ 0 S A S A S A N¯ 0 − = (A )T B 0 0 A T B A T B 0 0 (A )T B = [(S, A, B), (S , A , B )] N as required.

For a different description of the structure of this Lie algebra using its Levi decomposition and not involving cocycles see [27], Sect. 44, Remark 22, p. 245. Euler–Poincaré form. The Euler–Poincaré form for the equations can be derived as follows. Identify Sym(n, N ) with its dual using the positive definite inner product X, Y

:= trace (X Y ) , for X, Y ∈ Sym(n, N ).

(2.12)

Remark. The inner product X, Y

is not ad-invariant relative to the N -bracket, but the bilinear form κ N (X, Y ) := trace(N X N Y ),

(2.13)

is invariant, as is easy to check. Note that for N invertible κ N is non-degenerate and hence an inner product and provides another method of identifying Sym(n) with its dual. We shall return to this observation at the end of Sect. 4. Define the Lagrangian l : Sym(n, N ) → R on the Lie algebra (Sym(n, N ), [·, ·] N ) by 1 1 1 l(X ) = trace X 2 = trace X X T = X, X

. (2.14) 2 2 2 Proposition 2.5. The equations X˙ = [X 2 , N ]

(2.15)

are the Euler-Poincaré equations5 corresponding to the Lagrangian (2.14) on the Lie algebra (Sym(n, N ), [·, ·] N ). 5 For a general discussion of the Euler-Poincaré equations, see, for instance, [16].

406

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

Proof. Recall that the general (left) Euler-Poincaré equations on a Lie algebra g associated with a Lagrangian l : g → R are given by d Dl(ξ ) = ad∗ξ Dl(ξ ), dt where Dl(ξ ) ∈ g∗ is the Fréchet derivative of l at ξ . Equivalently, for each fixed η ∈ g, we have d Dl(ξ ) · η = Dl(ξ ) · [ξ, η]. dt

(2.16)

In our case, letting ξ = X and η = Y arbitrary, time- independent, Eqs. (2.16) become d X, Y

= X, [X, Y ] N

dt = X, X N Y − Y N X

; that is, trace X˙ Y = trace (X (X N Y − Y N X )) = trace (X 2 N − N X 2 )Y , which gives the result.

3. Poisson Structures Two compatible Poisson structures on Sym(n, N ) are introduced in this section. Their associated Poisson geometry is studied in detail. These two structures together with the bi-Hamiltonian methodology will be the key to proving integrability of (1.1). Two Poisson structures. Identifying Sym(n, N ) with its dual using the inner product ·, ·

defined in (2.12), endows Sym(n, N ) with the the (left, or minus) Lie-Poisson bracket { f, g} N (X ) = − trace [X (∇ f (X )N ∇g(X ) − ∇g(X )N ∇ f (X ))] ,

(3.1)

where ∇ f is the gradient of f relative to the inner product ·, ·

on Sym(n, N ). Later on we shall also need the frozen Poisson bracket { f, g} F N (X ) = − trace (∇ f (X )N ∇g(X ) − ∇g(X )N ∇ f (X )) .

(3.2)

It is a general fact that the Poisson structures (3.1) and (3.2) are compatible in the sense that their sum is a Poisson structure (see e.g. Exercise 10.1-5 in [16]). For what follows it is important to compute the Poisson tensors corresponding to the above Poisson brackets. Recall that the Poisson tensor can be viewed as a vector bundle morphism B : T ∗ (Sym(n, N )) → T (Sym(n, N )) covering the identity. It is defined by B(dh) = {·, h} N for any locally defined smooth function h on Sym(n, N ). Since Sym(n, N ) is a vector space, these bundles are trivial and hence the value B X at X ∈ Sym(n, N ) of the Poisson tensor B is a linear map B X : Sym(n, N ) → Sym(n, N ) by identifying Sym(n, N ) with its dual using the inner product ·, ·

.

A Class of Integrable Flows on the Space of Symmetric Matrices

407

Proposition 3.1. Denote the value at X ∈ Sym(n, N ) of the Poisson tensors corresponding to the Lie-Poisson (3.1) and frozen (3.2) brackets by B X and C X , respectively. Then for any Y ∈ Sym(n, N ) we have B X (Y ) = X Y N − N Y X, C X (Y ) = Y N − N Y.

(3.3) (3.4)

Proof. Let f and g be locally defined smooth functions on Sym(n, N ). The definition of B X gives ∇ f (X ), B X (∇g(X )

= = = =

{ f, g} N (X ) − trace [X (∇ f (X )N ∇g(X ) − ∇g(X )N ∇ f (X ))] trace [∇ f (X ) (X ∇g(X )N − N ∇g(X )X )] ∇ f (X ), X ∇g(X )N − N ∇g(X )X

,

which implies (3.3) since any Y ∈ Sym(n, N ) is of the form ∇g(X ), where g(X ) = X, Y

. Similarly, the definition of C X gives ∇ f (X ), C X (∇g(X )

= { f, g} F N (X ) = − trace (∇ f (X )N ∇g(X ) − ∇g(X )N ∇ f (X )) = trace [∇ f (X ) (∇g(X )N − N ∇g(X ))] = ∇ f (X ), ∇g(X )N − N ∇g(X )

, which proves (3.4).

Hamiltonian vector fields. Let us determine the Hamiltonian vector fields associated to a smooth function for both Poisson brackets. Recall that if g is a Lie algebra, the Lie-Poisson equations defined by h ∈ C ∞ (g∗ ) relative to the minus Lie- Poisson bracket are µ˙ = ad∗δh/δµ µ, where µ ∈ g∗ . We shall identify Sym(n, N )∗ with itself via the inner product ·, ·

. Therefore, for any X, Y, Z ∈ Sym(n, N ), we have ∗

adYN X, Z = X, [Y, Z ] N

= trace (X Y N Z − X Z N Y ) = trace ((X Y N − N Y X )Z ) = X Y N − N Y X, Z

, and hence

∗ adYN X = X Y N − N Y X.

If h ∈ C ∞ (Sym(n, N )), we denote by ∇h(X ) the gradient relative to the inner product ,

. Therefore, the Lie-Poisson equations for h ∈ C ∞ (Sym(n, N )) are ∗ N X, X˙ = ad∇h(X ) that is, X˙ = X ∇h(X )N − N ∇h(X )X.

(3.5)

408

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

Similarly, Hamilton’s equations for the frozen bracket are X˙ = ∇h(X )N − N ∇h(X ).

(3.6)

In particular, if h(X ) = trace(X 2 )/2, Eq. (3.5) becomes X˙ = X 2 , N . Similarly, if h(X ) = trace(X 3 )/3, Eq. (3.6) becomes X˙ = X 2 , N . If N is invertible, we have seen that there is an ad-invariant inner product κ N (X, Y ) = trace(N X N Y ). Therefore, we can identify Sym(n, N )∗ with itself using the inner prod† uct κ N . Denote by adYN the adjoint relative to κ N of the N -adjoint map adYN (Z ) := [Y, Z ] N , for any Z ∈ Sym(n, N ). To determine it, let M, Y, Z ∈ Sym(n, N ) be arbitrary (M is thought of as an element in the dual), compute † adYN M, Z = κ N (M, [Y, Z ] N ) = trace (N M N (Y N Z − Z N Y )) κN = trace (N (M N Y − Y N M)N Z ) = κ N ((M N Y − Y N M), Z ) , and conclude that † adYN M = M N Y − Y N M = [M, Y ] N . If h ∈ C ∞ (Sym(n, N )), denote by ∇ N h(M) the gradient relative to the inner product κ N . Therefore, the Lie-Poisson equations for h ∈ C ∞ (Sym(n, N )) are †

M˙ = ad∇N N h(M) M = M, ∇ N h(M) . N

(3.7)

For example, if h(M) = trace(N 2 M N 2 M)/2, then for any S ∈ Sym(n, N ) we get trace(N 2 M N 2 S) = dh(M) · S = κ N ∇ N h(M), S = trace N ∇ N h(M)N S , and hence ∇ N h(M) = N M N , so Hamilton’s equations (3.7) are M˙ = [M, N M N ] N .

(3.8)

Note that if l(X ) = X, X

/2 = trace(X 2 )/2 then the Legendre transform M := ∇ N l(X ) = N −1 X N −1 gives the Hamiltonian h(M) := κ N (M, X ) − l(X ) =

1 trace(N 2 M N 2 M). 2

Hence the Lie-Poisson equation (3.8) is equivalent to the Euler-Poicaré equation (2.15). One can check this fact explicitly: substituting for M in terms of X in (3.8) gives (2.15) and vice versa.

A Class of Integrable Flows on the Space of Symmetric Matrices

409

Generic leaves. Next, the dimension of the generic leaves of the two Poisson brackets are determined. The Lie-Poisson bracket is treated first. The following proposition follows from [27], Sect. 44, Prop. 23, p. 245. We give below an elementary proof. Proposition 3.2. Let n = 2 p + d, where 2 p = rank N . The generic leaves of the Lie– Poisson bracket {·, ·} N are 2 p( p + d)-dimensional. Proof. As in the proof of Proposition 2.4, we orthogonally decompose Rn = im N ⊕ ker N so that N¯ = N | im N : im N → im N is an isomorphism. In this decomposition the matrix N takes the form N¯ 0 N= 0 0 and, according to the isomorphism in Proposition 2.4, the matrix X can be written as S A , X= AT B where S ∈ Sym(2 p), B ∈ Sym(d), and A ∈ M(2 p)×d . Therefore, if U C Y = ∈ Sym(n, N ) CT D with U ∈ Sym(2 p), D ∈ Sym(d), C ∈ M(2 p)×d , the Poisson tensor of the Lie-Poisson bracket {·, ·} N takes the form (see Proposition 3.1) B X (Y ) = X Y N − N Y X U C N¯ 0 U C S S A N¯ 0 − = AT B C T D 0 0 0 0 C T D AT SU N¯ − N¯ U S + AC T N¯ − N¯ C A T − N¯ U A − N¯ C B . = A T U N¯ + BC T N¯ 0

A B

Since N¯ is invertible, the kernel of B X : Sym(n, N ) → Sym(n, N ) is therefore given by all U ∈ Sym(2 p), D ∈ Sym(d), and C ∈ M(2 p)×d such that SU N¯ − N¯ U S + AC T N¯ − N¯ C A T = 0 and U A + C B = 0. To compute the dimension of the maximal symplectic leaves, we assume that the matrix X is generic. So, supposing that B is invertible, we have C = −U AB −1 and S − AB −1 A T U N¯ − N¯ U S − AB −1 A T = 0. Since S − AB −1 A T ∈ Sym(2 p) is given, this condition is identical to the vanishing of the Poisson tensor on the dual of the Lie algebra Sym(2 p, N¯ ), [· , ·] N¯ evaluated at S − AB −1 A T. But N¯ is invertible so, according to Proposition 2.3, this Lie algebra is isomorphic to sp(2 p, N¯ −1 ) whose rank is p. Therefore, the kernel of the map U ∈ Sym(2 p, N¯ ) → S − AB −1 A T U N¯ − N¯ U S − AB −1 A T ∈ Sym(2 p, N¯ ) for generic S − AB −1 A T has dimension p.

410

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

Since C = −U AB −1 is uniquely determined and D ∈ Sym(d) is arbitrary, we see that the dimension of the kernel of B X for generic X has dimension p + d(d + 1)/2. Thus, the dimension of the generic leaf of the Lie–Poisson bracket {·, ·} N is 1 1 (2 p + d)(2 p + d + 1) − p − d(d + 1) = 2 p( p + d), 2 2 as claimed in the statement of the proposition.

Proposition 3.3. All the leaves of the frozen Poisson bracket {·, ·} F N are (i) 2 p( p + d)-dimensional if N is generic, that is, all its non-zero eigenvalues are distinct, and (ii) p( p + 1 + 2d)-dimensional if all non-zero eigenvalue pairs of N are equal. Proof. Proceeding as in the proof of the previous proposition and using the same notation for N , X , and Y , the Poisson tensor of the frozen bracket takes the form U C U C N¯ 0 N¯ 0 C X (Y ) = Y N − N Y = − CT D 0 0 0 0 CT D U N¯ − N¯ U N¯ C . = 0 C T N¯ Thus, since N¯ is invertible, the kernel of C X is given by all U ∈ Sym(2 p), D ∈ Sym(d), C ∈ M(2 p)×d such that C = 0 and U N¯ − N¯ U = 0. Since N¯ is non-degenerate, there exists an orthogonal matrix Q such that 0 V T ¯ Q, N=Q −V 0 where V = diag(v1 , . . . , v p ) and vi ∈ R, vi = 0 for all i = 1, . . . , p. Therefore, 0 V 0 V Q − QT QU 0 = U N¯ − N¯ U = U Q T −V 0 −V 0 0 V 0 V QU Q T Q − = Q T QU Q T −V 0 −V 0 is equivalent to U˜

0 −V

0 V − −V 0

where U˜ := QU Q T ∈ Sym(2 p). Write U11 U˜ = T U12

V ˜ U = 0, 0

U12 U22

(3.9)

with U11 and U22 symmetric p × p matrices and U12 an arbitrary p × p matrix. Then (3.9) is equivalent to T U22 = V U11 V −1 = V −1 U11 V and U12 = −V −1 U12 V = −V U12 V −1 .

(3.10)

A Class of Integrable Flows on the Space of Symmetric Matrices

411

(i) Assume now that vi = v j if i = j. Since V U11 V −1 = V −1 U11 V is equivalent to V 2 U11 V −2 = U11 , it follows that vi2 v 2j

u 11,i j = u 11,i j for all i, j = 1, . . . , p,

where u 11,i j are the entries of the symmetric matrix U11 . Since the fraction on the left hand side is never equal to one for i = j, this relation implies that u 11,i j = 0 for all i = j. Thus U11 is diagonal and U22 = U11 . A similar argument shows that U12 is diagT which implies that U = 0. Therefore, onal. However, then it follows that U12 = −U12 12 the kernel of the map U → U N¯ − N¯ U is p-dimensional. Concluding, the dimension of every leaf of the frozen Poisson structure equals 21 (2 p+ d)(2 p + d + 1) − p − 21 d(d + 1) = 2 p( p + d). (ii) The other extreme case is when vi = v j =: v for all i, j = 1, . . . , p. Then V = v I , where I is the identity matrix, and (3.10) becomes U22 = U11 , T = −U . Therefore, the kernel of the map U → U N ¯ − N¯ U has dimension equal U12 12 1 1 2 to 2 p( p + 1) + 2 p( p − 1) = p . Concluding, the dimension of every leaf of the frozen Poisson structure equals 21 (2 p+ d)(2 p + d + 1) − p 2 − 21 d(d + 1) = p( p + 1 + 2d). Casimir functions. The next job will be to determine Casimir functions for both brackets. Here is the main result. Proposition 3.4. Let the skew symmetric matrix N have rank 2 p and size n := 2 p + d. Choose an orthonormal basis of R2 p+d in which N is written as ⎡

0 N = ⎣−V 0

V 0 0

⎤ 0 0⎦, 0

where V is a real diagonal matrix whose entries are v1 , . . . , v p . (i) If vi = v j for all i = j, then p + d(d + 1)/2 Casimir functions for the frozen Poisson structure (3.2) are given by 1 C Fi (X ) = trace(E i X ), i = 1, . . . , p + d(d + 1), 2 where E i is any of the matrices ⎡

Skk ⎣ 0 0

0 Skk 0

⎤ ⎡ 0 0 0⎦ , ⎣0 0 0

0 0 0

⎤ 0 0 ⎦. Sab

Here Skk is the p × p matrix all of whose entries are zero except the diagonal (k, k) entry which is one and Sab is the d × d symmetric matrix having all entries equal to zero except for the (a, b) and (b, a) entries that are equal to one.

412

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

(ii) If vi = v j for all i, j = 1, . . . , p, then p 2 + d(d + 1)/2 Casimir functions for the frozen Poisson structure (3.2) are given by 1 C Fi (X ) = trace(E i X ), i = 1, . . . , p 2 + d(d + 1), 2 where E i is any of the matrices ⎤ ⎡ ⎡ 0 Skl 0 0 ⎣ 0 Skl 0⎦, ⎣−Akl 0 0 0 0

Akl 0 0

⎤ ⎡ 0 0 0⎦, ⎣0 0 0

0 0 0

⎤ 0 0 ⎦. Sab

Here Skl is the p × p symmetric matrix having all entries equal to zero except for the (k, l) and (l, k) entries that are equal to one and Akl is the p × p skew symmetric matrix with all entries equal to zero except for the (k, l) entry which is 1 and the (l, k) entry which is −1. (iii) Denote 0 V ¯ N= . −V 0 The p + d(d + 1)/2 Casimir functions for the Lie-Poisson bracket {·, ·} N on the open set det(B) = 0 (see (2.11)) of Sym(2 p + d) are given by 2k 1 ., for k = 1, . . . , p trace S − AB −1 A T N¯ −1 C k (X ) := 2k and 1 C k (X ) = trace(X E k ), for k = p + 1, . . . , p + d(d + 1), 2 where E k is any matrix of the form ⎡ 0 ⎣0 0

0 0 0

⎤ 0 0 ⎦. Sab

In the special case when N is full rank the Casimir functions are just 2k 1 k −1 , for k = 1, . . . , p. trace X N C (X ) = 2k Proof. To prove (i), recall from Proposition 3.3(i) that the kernel of the Poisson tensor C X has dimension p + 21 d(d + 1). Moreover, if E belongs to this kernel, then the linear function given by X → trace(E X ) has gradient E, which is annihilated by the Poisson tensor C X . Thus all C Fi are Casimir functions. Since the gradients of all these functions are the p + 21 d(d +1) matrices in the statement which are obviously linearly independent, it follows that the functions C Fi form a functionally independent set of Casimir functions for the frozen bracket {·, ·} F N . Part (ii) has an identical proof.

A Class of Integrable Flows on the Space of Symmetric Matrices

413

Now consider Part (iii). First, we compute the gradient relative to ·, ·

. We compute for any δS δ A ∈ Sym(n, N ) δX = (δ A)T δ B the derivative

DC k (X ) · δ X = trace N¯ −1 S − AB −1 A T N¯ −1 · · · N¯ −1 S − AB −1 A T N¯ −1 , (3.11) (δS) − (δ A)B −1 A T − AB −1 (δ A)T + AB −1 (δ B)B −1 A T .

Now denote

α β ∇C (X ) = T β γ

k

so that

δS δ A α β k k trace ∇C (X )(δ X ) = ∇C (X ), δ X = trace β T γ (δ A)T δ B (3.12) = trace α(δS) + β(δ A)T + β T (δ A) + γ (δ B) .

By (3.11) and (3.12) we have α = N¯ −1 S − AB −1 A T N¯ −1 · · · N¯ −1 S − AB −1 A T N¯ −1 , β = − N¯ −1 S − AB −1 A T N¯ −1 · · · N¯ −1 S − AB −1 A T N¯ −1 AB −1 , γ = B −1 A T N¯ −1 S − AB −1 A T N¯ −1 · · · N¯ −1 S − AB −1 A T N¯ −1 AB −1 , where in each term we have 2k factors of N¯ −1 . Therefore α −α AB −1 k ∇C (X ) = −B −1 A T α B −1 A T α AB −1 with α given above. Now we check that all these matrices ∇C k (X ) are in the kernel of the operator of the Lie- Poisson operator B X Y = X Y N − N Y X . Indeed, X ∇C k (X )N − N ∇C k (X )X S A α −α AB −1 N¯ 0 = A T B −B −1 A T α B −1 A T α AB −1 0 0 S A α −α AB −1 N¯ 0 − 0 0 −B −1 A T α B −1 A T α AB −1 A T B S A S A α N¯ 0 N¯ α − N¯ α AB −1 − = AT B A T B −B −1 A T α N¯ 0 0 0 0 Sα N¯ − AB −1 A T α N¯ N¯ αS − N¯ α AB −1 A T N¯ α A − N¯ α AB −1 B − = 0 0 A T α N¯ − B B −1 A T α N¯ 0 −1 T −1 T ¯ ¯ (S − AB A )α N − N α(S − AB A ) 0 . = 0 0

414

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

This vanishes if and only if (3.13) (S − AB −1 A T )α N¯ − N¯ α(S − AB −1 A T ) = 0. However, we know that α = N¯ −1 S − AB −1 A T N¯ −1 · · · N¯ −1 S − AB −1 A T N¯ −1 , where in each factor we have 2k factors of N¯ −1 . We replace α with this expression in (3.13) and get (S − AB −1 A T )α N¯ − N¯ α(S − AB −1 A T ) = (S − AB −1 A T ) N¯ −1 S − AB −1 A T N¯ −1 · · · N¯ −1 S − AB −1 A T N¯ −1 N¯ − N¯ N¯ −1 S − AB −1 A T N¯ −1 · · · N¯ −1 S − AB −1 A T N¯ −1 (S − AB −1 A T ) = (S − AB −1 A T ) N¯ −1 S − AB −1 A T N¯ −1 · · · N¯ −1 S − AB −1 A T − S − AB −1 A T N¯ −1 · · · N¯ −1 S − AB −1 A T N¯ −1 (S − AB −1 A T ) = 0, since both factors are equal; each once contains 2k − 1 factors of N¯ −1 . However, sp(2 p, N¯ −1 ) is identified with the subalgebra consisting of the (1, 1) blocks of elements of Sym(n, N ) (see Proposition 2.4). The isomorphism S ∈ Sym(2 p, N¯ ) → N¯ S ∈ sp(2 p, N¯ −1 ) given in Proposition 2.3 identifies the basis of p Casimirs in the dual of sp(2 p, N¯−1 ) (givenby the even traces of the powers of a matrix) with the functions S → trace (S N¯ −1 )2k /2k. Therefore the functions C k for k = 1, . . . , p given in the statement of the proposition are functionally independent Casimirs for the Lie-Poisson bracket of Sym(n, N ). To see that the remaining functions C k (X ) = trace(X E k ) are Casimirs observe that in this case 0 0 ∇C k (X ) = 0 Sab and B X (∇C k (X )) =

S AT

A B

0 0

0 Sab

N¯ 0

0 N¯ − 0 0

0 0

0 0

0 Sab

S AT

A = 0. B

Since the matrices Sab span symmetric k × k matrices, these Casimirs are functionally independent. The two sets of Casimirs are also independent taken together, since each set depends only on a subset of independent variables and these two sets of variables are disjoint. We have thus obtained p + d(d + 1)/2 Casimirs, which is the codimension of the generic leaf, thus proving that they generate the space of all Casimir functions of the Lie-Poisson bracket. The equations in the degenerate case. If N is degenerate, representing it and the matrix X ∈ Sym(n, N ) as in Proposition 2.4, the equations X˙ = [X 2 , N ] are equivalent to the system ⎧ 2 T ¯ ˙ ⎪ ⎨ S = [S + A A, N ] A˙ = − N¯ (S A + AB) ⎪ ⎩ ˙ B = 0.

A Class of Integrable Flows on the Space of Symmetric Matrices

415

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 −0.1 −0.2 0

10

20

30

40

50

60

70

80

90

100

Fig. 3.1. Time plot of flow in the 3 by 3 case for a, b, c, e, f , and g

Example. It is illuminating to examine the system in the lowest dimension degenerate case, i.e. p = 1 and d = 1. Let ⎡ ⎤ a e f S A X = ⎣ e b g ⎦ =: AT c f g c and

⎡

0 N = ⎣−1 0

1 0 0

⎤ 0 N¯ 0⎦ =: 0 0

0 . 0

Then the dynamics becomes a˙ = −2(ae + eb + f g), b˙ = 2(ae + eb + f g), c˙ = 0, e˙ = a 2 + f 2 − b2 − g 2 , g˙ = a f + ge + c f, f˙ = −(e f + bg + gc). In this case the two Casimir functions of the Lie-Poisson bracket are given by g2a det X 1 f ge f 2 b C1 = −ba + + e2 − 2 + =− , 2 c c c 2c and by C 2 = c, so that c˙ = 0 in equations of motion expresses the conservation of this Casimir directly. As we shall see in forthcoming sections the two integrals of motion which prove integrability are trace(X ) and trace(X 2 ). We already know these are conserved since the flow is isospectral. Observe also that conservation of trace(X ) is given by summing the first two equations of motion while trace(X )2 /2 is the Hamiltonian. We illustrate this example with time plots in Fig. 3.1 and two phase plots plots in Fig. 3.2.

416

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

0.8

0.7 0.6

0.6

0.5 0.4 0.4 0.2

0 −0.4

0.3

−0.2

0

0.2

0.4

0.2

0

0.2

0.4

0.6

0.8

Fig. 3.2. Phase plane portraits in the 3 by 3 case projected to the a-e and the b-e planes

4. The Sectional Operator Equations and Relation to Mischenko-Fomenko Flows It is shown that Eq. (1.1) can be mapped to a Mischenko-Fomenko type system (see [19–21] or [27]) in the case N is invertible with distinct eigenvalues. The Mischenko-Fomenko construction. Consider a semisimple complex or real split Lie algebra g with Killing form ·, · . Let h be a Cartan subalgebra, let a, b ∈ h and a be regular (i.e. its value on every root is non-zero). Define the sectional operators Ca,b,D : g → g by Ca,b,D (ξ ) := ada−1 adb (ξ1 ) + D(ξ2 ), where ξ = ξ1 + ξ2 , ξ2 ∈ h, ξ1 ∈ h⊥ (the perpendicular is taken relative to the Killing form and thus h⊥ is the direct sum of all the root spaces), and D : h → h is an arbitrary invertible symmetric operator on h. Then Ca,b,D : g → g is an invertible symmetric operator (relative to the Killing form) satisfying the condition [Ca,b,D (ξ ), a] = [ξ, b]

(4.1)

for all ξ ∈ g. The Lie-Poisson bracket on g∗ ∼ = g (the isomorphism being given by the Killing form) has the expression { f, g}(ξ ) = − ξ, [∇ f (ξ ), ∇g(ξ )]

C ∞ (g),

where ∇ is taken relative to ·, · . Hamilton’s equations for for any f, g ∈ h ∈ C ∞ (g) have thus the form ξ˙ = [ξ, ∇h(ξ )]. In particular, if 1 Ca,b,D (ξ ), ξ 2 then ∇h(ξ ) = Ca,b,D (ξ ) since Ca,b,D is ·, · -symmetric. Thus the equations of motion are h(ξ ) :=

ξ˙ = [ξ, Ca,b,D (ξ )].

(4.2)

A Class of Integrable Flows on the Space of Symmetric Matrices

417

Example. For g = so(n), the Killing form is a multiple of the symmetric bi-invariant two-form (1 , 2 ) → tr(1 2 ), and one chooses C −1 () := J + J for a given diagonal matrix J satisfying Ji + J j > 0 if i = j. We have [C(M), J ] = [M, J 2 ] for any M ∈ so(n). Then M˙ = [M, C(M)] is the n-dimensional rigid body equation. Note in this case that J and J 2 are not in the Cartan subalgebra of so(n), but the general theory in [20,21] deals also with this situation for any semisimple complex or real split Lie algebra; J an J 2 are in the Cartan subalgebra (after one makes them trace zero) of sl(n, C). Returning to the general case, note that (4.2) can be written as d (ξ + λa) = [ξ + λa, C(ξ ) + λb] dt

(4.3)

if and only if (4.1) holds. Now it is obvious that ξ → f k (ξ + λa), k = 1, . . . , := rank(g) = dim h, are conserved on the flow of (4.3), for any element of the basis of the polynomial Casimir functions f 1 , . . . , f and any parameter λ. Since the f k are polynomial, it follows that the coefficients of λi in the expansion of f k (ξ + λa) in powers of λ are conserved along the flow of (4.2). There are redundancies: some coefficients of λi vanish and other coefficients are Casimir functions. Mischenko and Fomenko ([20,21]) proved the following result. Theorem 4.1. Let g be a semisimple complex or real split Lie algebra and C : g → g a symmetric operator satisfying (4.1). Then the Lie-Poisson system ξ˙ = [ξ, C(ξ )] on g defined by the Hamiltonian H (ξ ) = C(ξ ), ξ /2 is completely integrable on the maximal dimensional adjoint orbits of the Lie algebra g and its commuting generically independent first integrals are the non-trivial coefficients of λi in the polynomial λ- expansion of f i,λ (ξ ) = f i (ξ + λa) which are not Casimir functions; here f 1 , . . . , f is the basis of the ring of polynomial invariants of g. In addition, all functions f i,λ commute with H . A Poisson isomorphism for N invertible. We want to compare the Lie-Poisson bracket (3.1) on Sym(n, N ) with that on sp(n, N −1 )∗ . To obtain the Lie-Poisson bracket on sp(n, N −1 )∗ we identify sp(n, N −1 )∗ with sp(n, N −1 ) via the invariant non-degenerate symmetric bilinear form Z 1 , Z 2

:= trace (Z 1 Z 2 ) . Therefore, the Lie-Poisson bracket on sp(n, N −1 )∗ ∼ = sp(n, N −1 ) is given by {φ, ψ}sp(Z ) := − Z , [∇φ(Z ), ∇ψ(Z )]

,

(4.4)

where ∇ is taken relative to ·, ·

and φ, ψ : sp(n, N −1 ) → R are smooth functions. In the following proposition, Sym(n, N )∗ is identified with itself using the noninvariant inner product ·, ·

(see (2.12)).

418

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

Proposition 4.2. The map Z ∈ sp(n, N −1 ), {·, ·}sp → Z N ∈ (Sym(n, N ), {·, ·} N ) is an isomorphism of Lie-Poisson spaces. Proof. By Proposition 2.3, the map : (Sym(n, N ), [·, ·] N ) → sp(n, N −1 ), [·, ·] given by (X ) := N X is a Lie algebra isomorphism. Therefore its dual ∗ : sp(n, N −1 ), {·, ·}sp → (Sym(n, N ), {·, ·} N ) is an isomorphism of Lie-Poisson spaces (see, e.g., [16]). Since for any Z ∈ sp(n, N −1 ) and Y ∈ Sym(n, N ) we have ∗ (Z ), Y = Z , (Y )

= Z , N Y

= trace(Z N Y ) = Z N , Y

, it follows that ∗ (Z ) = Z N .

Since N is invertible, as we have seen in Sect. 3, Sym(n, N )∗ can be identified with itself using the ad- invariant inner product κ N . To compute the pull-back † : sp(n, N −1 ) → Sym(n, N ) if we identify Sym(n, N )∗ with itself using κ N , let Z ∈ sp(n, N ) and Y ∈ Sym(n, N ). We get κ N († (Z ), Y ) = Z , (Y )

= Z , N Y

= trace(Z N Y ) = κ N (N −1 Z , Y ), and hence † (Z ) = N −1 Z .

(4.5)

The Mischenko-Fomenko system on sp(n, N −1 ), {·, ·}sp . We now show that for N ∗ with distinct eigenvalues maps the system (1.1) to a Mischenko-Fomenko system on sp(n, N −1 ), {·, ·}sp . Indeed, denoting X := ∗ (Z ) = Z N , we get Z˙ = X˙ N −1 = [X 2 , N ]N −1 = X 2 − N X 2 N −1 = Z N Z N − N Z N Z N N −1 = [Z , N Z N ]. The following lemma, which can easily be verified, shows that the linear invertible operator C : sp(n, N −1 ) → sp(n, N −1 ) defined by C(Z ) = N Z N is a sectional operator. Lemma 4.3. The map C (i) (ii) (iii) (iv)

is well-defined, i.e. N Z N indeed belongs to sp(n, N −1 ), is symmetric relative to ·, ·

, satisfies [C(Z ), N −1 ] = [N , Z ], is of the form Ca,b,D with a = N −1 , b = −N , and D having the same formula as C on the Cartan algebra.

Applying the Mischenko-Fomenko Theorem 4.1 we get the following result: Proposition 4.4. Let N be invertible with distinct eigenvalues. The system Z˙ = [Z , N Z N ]

(4.6)

is integrable on the maximal dimensional orbits of sp(n, N −1 ) and its generically independent integrals in involution are the non-trivial coefficients of λi in the polynomial expansion of k1 tr(Z + λN −1 )k that are not Casimir functions, k = 2, . . . , n. The Hamiltonian for (4.6) is H (Z ) := trace((Z N )2 )/2. Pushing forward Z by the map ∗ we obtain the following statement.

A Class of Integrable Flows on the Space of Symmetric Matrices

419

Theorem 4.5. Let N be invertible with distinct eigenvalues. The equation X˙ = [X 2 , N ] is an integrable Hamiltonian system on the maximal dimensional symplectic leaf of Sym(n, N ) defined by the function l(X ) = tr(X 2 )/2 relative to the Lie-Poisson bracket (3.1). The independent integrals in involution are the non-trivial coefficients of λi in the polynomial expansion of k1 tr(X N −1 + λN −1 )k that are not Casimir functions, k = 2, . . . , n. The Mischenko-Fomenko system on the dual of Sym(n). For N invertible we can also show that our system (1.1) is a system of Mischenko-Fomenko type directly on Sym(n, N) viewed as its own dual under the ad- invariant inner product κ N (X, Y ) = trace(N X N Y ) defined in Eq. (2.13). Recall from Proposition 2.3 the Lie algebra isomorphism : X ∈ (Sym(n, N ), [ , ] N ) −→ Z := N X ∈ (sp(n, N −1 ), [ , ]). It is easy to see that the ad-invariant inner product κ N on Sym(n, N ) is pushed forward by to the non- degenerate ad-invariant form given by the trace of the product on sp(n, N −1 ). Therefore, the pull back † : sp(n, N −1 ) → Sym(n, N ), where Sym(n, N )∗ is identified with itself using κ N , is an isomorphism of Lie-Poisson spaces. Hence † (Z ) = N −1 Z maps the Mischenko-Fomenko system (4.6) on sp(n, N −1 ) to a Mischenko-Fomenko system on Sym(n, N ). A direct computation shows that M := N −1 Z satisfies (3.8). In the ensuing sections we provide a direct proof of integrability on Sym(n, N ) for N with distinct eigenvalues but not necessarily invertible, that is, N has at most one zero eigenvalue. In the invertible case, we provide a different sequence of integrals and, in addition, derive a second Hamiltonian structure for the Mischenko-Fomenko system on sp(n, N −1 ). 5. Lax Pairs with Parameter To prove that system (1.1) is integrable for any N having distinct eigenvalues, we will compute its flow invariants. Bear it in mind that, by virtue of the isospectral representation (1.2), we already know that the eigenvalues of X , or alternatively, the quantities trace X k for k = 1, 2, . . . , n − 1, are invariants. One way to compute additional invariants is to rewrite the system as a Lax pair with a parameter. One can do this in a fashion similar to that for the generalized rigid body equations (see [15]). Theorem 5.1. Let λ be a real parameter. The system (1.2) is equivalent to the following Lax pair system:

d (X + λN ) = X + λN , N X + X N + λN 2 . (5.1) dt Proof. The proof is a computation. The only nontrivial power of λ to check is the first. In fact, the coefficient of λ on the right hand side of Eq. (5.1) is [N , N X + X N ] + [X, N 2 ] = N 2 X + N X N − N X N − X N 2 + X N 2 − N 2 X = 0, which proves (5.1).

420

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

Manakov [15] noticed that the generalized rigid body equations M˙ = [M, ] (see §4), can be written as a Lax equation with a parameter in the form d (M + λJ 2 ) = [M + λJ 2 , + λJ ]. (5.2) dt Note the following contrast with our setting: in the Manakov case the system matrix M is in so(n) and the parameter J is a symmetric matrix while in our case X is symmetric and the parameter N ∈ so(n). For the generalized rigid body the nontrivial coefficients of λi , 0 < i < k in the traces of the powers of M + λJ 2 then yield the right number of independent integrals in involution to prove integrability of the flow on a generic adjoint orbit of S O(n) (identified with the corresponding coadjoint orbit). The case i = 0 needs to be eliminated, because these are Casimir functions. Similarly, in our case, the nontrivial coefficients of λi , 0 ≤ i ≤ k, in 1 trace(X + λN )k , k = 1, 2, . . . , n − 1 (5.3) k yield the conserved quantities. The coefficient of λr , 0 ≤ r ≤ k, in (5.3) is trace X i1 N j1 X i2 · · · X is N js , r = 0, . . . , k, k = 1, . . . , n − 1, h λk (X ) :=

|i|=k−r | j|=r

where i = (i 1 , i 2 , . . . i s ), j = ( j1 , j2 , . . . js ) are multi-indices, i q , jq = 0, 1, . . . , k, and s s i q , | j| = q=1 jq . The coefficient of λk is the constant N k so it should not |i| = q=1 be counted. Thus we have r < k. In addition, since the trace of a matrix equals the trace of its transpose, X ∈ Sym(n, N ), and N ∈ so(n), it follows that trace X i1 N j1 X i2 · · · X is N js = (−1)| j| trace N js X js · · · X i2 N j1 X i1 . Therefore, if r is odd, then necessarily trace X i1 N j1 X i2 · · · X is N js = 0 |i|=k−r | j|=r

and only for even r we get an invariant. Thus, we are left with the invariants X i1 N j1 X i2 · · · X is N js h k,2r (X ) := trace |i|=k−2r | j|=2r

(5.4)

for k = 1, . . . , n − 1, i q = 1, . . . , k, jq = 0, . . . , k − 1, r = 0, . . . , k−1 2 , where [ ] denotes the integer part of ∈ R. The integrals (5.4) are thus the coefficients of λ2r , 0 < 2r < k, in the expansion of k1 trace(X + λN )k . For example, if k = 1 or k = 2 then we have one integral, the coefficient of λ0 . If k = 3 or k = 4, only the coefficients of λ2 and λ0 yield non-trivial 4 2 0 integrals. If k = 5 or k = 6 it is the coefficients k+1 of λ , λ , and λ that give non-trivial integrals. In general, for the power k we have 2 integrals. Recall that k = 1, . . . , n − 1. If n − 1 = 2 , we have hence n−1+1 n−1+1 + = 1 + 1 + 2 + 2 + ··· + + 1 + 1 + 2 + 2 + ··· + 2 2 n−1n+1 n−1 n−1 +1 = = ( + 1) = 2 2 2 2

A Class of Integrable Flows on the Space of Symmetric Matrices

421

integrals. If n − 1 = 2 + 1 then we have n−2+1 n−2+1 n−1+1 1 + 1 + 2 + 2 + ··· + + + 2 2 2 = 1 + 1 + 2 + 2 + · · · + + + ( + 1) n 2 = ( + 1) + ( + 1) = ( + 1)2 = 2 integrals. However, ⎧ ⎪ n − 1 n + 1 , if n is odd ⎨

n n + 1 ⎪ 2 2 = . 2 ⎪ 2 2 n ⎪ ⎩ , if n is even 2 Concluding, we have

n n + 1 2

2

invariants which are the coefficients of λ2r , 0

< 2r < k, in the expansion of k1 trace(X +

λN )k for k = 1, . . . , n − 1. We now address the issue of whether or not these integrals are the right candidates to prove complete integrability of the system X˙ = [X 2 , N ]. • If N is invertible, then n = 2 p and hence

n n + 1 2p 2p + 1 1 2 = = p2 = 2p + p − p 2 2 2 2 2 1 dim sp(2 p, N −1 ) − rank sp(2 p, N −1 ) = 2

which is half the dimension of the generic adjoint orbit in sp(2 p, N −1 ). Therefore, these conserved quantities are the right candidates to prove that this system is integrable on the generic coadjoint orbit of Sym(n, N ). This will be proved in the next sections. • If N is non-invertible (which is equivalent to d = 0), then n = 2 p + d and hence

n n + 1 2p + d 2p + d + 1 = 2 2 2 2 d +1 d p+ = p+ 2 2 d + 1 d d +1 d + + = p2 + p 2 2 2 2 d + 1 d . = p 2 + pd + 2 2 The right number of integrals is p( p + d) according to Proposition 3.2, so this calculation seems to indicate that there are additional integrals. The situation is not so simple since there are redundancies due to the degeneracy of N . Note, however, that if d = 1, then we do get the right number of integrals. We shall return to the study of the degenerate case in Sect. 7.

422

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

Remark. Recall that in the special case when N is invertible, we found the sequence of integrals given in Theorem 4.5. Note that these integrals have a different form from the family of integrals in (5.4). This does not necessarily mean that the two sets of functions are functionally independent. 6. Involution In this section we prove involution of the integrals found in the previous section for arbitrary N ∈ so(n). Bi-Hamiltonian structure. We begin with the following observation. Proposition 6.1. The system X˙ = X 2 N − N X 2 is Hamiltonian with respect to the bracket { f, g} N defined in (3.1) using the Hamiltonian h 2 (X ) := 21 trace(X 2 ) and is also Hamiltonian with respect to the compatible bracket { f, g} F N defined in (3.2) using the Hamiltonian h 3 (X ) := 13 trace(X 3 ). Proof. We have already implicitly checked the first statement using Euler-Poincaré theory, but here is a direct verification. We want to show that the condition f˙ = { f, h 2 } N d for any f determines the equations X˙ = X 2 N − N X 2 . First note that dt f (X ) = ˙ trace(∇ f (X ) X ). Second, since ∇h 2 (X ) = X , the right-hand side { f, h 2 } N becomes, by (3.1), { f, h 2 } N (X ) = − trace [X (∇ f (X )N X − X N ∇ f (X ))] = − trace ∇ f (X )N X 2 − ∇ f (X )X 2 N . Thus, X˙ = X 2 N − N X 2 as required. To show that the same system is Hamiltonian relative to the frozen Poisson bracket, we proceed in a similar way. Noting that ∇h 3 (X ) = X 2 , we get from (3.2), { f, h 3 } F N (X ) = − trace ∇ f N X 2 − X 2 N ∇ f = − trace ∇ f N X 2 − ∇ f X 2 N , and hence X˙ = X 2 N − N X 2 , as before.

integrals given in (5.4), namely Involution. Next we begin the proof that the n2 n+1 2 X i1 N j1 X i2 · · · X is N js , h k,2r (X ) := trace |i|=k−2r | j|=2r

where k = 1, . . . , n − 1, i q = 1, . . . , k, jq = 0, . . . , k − 1, r = 0, . . . , k−1 2 , are in involution. It will be convenient below to write the expansion of h λk starting with the highest power of λ, that is, h λk (X ) =

1 trace (X + λN )k = λk−r h k,k−r (X ). k k

r =0

(6.1)

A Class of Integrable Flows on the Space of Symmetric Matrices

423

As explained before, not all of these coefficients should be counted: roughly half of them vanish and the last one, namely, h k,k , is the constant N k . Consistently with our notation for the Hamiltonians, we set h k = h k,0 . Firstly we require the gradients of the functions h λk . Lemma 6.2. The gradients ∇h λk are given by ∇h λk (X ) =

1 1 (X + λN )k−1 + (X − λN )k−1 . 2 2

(6.2)

Proof. We have for any Y ∈ Sym(n, N ),

∇h λk (X ), Y

= dh λk (X ) · Y = trace (X + λN )k−1 Y 1 = trace (X + λN )k−1 + (X − λN )k−1 Y . 2

Since ,

is non-degenerate on Sym(n, N ), the result follows.

Proposition 6.3. B X (∇h λk (X )) = C X (∇h λk+1 (X )).

(6.3)

Proof. Using (3.3) we get B X (∇h λk (X )) = X ∇h λk (X )N − N ∇h λk (X )X 1 X (X + λN )k−1 N + X (X − λN )k−1 N = 2 −N (X + λN )k−1 X − N (X − λN )k−1 X =

1 (X + λN )k N − λN (X + λN )k−1 N + (X − λN )k N + λN (X − λN )k−1 N 2

−N (X + λN )k + λN (X + λN )k−1 N − N (X − λN )k − λN (X − λN )k−1 N 1 (X + λN )k N + (X − λN )k N − N (X + λN )k − N (X − λN )k = 2 = ∇h λk+1 (X )N − N ∇h λk+1 (X ) = C X (∇h λk+1 (X )) by (3.4), which proves the formula.

Proposition 6.4. The functions h k,k−r satisfy the recursion relation B X (∇h k,k−r (X )) = C X (∇h k+1,k−r (X )).

(6.4)

Proof. Substituting (6.1) into (6.3) we obtain k r =0

k+1 λk−r B X ∇h k,k−r (X ) = λk+1−r C X ∇h k+1,k+1−r (X ) . r =0

Since ∇h k+1,k+1 (X ) = N k+1 , formula (3.4) implies that C X ∇h k+1,k+1 (X ) = 0. Thus on the right-hand side the sum begins at r = 1. Changing the summation index on the right-hand side from r to r − 1 and identifying the coefficients of like powers of λ yields (6.4).

424

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

Remark. It is worth making a few remarks about Propositions 6.3 and 6.4. Note that unlike the similar recursion for the rigid body Manakov integrals (see e.g. [27 and 22]), our polynomial recursion relation (6.3) does not have a premultiplier λ on the right-hand side and the polynomials on the left- and right-hand sides appear to be of different order. This cannot be and indeed is not so. Indeed, the highest order coefficient on the right hand side vanishes by virtue of following result. Corollary 6.5. The functions h k,k−1 (X ) are Casimirs for the frozen Poisson structure, i.e. C X ∇h k,k−1 (X ) = 0 (6.5) for all k. Proof. By (6.1), h k,k−1 (X ) = trace N k−1 X , so its gradient equals ∇h k,k−1 (X ) = N k−1 . So (3.4) immediately gives (6.5). The recursion relations (6.4) for r = 0 also imply the following relation between the Hamiltonians that can also be easily checked by hand: Corollary 6.6. B X (∇h k (X )) = C X (∇h k+1 (X )) .

(6.6)

Example. An interesting nontrivial example of the recursion relation to check is B X (dh 3,2 (X )) = C X (dh 4,2 (X )), where h 3,2 (X ) = trace(N 2 X ) and h 4,2 (X ) = trace (N 2 X 2 ) + 21 trace(N X N X ). This example illustrates how the recursion relation works despite the apparent inconsistency in order. Involution follows immediately, using the recursion relations. Proposition 6.7. The invariants h k,k−r are in involution with respect to both Poisson brackets { f, g} N and { f, g} F N . Proof. The definition of the Poisson tensors B X and C X and the recursion relation (6.4) give h k,k−r , h l,l−q N = ∇h k,k−r (X ), B X (∇h l,l−q (X ))

= ∇h k,k−r (X ), C X (∇h l+1,l−q (X ))

= h k,k−r , h l+1,l−q F N = − h l+1,l−q , h k,k−r F N = − ∇h l+1,l−q (X ), C X (∇h k,k−r (X ))

= − ∇h l+1,l−q (X ), B X (∇h k−1,k−r (X ))

= − h l+1,l−q , h k−1,k−r N = h k−1,k−r , h l+1,l−q N for any k, l = 1, . . . , n − 1, r = 1, . . . , k and q = 0, . . . , l − 1. Of course, in these relations we assume that k − r and l − q are even, for if at least one of them is odd, the identity above has zeros on both sides. Repeated application of this relation eventually leads to Hamiltonians h k,k−r , where either k − r is a power of λ that does not exist for k, in whichcase the Hamiltonian is zero, or one is led to h 0,0 which is constant. This shows that h k,k−r , h l,l−q N = 0 for any pair of indices. In a similar way one shows that h k,k−r , h l,l−q F N = 0.

A Class of Integrable Flows on the Space of Symmetric Matrices

425

Bi-Hamiltonian structure on sp(n, N −1 ). Using the bi-Hamiltonian property of system (1.1) and the Poisson isomorphism in Proposition 4.2 we get the following statement: Theorem 6.8. The Lie-Poisson isomorphism Z ∈ sp(n, N −1 ), {·, ·}sp → Z N ∈ (Sym(n, N ), {·, ·} N ) induces a bi-Hamiltonian structure for the Mischenko-Fomenko equations (4.6) on sp(n, N −1 ). The second Hamiltonian structure is { f, g} N −1 (Z ) = − trace N −1 [∇ f (Z ), ∇g(Z )] −1 for any f, g ∈ C ∞ (sp(n, N )3 and the Hamiltonian corresponding to this Poisson structure is h(Z ) = trace (Z N ) /3.

7. Independence To complete the proof of integrability we need to show that the integrals h k,2r are independent. We will demonstrate this first in the generic case when N is invertible with distinct eigenvalues. By (5.4), the gradients of the integrals h k,2r have the form X i1 N j1 X i2 · · · X is N js , (7.1) ∇h k,2r (X ) := |i|=k−2r −1 | j|=2r

where k = 1, . . . , n − 1, i q = 1, . . . , k, jq = 0, . . . , k − 1, r = 0, . . . ,

k−1 2 .

The generic case. We consider the case N invertible with distinct eigenvalues. Therefore d = 0 and n = 2 p. In this case we show that the integrals h k,2r given in (5.4) are independent, and hence the system (1.1) is integrable. Theorem 7.1. For N invertible with distinct eigenvalues, the integrals h k,2r given by Eq. (5.4) are independent. Proof. We are concerned with the linear independence (in a generic sense) of (7.1), where k = 1, . . . , n − 1, i q = 1, . . . , k, jq = 0, . . . , k − 1 and r = 0, . . . [ 21 (k − 1)]. We recall that N is invertible with distinct eigenvalues and, without loss of generality, assume that X is diagonal, X = diag µ. This reduces the statement of the theorem to a problem about the independence of polynomials in single matrix variable. Now, we aim to prove a stronger statement: the terms vi, j = X i1 N j1 X i2 · · · X is N js are independent for all multi-indices i and j in the above range. Note however that each vi, j is a q-degree polynomial in µ1 , µ2 , . . . , µn , where q = k −2r −1 ∈ {0, . . . , n −2}. Let Hq = {vi, j | |i| = q, | j| even}. Clearly, in a generic sense, if linear dependence exists, it must exist within the set Hq . In other words, if we can prove that there is no linear dependence within each Hq , we

426

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

are done. (Note that since k ≤ n − 1 in the expression (7.1) there is no dependence of powers of X on lower powers through the characteristic polynomial.) There is nothing to prove for q = 0. For q = 1 we have H1 = {X N j | j even} ∪ {N j X | j even}. Suppose that there exists linear dependence in H1 . Then there necessarily exist ρ0 , ρ2 , . . . , ρn−2 and κ0 , κ2 , . . . , κn−2 , not all zero, such that X ρ2 j N 2 j + κ2 j N 2 j X = 0 = X R(N ) + K (N )X = 0. Therefore, µa [R(N )]a,b + [K (N )]a,b µb = 0,

a, b = 1, . . . , n.

Generically (i.e., for all µ except for a set of measure zero) this can hold only if R(N ), K (N ) = 0. But deg R, deg K ≤ n − 1 and, since the eigenvalues of N are distinct, the degree of the minimal polynomial of N is n. Therefore K , R ≡ 0, a contradiction. Hence there is no linear dependence. We continue to q = 2. Now H2 = {X i1 N j1 X i2 N j2 X i3 : i 1 + i 2 + i 3 = 2, j1 + j2 even}. Assume that there exist ρi, j , not all zero, such that ρi, j X i1 N j1 X i2 N j2 X i3 = 0. i, j

Therefore

ρi, j

i, j

µia1 µib2 µic3 (N j1 )a,b (N j2 )b,c = 0,

a, c = 1, . . . , n.

b

Note that we want the above to hold for all real µk , but this is possible only if ρi, j (N j1 )a,b (N j2 )b,c = ρi, j (N j1 + j2 )a,c , a, c = 1, . . . , n, 0= i, j

thus

b

i, j

ρi, j N j1 + j2 = 0.

i. j

We again obtain a polynomial in N 2 of degree < n/2, which cannot be zero: a contradiction. We can continue for higher s in an identical manner. Hence, since we have involution and independence, we have proved the following. Theorem 7.2. For N invertible with distinct eigenvalues the system (1.1) is completely integrable. Corollary 7.3. For N odd dimensional with distinct eigenvalues and nullity one, the system (1.1) is completely integrable.

A Class of Integrable Flows on the Space of Symmetric Matrices

427

Proof. In this case we have d = 1 and n = 2 p + 1. All eigenvalues are distinct with one of them being zero. The above proof of independence still holds, the only change being that the characteristic (and minimal) polynomial of N is of form N w(N 2 ), where w is a polynomial of degree (n − 1)/2. Remark. Independently Li and Tomei [14] have shown the integrability of the same system in precisely the two cases discussed in this paper employing different techniques; they use the loop group approach suggested by the Lax equation with parameter (5.1) and give the solution in terms of factorization and the Riemann-Hilbert problem. 8. Linearization of the Flow We have demonstrated integrability of the system (1.1) for appropriate N by showing involution and independence of a sufficient number of integrals. The purpose of this section is to analyze the linearization of this system on the Jacobi variety of the curve det(z I − λN − X ) = 0 using the theory discussed in [3 and 11], for example (see also [1,9,12,13]). Linearization on the Jacobian for N invertible and generic. Let us denote X (λ) := X + λN and Y (λ) := N X + X N + λN 2 . For N invertible with distinct eigenvalues (n := 2 p), choose an orthonormal basis of R2 p in which N is written as 0 V N= , −V 0 where V is a real diagonal matrix whose entries are v1 , . . . , v p . Denote by xk,l the entries of the matrix X and put it in the form U C , X= CT R where U ∈ Sym( p), R ∈ Sym( p), and C ∈ M p× p . Then the matrix Y (λ) can be written as V R + UV −λV 2 + V C T − C V . Y (λ) = −V U − RV −λV 2 + C T V − V C The plane algebraic curve (called a spectral curve), associated to each X (λ), namely, X (λ) := {(λ, z) ∈ C × C | det(z I − X (λ)) = 0}, is preserved by the flow of (5.1); the functions which are defined by the coefficients of the characteristic polynomial Q(λ, z) of X (λ) are constants of motion of (5.1). Similarly, for each X (λ) the isospectral variety of matrices A X (λ) defined by A X (λ) := {X (λ) | X (λ) and X (λ) have the same characteristic polynomial} is preserved by the flow of (5.1). Notice that the spectral curve and the isospectral variety depend on the values of the constants of motion only (i.e., on the vector c = (qkl ), where qkl is the coefficient of λk z l in Q(λ, z)). Sometimes one writes c and Ac instead of X (λ) and A X (λ) . Notice that the spectral curve c is non-singular for generic values of

428

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

c. Let c be the compactification in the projective plane P2C of c . For generic values of c the projective curve c is also non-singular. Let us compute the points at infinity of the spectral curve. The equation of the affine spectral curve is: z 2 p + v12 v22 ...v 2p λ2 p + Q 1 (λ, z) = 0,

(8.1)

where the polynomial Q 1 (λ, z) has degree strictly less than 2 p. Put λ = ν/z 0 and z = ζ /z 0 . Now, set z 0 = 0 in the equation 2p

z 0 Q(ν/z 0 , ζ /z 0 ) = 0 of the projective spectral curve c . We get the points at infinity {P1 , . . . , P2 p } := c \ c , with Pk+1 = (1, βk+1 , 0), k = 0, 1, . . . , 2 p − 1, where (2k + 1)π 1/ p and v := |v1 v2 · · · v p |. βk+1 := v exp i 2p At each of these points the meromorphic functions λ and z on c have a pole of order 1. Note also that the genus of the plane curve c is g := ( p − 1)(2 p − 1) (the genus of a non-singular plane curve is given by the well-known formula g = (n − 1)(n − 2)/2, where n is the degree of the homogeneous polynomial equation of the curve; see also [11]). Take now a generic value of the vector c such that c is non-singular and note that for generic (λ, z) ∈ c , the eigenspace of X (λ) with eigenvalue z is one-dimensional. If we denote by kl (z, X (λ)) the cofactor of the matrix z I2 p − X (λ) corresponding to the (k, l)th entry then, the unique eigenvector of X (λ) with eigenvalue z, normalized by ξ1 = 1, is ξ(z, X (λ)) := (ξ1 , . . . , ξ2 p )T , where ξk = 1k (z, X (λ))/11 (z, X (λ)). By [3], p. 187, when X (λ, t) flows according to (5.1), the corresponding eigenvector ξ(t) := ξ(z, X (λ, t)) satisfies the autonomous equation ξ˙ + Y ξ = ρξ, where Y := Y (λ, X (λ, t)) and ρ is the scalar function ρ := ρ(λ, z, X (λ, t)) =

2p

Y (λ, X (λ, t))1l 1l (z, X (λ, t))/11 (z, X (λ, t)).

l=1

The role of the eigenvector ξ is to define the divisor map i c : Ac → Divd ( c ),

X (λ) → D X (λ) ,

where D X (λ) is the minimal effective divisor on c such that (ξk )c ≥ −D X (λ) , k = 1, . . . , 2 p.

A Class of Integrable Flows on the Space of Symmetric Matrices

429

Here, d := deg(D X (λ) ) is independent of X (λ) ∈ Ac (for generic c we can assume Ac connected) and so, D X (λ) defines an effective divisor of degree d in c . Now choose and fix a divisor D0 ∈ Divd ( c ), a basis (ω1 , . . . , ωg ) of holomorphic differentials on c , and consider the vector ω := (ω1 , . . . , ωg )T . One defines the linearizing map by DX jc : Ac → Jac( c ), X → ω, D0

where Jac( c ) denotes the Jacobian of the curve c . The role of the function ρ is to linearize the isospectral flow of (5.1) on Ac , that is, to be able to write D X (t) D X (0)

ω=t

2p

Res Pk (ρ(λ, z, X (λ, 0))ω), D X (0) = D0 ,

k=1

if it is possible. The Linearization Criterion in [3], p. 195 says that this happens if and only if for each X ∈ Ac there exists a meromorphic function X on c with ( X ) c ≥ −

2p

Pk ,

k=1

such that for all Pk , (Laurent tail of dρ(λ, z, X )/dt at Pk ) = (Laurent tail of X at Pk ); see also [11]. Now we shall apply the linearization criterion to our case. Firstly, we have: 11 (z, X (λ)) = z 2 p−1 + v22 ...v 2p zλ2 p−2 + Q 11 (z, λ), where the polynomial Q 11 (z, λ) has degree strictly less than 2 p − 1. Then we compute 12 (z, X (λ)) = M12 (z, λ) + Q 12 (z, λ), where the polynomial Q 12 (z, λ) has degree strictly less than 2 p−2 and the homogeneous polynomial M12 (z, λ) = −x1,2 z 2 p−2 + · · · + x p+1, p+2 v1 v2 v32 . . . v 2p λ2 p−2 has degree 2 p − 2. Similarly, we get for l = 3, . . . , 2 p, l = p + 1, 1l (z, X (λ)) = M1l (z, λ) + Q 1l (z, λ), where the polynomial Q 1l (z, λ) has degree strictly less than 2 p−2 and the homogeneous polynomial M1l (z, λ) has degree 2 p − 2. For l = p + 1, we get 1, p+1 (z, X (λ)) = M1, p+1 (z, λ) + Q 1, p+1 (z, λ), where the polynomial Q 1, p+1 (z, λ) has degree strictly less than 2 p − 1 and the homogeneous polynomial M1, p+1 (z, λ) has degree 2 p − 1.

430

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

Let z k be a local parameter around the point at infinity Pk , k = 1, . . . , 2 p. The Laurent tail of z at Pk is βk /z k and the Laurent tail of λ at Pk is 1/z k . By using the formulas above we conclude that the Laurent tail of 1l (z, X (λ))/11 (z, X (λ)), l = 2, . . . , 2 p at Pk is zero, since this meromorphic function is holomorphic at Pk . Moreover, this function has a zero at Pk for each k = 1, . . . , 2 p, and l = p + 1 (note that on the 2 p−1 + βk v22 ...v 2p is non-zero for generic c). denominator the constant term βk Now we compute the Laurent tail of dρ(λ, z, X )/dt at Pk . We emphasize that ρ only depends on t through X (λ). Firstly, we see that the Laurent tail of d (1l (z, X (λ))/11 (z, X (λ))), l = 2, . . . , 2 p dt at each Pk is zero, because this meromorphic function is holomorphic at Pk , k = 1, . . . , 2 p. Since Y11 = −λv12 , Y1l = v1 xl, p+1 − vl x1, p+l for l = 2, . . . , p, and Y1, p+l = vl x1,l + v1 x p+1, p+l for l = 1, . . . , p, we conclude that the Laurent tail of d d 1l (z, X (λ, t)) ρ(λ, z, X ) = Y (λ, X (λ, t))1l dt dt 11 (z, X (λ, t)) 2p

l=2 2p

+

Y (λ, X (λ, t))1l

l=2

d 1l (z, X (λ, t)) , dt 11 (z, X (λ, t))

at each Pk is zero for all k = 1, . . . , 2 p. Thus, the linearization criterion applies with X = 0. We have proved the following. Theorem 8.1. For N invertible with distinct eigenvalues the map jc linearizes the isospectral flow of the system (5.1) on the Jacobian Jac( c ). Linearization on the Prym variety for N invertible and generic. Since (X + λN )T = X − λN , we have Q(−λ, z) = Q(λ, z). Thus there is an involution τ : c → c of the spectral curve defined by τ (λ, z) = (−λ, z). In homogeneous coordinates λ = ν/z 0 , z = ζ /z 0 this involution is given by τ (ν, ζ, z 0 ) = (−ν, ζ, z 0 ).

A Class of Integrable Flows on the Space of Symmetric Matrices

431

Notice that the involution τ has no fixed points at infinity (z 0 = 0 and ν = 0 would imply ζ = 0 from the homogeneous equation of the curve). Thus, the fixed points are obtained from the equation Q(0, z) = 0, which is the characteristic polynomial of the symmetric matrix X . Generically, we obtain 2 p distinct points Z 1 , . . . , Z 2 p as its fixed (ramification) points, where Z k = (0, z k , 1), k = 1, . . . , 2 p, with z k the (real) eigenvalues of the symmetric matrix X . By the Riemann-Hurwitz formula, the quotient (smooth) curve C1 := c /τ has genus g1 := ( p − 1)2 . Associated to the double covering c → C1 is the Prym variety Prym( c /C1 ), with the property that Jac( c ) is isogenous to Jac(C1 ) × Prym( c /C1 ). It follows that dim Prym( c /C1 ) = g − g1 = p 2 − p. Let us denote by c the sheaf of holomorphic 1-forms on c . Recall that Jac( c ) ∼ = H 0 ( c , c )∗ /H1 ( c , Z). The involution τ acts on the vector space H 0 ( c , c ) and on the free group H1 ( c , Z) having eigenvalues ±1. The Prym variety Prym( c /C1 ) can be equivalently described as the quotient H 0 ( c , c )−∗ /H1 ( c , Z)− , where the upper ± index on a vector space denotes the ±1 eigenspaces. Note that ρ := ρ(λ, z, X (λ, t)) =

2p

Y (λ, X (λ, t))1l 1l (z, X (λ, t))/11 (z, X (λ, t))

l=1

= −λv12 + ρ1 (λ, z, X (λ, t)), where the meromorphic function ρ1 (λ, z, X (λ, t)) has residue zero at each Pk ; see the computation above. By [11], or by direct computation, we have Res Pk (τρ(λ, z, X (λ, 0))) = − Res Pk (ρ(λ, z, X (λ, 0))). It follows that the flow is actually linearized on Prym( c /C1 ). Thus we have proved: Corollary 8.2. For N invertible with distinct eigenvalues the map jc linearizes the isospectral flow of the system (5.1) on the Prym variety Prym( c /C1 ).

432

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

The case of N maximal rank and nullity one. Let us consider now the case of n odd and N having distinct eigenvalues and nullity one, i.e., n = 2 p + 1 and rank N = 2 p. Choose an orthonormal basis of R2 p+1 in which N is written as ⎡ ⎤ 0 V 0 N = ⎣−V 0 0⎦, 0 0 0 where V is a real diagonal matrix whose entries are v1 , . . . , v p . The equation of the affine spectral curve is: z 2 p+1 + v12 v22 ...v 2p λ2 p z + Q 01 (λ, z) = 0,

(8.2)

where the polynomial Q 01 (λ, z) has degree strictly less than 2 p + 1. Put λ = ν/z 0 and z = ζ /z 0 . Now set z 0 = 0 in the equation 2 p+1

z0

Q(ν/z 0 , ζ /z 0 ) = 0

of the projective spectral curve c . We get the points at infinity {P0 , P1 , . . . , P2 p } := c \ c , with P0 = (1, 0, 0) and Pk+1 = (1, βk+1 , 0), k = 0, 1, . . . , 2 p − 1, where (2k + 1)π and v := |v1 v2 · · · v p |. βk+1 := v 1/ p exp i 2p Note that at each of these points, with the exception of P0 , the meromorphic functions λ and z on c have a pole of order 1. At P0 , the function λ has a pole of order 1 and z has a zero of order 1. We shall analyze below in detail the particular case p = 2 (that is, n = 5). A direct computation shows that 11 = (z 4 + v22 z 2 λ2 ) + Q 011 (z, λ),

deg Q 011 < 4,

12 = (v1 v2 x34 zλ2 − x12 z 3 ) + Q 012 (z, λ), 13 = (−v1 v22 zλ3 − v1 z 3 λ) + Q 013 (z, λ),

deg Q 012 < 3, deg Q 013 < 4,

14 = (v2 x12 z 2 λ + v1 x34 z 2 λ − x14 z 3 − v1 v2 x23 zλ2 ) + Q 014 (z, λ),

deg Q 014 < 3,

15 = (−v1 v22 x35 λ3 + v22 x15 zλ2 − v1 x35 z 2 λ + x15 z 3 ) + Q 015 (z, λ),

deg Q 015 < 3.

Let z k be a local parameter around the point at infinity Pk , k = 1, . . . , 4. The Laurent tail of z at Pk is βk /z k and the Laurent tail of λ at Pk is 1/z k . By using the formulas above we conclude that the Laurent tail of 1l (z, X (λ))/11 (z, X (λ)),

l = 2, . . . , 5

at Pk is zero, since this meromorphic function is holomorphic at Pk .

A Class of Integrable Flows on the Space of Symmetric Matrices

433

For P0 the computation changes. Let u be a local parameter around the point P0 . The Laurent tail of z at P0 is zero (z has a simple zero at P0 ) and the Laurent tail of λ at P0 is 1/u. We shall emphasize the leading term for the Laurent tail of 2 11 = v22 (x33 x55 − x35 )/u 2 + . . . ,

12 = v1 v2 (x35 x45 − x34 x55 )/u 2 + . . . , 13 = v1 v22 x55 /u 3 + . . . , 14 = v1 v2 (x23 x55 − x25 x35 )/u 2 + . . . , 15 = −v1 v22 x35 /u 3 + . . . , and we get 13 /11 = 15 /11 =

v1 x55 2 x33 x55 − x35 −v1 x35 2 x33 x55 − x35

1 + ..., u 1 + ..., u

the other two quotients 12 /11 and 14 /11 being holomorphic around P0 . As in the case of n even, we have ρ(λ, z, X ) = −v12 λ + (v1 x23 − v2 x14 ) + v1 (x11 + x33 )

12 11

15 13 14 + (v2 x12 + v1 x34 ) + v1 x35 , 11 11 11

and hence Res P0 ρ =

v12

−1 +

2 (x11 + x33 )x55 − x35 2 x33 x55 − x35

.

From the system (1.2) we get x11 + x33 = C1 and x55 = C2 , where C1 , C2 are constants of the motion. Then a direct computation shows that Res P0

2v 2 C2 x35 x˙35 (C1 − x33 ) dρ = 1 , 2 )2 dt (C2 x33 − x35

which is non-zero generically. By applying Lemma 5.11 in [3] and the linearization criterion, we get the following result. Proposition 8.3. For N ∈ so(5) having distinct eigenvalues and nullity one, generically the map jc does not linearize the isospectral flow of the system (5.1) on the Jacobian Jac( c ).

434

A. M. Bloch, V. Brînz˘anescu, A. Iserles, J. E. Marsden, T. S. Ratiu

An easier computation gives the same result in the case n = 3. We carried out the case n = 5 as more representative of the general case; for n = 3, there are various non-typical simplifications of the computations leading to the non-linearizability result due to the low size of the matrices involved. We expect however that it will be possible to analyze linearization of the general case where N has distinct eigenvalues (i.e. either n = 2 p and N is invertible or, n = 2 p + 1, rank N = 2 p and N has nullity one) on the generalized Jacobian (see e.g. [26]). To do this we intend to follow [10 and 4] (see also [2,8 and 3]). We intend to carry out this study of generalized algebraic integrability of our system in a future publication. Acknowledgements. We thank G. Prasad for his observation regarding Lie algebras. Luc Haine and Pol Vanhaecke have our gratitude for many very illuminating discussions regarding algebraic complete integrability. We also thank Alexey Bolsinov, Percy Deift, Igor Dolgachev, Michael Gekhtman, Rob Lazarsfeld, Alejandro Uribe, and Nguyen Tien Zung for useful conversations that clarified various points in the paper and thereby improved our exposition. Finally we would like to thank the referee for an extraordinarily useful and insightful referee report and for pointing out reference [27] and we would also like to thank the editor for his very helpful input.

References 1. Adams, M.R., Harnad, J., Hurturbise, J.: Darboux coordinates and Liouville-Arnold integration in loop algebras. Comm. Math. Phys. 155, 385–413 (1993) 2. Adler, M., van Moerbeke, P.: Linearization of Hamiltonian systems, Jacobi varieties and representation theory. Adv. Math. 38, 318–379 (1980) 3. Adler, M., van Moerbeke, P., Vanhaecke, P.: Algebraic Integrability, Painlevé Geometry and Lie algebras, Volume 47 of Ergebnisse der Mathematik und ihrer Grenzgebiete, Berlin-Heidelberg-New York: Springer-Verlag, 2004 4. Beauville, A.: Jacobiennes des courbes spectrales et systèmes hamiltoniens completement integrables. Acta Math. 164, 211–235 (1990) 5. Bloch, A.M., Iserles, A.: On an isospectral Lie–Poisson system and its Lie algebra. Found. of Comput. Math. 6, 121–144 (2006) 6. Bolsinov, A.V.: Compatible Poisson brackets on Lie algebras and completeness of families of functions in involution. Math. USSR. Izv. 38(1), 69–90 (1992) 7. Bolsinov, A.V., Borisov, A.V.: Compatible Poisson brackets on Lie algebras. Mat. Zametki, 72(1), 11–34 (2002) 8. Deift, P., Li, L.C., Tomei, C.: Matrix factorizations and integrable systems. Comm. Pure Appl. Math. XLII, 443–521 (1989) 9. Dubrovin, B.A., Novikov, S.P., Krichever, I.M.: Integrable Systems, Encyclopaedia of Mathematical Sciences. 4, Berlin: Springer-Verlag, 1989 10. Gavrilov, L.: Generalized Jacobians of spectral curves and completely integrable systems. Math. Z. 230, 487–508 (1999) 11. Griffiths, P.: Linearizing flows and a cohomological interpretation of Lax equations. Amer. J. Math. 107, 1445–1483 (1985) 12. Krichever, I.M.: Methods of algebraic geometry in the theory of nonlinear equations. Russ. Math. Surv. 32, 185–213 (1977) 13. Krichever, I.M., Novikov, S.P.: Holomorphich bundles over algebraic curves and nonlinear equations. Russ. Math. Surv. 35, 53–79 (1980) 14. Li, L.-C., Tomei, C.: The complete integrability of a Lie–Poisson system proposed by Bloch and Iserles. Intern. Math. Res. Notes 64949, 1–19 (2006) 15. Manakov, S.V.: Note on the integration of Euler’s equations of the dynamics of an n-dimensional rigid body. Funct. Anal. Appl. 10, 328–329 (1976) 16. Marsden, J.E., Ratiu, T.S.: Introduction to Mechanics and Symmetry. Volume 17 of Texts in Applied Mathematics; Second Edition, second printing, Berlin-Heidelberg-New York: Springer-Verlag, 2003 17. Meshcheryakov, M.V.: A characterisitic property of the inertial tensor of a multidimensional solid body. Russ. Math. Surv. 38(5), 201–202 (1983) 18. Mikhailov, A.V., Sokolov, V.V.: Integrable ODEs on associative algebras. Commun. Math. Phys. 211(1), 231–251 (2000)

A Class of Integrable Flows on the Space of Symmetric Matrices

435

19. Mishchenko, A.S., Fomenko, A.T.: On the integration of the Euler equations on semisimple Lie algebras. Sov. Math. Dokl. 17, 1591–1593 (1976) 20. Mischenko, A.S., Fomenko, A.T.: Euler equations on finite-dimensional Lie groups. Izv. AN SSSR 42(2), 396–415 (1978) 21. Mischenko, A.S., Fomenko, A.T.: Integration of Euler equations on semisimple Lie algebras. (In Russian), Trudy Sem. po Vekt. i Tenz. Analizu 19, Moscow MGU, 3–94 (1979) 22. Morosi, C., Pizzocchero, L.: On the Euler equation: bi-H amiltonian structure and integrals in involution. Lett. Math. Phys. 37, 117–135 (1996) 23. Mumford, D.: Tata Lectures on Theta. II. Volume 43 of Progr. Math., Boston, MA: Birkhäuser, Boston, 1984 24. Odesskii, A.V., Sokolov, V.V.: Integrable matrix equations related to pairs of compatible associative algebras. J. Phys. A 39(40), 12447–12456 (2006) 25. Ratiu, T.S.: Involution theorems. In: Kaiser, G., Marsden, J. eds., Geometric Methods in Mathematical Physics, Volume 775 of Springer Lecture Notes, Berlin-Heidelberg-New York: Springer, 1980 pp. 219–257 26. Serre, J.P.: Groupes Algebriques et Corps de Classes, Paris: Hermann, 1959 27. Trofimov, V.V., Fomenko, A.: Algebra and geometry of integrable Hamiltonian differential equations. In: Russian, Moskva, Faktorial, 1995 28. Vanhaecke, P.: Integrable systems and symmetric products of curves. Math. Z. 227(1), 93–127 (1998) 29. Vanhaecke, P.: Integrable Systems in The Realm of Algebraic Geometry, Second edition. Volume 1638 of Lecture Notes in Mathematics, Berlin-Heidelberg-New York: Springer-Verlag, 2001 Communicated by L. Takhtajan

Commun. Math. Phys. 290, 437–477 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0850-0

Communications in

Mathematical Physics

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model Manuel Barros1 , Magdalena Caballero2 , Miguel Ortega1 1 Departamento de Geometría y Topología, Universidad de Granada,

18071 Granada, Spain. E-mail: [email protected]; [email protected] 2 Departamento de Matematicas, Universidad de Córdoba, 14071 Córdoba, Spain. E-mail: [email protected] Received: 17 May 2007 / Revised: 11 November 2008 / Accepted: 25 March 2009 Published online: 19 June 2009 – © Springer-Verlag 2009

Abstract: The Gauss map of non-degenerate surfaces in the three-dimensional Minkowski space are viewed as dynamical fields of the two-dimensional O(2, 1) Nonlinear Sigma Model. In this setting, the moduli space of solutions with rotational symmetry is completely determined. Essentially, the solutions are warped products of orbits of the 1-dimensional groups of isometries and elastic curves in either a de Sitter plane, a hyperbolic plane or an anti de Sitter plane. The main tools are the equivalence of the two-dimensional O(2, 1) Nonlinear Sigma Model and the Willmore problem, and the description of the surfaces with rotational symmetry. A complete classification of such surfaces is obtained in this paper. Indeed, a huge new family of Lorentzian rotational surfaces with a space-like axis is presented. The description of this new class of surfaces is based on a technique of surgery and a gluing process, which is illustrated by an algorithm. 1. Introduction Nonlinear sigma models are field theories whose elementary fields, or dynamical variables, are maps, φ, from a space, M, the source space, to an auxiliary space, E, the target space, endowed with a non-degenerate metric. The Lagrangian governing the dynamics of the model measures the total energy of those maps. The classical solutions of the model, i.e., the solutions of the corresponding field equations, constitute the space of field configurations. The dimension of the source space is called the dimension of the model. The isometry group, A, of the target space is the symmetry of the model. In particular, when M is compact and Riemannian, each solution has finite energy. In this sense, we call them solitons. Two-dimensional nonlinear sigma models, in particular those with symmetry O(3) and O(2, 1), are ubiquitous in Physics (see for example [13,37] and references therein); This work was partially supported by MEC Grant MTM2007-60731 with FEDER funds and the Junta de Andalucía Grant PO6-FQM-01951.

438

M. Barros, M. Caballero, M. Ortega

with applications going from Condensed-matter Physics (see [7,22,23] and references therein) to High-energy Physics (see [1,2,20,28] and references therein) and, of course, Quantum Field Theory (see [24,33] and references therein). In particular, those with Minkowski signature metric on the target space are applied to Gauge Theories (see [1,35]), Quantum Gravity (see [36]), String Theories (see [10,36]), Quantum Mechanics (see [16]) and General Relativity, in particular Einstein and Ernst equations (see [14,18]). They are specially important in string theories where the model description is applicable. This kind of universality is strongly related to the fact that these sigma models, and the equations governing their dynamics, have a deep underlying geometric meaning. This provides a powerful reason to explain the great interest of these models in Applied Mathematics and in Differential Geometry, even without mentioning any physical terminology, simply as a kind of constrained Willmore problem (see, for example, [3,4,9,12] and references therein). In this framework, it seems natural to identify the dynamical variables of the two-dimensional O(3) Nonlinear Sigma Model with the Gauss maps of surfaces in the three-dimensional Euclidean space. This approach has been successfully used to obtain certain moduli spaces of solutions: with constant mean curvature, [7,17,26,32], those admitting a rotational symmetry, [5], and those foliated by Villarceau circles, [6]. The study of moduli spaces of solutions (field configurations) of the two-dimensional O(2, 1) Nonlinear Sigma Model constitutes an ambitious program. We will develop it along a series of articles, starting with this one. Beforehand, it will be useful to remark on the following general points related to this model: • The Gauss map of any nondegenerate surface in the three-dimensional Lorentz-Minkowski space, L3 , is automatically an elementary field of this model. Therefore, the geometrical approach identifies the space of dynamical variables with that of Gauss maps of nondegenerate surfaces in L3 . • On the other hand, the underlying variational problem of this model turns out to be equivalent to the Willmore variational problem (see Theorem 3.2). This has important consequences: 1. The field configurations of this model are nothing but the Willmore surfaces in L3 . 2. The model is invariant under conformal changes of the metric of L3 . 3. Since the Willmore functional is essentially the Polyakov action, the model can be regarded as a bosonic string theory in L3 that is governed by the WillmorePolyakov action. In this sense, the solutions of the model provide the string world sheet configurations (see [30,31]). Now, the first step in the above program, which constitutes the main aim of this paper, is stated as follows: To determine the moduli space of solutions of the two-dimensional O(2, 1) Nonlinear Sigma Model that admit a rotational symmetry. Equivalently, classify, up to congruences, those rotational surfaces in L3 that are critical points of the total energy. This problem is much more difficult and subtle than its Riemannian partner, [5], and it will be treated according to the causal character of the symmetry axis. Indeed, in Sect. 4, we have studied and completely solved the case where the symmetry axis is time-like, that is, surfaces invariant under a one parameter group, A1 , of elliptic motions. This can be summarized as follows:

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

439

1. Firstly, we consider the nonlinear sigma model with boundary and we determine the admissible boundary conditions. 2. Next, we obtain the space of surfaces that are invariant under rotations with time-like axis. 3. Then, since the orbits are circles, we use the principle of symmetric criticality, [29], and the conformal invariance of the model to make a suitable conformal change to obtain that The solutions of the two-dimensional O(2, 1) Nonlinear Sigma Model that admit a rotational symmetry with time-like axis are obtained by rotating clamped free elastic curves (critical points of the total squared curvature) in the anti de Sitter plane. The major part of the paper is devoted to obtaining rotational solutions with space-like axis. This case is the most complicated. The first important difficulty is to obtain the whole class of rotational surfaces in L3 with space-like axis, in other words, surfaces that are invariant under a one parameter group, A2 , of hyperbolic motions. This problem, which has been usually avoided in the literature, perhaps because of its difficulty, is completely solved in Sect. 5. To understand this problem, assume that the space-like axis coincides with the {x}-axis. Then, the planes y = z and y = −z divide L3 in four open regions, which will be called fundamental regions. Certainly, for every fundamental region we can get a class of rotational surfaces, with {x}-axis, immersed in the region. These surfaces are well known in the literature (see for example [19]). However, there are rotational surfaces with {x}-axis in L3 that leave a fundamental region to emerge in another fundamental region. This family includes popular surfaces, such as a saddle surface and the one-sheet hyperboloid with {z}-axis. In some sense, these surfaces can be obtained by gluing two or more surfaces, each of them contained in a fundamental region. Along Sect. 5, we use surgery to dissect these surfaces and to understand the gluing mechanism. At the end of it, we obtain a classification theorem (see Theorem 5.14) and a construction algorithm (see Subsect. 5.7). Once we have obtained the whole space of rotational surfaces with space-like axis, there are, at least, two different ways to get the corresponding solutions. On one hand, one can try to carry out a symmetry reduction of the action principle. This procedure depends on a kind of symmetric criticality principle that should be established. However, there is a second way that consists in a direct variational approach. Therefore, one needs to obtain the field equations governing the model. Since the model turns out to be equivalent to a constricted Willmore model, in Sect. 6, we obtain the first variation that provides Willmore surfaces in a general semi-Riemannian background. In Sect. 7, we obtain the whole moduli space of Riemannian solutions with a rotational symmetry with space-like axis (see Theorem 7.1). The Riemannian solutions of the two-dimensional O(2, 1) Nonlinear Sigma Model that admit a rotational symmetry with space-like axis are obtained by rotating space-like clamped free elastic curves of the de Sitter plane. In Sect. 8, the whole moduli space of Lorentzian solutions with a rotational symmetry with space-like axis is obtained. Firstly, we study those solutions that are contained in a fundamental region (fundamental solutions). On one hand, we get solutions coming from time-like clamped free elastic curves in the de Sitter plane (Theorem 8.1). On the other hand, we also obtain

440

M. Barros, M. Caballero, M. Ortega

a second family of Lorentzian solutions, which are generated by clamped free elastic curves in the hyperbolic plane (Theorem 8.2). Certainly, each solution in those families is contained in a fundamental region. In contrast with the Riemannian case, we can find Lorentzian solutions in all the fundamental regions. This fact allows us to study the existence of solutions leaving a fundamental region and emerging in another one. In other words, we look for solutions obtained by gluing fundamental solutions. This problem is completely solved at the end of Sect. 8. In fact, such solutions are surfaces that are connected pieces of either a one-sheet hyperboloid with time-like axis and centered at any point of the space-like axis, or a Lorentzian plane orthogonal to the space-like axis (Theorem 8.3). Finally, in the last section we consider the case where solutions admit a one parameter group, A3 , of parabolic transformations. They are known in the literature as rotational surfaces with light-like axis, [19]. Now, parabolic rotational surfaces lie in two fundamental regions of L3 . In contrast with the case of rotational surfaces with space-like axis, in Sect. 9 we prove that we can not find parabolic rotational surfaces that leave a fundamental region to emerge in the other. At the end of that section, we obtain the complete classification of A3 -invariant solutions (see Theorem 9.2). The solutions of the two-dimensional O(2, 1) Nonlinear Sigma Model that admit a rotational symmetry with light-like axis are obtained by rotating clamped free elastic curves of the anti de Sitter plane. The results of this paper can be summarized in the following statement: The solutions of the two-dimensional O(2, 1) Nonlinear Sigma Model that admit a rotational symmetry are the following surfaces: 1. A connected piece (with boundary) of a Lorentzian plane. 2. A connected piece (with boundary) of a one-sheet hyperboloid with time-like axis. 3. A surface generated, via rotations, by a clamped free elastic curve according to the following table: Symmetry Group

Axis

Orbits

Character of the surface

Generating Curve

A1

Time-like

Circles

Riemannian

A1

Time-like

Circles

Lorentzian

A2

Space-like

Hyperbolas

Riemannian

A2

Space-like

Hyperbolas

Lorentzian

A2

Space-like

Hyperbolas

Lorentzian

A3

Light-like

Parabolas

Riemannian

A3

Light-like

Parabolas

Lorentzian

Space-like free elastic curve in the anti de Sitter plane Time-like free elastic curve in the anti de Sitter plane Space-like free elastic curve in the de Sitter plane Time-like free elastic curve in the de Sitter plane Free elastic curve in the hyperbolic plane Space-like free elastic curve in the anti de Sitter plane Time-like free elastic curve in the anti de Sitter plane

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

441

2. Preliminaries and Generalities Along this paper the geometrical objects are C ∞ or equivalently smooth, though those appearing in the paper could be supposed to be only as differentiable as needed. Let M be a surface with (or without) boundary and φ : M −→ L3 an immersion in the Lorentz-Minkowski three space with flat metric g = , . If dφ p (T p M) is a non-degenerate plane in L3 for any p ∈ M, then φ : (M, φ ∗ (g)) −→ L3 is said to be a non-degenerate isometric immersion (or a non degenerate surface). A nondegenerate surface can be oriented, at least locally, by a unitary normal vector field, say Nφ . According to the causal character, Nφ , Nφ = ε, we have two possibilities: Lorentzian surfaces (also called time-like surfaces). When ε = 1, Nφ is space-like and so (M, φ ∗ (g)) is a Lorentzian surface. The unitary normal vector field, Nφ , can be viewed as a map, the Gauss map, Nφ : M −→ S21 , where S21 = { v ∈ L3 : v , v = 1} is the de Sitter plane. Riemannian surfaces (also named space-like surfaces). When ε = −1, Nφ is time-like and so (M, φ ∗ (g)) is a Riemannian surface. In this case, the Gauss map is defined as Nφ : M −→ H2 , where H2 = { v ∈ C↑ ⊂ L3 : v , v = −1} is the hyperbolic ↑ plane and C denotes the future cone. Let us denote by O(2, 1) the matrix representation of the group of vectorial isometries of L3 , Iso(L3 ), also known as the group of Lorentz transformations. Since the group of isometries of both S21 and H2 is O(2, 1), the Gauss map of non-degenerate surfaces can be regarded as elementary fields in the two-dimensional O(2, 1) Nonlinear Sigma Model (sometimes we will abbreviate it as O(2, 1) NSM). The Lagrangian density governing this field theory is precisely d Nφ 2 , which can be computed, via the Gauss equation, in terms of the mean curvature, Hφ , of (M, φ) and the Gaussian curvature, K φ , of (M, φ ∗ (g)), i. e.,

d Nφ 2 = 4Hφ2 − 2εK φ .

(1)

Consider Iso+↑ (L3 ) = { f ∈ Iso(L3 ) : det( f ) = 1, f (C↑ ) = C↑ }. Its partner in O(2, 1) is denoted by (O(2, 1))+↑ . It is known that each Lorentz transformation, f ∈ Iso(L3 ), admits at least one eigenvector with eigenvalue ±1. Therefore, given x ∈ L3 , we wish to determine those Lorentz transformations, f ∈ Iso+↑ (L3 ), such that f ( x ) = x. These vectorial isometries constitute a subgroup, A, of Iso+↑ (L3 ), called the group of rotations with axis x = Span{ x }. Certainly, A acts naturally on the whole L3 producing orbits. However, these orbits are quite different according to the causal character of the axis. Next we summarize the corresponding discussion. (1) Time-like axis. Choose an orthonormal basis, in L3 , as follows B = { x , y, z }. We work in coordinates with respect to B. Since the axis is time-like, then {y , z } determine an Euclidean plane. In this case, the group A is identified with the following subgroup of (O(2, 1))+↑ : ⎧ ⎫ ⎛ ⎞ 1 0 0 ⎨ ⎬ A1 = {1} × O + (2) = µt = ⎝ 0 cos t − sin t ⎠ : t ∈ R . ⎩ ⎭ 0 sin t cos t

442

M. Barros, M. Caballero, M. Ortega

Given a point p = (a1 , a2 , a3 ) ∈ L3 , denote by P the Euclidean plane in L3 passing through p and orthogonal to x. The orbit of p under the action of A1 , [ p ]1 , is just the circle in P through p with center (a1 , 0, 0), i.e., [ p ]1 = {(a1 , y, z) ∈ L3 : y 2 + z 2 = a22 + a32 }. Certainly, [ p ]1 = { p } if a2 = a3 = 0. Therefore, sometimes we call transformations in A1 elliptic motions or pure rotations. (2) Space-like axis. Choose an orthonormal basis, in L3 , as follows B = { x , y, z }. We work in coordinates with respect to this basis. Since the axis is space-like, then {y , z } determine a Lorentzian plane. In this case, the group A is identified with the following +↑ subgroup of O1 (3): ⎧ ⎫ ⎛ ⎞ 1 0 0 ⎨ ⎬ +↑ A2 = {1} × O1 (2) = ξt = ⎝ 0 cosh t sinh t ⎠ : t ∈ R . (2) ⎩ ⎭ 0 sinh t cosh t Given a point p = (a1 , a2 , a3 ) ∈ L3 , denote by P the Lorentzian plane in L3 passing through p and orthogonal to x. It is clear that the orbit of p, [ p ]2 , is contained in P. If a2 = a3 = 0, then [ p ]2 = { p }. Otherwise, we must distinguish two cases: • If a22 = a32 , then [ p ]2 is the open half straight line starting at (a1 , 0, 0) and passing through p, i.e., [ p ]2 = {(a1 , λ a2 , λ a3 ) : 0 < λ}. • If a22 = a32 , then [ p ]2 is the branch of hyperbola in P centered at (a1 , 0, 0) and passing through p, i.e., [ p ]2 is the connected component of {(a1 , y, z) ∈ L3 : y 2 − z 2 = a22 − a32 } that contains p. Transformations in A2 will be usually called hyperbolic motions or hyperbolic rotations. (3) Light-like axis. If x is null, then we consider a basis B = { x , y, z } of L3 such that: (1) y is a light-like vector with x , y = −1, and (2) z is a unitary space-like vector orthogonal to the plane Span{ x , y}. We work in coordinates with respect to B. It can +↑ be checked that the group A is identified with the following subgroup of O1 (3): ⎫ ⎧ ⎞ ⎛ ⎬ ⎨ 1 21 t 2 t A3 = ςt = ⎝ 0 1 0 ⎠ : t ∈ R . ⎭ ⎩ 0 t 1 In this setting, to analyze the orbits, we must consider again a couple of cases different to that where a2 = a3 = 0, in which [ p ]3 = { p }, for p = (a1 , a2 , a3 ) ∈ L3 . • If a2 = 0, then the orbit [ p ]3 is a straight line, namely [ p ]3 = {(t, 0, a3 ) : t ∈ R}. • If a2 = 0, then the orbit [ p ]3 is a parabola in the plane Q = {(x, a2 , z) ∈ L3 : x, z ∈ R}. In fact, [ p ]3 = {(x, a2 , z) ∈ Q : x = 2a12 (z − a3 )2 + aa23 (z − a3 ) + a1 }. Thus, A3 will be called the group of parabolic rotations or parabolic motions. To end this section, we define elastic curves, known also as elasticae. We consider a Riemannian or Lorentzian oriented 2-manifold (M, h), with its Levi-Civita connection ∇. We only work with Frenet curves, i.e., regular curves α : I ⊂ R → M such that the following Frenet equations are well-defined, ∇T T = 2 κ N , ∇T N = −1 κ T, where {T = α / α , N } is a positive orthonormal frame along α, 1 = h(T, T ) = ±1, 2 = h(N , N ) = ±1 and κ is a smooth function, usually called the (geodesic) curvature of α. We recall that geodesics are curves such that κ vanishes identically.

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

443

Next, given two points p1 , p2 ∈ M and two tangent vectors vi ∈ T pi M, i = 1, 2, we define a space of clamped curves

= {α : [a1 , a2 ] → M : α(ai ) = pi , α (ai ) = vi , i = 1, 2}. We also admit the case p1 = p2 and v1 = v2 , and then, we call a space of closed curves. Let us consider the total squared action on , E : → R, E(α) = (κ 2 + λ), α

where λ ∈ R is a Lagrange multiplier. Thus, an elastica (or an elastic curve) is a critical point of E. In case λ = 0, we call it free elastica. 3. Gaussian Map Approach and the Conformal Invariance of the Two-Dimensional O(2, 1) NSM The elementary fields in the two-dimensional O(2, 1) Nonlinear Sigma Model are L3 -valued unitary vector fields on surfaces with (or without) boundary. Therefore, they map something of dimension two (a surface) in either a de Sitter plane, S21 , or a hyperbolic one, H2 . Along this paper, we will assume that the source space is a surface with boundary and elementary fields are subject to a natural constraint along the boundary. However, the free case, i.e., when the boundary of the surface is empty, can be regarded as a particular case with no constraints. Hence, it seems natural to approach the study of this sigma model, in connection with the differential geometry of surfaces in L3 , by identifying the dynamical variables of the model with the Gauss map of non-degenerate surfaces in the Lorentz-Minkowski three-space. To be precise, let us state the general setting of this approach. Let = {γ1 , γ2 , . . . , γn } be a finite set of non-null regular curves in L3 with γi γ j = ∅, if i = j. Let No be a unitary vector field along orthogonal to

and with constant causal character on the whole , i.e., ( p), No ( p) = 0, ∀ p ∈ , and No ( p), No ( p) = ε, ∀ p ∈ , where ε ∈ {1, −1}. Notice that, if ε = 1, could consist of both time-like and space-like curves at the same time. Furthermore, and No determine a third vector field along given by No = ∧ ν. On the other hand, let M be a connected smooth surface with boundary, ∂ M = c1 ∪c2 ∪· · ·∪cn . We denote by I ε (M, L3 ) the space of immersions, φ : M −→ L3 with unitary normal vector field, Nφ , satisfying Nφ , Nφ = ε and the following boundary conditions: 1. φ(∂ M) = , namely φ(c j ) = γ j , 1 ≤ j ≤ n, and

2. dφq (Tq M) is orthogonal to No (φ(q)), ∀q ∈ ∂ M; this is equivalent to say Nφ = No . Roughly speaking, if we identify each immersion φ ∈ I ε (M, L3 ) with its graph, φ(M), viewed as a surface with boundary in L3 , then I ε (M, L3 ) can be regarded as the space of immersed surfaces in L3 having the same causal character, the same boundary and the same Gauss map along the common boundary.

444

M. Barros, M. Caballero, M. Ortega

In this setting, the action governing the two-dimensional O(2, 1) Nonlinear Sigma Model, S : I ε (M, L3 ) −→ R, can be written as S(φ) = M

d Nφ 2 d Aφ ,

(3)

where d Aφ denotes the element of area of (M, φ ∗ (g)). The solutions of the twodimensional O(2, ε 1) Nonlinear Sigma Model are just the critical points of I (M, L3 ), S . Next, we define a non-null polygon as a connected, simply-connected, compact domain K ⊂ M with nonempty interior and with piecewise smooth boundary, ∂K, made up of a finite number of smooth non-null curves. The concept of solution can be materialized according to the following 3 Definition 3.1. φ ∈ I ε (M, L3 ) is a critical point of I ε (M, L ); S if for any non-null ε (K, L3 ); SK , where polygon K ⊆ M, the restriction φ|K is a critical point of Iφ(∂K) ε (K, L3 ) is the space of immersions, ψ : K −→ L3 , which satisfy ψ|∂K = φ|∂K ; • Iφ(∂K) Nψ |∂K = Nφ |∂K and Nψ , Nψ = ε, and • SK (ψ) =

d Nψ 2 d Aψ , where d Aψ denotes the element of area of (K, ψ ∗ (g)). K

Once we have shown the Gaussian map approach to the O(2, 1) Nonlinear Sigma Model, we focus on proving its conformal invariance. The Willmore functional for free boundary surfaces in the Euclidean space, [40], was extended to surfaces with boundary, [39]. This functional can also be considered for non-degenerate surfaces in the Lorentz-Minkowski space and so extended to those with non-null boundary. In particular, we can define W : I ε (M, L3 ) −→ R as

W(φ) = M

Hφ2 d Aφ +

∂M

κφ ds,

(4)

where κφ is the geodesic curvature of ∂ M in (M, φ ∗ (g)). This action defines a variational problem which is invariant under conformal transformations in L3 . The corresponding critical points, which can be defined similarly to those of the action S, are called Willmore surfaces (or Willmore surfaces with prescribed Gauss map along the boundary). The following result provides a strong relationship between the variational problems associated with both functionals: Theorem 3.2. The two-dimensional O(2, 1) Nonlinear Sigma Model, I ε (M, L3 ); S , ε turns out to be equivalent to the Willmore variational problem I (M, L3 ); W . In particular, 1. Both have the same critical points. That is, φ ∈ I ε (M, L3 ) is a solution of the twodimensional O(2, 1) Nonlinear Sigma Model if and only if (M, φ) is a Willmore surface. 2. The two-dimensional O(2, 1) Nonlinear Sigma Model is invariant under conformal changes in the metric of L3 .

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

445

Proof. If ε = −1, then the immersions φ ∈ I −1 (M, L3 ), provide Riemannian surfaces, φ(M), in L3 . Consequently, we can follow, up to slight changes, the proof made in [5] for surfaces in the Euclidean space. If ε = 1, choose an immersion φ ∈ I 1 (M, L3 ). Then, it provides a Lorentzian surface, φ(M), in L3 . Given K any non-null polygon, the main aim is to relate the actions −1 (K, L3 ) −→ R. Therefore, the first step is to use the formula (1) and SK , MK : Iφ(∂K) next to get control on the total Gaussian curvature. To do so, we need a Gauss-Bonnet formula working on non-null polygons, no matter the causal character of the boundary pieces. This is made, with details, in the Appendix. So, after applying this formula, we get r SK (ψ) = 4 MK (ψ) − 6 κψ ds − 2 θj. ∂K

j=1

Finally, the nature of the hyperbolic angle and the formula (15), (see the Appendix), are used to show that the action measuring the total geodesic curvature of ψ(∂K) = −1 (K, L3 ). This concludes the proof of the φ(∂K) indeed does not depend on ψ ∈ Iφ(∂K) result. 4. Solutions of the O(2, 1) NSM Which are A1 -Invariant In this section, we completely determine the moduli space of solutions of the two-dimensional O(2, 1) Nonlinear Sigma Model, I ε (M, L3 ); S , which, in addition, is invariant under A1 , i.e., the group of rotations with time-like axis, x . Firstly, we need to establish this previous problem in a suitable way. In fact, the boundary conditions, ( , No ), cannot be arbitrary but are invariant under the A1 -action. This invariance holds if and only if the following conditions are satisfied: 1. The boundary, , consists of a pair of circles, {γ1 , γ2 }, contained in Euclidean planes, P1 , P2 , which are orthogonal to the time-like axis, x , and centered at the points Pi x , i = 1, 2. 2. The unitary normal vector field, No , along = {γ1 , γ2 }, satisfies No , x = constant. Consequently, the topology of the surface is M = [ a1 , a2 ] × S1 . Secondly, the action of A1 on L3 can be naturally extended to I ε (M, L3 ) as follows: A1 × I ε (M, L3 ) −→ I ε (M, L3 ), (µt , φ) → µt ◦ φ. It is obvious that both functionals, S and W are invariant under this action, i.e., S(µt ◦ φ) = S(φ), W(µt ◦ φ) = W(φ), ∀t ∈ R and φ ∈ I ε (M, L3 ). Define the set of the immersions which are invariant under A1 , also called symmetric points, as ε = {φ ∈ I ε (M, L3 ) : µt ◦ φ = φ, ∀t ∈ R}. To identify ε , choose an orthonormal basis, B = { x , y, z }, in L3 and take the Lorentzian half-plane AdS2 = { v∈ 3 L : v , y > 0, v , z = 0}. Let Cε be the space of curves, α : [s1 , s2 ] −→ AdS2 , satisfying the following conditions, up to a reparametrization: • α (s), α (s) = −ε, • α(si ) = trace(γi ) ∩ AdS2 , i = 1, 2, and • α (si ) = ν(α(si )), i = 1, 2.

446

M. Barros, M. Caballero, M. Ortega

Since M = [ a1 , a2 ] × S1 , for each α ∈ Cε , we can construct the immersion φα ∈ I ε (M, L3 ) defined as φα (s, eit ) = µt (α(s)). It is obvious that φα ∈ ε . The converse also holds. Indeed, given φ ∈ ε , we can find α ∈ Cε such that φ = φα . As a consequence, we can identify ε with Cε . On the other hand, since A1 is compact, we can apply the principle of symmetric criticality [29]. According to this principle, the critical points of W (equivalently S) which are symmetric are just the critical points of W (equivalently S) when restricted to ε . The following result provides, up to similarities in L3 , all the solutions of the twodimensional O(2, 1) Nonlinear Sigma Model which are invariant under A1 . Theorem 4.1. An immersion φα ∈ I ε (M, L3 ) is a solution of the two-dimensional O(2, 1) Nonlinear Sigma Model if and only if the curve α is a free elastica of AdS2 when viewed as an anti de Sitter plane. Proof. First, we view a piece of the Lorentz-Minkowski three space as a warped product: L3 \ x , g = (AdS2 , g) ×h (S1 , dt 2 ), where the warping function, h : AdS2 −→ R+ , is defined as h( p) = p, y, where p denotes the position vector of the point p, and the metric in AdS2 is induced from the usual one in L3 . Next, we make an obvious conformal change to obtain a semi-Riemannian product: 1 3 L \ x , g¯ = 2 g = (AdS2 , g) ¯ × (S1 , dt 2 ). h ¯ has constant Gaussian curvature −1, which An easy computation shows that (AdS2 , g) proves that it is an anti de Sitter plane. Denote by W and W the Willmore functionals of g and g, ¯ respectively. We compute their restriction to ε as follows: 2 W(φα ) = W(φα ) = H α + R α d Aα + κ ds, ∂M

M

where R α stands for the sectional curvature of L3 \ x , g¯ along dφ(T M). Notice that in this case R α = 0 because dφ(T M) is a mixed section in a semi-Riemannian product (see [27]). Furthermore, the geodesic curvature of ∂ M in (M, φα∗ (g)), ¯ κ, also vanishes identically since = φ(∂ M) is made up of two geodesics in (φ(M), g). ¯ Next, we compute the mean curvature function of φ(M) in L3 \ x , g¯ , obtaining 2

Hα =

1 2 κ , 4 α

¯ As a where κ α denotes the curvature function of α in the anti de Sitter plane (AdS2 , g). consequence, we have 1 π κ 2α ds dt = κ 2 ds. W(φα ) = W(φα ) = 4 [s1 ,s2 ]×S1 2 [s1 ,s2 ] α This concludes the proof.

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

447

The above result reduces the search of solutions with a pure rotational symmetry to curves in the anti de Sitter plane, (AdS2 , g), ¯ which are critical points of the following variational problem, known as the Bernouilli elastica in its Lorentzian version. The source space is the space of clamped curves, Cε , and the Lagrangian is E : Cε −→ R, defined by E(α) = κ 2α ds. α

Cε

The first variation, δE(α) : Tα −→ R, associated with this functional can be computed, using a standard method which involves some integration by parts (see for example [21] for details) to be, δE(α)[W ] = g((α), ¯ W ) ds + [B(α, W )]ss21 , α

where and B denote, respectively, the Euler-Lagrange and the boundary operators, given by 3 (α) = 2ε2 ∇ T T + 3ε1 ∇ T κ¯ α2 T + 2∇ T T, 2 B(α, W ) = 2ε2 g(∇ ¯ T W, ∇ T T ) − g¯ W, 2ε2 ∇ T T + 3ε1 κ¯ α2 T , where ∇ is the Levi-Civita connection of (AdS2 , g), ¯ ε1 is the causal character of T = α and ε2 that of the normal. By using the boundary conditions of curves in Cε (clamped curves) we can see that [B(α, W )]ss21 = 0. Therefore, the elasticae of (AdS2 , g) ¯ are those curves in Cε satisfying the Euler-Lagrange equation (α) = 0. This equation can be transformed using the Frenet equations to obtain the following elastica equation: 2κ¯ α

− κ¯ α3 + 2ε2 κ¯ α = 0.

(5)

Certainly κ¯ α = 0 is a trivial solution of this equation, which means that geodesics of (AdS2 , g) ¯ are elasticae. Writing u = κ¯ α , the above elastica equation turns out to be 2u

− u 3 + 2ε2 u = 0, which can be integrated by means of Jacobi elliptic functions, [15], to have u(s) = C cn λ(s − ao ), C˜ , where λ ∈ C\{0} and ao ∈ R are arbitrary constants, C 2 = −2 λ2 − ε2 and C˜ 2 = λ2 −ε2 . 2λ2

However, the elliptic cosinus of Jacobi is a complex-valued function, and the curvature must be a real-valued function. The real-valued solutions of the equation can be obtained using the properties of the Jacobi elliptic cosinus, see [11]. They provide the following curvature functions: ⎛ ⎞ 2 C κ¯ α (s) = C cn ⎝ ε2 − (s − ao ), C˜ ⎠ , 2 ⎧ ⎫ ⎫ ⎧ ⎬ ⎪ ⎪ ⎨ 2n + 1 2 ⎪ ⎪ ⎪ ⎪ ⎪R \ E if ε2 − C2 < 0 ⎪ ao + ⎬ ⎨ 2 ⎩ ⎭ C − ε n∈Z , for s ∈ 2 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ 2 R if ε2 − C2 > 0

448

M. Barros, M. Caballero, M. Ortega

√ √ where C ∈ R\{− 2 ε2 , 2 ε2 } and ao ∈ R are arbitrary constants, C˜ 2 = E is the complete elliptic integral of first kind with modulus 1 − C˜ 2 .

C2 2C 2 −4ε2

and

i

5. A2 -Invariant Surfaces in L3 Next, our aim is to obtain, up to isometries of the Lorentz-Minkowski space, the whole class of solutions of the O 1 (3) Nonlinear Sigma Model, which are invariant under the group A2 . A priori, it could be similar to the above studied case, but it becomes more difficult and subtle. The main difficulty we have now, is to find the symmetric points, i.e. the immersions that are invariant under A2 . Let us denote by x the space-like axis and consider the only two degenerate planes containing x . L3 minus these two planes consists of four open regions that we will call fundamental regions. The A2 -invariant surfaces contained in a fundamental region will be named fundamental symmetric surfaces. Certainly, we can get a wide class of surfaces invariant under A2 by taking a curve immersed in any non-degenerate plane of L3 containing x , whose trace does not intersect the axis, and rotating it by applying the elements of A2 . These surfaces are those known in the literature as rotational surfaces with space-like axis, see for example [19]. All of them are fundamental symmetric surfaces. However, we can also find symmetric surfaces that leave a fundamental region to emerge in another one. In some sense, they are obtained by gluing fundamental symmetric surfaces. These extended surfaces have been usually avoided in the literature because of their difficulty. Nevertheless, the class includes famous surfaces, such as a saddle surface and a one-sheet hyperboloid. In this big section, we will make an exhaustive analysis to completely describe the whole class of surfaces in L3 that admit a rotational group of symmetries with space-like axis, i.e. surfaces that are invariant under a group of hyperbolic rotations. For the sake of clearness, we will split our study in several subsections. 5.1. Fundamental regions and fundamental surfaces. Let x be a unitary space-like vector in L3 . We choose an orthonormal basis, B = { x , y, z }, where y is also space-like and z is time-like. We work in coordinates with respect to B, so that the metric in L3 is written as g ≡ d x 2 + dy 2 − dz 2 . In L3 \ x we will distinguish the following regions

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

449

that will be called fundamental regions: R+ R− Q+ Q−

= {(x, y, z) ∈ L3 = {(x, y, z) ∈ L3 = {(x, y, z) ∈ L3 = {(x, y, z) ∈ L3

: : : :

z2 − y2 z2 − y2 z2 − y2 z2 − y2

> 0, > 0, < 0, < 0,

z > 0}, z < 0}, y > 0}, and y < 0}.

Definition 5.1. An A2 -invariant surface immersed in L3 is said to be a fundamental symmetric surface (or simply a fundamental surface) if it is contained in only one fundamental region. We define R = {(x, y, z) ∈ L3 : y = 0} and Q = {(x, y, z) ∈ L3 : z = 0}. In this setting, we can introduce the notion of rotational surface generated by a curve. Definition 5.2. Let γ be a curve immersed in either R or Q, with domain I ⊆ R. We define the rotational surface generated by γ as γ := {ξt (γ (s)) : s ∈ I, t ∈ R}. Remark 5.3. Notice that, in the case γ intersects x , γ is not a topological surface. Next, we consider the following open half planes: ˜ + = R+ ∩ R = {(x, 0, z) ∈ L3 : z > 0}, R ˜ − = R− ∩ R = {(x, 0, z) ∈ L3 : z < 0}, R Q˜ + = Q+ ∩ Q = {(x, y, 0) ∈ L3 : y > 0}, Q˜ − = Q− ∩ Q = {(x, y, 0) ∈ L3 : y < 0}, P++ = ∂R+ ∩ ∂Q+ \ x = {(x, y, z) ∈ L3 : y = z > 0}, + − P− x = {(x, y, z) ∈ L3 : −y = z > 0}, + = ∂R ∩ ∂Q \ P+− = ∂R− ∩ ∂Q+ \ x = {(x, y, z) ∈ L3 : y = −z > 0}, and − − P− x = {(x, y, z) ∈ L3 : y = z < 0}. − = ∂R ∩ ∂Q \ We also consider the following Lorentzian unitary circles: H+ H− J+ J−

= {(0, y, z) ∈ L3 = {(0, y, z) ∈ L3 = {(0, y, z) ∈ L3 = {(0, y, z) ∈ L3

: : : :

z2 − y2 z2 − y2 y2 − z2 y2 − z2

= 1, = 1, = 1, = 1,

z > 0}, z < 0}, y > 0}, and y < 0}.

It should be noticed that while H+ and H− are space-like in L3 , with metric dt 2 , J+ and J− are time-like, with metric denoted by −dt 2 . Next, we define the following positive functions: ˜ + −→ R, f + (x, 0, z) = z, f+ : R ˜ − −→ R, f − (x, 0, z) = −z, f− : R h + : Q˜ + −→ R, h + (x, y, 0) = y, and h − : Q˜ − −→ R, h − (x, y, 0) = −y.

450

M. Barros, M. Caballero, M. Ortega

In this setting, it is not difficult to check the following warped product decompositions: ˜ + , g) × f (H+ , dt 2 ), (R+ , g) = (R + ˜ − , g) × f (H− , dt 2 ), (R− , g) = (R − (Q+ , g) = (Q˜ + , g) ×h + (J+ , −dt 2 ), and (Q− , g) = (Q˜ − , g) ×h (J− , −dt 2 ). −

Furthermore, when we make the obvious conformal changes, it is easy to see that: 1 1 + − ˜ ˜ • R , f 2 g and R , f 2 g are de Sitter planes with curvature 1, and + − 1 1 + − ˜ ˜ • Q , h 2 g and Q , h 2 g are hyperbolic planes with curvature −1. −

+

Consequently, we obtain the following result: Lemma 5.1. 1. R+ , f12 g and R− , f12 g are the semi-Riemannian product of a de −

+

Sitter plane anda space-like Lorentzian unitary circle. 2. Q+ , h12 g and Q− , h12 g are the semi-Riemannian product of a hyperbolic plane +

−

and a time-like Lorentzian unitary circle. 5.2. Fundamental symmetric immersions. In this subsection, we completely describe those non-degenerate immersions, φ : M −→ L3 , whose image is a fundamental symmetric surface. We will see that they correspond with rotational surfaces generated by non-degenerate curves that do not intersect x . To proceed, we consider separately Riemannian and Lorentzian cases. Since A2 -invariant Riemannian surfaces automatically lie in R+ or R− , the following result assures us that all the A2 -invariant Riemannian surfaces are fundamental symmetric surfaces, and it classifies them. Theorem 5.4. Let M be a connected surface and φ : M −→ L3 an immersion. Then, (M, φ ∗ (g)) is Riemannian and A2 -invariant if and only if there exists a smooth space˜ + or R ˜ − , such that φ(M) = α . In particular, these like curve, α, contained in either R surfaces lie in R+ or R− . Proof. The sufficient condition is widely known, [19]. To prove the converse, assume that φ : M −→ L3 is an immersion such that (M, φ ∗ (g)) is Riemannian and φ(M) is invariant under the action of A2 . This orbits. implies that φ(M) is foliated+by space-like − + − On the other hand, the orbits in Q+ Q− are time-like and those in P P P P− + + − x . Consequently, there exists a are light-like. This shows that φ(M) ⊂ R+ R− space-like curve, α : J ⊂ R −→ R, such that φ(M) = {ξt (α(s)) : s ∈ J, t ∈ R}. + ˜ + . Indeed, given a point s0 ∈ J such ˜ = ∅ then α(J ) ⊂ R However, if α(J ) R that α(s0 ) ∈ x , then the surface α is not even a topological manifold at α(s0 ). This concludes the proof. In the Lorentzian case, the behavior is different. The reason is that it is possible to find A2 -invariant Lorentzian surfaces in the four fundamental regions, according to the following result.

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

451

Theorem 5.5. Let M be a connected surface and φ : M −→ L3 . Then, (M, φ ∗ (g)) is a Lorentzian symmetric fundamental surface if and only if either • • • •

˜ + , such that φ(M) = α , there exists a time-like curve, α : I −→ R ˜ − , such that φ(M) = α , there exists a time-like curve, α : I −→ R there exists a space-like curve, α : I −→ Q˜ + , such that φ(M) = α , or there exists a space-like curve, α : I −→ Q˜ − , such that φ(M) = α .

The proof is left to the reader because it is similar to the one of Theorem 5.4.

5.3. Some examples to motivate the extended Lorentzian case. For a better understanding of the general (or extended) Lorentzian case, we analyze two examples of A2 -invariant Lorentzian surfaces that intersect more than one fundamental region. Example 1 : A saddle surface. We consider the following saddle surface in L3 : 1 3 2 2 , S = (x, y, z) ∈ L : x = y − z > − 4 which admits a natural Monge parametrization as a graph. Indeed, in the plane x = 0, we consider the map 1 1 −→ S ⊂ L3 , X(y, z) = (y 2 − z 2 , y, z). X :R× − , 2 2 S is a Lorentzian surface in L3 , because we have considered only the piece where the induced metric is Lorentzian. S is also invariant under the group A2 . In addition, every fundamental region contains a piece of this saddle surface. According to the notation we are using, these pieces can be described as follows: ˜ + , α + (s) = (−s 2 , 0, s), α + , where α + : (0, 1/2) → R ˜ − , α − (s) = (−s 2 , 0, s), α − , where α − : (−1/2, 0) → R

β + , where β + : (0, +∞) → Q˜ + , β + (s) = (s 2 , s, 0), and β − , where β − : (−∞, 0) → Q˜ − , β − (s) = (s 2 , s, 0).

Obviously, these surfaces are glued along the common boundaries, obtaining S. Though the gluing mechanism is obvious in this case, we will emphasize it as a motivation for the later extension. We will work in a neighborhood of the boundaries of the above four pieces. Firstly, it should be noticed that we can work with the following couple of curves: • a time-like curve α : (−δ, δ) −→ R, α(s) = ( f α (s), 0, s) = (−s 2 , 0, s), and • a space-like curve β : (−δ, δ) −→ Q, β(s) = ( f β (s), s, 0) = (s 2 , s, 0). They are defined as graphs for a certain δ ∈ (0, 1/2). In addition, we have a gluing smooth function, F : {(y, z) ∈ R2 : |z 2 − y 2 | < δ 2 } −→ R, defined as ⎧ ⎫ ⎨ f α sign(z) z 2 − y 2 = y 2 − z 2 if z 2 ≥ y 2 , ⎬ F(y, z) = ⎩ f sign(y) y 2 − z 2 = y 2 − z 2 if y 2 ≥ z 2 . ⎭ β

452

M. Barros, M. Caballero, M. Ortega

When we consider the four pieces altogether, the Monge parametrization is just obtained in terms of F. Example 2 : A one-sheet hyperboloid. We consider the following one-sheet hyperboloid: H = {(x, y, z) ∈ L3 : x 2 + y 2 − z 2 = 1}. Clearly, it is A2 -invariant and it is not contained in any fundamental region. In fact, the intersection of H and the fundamental regions consist of six connected pieces. We denote by p = (1, 0, 0) and q = (−1, 0, 0) the two points in which H intersects the axis. Then, it is easy to check that the boundaries of above six pieces are just the eight light-like orbits with boundary either p or q. Thus, it is necessary to glue twice to obtain H. Firstly, we work around p. We choose δ satisfying 0 < δ < 1 and we define: √ • a time-like curve α p : (−δ, δ) → R, α p (s) = ( f α p (s), 0, s) = (+ 1 + s 2 , 0, s), and √ • a space-like curve β p : (−δ, δ) → Q, β p (s) = ( f β p (s), s, 0) = (+ 1 − s 2 , s, 0), which satisfy α p (0) = β p (0) = p. Then, we define the gluing smooth function Fp : {(y, z) ∈ R2 : |z 2 − y 2 | < δ 2 } −→ R as ⎧ ⎫ ⎨ f α p sign(z) z 2 − y 2 = + 1 − y 2 + z 2 if z 2 ≥ y 2 , ⎬ Fp (y, z) = ⎩f 2 2 = + 1 − y 2 + z 2 if y2 ≥ z2. ⎭ β p sign(y) y − z Now, in terms of this gluing function, we can define a parametrization of the one-sheet hyperboloid around p as follows: Xp : {(y, z) ∈ R2 : |z 2 − y 2 | < δ 2 } −→ H ⊂ L3 , Xp (y, z) = Fp (y, z), y, z = + 1 − y 2 + z 2 , y, z . Finally, using the negative square root we obtain curves and a gluing function to paste the pieces around q.

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

453

5.4. Dissection of an A2 -invariant Lorentzian surface. Along this subsection, we assume that M is a connected smooth surface and φ : M −→ L3 is an immersion such that (M, φ ∗ (g)) is Lorentzian and φ(M) is A2 -invariant. We are going to use surgeryto study +the pieces of φ(M) that lie in each of the fundamental regions and in P++ P− P− P− + − . The following assertions can be checked by the reader. + − R is empty or it is a countable union of Lorentzian fundamental 1. φ(M) R ˜ − , i.e., ˜ + or R surfaces that are generated by time-like curves immersed in either R ⎞ ⎛ ⎝ αe α f ⎠ , e∈E

f∈F

where {αe : e ∈ E} and {α f : f ∈ F} are countable families of time-like curves ˜ + and R ˜ − , respectively. in R + 2. φ(M) Q Q− is empty or it is a countable union of Lorentzian fundamental surfaces with profile curves immersed in either Q˜ + or Q˜ − , i.e., βλ βθ , λ∈

θ∈

where {βλ : λ ∈ } and {βθ : θ ∈ } are countable families of curves in Q˜ + and Q˜ − , respectively. + − + − 3. φ(M) P+ P+ P− P− is empty or a countable set of light-like orbits lying in the boundary of the Lorentzian fundamental surfaces mentioned in the previous items. We have already studied the Lorentzian fundamental surfaces, so we will focus on the case in which φ(M) is not a fundamental surface. So, we assume that φ(M) ∩ (R+ ∪ R− ) = ∅ and φ(M) ∩ (Q+ ∪ Q− ) = ∅. Once we have made the dissection, we will study the curves immersed in R, the curves immersed in Q and finally, how these curves are related. Generating curves immersed in R. By using a connection argument, as well as the nonexistence of closed time-like curves in R, we can state the following facts about the ˜ + and R ˜ −: curves in R [R1] For each α ∈ {αe : e ∈ E} ∪ {α f : f ∈ F}, there exists a Uα ⊆ M connected submanifold such that φ(Uα ) = α . Moreover, if there exist more than one of such submanifolds, we include in {αe : e ∈ E} ∪ {α f : f ∈ F} as many copies of α as existing submanifolds, and we notate them with different subindices. [R2] If α ∈ {αe : e ∈ E} ∪ {α f : f ∈ F}, then α is maximal in the sense that there are no connected submanifolds satisfying Uα ⊆ V and φ(V ) do not intersect x . [R3] Many curves of {αe : e ∈ E} ∪ {α f : f ∈ F} can be glued to obtain smooth or piecewise smooth time-like curves in R. Indeed, given p ∈ x , if there exist e ∈ E and f ∈ F satisfying • p belongs to the boundary of these two curves, and • there exists U ⊆ M open and connected, such that U ∩ φ −1 (R+ ) = Uαe and U ∩ φ −1 (R− ) = Uα f ,

454

M. Barros, M. Caballero, M. Ortega

then these two curves can be glued. If p ∈ φ(M), then the union is a time-like smooth curve in R. Otherwise, the union is time-like and smooth everywhere except in p, where we only know it is continuous. It is easy to check that this procedure cannot be applied to two curves of {αe : e ∈ E} or two curves of {α f : f ∈ F}, because we will not obtain a Lorentzian surface. [R4] After all the possible gluing processes, we obtain a countable family of continuous piecewise smooth time-like curves in R, {αi : Ji −→ R : i ∈ I}, Ji ⊆ R being an interval for all i ∈ I (they are smooth everywhere, except in those points satisfying p ∈ / φ(M) and p ∈ x ). [R5] As R is a Lorentzian plane, we can deduce that for each i ∈ I there exists a unique si ∈ Closure(Ji ) such that pi = lims→si α(s) belongs to x . In addition, either si belongs to the boundary of Ji , or the curve changes from one fundamental region to another at pi . Generating curves immersed in Q. The following assertions hold: [Q1] The properties analogous to [R1], [R2] and [R3] hold true for curves in {βλ : λ ∈ } ∪ {βθ : θ ∈ }. [Q2] After all the possible gluing processes, we obtain only one continuous, piecewise smooth curve in Q (smooth everywhere except in the intersections with x , in which we only know the curve is continuous). We will notate this curve as β. This property is a consequence of the connectedness of M and [R5]. [Q3] The domain of β, J, can be an interval or S1 . The first situation corresponds to the case in which either β is not closed, or β is closed but there are two curves in {βλ : λ ∈ } ∪ {βθ : θ ∈ } that suffered only one gluing process. [Q4] Each time β intersects x , either the curve changes from a fundamental region to another, or the intersection is a boundary point of β, or the point belongs to the only two curves in {βλ : λ ∈ } ∪ {βθ : θ ∈ } that suffered only one gluing process. Connecting profile curves with different causal character. At this point, we know that the surface φ(M) is generated by {αi : i ∈ I} and β. However, we need a deeper understanding of the relation between β and the curves {αi : i ∈ I}, as well as the way of constructing the original surface from the generating curves. Thanks to the connectedness of M, Remark 5.3 and the properties above, is easy to check the following assertions. x , there exist ε > 0 and i ∈ I, satisfying P1. ∀so ∈ J such that β(so ) ∈ • β(so ) ∈ Closure(trace(αi )), and • β|Jo αi is a smooth surface, where Jo =]so − ε, so + ε[ ∩ J. Roughly speaking, for each so ∈ J such that β(so ) ∈ x , there exists αi gluing appropriately with β in a neighborhood of so . P2. ∀i ∈ I, there exist ε > 0 and so ∈ Closure(J), such that • lims→s o β(s) = lims→si α(s), and • β|Jo αi is a smooth surface, where Jo =]so − ε, so + ε[ ∩ J. Roughly speaking, for each αi , there exists so ∈ Closure(J) such that β and αi glue appropriately in a neighborhood of so .

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

455

Remark 5.6. If we ask Uαi+ ∪ Uα − ∪ Uβi+ ∪ Uβ − to be connected, being αi± = αi |Ji ∩R± i i and β ± = β|Jo ∩R± , then i is unique in P1, and so is unique in P2. In this case, P1 and P2 can be viewed as injective maps, P1 : {so ∈ J : β(so ) ∈ x } −→ I and P2 : I −→ {so ∈ Closure(J) : lims→so β(s) ∈ x }, verifying P2 ◦ P1(s) = s for all s ∈ J such that β(s) ∈ x . Remark 5.7. When considering β|Jo ∪ αi , we are also adding the light-like orbits belonging to the boundary of these two surfaces and to φ(M) (otherwise the union will not be connected). Light-like orbits connecting the pieces. Definition 5.8. Given any point p = (xo , 0, 0) ∈ x , we define the light-like orbital at p, O p , as the set consisting of p together with the four light-like orbits through this point, i.e. O p = {(xo , y, z) : y 2 = z 2 }. x , we define Definition 5.9. For each so ∈ Closure(J) such that p = lims→so β(s) ∈ the light-like patch at so as φ(U ), Bso = O po where po = lims→so β(s) and U ⊆ M is a domain such that: + − Q = β|Jo , and • φ(U ) Q + − R R = αi . • if ∃i ∈ I satisfying P2(i) = so , then φ(U ) Remark 5.10. The light-like patch at so is just the union of the elements of a subset of {{ po }, {(xo , a, a) : a > 0}, {(xo , a, a) : a < 0}, {(xo , −a, a) : a > 0}, {(xo , −a, a) : a < 0}}, where po = (xo , 0, 0) = lims→so β(s). Summary of the dissection. Given an A2 -invariant Lorentzian immersion, φ : M −→ L3 , there exists a family of time-like curves in R, {αi : Ji ⊆ R −→ R : i ∈ I}, and a curve in Q, β : J −→ Q, satisfying: • All of them are smooth everywhere except in those points belonging to x \φ(M), in which the curve is only known to be continuous. • For each i ∈ I, Ji is an interval. Even more, there exists only one si ∈ Closure(Ji ) such that pi = lims→si α(s) belongs to x . If si is not a boundary point of Ji , then it is a point in which the curve goes from one fundamental region into another. • The domain of β, J, is either an interval, or S1 , see [Q3]. • Each time β intersects x , either the curve goes from one fundamental region to another, or the point belongs to { lim β(s) / so ∈ ∂J}. s→so

• P1 and P2 hold. Also, for each so ∈ Closure(J) such that lims→so β(s) ∈ x , there exists a set, Bso , called light-like patch, consisting of the union of some of the following sets: { po }, {(xo , a, a) : a > 0}, {(xo , a, a) : a < 0}, {(xo , −a, a) : a > 0} and {(xo , −a, a) : a < 0} (where po = (xo , 0, 0) = lims→so β(s)). In this setting, {Bso : lim β(s) ∈ φ(M) = β αi x } . i∈I

s→so

Note that αi and β correspond to Definition 5.2, but here we are removing the points of the axis. From now on, we will use this assumption freely.

456

M. Barros, M. Caballero, M. Ortega

5.5. Characterization of the gluing. In this subsection, we characterize the way to paste two Lorentzian fundamental symmetric surfaces, α and β , which are generated by suitable curves, α in R and β in Q. Let α be a time-like curve in R and β a curve in Q, such that lims→0 α(s) = lims→0 β(s) = p = (xo , 0, 0) ∈ x . Firstly, we choose an appropriate light-like patch, B0 ⊂ O p = {(xo , y, z) : y 2 = z 2 }. Secondly, we take a neighborhood of 0 in the domain of β such that lims→˜s β(s) ∈ / x , for s˜ = so . There is no loss of generality, because only that neighborhood is important in the gluing process. In this setting, the following result can be regarded as the master piece to understand how A2 -invariant surfaces generated by curves glue smoothly. Moreover, it characterizes the gluing mechanism. and β Theorem 5.11. Local Gluing Theorem. In the setting of this subsection, α glue smoothly and the metric along the union is Lorentzian, i.e. = α β B0 is a smooth Lorentzian surface in a neighborhood of B0 , if, and only if, there exist smooth functions f α and f β such that the following assertions hold: LG1. α(u) = ( f α (u), 0, u) is a parametrization of α in a neighborhood of p. LG2. β(u) = f β (u), u, 0 is a parametrization of β in a neighborhood of p. LG3. The following function is smooth: ⎧ ⎫ ⎨ f α sign(z) z 2 − y 2 if z 2 ≥ y 2 , ⎬ F(y, z) = , ⎩ f sign(y) y 2 − z 2 if y 2 ≥ z 2 , ⎭ β defined on a neighborhood of {(y, z) : (xo , y, z) ∈ B0 }. F is called the gluing function. Proof. Assume that is a smooth Lorentzian surface in a neighborhood of B0 . We split the proof of the necessary condition in two cases. Case 1: If both α and β do not cross the axis, each of them is contained in the union ˜ + of a fundamental region and the axis. We can assume that trace(α) ⊆ R x and trace(β) ⊆ Q˜ + x , since the proof for other cases works similarly. We prove LG1. The curve α can be written down as α(s) = (α1 (s), 0, α3 (s)), and we can assume it is arclength parametrized. Then, α3 (s)2 = α1 (s)2 + 1 > 0. Thus, by using the Inverse Function Theorem, we obtain ρ > 0 such that α3 :]0, ρ[−→ α3 (]0, ρ[) is a diffeomorphism. Now, from lims→0 α3 (s) = 0 and α3 > 0, we get that α3 (]0, ρ[) = ]0, [ for certain > 0. This provides a smooth function f α :]0, [−→ R, defined by f α (u) = α1 ◦ α3−1 (u), and so LG1 holds. We prove LG2. We can write down β as β(s) = (β1 (s), β2 (s), 0), and we can assume this parametrization is arc-length. Then, we consider X β (s, t) := ξt (β(s)). Our main aim is to show that lims→0 β2 (s) = 0. Suppose, contrary to our claim, that lims→0 β2 (s) = 0. It should be noticed that the light-like orbit {(xo , y, y) ∈ L3 /y > 0} is contained in B0 . Given a point in that orbit, q = (xo , a, a) with a > 0, let γq (τ ) = X β (sq (τ ), tq (τ )) be a smooth curve with limτ →0 γq (τ ) = q. Obviously, this implies that limτ →0 sq (τ ) = 0. As lims→0 β2 (s) = 0, then

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

lim cosh(tq (τ )) = lim sinh(tq (τ )) = +∞.

τ →0

τ →0

457

(6)

On the other hand, we compute the Riemannian normal to along γq , obtaining N R (γq (τ )) =

(β2 (sq (τ )), −β1 (sq (τ )) cosh(tq (τ )), β1 (sq (τ )) sinh(tq (τ ))) . β2 (sq (τ ))2 + β1 (sq (τ ))2 (cosh2 (tq (τ )) + sinh2 (tq (τ )))

Now, by using that lims→0 β2 (s) = 0, as well as (6) and the previous expression, we obtain 1 1 N R (q) = lim N R (sq (τ ), tq (τ )) = (0, − √ , √ ). τ →0 2 2 This implies that Tq is a degenerate plane, which provides a contradiction. Therefore, we have proven that lims→0 β2 (s) = 0. Now, we can follow a similar argument to the one used with α, concluding that LG2 holds. We prove LG3. We define the following sets: = {(y, z) ∈ R2 : |z 2 − y 2 | < 2 , (xo , y, z) ∈ R+ ∪ Q+ ∪ B0 }, α = {(y, z) ∈ R2 : |z 2 − y 2 | < 2 , (xo , y, z) ∈ R+ }, and β = {(y, z) ∈ R2 : |z 2 − y 2 | < 2 , (xo , y, z) ∈ Q+ }, being the minimum of {α , β }. It is clear that F|α and F|β are smooth. In order to study the smoothness of F at the points of B0 , we define = α|]0,[ ∪ β|]0,[ ∪ B0 and the map : −→ , (x, y, z) = (y, z). is smooth, bijective and −1 (y, z) = (F(y, z), y, z). If we prove that dq is bijective ∀q ∈ B0 , the Inverse Function Theorem gives us the smoothness of F at the points of B0 . It is enough to prove that ∂x (q) ∈ / Tq ∀q ∈ B0 . To do so, we distinguish three cases: • If q = p, then {(xo , y, −y) ∈ L3 /y ∈ R} ⊆ B0 , and so, Tq = Span{∂ y (q) + ∂z (q), ∂ y (q) − ∂z (q)}. • If q ∈ {(xo , y, y) ∈ L3 /y = 0}, it is clear that ∂ y (q) + ∂z (q) ∈ Tq . As Tq is a Lorentzian plane, and Span{∂x (q), ∂ y (q) + ∂z (q)} is a degenerate plane, we get ∂x (q) ∈ / Tq . • If q ∈ {(xo , y, −y) ∈ L3 /y = 0}, we proceed similarly to the previous item. Case 2. At least one of the curves α and β crosses the axis. From Case 1, conditions LG1, LG2 and LG3 hold except for the smoothness of f α and f β at u = 0, and the smoothness of F at (0, 0) (in the case that they make sense). We will only prove the smoothness of f α , since the proof for f β is analogous. If either α ∩ R+ = ∅ or α ∩ R− = ∅, the proof is trivial. Consequently, consider that α ∩ R+ = ∅ and α ∩ R− = ∅. Then, either β ∩ Q+ = ∅ or β ∩ Q− = ∅. Without loss of generality, we may assume that β ∩ Q+ = ∅. Given a > 0, we choose

458

M. Barros, M. Caballero, M. Ortega

the following curves, that are smooth because of Case 1: t t t t ,a + ), a − ,a + γ1 (t) = F(a − 4a 4a 4a 4a ! √ t t f α ( t), a − 4a if t ≥ 0 , a + 4a = √ , t t if t ≤ 0 , a + 4a f β ( −t), a − 4a t t t t , −a − ), a − , −a − γ2 (t) = F(a − 4a 4a 4a 4a ! √ t t f α (− t), a − 4a , −a − 4a if t ≥ 0 = √ . t t , −a − 4a if t ≤ 0 f β ( −t), a − 4a Then, we have √ dn √ dn √ dn lim n f α ( t) = lim n f β ( −t) = lim n f α (− t) ∀n ∈ N, t → 0 dt t →0 d t t → 0 dt t >0 t <0 t >0 and consequently lim

u→0 u>0

dn dn f (u) = lim f α (u) ∀n ∈ N. α u → 0 du n du n u<0

Thanks to Case 1, we can assure that the gluing function is smooth everywhere except at (0, 0). The smoothness of F at (0, 0) is obtained by applying the Inverse Function Theorem to at p. Let us prove the converse. We suppose that LG1, LG2 and LG3 hold. It is enough to show that is a Lorentzian smooth surface in a neighborhood of B0 . Indeed, as the function F can be used to define a parametrization of in a neighborhood of B0 , we only need to exhibit the Lorentzian character of the surface along B0 . Firstly, we need to prove that f α (0) = f β (0) = 0. If p ∈ , then the proof is trivial. Otherwise, we have to prove that if any of those two equalities do not hold, then F is not smooth. We know that B0 contains at least one orbit of the light-like orbital O p . We suppose {(xo , y, y) ∈ L3 /y > 0} ⊆ B0 and we consider {(y, y) / y > 0}. It is easy to see that the gradient of F is not continuous along {(y, y) / y > 0} when any of the equalties f α (0) = f β (0) = 0 are not true. This is a contradiction because F is a smooth function. Secondly, we prove the surface along B0 is Lorentzian. If p ∈ B0 , then T p = Span{lims→0 α (s) = (0, 0, 1), lims→0 β (s) = (0, 1, 0)}, which is Lorentzian. Next, we focus on studying the metric along the light-like orbits contained in B0 . If we suppose {(xo , y, y) ∈ L3 /y > 0} ⊆ B0 , then either α ∩ R+ = ∅ or β ∩ Q+ = ∅. We assume the first one holds and we take (xo , a, a) with a > 0. Now, choose the curve ω(t) = (F(at, a), at, a) for t ≤ 1. Certainly, ω(1) = (xo , a, a) and so, using that f α (0) = 0, we compute ω (1) = (−a 2 f α

(0), a, 0). Then, the vector −2 a(1 + a 2 ( f α

(0))2 )

ω (1) + (0, 1, 1)

is light-like, it belongs to T(xo ,a,a) and it is not proportional to (0, 1, 1), so T(xo ,a,a) is a Lorentzian plane. The proof for the other light-like orbits is analogous.

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

459

Remark. It should be noticed that we have shown that f α (0) = f β (0) = 0 when F is smooth. Then, α and β are perpendicular to x , and so, there cannot exist a singularity at p. Another consequence of this fact is that for each A2 -invariant immersion, the curves {αi : i ∈ I} ∪ {β} are smooth.

5.6. Classification of Lorentzian A2 -invariant surfaces. As a summary of all the results obtained along the previous subsections, we exhibit the classification of Lorentzian A2 -invariant surfaces. As a previous step, we need the following definition. Definition 5.12. Let {β : J → Q}

{αi : Ji → R : i ∈ I}

be a countable family of smooth curves such that αi is time-like and Ji ⊆ R is an interval ∀i ∈ I, and J is either an interval or S1 . We will say that these curves are in general position if they satisfy the following three conditions: P1. ∀s ∈ J such that β(s) ∈ x , there exist ε > 0 and i ∈ I satisfying • β(s) ∈ Closure(trace(αi )), and • β|Jo αi is a smooth surface, where Jo =]so − ε, so + ε[∩J. P2. ∀i ∈ I, there exist ε > 0 and so ∈ Closure(J) such that • lims→s o β(s) ∈ Closure(trace(αi )), and • β|Jo αi is smooth, where Jo =]so − ε, so + ε[∩J. P3. Using P1 and P2, there can be defined two injective maps, P1 : {s ∈ J : β(s) ∈ x } −→ I and P2 : I −→ {so ∈ Closure(J) : lim β(s) ∈ x }, s→so

verifying P2(P1(s)) = s for all s ∈ J such that β(s) ∈ x . αi is smooth, it means Remark 5.13. In the previous definition, when we say β |J o that the union of β|Jo αi and an appropriate A2 -invariant subset of Oβ(s) make a smooth surface. Theorem 5.14. Let M be a connected surface and φ : M −→ L3 an immersion. Then, (M, φ ∗ (g)) is Lorentzian and A2 -invariant if and only if either 1. φ(M) is a Lorentzian fundamental symmetric surface (described in Theorem 5.5), or the union of such surface and one, two, three or four light-like orbits. 2. φ(M) is the union of the rotational surfaces generated by a family of curves in general position, {β : J → Q} ∪ {αi : Ji → R : i ∈ I}, and the corresponding family of light-like patches, {Bso : lim β(s) ∈ x }; s→so

460

M. Barros, M. Caballero, M. Ortega

that is φ(M) = β

i∈I

αi

{Bso : lim β(s) ∈ x } . s→so

5.7. An algorithm to construct A2 -invariant Lorentzian surfaces not contained in any fundamental region. To finish the study of A2 -invariant Lorentzian surfaces, we give an algorithm to construct many examples of this kind of surfaces, that are not contained in any fundamental region. 1. Given δ > 0, choose a smooth function ϕ : −δ 2 , δ 2 −→ R. 2. We consider the functions f α : Jα ⊆ (−δ, δ) −→ R and f β : Jβ ⊆ (−δ, δ) −→ R, defined as f α (s) = ϕ(s 2 ) and f β (s) = ϕ(−s 2 ), where Jα and Jβ are intervals such that 0 lies in the closure of both of them and ( f α (s))2 < 1 for all s ∈ Jα . 3. We define the following curves, α : Jα −→ R and β : Jβ −→ Q, given by α(s) = ( f α (s), 0, s) and β(s) = ( f β (s), s, 0). 4. Choose B0 ⊂ {(ϕ(0), y, z) ∈ L3 : y 2 = z 2 } to be A2 -invariant and such that α ∪ β ∪ B0 is a topological surface. 5. The surfaces αand β , generated by α and β respectively, glue smoothly and = α β B0 is an A2 -invariant Lorentzian surface. The gluing function (see Theorem 5.11) is given by F(y, z) = ϕ(z 2 − y 2 ). In addition, if we start with two functions, f α and f β , such that they are analytic in 0, f α (0) = f β (0) and the corresponding graphs in R and Q generate Lorentzian surfaces that glue smoothly with Lorentzian metric along the union (which means that the gluing function F is smooth), then we can show the existence of a smooth function, ϕ, that allows us to write those graphs as α(s) = (ϕ(s 2 ), 0, s) and β(s) = (ϕ(−s 2 ), s, 0). This result constitutes a kind of converse when starting from analytic data. The following pictures illustrate the last four subsections. Picture A shows a surface touching the axis twice, and crossing all fundamental regions in both cases. Picture B shows a surface touching the axis at three points, crossing four fundamental regions around the first point, three regions around the second point and just one region around the final point.

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

461

6. First Variation of the Willmore Functional in a Semi-Riemannian Manifold In contrast with the pure rotational case, i.e., the one associated with the group A1 , now, we will use a direct variational approach to study the case of space-like axis. This means we will avoid the principle of symmetric criticality. Thus, it is necessary to obtain the Euler-Lagrange equations associated with the problem I ε (M, L3 ); S or ε equivalently I (M, L3 ); W . These equations were computed in [39], when the target space was a Riemannian three-space with constant curvature. However, now we have semi-Riemannian target spaces, namely Lorentzian three-spaces. On the other hand, the constancy of the curvature is not enough for our purposes. In fact, we will need to make some suitable conformal changes, in the Lorentz-Minkowski metric, which, obviously, will not preserve the constancy of the curvature. Consequently, the variational setting will be as general as possible and later, computations will be particularized to our purposes. To start with, we introduce some preliminaries following the notation of [39]. Let M be a compact orientable smooth surface with boundary (maybe empty) and ¯ g) ¯ be a non-degenerate ¯ a 3-dimensional semi-Riemannian manifold. Let φ : M → M (M, ¯ immersion. Only in this section, Iφ(∂M) (M, M) will denote the space of non-degenerate immersions that fix the boundary, φ(∂M), without further conditions on the normal field along the common boundary. ¯ is nothing but a smooth map, : M×(−δ, δ) −→ M ¯ A variation of φ in Iφ(∂M) (M, M) satisfying the following conditions: ¯ defined by φv (m) = (m, v), 1. For each v ∈ (−δ, δ), the map φv : M −→ M, ¯ and belongs to Iφ(∂M) (M, M), 2. φ0 = φ. ¯ is non-degenerate for any v ∈ (−δ, δ), and (m, v) = It should be noticed that φv∗ (g) φ(m), for any m ∈ ∂M. Now, we can use all the paraphernalia of geometrical objects along a map. In particular, we can talk about vector fields along , or in other words, ¯ over M × (−δ, δ). Thus, we can cross sections of the induced vector bundle ∗ (TM) define the following vector field along : ∂ V(m, v) = ∗ (m, v) . ∂v

462

M. Barros, M. Caballero, M. Ortega

In particular, it holds V(m, v) = 0, ∀m ∈ ∂M. This, when restricted to v = 0, provides a vector field along φ which vanishes along ∂M, called the variational vector field ∂ (m, 0) . V(m) = V(m, 0) = ∗ ∂v ¯ is made up of those vector fields along Therefore, the tangent space Tφ Iφ(∂M) (M, M) φ that vanish along ∂M. We consider the Willmore variational problem which is associated with the functional ¯ −→ R defined by W : Iφ(∂M) (M, M) W(ψ) = Hψ2 + Rψ d Aψ + κψ ds, M

∂M

where Hψ denotes the mean curvature function of (M, ψ), Rψ is the sectional curvature ¯ g), ¯ restricted to the tangent plane dψ(TM) and κψ is the geodesic of the target space, (M, ¯ Now, we wish to determine the sufficient and necessary curvature of ∂M in (M, ψ ∗ (g)). conditions for φ to be a critical point of the above functional, in other words, a Willmore ¯ ¯ Therefore, we need to compute the differential surface of the conformal space (M, [g]). ¯ ¯ , of W at φ, i.e., δW(φ) : Tφ Iφ(∂M) (M, M) −→ R. Given V ∈ Tφ Iφ(∂M) (M, M) ¯ with V(m) = V(m, 0) = ∗ ∂ (m, 0) , consider a variation : M × (−δ, δ) −→ M ∂v ¯ Hv = Hφv , Rv = Rφv , ∀m ∈ M. To simplify the notation, we put Mv = (M, φv∗ (g)), κv = κφv , d Av = d Aφv , and then # " ∂ Hv2 + Rv d Av + κv ds . (7) δW(φ)[V] = ∂v Mv ∂M v=0 We will compute this step by step. First, we control the action on the boundary. To do so, we recall that we are using variations that fix the boundary, φ(∂M). Let {ν, T } be a unitary positively oriented frame field on φ(∂M), where T is tangent to φ(∂M) and ν is the outward normal to φ(∂M). Since T does not depend on v, {ν v , T } is the orientation of φ(∂M) in φv (M), for each v. Moreover, the principal curvature vector field, η = ∇¯ T T , ¯ g), ¯ does not depend on v, and so of φ(∂M) in (M, ∂ v ¯ ¯ g(η, ∇V ν )ds = − g¯ η⊥ , Dν V⊥ ds, (8) κv ds =− ∂v ∂M ∂M ∂M v=0 ¯ g). ¯ where ⊥ indicates normal component and D the normal connection of (M, φ) in (M, To obtain the second equality we have used an argument similar to that used in the Riemannian case, [39]. Remark. Regarding (8), it should be noticed that under the boundary conditions we are considering in this paper (i.e., space of surfaces immersed in L3 , with the same causal character, the same boundary and the same Gauss map along the common boundary), Dν V⊥ = 0 and so ∂ κv ds = 0, ∂v ∂M v=0 which is not surprising, because the total curvature of the boundary is a constant under these boundary conditions.

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

463

To obtain the variation of the two-dimensional integral appearing in (7), we need some formulae which can be obtained by using standard variational arguments and that we collect in the following Lemma 6.1. The following statements hold: 1. Let H(m, v) be the vector field along the variation that measures the mean curvature vector field of (M, φv ) at m ∈ M, then ' % $ 1& ˜ ⊥ ) + εRic(Nφ , Nφ )V⊥ + DV H, V⊥ + A(V = D∂ H ∂v v=0 2 where is the Laplacian relative to the normal connection, D, A˜ is the Simons’ ¯ g) ¯ and ε = g(N ¯ φ , Nφ ). operator (see [34]), Ric is the Ricci tensor of (M, 2. The variation of the area element is given by the following formula d ¯ = −2g(H, V) d A + dθ, (d Av ) dv v=0 where d A = d A0 and θ is the one-form defined by θ (Z ) = d A(V , Z ). Remark. The proof of 1 can be found in [39]. The proof of 2 is almost the same as in the Riemannian case. ¯ H) to obtain Next, we use the above lemma and the fact that Hv2 = εg(H, & # " ' ∂ ∂Rv 2 ⊥ 2 ¯ dA Hv + Rv d Av = εg(V , H) + V (H ) + ∂v ∂v v=0 v=0 ˜ + g¯ ε A(H) + Ric(Nφ , Nφ )H − 2(H 2 + R)H, V⊥ d A + (H 2 + R) dθ. On the other hand, (H 2 + R) dθ = d (H 2 + R) θ − V (H 2 + R) d A. Since θ vanishes on ∂M, we have V (H 2 + R) d A. (H 2 + R) dθ = − M

M

It is easy to see that Proposition 1.2. in [39] remains true in a semi-Riemannian setting. Then, we make use of it to have & ' ∂ 2 ¯ Hv + Rv d Av g(R(H), V⊥ ) + V⊥ (RV ) d A = ∂v Mv M v=0 ¯ g(H, Dν V⊥ )ds, +ε ∂M

˜ + (Ric(Nφ , Nφ where R = ε( + A) is a kind of Schrödinger operator, I is the identity map and RV ((m, v)) = Rv (m). Finally, we combine this formula with (8) to get & ' ¯ ¯ δW(φ)[V] = g(εH − η⊥ , Dν V⊥ )ds, g(R(H) + (∇RV )⊥ , V⊥ ) d A + ) − 2(H 2 + R)) I

∂M

M

(∇RV )⊥

(RV )N

V where = εNφ φ is the normal component of the gradient of R . These computations can be summarized in the following result which gives the first ¯ [g]) ¯ variation of the Willmore functional in a semi-Riemannian manifold, (M,

464

M. Barros, M. Caballero, M. Ortega

¯ [g]) ¯ with Theorem 6.1. In the previous setting, (M, φ) is a Willmore surface in (M, boundary date φ(M), if and only if & ' ¯ g(R(H) + εNφ (RV )Nφ , V⊥ ) d A + M

∂M

¯ g(εH − η⊥ , Dν V⊥ )ds = 0,

¯ . for any V ∈ Tφ Iφ(∂M) (M, M) From now on, we assume the boundary conditions we are considering along this ¯ paper. Namely, let = {γ1 , γ2 , . . . , γn } be a finite set of non-null regular curves in M with γi γ j = ∅, if i = j and choose No to be a unitary vector field along which is orthogonal to and has constant causal character, ε, on the whole . Then, we con¯ made up of those immersions, φ : M −→ M ¯ sider the space of immersions, I ε (M, M), satisfying φ(∂M) = ,

Nφ = No along ∂M

and

¯ φ , Nφ ) = ε. g(N

In this case (see the above remark), Dν (V)⊥ = 0 and therefore, the boundary term vanishes. Consequently, the Willmore surfaces with prescribed Gauss map along the ¯ [g]) ¯ are characterized by the equation common boundary in (M, & ' ¯ . ¯ g(R(H) + εNφ (RV )Nφ , V⊥ ) d A = 0, ∀ V ∈ Tφ I ε (M, M)

(9)

M

˜ = ε H d Nφ 2 Nφ , allowing us An easy computation shows H = (H )Nφ and A(H) to reduce (9) to & M

' εH + H d Nφ 2 + Ric(Nφ , Nφ ) − 2(H 2 + R) + εNφ (RV )

¯ φ , V⊥ )d A = 0, g(N

(10)

¯ . for any V ∈ Tφ I ε (M, M)

7. A2 -Invariant Riemannian Solutions A2 -invariant Riemannian surfaces in L3 were classified in Theorem 5.4. Each surface ˜ + or R ˜ − . This curve evolves of this type is generated by a curve, α, immersed in either R according to the motions of A2 to produce the symmetric surface α = {ξt (α(s)) : s ∈ I, t ∈ R}, lying in either R+ or R− respectively.

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

465

At this time, our aim consists of finding which curves generate surfaces α which are −1 Riemannian solutions of the O(2, 1) Nonlinear Sigma Model. Since I∂α (M, L3 ); S −1 3 ); W , the above problem is reduced to finding which is equivalent to I∂ (M, L α curves α generate Riemannian Willmore surfaces with prescribed time-like Gauss map along the boundary, ∂α . The solution is made explicit in the following result. For a better understanding, we recall Lemma 5.1, where we showed that the Lorentz-Minkowski metric, g, on both R+ and R− is conformal to a semi-Riemannian product of a de Sitter plane and a space-like Lorentzian circle. ˜ + or R ˜ − . Then, α is Theorem 7.1. Let α be a space-like curve immersed in either R a Riemannian solution of the O(2, 1) Nonlinear Sigma Model, that is, a Riemannian 3 Willmore surface with prescribed Gauss map along the common boundary in (L , g), if 1 ˜ + , 2 g or in the de Sitter plane and only if α is a free elastica in the de Sitter plane R f+ 1 − ˜ R , f 2 g , respectively. −

˜ + , since the other case is analoProof. We consider the case where α is immersed in R gous. The proof is obtained from a chain of three steps. Step 1. We make use of the conformal invariance of the Willmore functional. In this setting, the suitable boundary conditions for the problem are = ∂α and No = Nα . Even more, as α is the image of an immersion from [a1 , a2 ] × R, we deduce that −1 −1 (M, R+ ) is an open set of I∂ (M, L3 ). Using this M = [a1 , a2 ] × R. Notice that I∂ α α fact, as well as the conformal invariance of the Willmore functional, it is easy to see that α is a Riemannian Willmore surface with prescribed Gauss map along the common −1 (M, R+ ); W), where W is boundary in (L3 , g), if and only if α is a solution of (I∂ α the Willmore functional of R+ , f12 g . + From now on, we denote with a bar all the elements in R+ , f12 g . Also, to simplify + ¯ ¯ ¯ ¯ ¯ the we put Nα = Nα and Hα = Hα . Denote by Rα the sectional curvature of notation, R+ , f12 g along α and by K¯ α the Gaussian curvature of α with the induced metric + from R+ , f12 g . +

¯ V ). The surface α is a critical point of the problem Step 2. We compute the term N¯ α (R −1 (I∂α (M, R+ )); W) if and only if, (10) holds on every non-null polygon contained in α , for ε = −1. However, it will be interesting to obtain a characterization of the critical −1 (M, R+ ); W) as the solution of an equation involving terms that depend points of (I∂ α ¯ V ) because, a priori, only on α . In this sense, one needs to manipulate the term N¯ α (R ¯ V along a it is the only one that depends on V. Pick a point p ∈ α , and compute R curve, γ , with γ (0) = p and γ (0) = N¯ α ( p). Let φα : I × R −→ R+ , f12 g be the + parametrization of α defined as φα (s, t) = ξt (α(s)). Write α(s) = (α1 (s), 0, α3 (s)) so that f12 g(α , α ) = 1, then a unitary normal vector to α can be computed to be +

N¯ α (φα (s, t)) = N¯ α (s, t) = ξt ((α3 (s), 0, α1 (s)) = (α3 (s), α1 (s) sinh t, α1 (s) cosh t).

466

M. Barros, M. Caballero, M. Ortega

−1 + )) with compact support (remember we are working with Given V ∈ Tφα I∂ (M, R α non—null polygons in α ), an associated variation of φα is just 1 : (−δ, δ) × I × R −→ R+ , 2 g , (v, s, t) = φα (s, t) + vV(φα (s, t)). f+ Without loss of generality, we can assume V = 0. In this case, there exists a function f : I × R −→ R with compact support, such that V = f N¯ α , and so (v, s, t) = φα (s, t) + ¯ V ((vo , so , to )) is the sectional curv f (s, t) N¯ α (s, t). Given a point (vo , so , to ), then R vature of the plane by {(∂s )(vo , so , to ), (∂t )(vo , so, to )}. generated ˜ + , 12 g × (H+ , dt 2 ) and denote by 1 : R+ , 12 g −→ Recall that R+ , f12 g = R f+ + f+ 1 1 + 2 + + 2 ˜ R , f 2 g and : R , f 2 g −→ (H , dt ) the canonical projections. It is clear + + ¯ s , t , t , s ) = R(E ˜ s , E t , E t , E s ), where R ¯ is the curvature tensor of that R( 1 + + ˜ ˜ R , 2 g and R stands for the curvature tensor of R , 12 g , E s = d1 (∂s ) and f+

f+

E t = d1 (∂t ). Therefore, ¯ V ((vo , so , to )) = R

˜ s , Et , Et , Es ) R(E . 1 1 1 2 g(∂ , ∂ ) g(∂ , ∂ ) − ( g(∂ , ∂ )) s s t t s t 2 2 2 f+ f+ f+

Observe that the normal vector at φα (so , to ) is N¯ α (so , to ) = γ (0), where τ , s o , to . γ (τ ) = φα (so , to ) + τ N¯ α (so , to ) = f (so , to ) ¯ V along γ , we have To compute the value of R ∂t f 2 ˜ ˜ o , V1 , V1 , U1 ) [R(Uo , V1 , V1 , Uo ) + 2τ R(U f ˜ 1 , V1 , V1 , U1 )], +τ 2 R(U

˜ s , Et , Et , Es ) = τ 2 R(E

being Uo = (α1 , 0, α3 ), U1 = ( ∂sf f α3 + α3

, 0, ∂sf f α1 + α1

) and V1 = (α3 , 0, α1 ). Also 2 1 1 1 g(∂s , ∂s ) 2 g(∂t , ∂t ) − g(∂s , ∂t ) f +2 f+ f +2 α 4 + b1 τ + b2 τ 2 + b3 τ 3 + b4 τ 4 = 3 , (α3 + τ α1 )4 where b1 , b2 , b3 and b4 are functions that do not depend on τ . Then, τ , so , to )) f (so , to ) ' & 2 (α3 +τ α )4 R ˜ (U1 , V1 , V1 , U1 ) ˜ (Uo , V1 , V1 , Uo )+2τ R ˜ (Uo , V1 , V1 , U1 )+τ 2 R 1 f ∂ t =τ2 . f α34 + b1 τ + b2 τ 2 + b3 τ 3 + b4 τ 4

¯ V (( R

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

467

Since α3 > 0, we obtain

τ d V V ¯ ¯ ¯ = 0. , s o , to R Nα (R )(φ(so , to )) = dτ |τ =0 f (so , to ) ˜ + , 12 g . Step 3: The solutions come from clamped elasticae in the de Sitter plane R f+ The computation we did in the last step allows one to characterize the solutions, α , of −1 ([a1 , a2 ] × R, R+ ); W) as the solutions of the following Euler-Lagrange equation: (I∂ α ¯ α) = 0 − H¯ α + H¯ α d N¯ α 2 + Ric( N¯ α , N¯ α ) − 2( H¯ α2 + R

in α .

(11)

¯ α − det(−d N¯ α ), Eq. (11) turns out Using d N¯ α 2 = 4 H¯ α2 − 2 det(−d N¯ α ) and K¯ α = R to be equivalent to ¯ α + Ric( N¯ α , N¯ α ) = 0. − H¯ α + H¯ α 2 H¯ α2 + 2 K¯ α − 4R ¯ α = 0 because it is a mixed sectional curvature in the semi-Riemannian However, R 1 1 + + + 2 ˜ product, [27], namely R , f 2 g = R , f 2 g × (H , dt ). Also, α = trace(α), + + 1 g ×(H+ , dt 2 ) is a semi-Riemannian product and so K¯ α = 0. Recall d1 (∂s φ) = α , f +2 ˜ + , 12 g has sectional curvature 1, then we get d1 (∂t φ) = d2 ( N¯ α ) = 0, and R f+ Ric( N¯ α , N¯ α ) = −1. Bearing in mind d2 (∂s φ) = 0, we compute the mean curvature: ¯ α = 1 (∇¯ ∂s φ ∂s φ)⊥ + 1 (∇¯ ∂t φ ∂t φ)⊥ = 1 ((d1 )−1 (∇˜ α α ))⊥ = − 1 k˜α N¯ α , H 2 2 2 2 where ∇¯ is the Levi-Civita connection of R+ , f12 g , ∇˜ is the Levi-Civita connection + ˜ + , 12 g . As a consequence, we obtain ˜ + , 12 g and k˜α is the curvature of α in R of R f+ f+ H¯ α = − 1 k˜α and H¯ α = − 1 k˜α

. By using all these computations, we obtain that α is 2

2

−1 ([a1 , a2 ] × R, R+ ); W) if and only if a solution of (I∂ α

2k˜α

− k˜α3 + 2k˜α = 0.

(12) ˜ + , 12 g . Finally, we check that this equation characterizes the elasticae in R f+ In Sect. 4, we computed the equation of both space-like and time-like elastic curves in x was time-like, B = { x , y, z } (AdS2 , h12 g), see (5). Recall that in that section, the axis 3 v ∈ L : v , y > 0, v , z = was an orthonormal basis, AdS2 = { 0} and h( p) = ˜ + , gˆ = − 12 g p, y, being p the position vector of the point p. Now, notice that R f+ is an anti de Sitterspace,and, with this new metric, x is also a time-like axis. So, a ˜ + , gˆ is an elastica if and only if 2kˆ

− kˆ 3 + 2kˆ = 0, where kˆ is the time-like curve in R geodesic curvature of the curve. Both gˆ and

1 g f +2

have the same Levi-Civita connection.

˜ where k˜ is the Thus, by comparing the Frenet equations, it is easy to see that kˆ = −k,

468

M. Barros, M. Caballero, M. Ortega

( ˜ + , 12 g . And so, a space-like curve is a critical point of k˜ 2 geodesic curvature in R α f+ 1 + ˜ , 2 g , if and only if in R f +

2k˜

− k˜ 3 + 2k˜ = 0. This ends the proof of the theorem. Remark. The solutions of (12) are

⎛

⎞

κ¯ α (s) = C cn ⎝ 1 −

C2 2

(s − ao ), C˜ ⎠ ,

√ √ 2 where C ∈ R \ {− 2, 2} and ao ∈ R are arbitrary constants and C˜ 2 = 2CC2 −4 . If ⎫ ⎧ ⎨ (2n + 1)E ⎬ C 2 < 2, s ∈ R; in another case, s ∈ R\ , E being the complete ao + ⎭ ⎩ C2 − 1 n∈Z 2 2 ˜ elliptic integral of first kind with modulus 1 − C . 8. A2 -Invariant Lorentzian Solutions Along this section, we describe the class of A2 -invariant Lorentzian solutions. Firstly, we consider the case when the surfaces are contained in one fundamental region, i.e., we deal with the fundamental solutions. Certainly, this is the easy but basic case. As one can guess, two classification theorems are obtained according to the nature of the profile curve. On one hand, we obtain Lorentzian solutions, α , contained in either R+ or R− which are generated by time-like free elasticae in the de Sitter plane. On the other hand, we also obtain a second class of Lorentzian solutions, that lie in either Q+ or Q− , coming from free elasticae in the hyperbolic plane. To be precise, we consider the ˜ − , or a curve in either Q˜ + ˜ + or R following problem: Let γ be a time-like curve in either R − ˜ or Q . What does it have to satisfy in order to make γ be a solution? Or equivalently, what does γ have to satisfy in order to make γ be a Willmore surface with prescribed space-like Gauss map along the boundary? The answer to this problem is given in the next pair of statements. We omit their proofs because they are quite similar to that of Theorem 7.1 with only technical changes. ˜ + or R ˜ − . Then, α is a Theorem 8.1. Let α be a time-like curve immersed in either R Lorentzian solution of the O(2, 1) Nonlinear Sigma Model, that is, a Lorentzian Willmore surface with prescribed Gauss map along thecommon boundary in (L3 , g), if and 1 + ˜ only if, α is a free elastica in the de Sitter plane R , f 2 g or in the de Sitter plane + 1 − ˜ R , f 2 g , respectively. −

Theorem 8.2. Let β be a curve immersed in either Q˜ + or Q˜ − . Then, β is a Lorentzian solution of the O(2, 1) Nonlinear Sigma Model, that is, a Lorentzian Willmore surface with prescribed Gauss map along the common boundary in (L3 , g), if and only if, β isa 1 + ˜ free elastica in the hyperbolic plane Q , h 2 g or in the hyperbolic plane Q˜ − , h12 g , +

respectively.

−

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

469

Nevertheless, it seems convenient to give the curvature functions of the above solution generatrices. ˜ − , 12 g are those time-like curves ˜ + , 12 g and R Time-like free elastic curves in R f f −

+

with curvature ⎛

⎧ ⎫ ⎨ (2n + 1) ⎬ κ(s) ˜ = C cn ⎝i 1 + E ao + , (13) (s − ao ), C˜ ⎠ , for s ∈ R \ 2 ⎩ ⎭ 2 1 + C2 n∈Z ⎞

C2

2 where C ∈ R, ao ∈ R are arbitrary constants, C˜ 2 = 2CC2 +4 and E is the complete elliptic integral of the first kind with modulus 1 − C˜ 2 . Free elastic curves in Q˜ + , h12 g and Q˜ − , h12 g are those curves with curvature +

−

⎛

⎞ 2 C − 1 (s − ao ), C˜ ⎠ , κ˜ γ (s) = C cn ⎝ 2 ⎧ ⎫ ⎫ ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ 2n + 1 ⎬ ⎪ ⎪ ⎪ E ao + if C 2 < 2 ⎪ ⎬ ⎨R \ 2 ⎩ ⎭ C 1− 2 n∈Z , for s ∈ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ R if C 2 > 2

(14)

√ √ 2 where C ∈ R \ { 2, 2} and ao ∈ R are arbitrary constants, C˜ 2 = 2CC2 −4 and E is the complete elliptic integral of the first kind with modulus 1 − C˜ 2 . Moreover, we already know the existence of a wide family of A2 -invariant Lorentzian surfaces which are not contained in a unique fundamental region. In other words, they are obtained by pasting fundamental rotational surfaces. Then, we investigate the existence of A2 -invariant Lorentzian solutions which can be obtained by the gluing mechanism. The following theorem provides the connection between both the variational approach and the gluing mechanism. Theorem 8.3. Given a surface M, let φ ∈ I 1 (M, L3 ) be a A2 -invariant immersion whose image intersects at least two different fundamental regions. Then, (M, φ) is a Lorentzian solution of the O(2, 1) Nonlinear Sigma Model, that is, a Lorentzian Willmore surface with prescribed Gauss map along the boundary in (L3 , g), if and only if, φ(M) is a A2 -invariant connected piece (with boundary) of one of the following surfaces: • A Lorentzian plane orthogonal to the axis. • A one-sheet hyperboloid with arbitrary radius, centered at any point p ∈ x and → with axis p + − z (see Subsect. 5.3). Proof. Given a surface M, and a A2 -invariant Lorentzian immersion φ ∈ I 1 (M, L3 ), we know φ(M) is generated by a curve β in Q and a countable family of time-like curves αi in R, that are in general position. Itshould be noticed (see Theorems 8.1 and 8.2) that φ is a critical point of I 1 (M, L3 ); S , if and only if, the following five conditions hold:

470

M. Barros, M. Caballero, M. Ortega

1. trace(β) ∩ Q˜ + consists of free elastic curves in Q˜ + , h12 g , + − − ˜ ˜ 2. trace(β) ∩ Q consists of free elastic curves in Q , h12 g , − ˜ + consists of time-like free elastic curves in R ˜ + , 12 g ∀i, 3. trace(αi ) ∩ R f+ − − ˜ ˜ 4. trace(αi ) ∩ R consists of time-like free elastic curves in R , f12 g ∀i, and − 1 5. φ|K is a critical point of Iφ(∂K) (K, L3 ); WK for any non-null polygon K ⊆ M, such that both K (R+ ∪ R− ) and K (Q+ ∪ Q− ) are not empty. The last condition holds if and only if, for any non-null polygon K, (10) is satisfied for all V ∈ Tφ I 1 (K, L3 ) . Notice that Nφ (RV ) = 0 in this case, because L3 is flat. Therefore, if φ satisfies the first four conditions, the last one is satisfied too. Now, we wish to control the curves αi for each i. It is known, for each αi , the existence of a function, f i :] − ε, ε[→ R, such that αi (t) = ( f i (t), 0, t) is a parametrization of αi ,

in a neighborhood of trace(αi ) x . Even more, we know that f i (0) = 0 (see proof of 1 ˜ + , 2 g , can be computed Theorem 5.11). The curvature function of αi :]0, ε[−→ R f+

to be κi (t) :=

− f i (t) + t f i

(t) + ( f i (t))3 . (1 − ( f i (t))2 )3/2

Since f i (0) = 0, we obtain that limt→0κi (t) = 0. On the other hand, the curvature of a ˜ + , 12 g is given by (13). If we use the properties of the time-like free elastic curve in R f+

elliptic cosinus of Jacobi, it is easy to check that the module of (13) is equal or greater ˜ + , 12 g . A similar than |C|. So, condition 3 holds if, and only if, αi is a geodesic of R f+ ˜ − , 12 g . As a conclusion, we obtain that conreasoning works for αi :] − ε, 0[−→ R f− 1 1 + − ˜ ˜ ditions 3 and 4 hold if and only if αi is a geodesic in both R , f 2 g and R , f 2 g +

−

for all i. Next, we work with β. For each piece of trace(β) Q˜ + , there exists so in the closure of the domain of β, such that p = lims→so β(s) ∈ x and ∃i ∈ I satisfying αi = αso (see Definition 5.12). Then, there exists a function f β , defined on ] − ε, ε[, such that β(t) = ( f β (t), t, 0) provides of β. Again, we compute the curvature a parametrization function of β :]0, ε[−→ Q˜ + , 12 g , obtaining h+

κ(t) :=

f β (t) − t f β

(t) + ( f β (t))3 (1 + ( f β (t))2 )3/2

.

Bearing in mind that f β (0) = 0, (see the proof of Theorem 5.11), we also obtain limt→0 κ(t) = 0. We compare this expression with the curvature function of an elastica in Q˜ + , h12 g , see (14). If C 2 < 2, then the absolute value of the curvature is equal + or greater than |C|; otherwise, if s denotes the arc-length parameter, κ(s) is a periodic function that takes the value 0 at s = ao + (2n + 1)E/

C2 2

− 1, and the value C at

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

471

2 s = ao +4nE/ C2 − 1, for all n ∈ Z, where E is the complete elliptic integral of the first ˜ Now, we combine the behavior of (14) with both limt→0 κ(t) = 0 kind with modulus C. and limt→0 β(t) x , to conclude that condition 1 holds if, and only if, β is a geodesic ∈ in Q˜ + , h12 g . A similar argument can be used to see that condition 2 holds if and only + 1 − ˜ if β is a geodesic in Q , h 2 g . −

At this point, we take advantage of the knowledge one has on the geodesics of both the hyperbolic plane and the de Sitter one. Notice that we are regarding both surfaces as half-plane Poincaré models. We recall that the geodesics of the hyperbolic plane are those curves whose trace is either a ray perpendicular to the boundary or half a circle centered at the boundary. In the de Sitter plane, the time-like geodesics are those curves whose trace is either a ray perpendicular to the boundary or half of any of the connected components of a time-like hyperbola centered at any point of the boundary. Consequently, condition 1 holds if and only if, β can be reparametrized, in Q˜ + , as either: • β(t) = (A, t, 0) for t ∈]0, b[⊆]0, +∞[, where A ∈ R, or • β(t) = (A + ρ cos(t), ρ sin(t), 0) for t ∈]0, a[⊆]0, π [, where A ∈ R and ρ > 0. Certainly, the same argument can be used for condition 2, just by replacing Q˜ + by Q˜ − ; ]0, ∞[ by ] − ∞, 0[ and ]0, π [ by ] − π, 0[. Since β is C ∞ , we conclude that β intersects x at least once. Even more, it is either • a segment perpendicular to x , or • a connected piece of a circle in (Q, g) centered at any point of x . As an important consequence, the cardinal of {αi : i ∈ I} is either 1 or 2. Based on the geodesics of the de Sitter plane, a similar reasoning can be applied to each αi , obtaining that conditions 3 and 4 hold if and only if for each i, αi not only intersects x but also it is either: • a connected segment in (R, g) perpendicular to x , or • a connected piece of a connected component of a time-like Lorentzian circle in (R, g), centered at any point of x . The only remaining detail consists of finding which combinations of parametrizations of β and {αi : i ∈ I} give rise to C ∞ gluing functions (recall Theorem 5.11). The proof of the following assertions are left to the reader: 1. If β(t) = (A, t, 0) for an arbitrary constant A ∈ R, then the only α that gives a C ∞ gluing function is α(t) = (A, 0, t). In this case, the surface generated by α and β is a A2 -invariant connected piece of plane {(A, y, z)/y, z ∈ R}. 2. If β is a piece of the circle in (Q, g) with radius ρ > 0 and center (A, 0, 0), that contains the point (A + ρ, 0, 0), then to obtain a C ∞ gluing function in that point, α must be a piece of the future-pointing connected component of the time-like Lorentzian circle with radius ρ and center (A, 0, 0). 3. If β is a piece of the circle in (Q, g) with radius ρ > 0 and center (A, 0, 0), that contains the point (A − ρ, 0, 0), then, to obtain a C ∞ gluing function in that point, α must be a piece of the past-pointing connected component of the time-like Lorentzian circle with radius ρ and center (A, 0, 0). Finally, in the last two cases, i.e., when β is a piece of circle in (Q, g), the surface generated by {β} ∪ {αi : i ∈ I} is just a A2 -invariant connected piece of a one-sheet → hyperboloid with center p ∈ x , and axis p + − z .

472

M. Barros, M. Caballero, M. Ortega

9. A3 -Invariant Surfaces in L3 In this section, we obtain, up to similarities in the Lorentz-Minkowski space, the whole class of solutions of the O(2, 1) Nonlinear Sigma Model which are symmetric under the group A3 . As above, the first step consists of finding the symmetric points, i.e., the immersions that are A3 -invariant. In this sense, we will consider the corresponding fundamental regions where one can get fundamental symmetric surfaces, well-known in the literature as rotational surfaces with light-like axis, see for example [19]. Then, we will pay attention to the, a priori, reasonable problem of gluing two fundamental symmetric surfaces which are contained in different fundamental regions. Nevertheless, we will see that a gluing mechanism does not work in this case, so we cannot paste two rotational surfaces with light-like axis, lying in different fundamental regions, to provide an A3 -invariant surface. Given x ∈ L3 a light-like vector, we choose a basis B = { x , y, z }, such that: (1) y is a light-like vector with x , y = −1, and (2) z is a unitary space-like vector orthogonal to the plane Span{ x , y}. From now on, we will use coordinates with respect to B, so g ≡ −2d x d y + dz 2 . In L3 \ x we will distinguish the following fundamental regions: S+ = {(x, y, z) ∈ L3 : y > 0}, and S− = {(x, y, z) ∈ L3 : y < 0}. Put S = {(x, y, z) ∈ L3 : z = 0} and T = {(x, y, z) ∈ L3 : y = 0}, and consider the following open half planes: S˜ + = S+ ∩ S = {(x, y, 0) ∈ L3 : y > 0}, and S˜ − = S− ∩ S = {(x, y, 0) ∈ L3 : y < 0}. We also have the following parabolas, that are orbits under the action of A3 : P+ = {(x, 1, z) ∈ L3 : −2x + z 2 = 0} = {(t 2 /2, 1, t) ∈ L3 : t ∈ R}, and P− = {(x, −1, z) ∈ L3 : 2x + z 2 = 0} = {(−t 2 /2, −1, t) ∈ L3 : t ∈ R}. It should be noticed that P+ and P− are space-like in L3 with metric dt 2 . Next, we define positive functions l+ : S˜ + −→ R, l+ (x, y, 0) = y, and l− : S˜ − −→ R, l− (x, y, 0) = −y.

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

473

In this setting, it is not difficult to check the following warped product decompositions: (S+ , g) = (S˜ + , g) ×l+ (P+ , dt 2 ), and (S− , g) = (S˜ − , g) ×l− (P− , dt 2 ). Furthermore, when we make the obvious conformal changes, it is easy to see that the 1 1 + − ˜ ˜ surfaces S , l 2 g and S , l 2 g are anti de Sitter planes with curvature −1. Conse−

+

quently, we obtain the following result. Lemma 9.1. S+ , l12 g and S− , l12 g are semi-Riemannian products of an anti de +

−

Sitter plane and a space-like parabola. The following result classifies the A3 -invariant surfaces. In particular, it proves that the surfaces of this class lie in one fundamental region. Theorem 9.1. Let M be a connected surface and φ : M −→ L3 a non-degenerate immersion. Then, (M, φ ∗ (g)) is A3 -invariant if and only if one of the following statements hold: 1. If (M, φ ∗ (g)) is Riemannian, there exists a space-like curve, α, immersed in either S˜ + or S˜ − , such that φ(M) = {ςt (trace(α))/t ∈ R}. 2. If (M, φ ∗ (g)) is Lorentzian, there exists a time-like curve, α, immersed in either S˜ + or S˜ − , such that φ(M) = {ςt (trace(α))/t ∈ R}. Proof. It is clear that, given any non-null curve, α, immersed in either S˜ + or in S˜ − , then, the surface parametrized by X (s, t) = ςt (α(s)) provides an A3 -invariant surface, [19]. If α is space-like, then the surface is Riemannian, and, if α is time-like, the surface is Lorentzian. It is also easy to see the converse, given an A3 -invariant surface immersed in either S+ or S− , then, there exists a non-null curve, α immersed in either S˜ + or S˜ − such that the surface is parametrized as X (s, t) = ςt (α(s)) and so it is A3 -invariant. Thus, we only need to check that there do not exist A3 -invariant surfaces intersecting the plane T. First, notice that the orbits contained in T are always light-like, so Riemannian A3 -invariant surfaces intersecting the plane T cannot exist. On the other hand, let us assume there exists a Lorentzian A3 -invariant surface immersed in L3 , φ(M), intersecting the plane T. Then, φ(M) ∩ S+ and φ(M) ∩ S− are the union of a countable family of A3 -invariant surfaces generated by time-like curves in S˜ + and S˜ − , respectively. Note that the boundary of each of these curves has only one point in x , because of its causality. Therefore, we must check that none of the following cases hold: 1. There exist two time-like curves α + :]0, δ[−→ S˜ + and α − :] − δ, 0[−→ S˜ − (δ > 0) such that (a) lims→0 α + (s) = lims→0 α − (s) is a point of x . (b) The surfaces generated by both curves can be glued, obtaining a smooth A3 -invariant Lorentzian surface. 2. There exist two time-like curves both in either S˜ + or S˜ − , satisfying (a) and (b). 3. There exists a time-like curve in either S˜ + or S˜ − , that generates an A3 -invariant surface that glue smoothly with x , or a part of it, and the induced metric along the union is Lorentzian.

474

M. Barros, M. Caballero, M. Ortega

To do so, we study the behavior of the surfaces generated by a curve immersed in S˜ + and x . Let α + :]0, δ[−→ S˜ + be a time-like a curve immersed in S˜ − , in a neighborhood of + curve such that lims→0 α (s) ∈ x . We define r + as r + = { lim ςt (α + (s))/t ∈ R} = { lim α + (s) + λ x /λ > 0}. s→0

s→0

Let α − :] − δ, 0[−→ S˜ − be a time-like curve such that lims→0 α − (s) ∈ x . We define r − as r − = { lim ςt (α − (s))/t ∈ R} = { lim α − (s) + λ x /λ < 0}. s→0

s→0

Then, cases 1 and 2 are not possible, because in both cases, the surfaces obtained after the gluing are not a C ∞ surface in a neighborhood of the point in which both curves glue. A necessary condition for case 3 to hold, is that the curve obtained by gluing α and r + or r − (depending on if α is immersed in S˜ + or S˜ − , respectively) must be C ∞ . But in that case, the tangent plane of the surface along r + or r − is {(x, 0, z)/x, z ∈ R}, so the induced metric is not Lorentzian along the union. Finally, the last result classifies the A3 -invariant surfaces which are also solutions of the O(2, 1) Nonlinear Sigma Model. Theorem 9.2. Let α be a non-null curve immersed in S. Then, α = {ςt (trace(α))/ t ∈ R} is a non-degenerate solution of the O(2, 1) Nonlinear Sigma Model, that is, a non-degenerate Willmore surface with prescribed Gauss map along the common bound 1 3 + ˜ ary in (L , g), if and only if α is a free elastica in the anti de Sitter plane S , l 2 g or + in the anti de Sitter plane S˜ − , l12 g , respectively. −

The proof is analogous to the one of Theorem 7.1, so it is left to the reader. We only have to recall that the curvature function for free elasticae in the de Sitter plane were made explicit at the end of Sect. 4, but in our case, we have to substitute ε2 by ε.

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model

475

10. Appendix: A Gauss-Bonnet Formula for Non-Null Polygons The Lorentzian version of the Gauss-Bonnet formula given in [8] is only true when the boundary, = {γ1 , γ2 , . . . , γn }, is made exclusively of time-like pieces. However, we are considering a more general setting where some pieces of the boundary might be space-like while others can be time-like. Therefore, we need to extend the Gauss-Bonnet formula to this more general context. Let (S, g) be a Lorentzian surface. We choose an orientation and a time-orientation in S. For any unitary vector w ∈ T p S, denote by w ⊥ ∈ T p S the unique unitary vector ⊥ w ⊥ } is positively oriented. By choossuch that w, w = 0 and the ordered basis {w, ing such basis and expressing vectors by their corresponding coordinates, we can define the concept of hyperbolic angle, [8], made by a pair of time-like vectors. Let u, v be two unitary time-like vectors, if they are future-pointing (or past-pointing), the angle, ∠[ u , v], from u to v is the number θ such that cosh θ sinh θ . Aθ · u = v, with Aθ = sinh θ cosh θ Once we have defined the angle between two future-pointing (or past-pointing) unitary time-like vectors, we can define the angle between arbitrary unitary vectors according to the following cases: [25]: 1. If u is future-pointing and v past-pointing (or viceversa) unitary time-like vectors, define ∠[ u , v] = ∠[ u , − v ]. 2. If u and v are unitary space-like vectors, then u⊥ and v⊥ are unitary time-like vectors and so, we define ∠[ u , v] = ∠[ u ⊥ , v⊥ ]. 3. Finally, if u is time-like and v space-like, we define ∠[ u , v] = ∠[ u , v⊥ ]; ∠[ v , u] = ∠[ v ⊥ , u]. With these definitions, Lemma 1 of [8] still holds, see [25]. The main purpose of the next step is to realize, à la Euler, the geodesic curvature of any non-null curve, δ(s), in S. Without loss of generality, we may assume that δ(s) is arclength parametrized, with Frenet apparatus {T (s) = δ (s), T ⊥ (s)}, curvature function κ, and Frenet equations ∇T T = ε2 κ T ⊥ , ∇T T ⊥ = −ε1 κ T, where ∇ stands for the Levi-Civita connection of (S, g), ε1 = g(T, T ) and ε2 = g(T ⊥ , T ⊥ ). On the other hand, let Z (s) be a unitary time-like vector field parallel along δ(s). By choosing Z (0) future-pointing, then it is so at every point because parallel displacement preserves time-orientation. Notice that Z ⊥ (s) is also parallel along δ(s). Now, if we denote by ϕ(s) = ∠[T (s), Z (s)], the conclusion is ϕ (s) = −κ(s). To check this formula, we will distinguish two cases:

(15)

476

M. Barros, M. Caballero, M. Ortega

1. If δ(s) is time-like, the proof can be found in [8]. 2. If δ(s) is space-like, then, in the basis {Z (s), Z ⊥ (s)}, we have A−ϕ(s) · Z (s) = ±T ⊥ (s), ⊥

A−ϕ(s) · Z (s) = ∓T (s),

(16) (17)

depending on whether T ⊥ is future-pointing or past-pointing, respectively. Next, we differentiate (16) with respect to s, compare it with the Frenet equations and finally, combine with (17) to obtain (15). After this point, we can follow step by step the proof of [8], valid for time-like polygons (i.e., non-null polygons with time-like boundary) to obtain the following result: Gauss-Bonnet formula for non null polygons. Let (S, g) be a Lorentzian surface and let K ⊂ S be a non-null polygon such that ∂K is a simple closed curve made up of a finite number of smooth non-null curves, δ j (s), 1 ≤ j ≤ r . Suppose that δ j starts at p j ∈ S with initial unitary speed tj and ends at p j+1 ∈ S with terminal unitary speed u j , where pr +1 = p1 . Denote the exterior angles at vertices as follows: θ1 = ∠[ u 1 , t2 ], θ2 = ∠[ u 2 , t3 ],. . . ,θr −1 = ∠[ u r −1 , tr ] and θr = ∠[ u r , t1 ]. In this framework, we have r − K dA + κ ds + θ j = 0, K

∂K

j=1

where K stands for the Gaussian curvature of (S, g) and κ denotes the geodesic curvature along ∂K. Acknowledgement. The authors would like to thank the referees for their useful comments, which helped us to improve this paper.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.

Albertsson, C., Lindstrom, U., Zabzine, M.: Commun. Math. Phys. 233, 403 (2003) Albertsson, C., Lindstrom, U., Zabzine, M.: Nucl. Phys. B 678, 295 (2004) Anzellotti, G., Serapioni, R., Tamanini, I.: Indiana Univ. Math. J. 39, 617 (1990) Anzellotti, G., Delladio, S.: Proceedings of a Conference in Honor of the 70th Birthday of Robert Finn, Stanford University. Boston-Cambridge, MA: International Press Incorporated, 1995 Barros, M.: Phys. Lett. B 553, 325 (2003) Barros, M., Caballero, M., Ortega, M.: J. Geom. Phys. 57, 177 (2006) Belavin, A.A., Polyakov, A.M.: JETP Lett. 22, 245 (1975) Birman, G.S., Nomizu, K.: Michigan Math. J. 31, 77 (1984) Bracken, P.: The generalized Weiertrass system for nonconstant mean curvature surfaces and the nonlinear sigma model. http://arXiv.org/abs/math-ph/0607048v1, 2006 Bredthauer, A.: Tensionless Strings and Supersymmetric Sigma Models. Aspects of the Target Space Geometry. Uppsala: Acta Universitatis Upsaliensis, 2006 Byrd, P.F., Friedman, M.D.: Handbook of Elliptic Integrals for Engineers an Scientists. Berlin-Heidelberg-New York: Springer-Verlag, 1971 Capovilla, R., Guven, J.: J. Phys. A 38, 2593 (2005) Cavalcante, F.S.A., Cunha, M.S., Almeida, C.A.S.: Phys. Lett. B 475, 315 (2000) Chelnokov, V.E., Zeitlin, M.G.: Phys. Lett. A 104, 329 (1984) Davis, H.T.: Introduction to Nonlinear Differential and Integral Equations. New York: Dover Publications, Inc., 1962 Davis, A.C., Macfarlane, A.J., van Holten, J.W.: Nucl. Phys. B 216, 493 (1983) Do Carmo, M.P., Dajzer, M.: Tôhoku Math. J. 34, 425 (1982) Gruszczak, J.: J. Phys. A 14, 3247 (1981)

Rotational Surfaces in L3 and Solutions of the Nonlinear Sigma Model 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40.

477

Hano, J., Nomizu, K.: Tôhoku Math. J. (2) 36, 427 (1984) Howe, P.S., Lindstrom, U., Stojevic, V.: JHEP 0601, 159 (2006) Langer, J., Singer, D.A.: J. Diff. Geom. 20, 1 (1984) Laughlin, R.B.: Phys. Rev. Lett. 60, 2677 (1988) Mieck, B.: Nonlinear sigma model for a condensate composed of fermionic atoms. http://arXiv.org/abs/ cond-mat/0501139v3[cond-mat.stat-mech], 2005 Mishchenko, Y., Chueng-Ryong Ji.: Int. J. Mod. Phys. A 20, 3488 (2005) Nešovi´c, E., Petrovi´c-Torgašev, M., Verstraelen, L.: Bolletino U. M. I. (8) 8-B, 685 (2005) Ody, M.S., Ryder, L.H.: Int. J. Mod. Phys. A 10, 337 (1995) O’Neill, B.: Semi-Riemannian Geometry with Applications to Relativity. London-New York: Academic Press, 1983 Otsu, H., Sato, T., Ikemori, H., Kitakado, S.: JHEP 0507, 052 (2005) Palais, R.S.: Commun. Math. Phys. 69, 19 (1979) Polyakov, A.M.: Phys. Lett. B 103, 207 (1981) Polyakov, A.M.: Phys. Lett. B 103, 211 (1981) Purkait, S., Ray, D.: Phys. Lett. A 116, 247 (1986) Schützhold, R., Mostane, S.: JETP Lett. 82, 248 (2005) Simons, J.: Ann. Math. 88, 62 (1968) Tseytlin, A.A.: Phys. Lett. B 288, 279 (1992) Tseytlin, A.A.: Phys. Rev. D 47, 3421 (1993) Tsurumaru, T., Tsutsui, I.: Phys. Lett. B 460, 94 (1999) Vekslerchik, V.E.: J. Phys. A: Math. Gen. 27, 6299 (1994) Weiner, J.L.: Indiana Univ. Math. J. 27, 19 (1978) Willmore, T.J.: Total Curvature in Riemannian Geometry. New York: John Wiley and Sons, 1982

Communicated by N. A. Nekrasov

Commun. Math. Phys. 290, 479–522 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0848-7

Communications in

Mathematical Physics

Long-Time Asymptotics for the Focusing NLS Equation with Time-Periodic Boundary Condition on the Half-Line Anne Boutet de Monvel1 , Alexander Its2 , Vladimir Kotlyarov3 1 IMJ, Université Paris 7, Case 7012, Site Chevaleret,

75205 Paris Cedex 13, France. E-mail: [email protected]

2 Indiana University - Purdue University, Indianapolis, IN, USA 3 Math. Div., Inst. B. Verkin, 47 Lenin Avenue, 61103 Kharkiv, Ukraine

Received: 29 January 2008 / Accepted: 23 March 2009 Published online: 24 June 2009 – © Springer-Verlag 2009

Abstract: We consider the focusing nonlinear Schrödinger equation on the quarter plane. The initial data are vanishing at infinity while the boundary data are timeperiodic, of the form aeiα e2iωt . The goal of this paper is to study the asymptotic behavior of the solution of this initial-boundary-value problem. The main tool is the asymptotic analysis of an associated matrix Riemann–Hilbert problem. We show that for ω < −3a 2 the solution of the IBV problem hasdifferent asymptotic behaviors in different regions. In the region x > 4bt, where b := (a 2 − ω)/2 > 0, the solution takes the form of the Zakharov-Manakov vanishing asymptotics. In a region of type 4bt − N2a+1 log t < x < 4bt, where N is any √ integer, the solution is asymptotic to a train of asymptotic solitons. In the region 4(b −a 2)t < x < 4bt,√the solution takes the form of a modulated elliptic wave. In the region 0 < x < 4(b − a 2)t, the solution takes the form of a plane wave. 1. Introduction The discovery of the Lax pairs for nonlinear evolutionary equations and the inverse scattering transform method (IST) for solving initial-value problems on the whole line turn out to be very successful. This powerful method gives a huge number of very interesting results in different areas of mathematics and physics. In particular, at the beginning of the 90’s a new great achievement in the further development of the IST method has been done by P. Deift and X. Zhou. It is a nonlinear steepest descent method for oscillatory matrix Riemann–Hilbert problems. With this new method came the nice possibility to rewrite known asymptotic results for different nonlinear integrable models in a rigorous and transparent form (see [11,14,15]) and obtain numerous new significant results in the theory of completely integrable nonlinear equations, random matrix models, orthogonal polynomials, and integrable statistical mechanics. This paper continues the study of the initial-boundary-value (IBV) problem which originated in [5–7], and which is related to the focusing nonlinear Schrödinger equation in the quarter plane x > 0, t > 0 with time periodic boundary data and vanishing at

480

A. Boutet de Monvel, A. Its, V. Kotlyarov

infinity initial condition. The resulting spectral analysis [7] allows the solution to be represented in a Riemann–Hilbert form. The initial and boundary conditions must satisfy a certain global relation constraint for the IBV problem to be well posed. One of the important advantages of this method is that we obtain the solution in a very convenient form to study its long time asymptotics. Using the Deift–Zhou steepest descent method [11,14,15] for oscillatory Riemann–Hilbert problems, the long time asymptotics of several IBV problems have already been studied in [3,16–18,20], under the assumption that the boundary values of x = 0 vanish for t → +∞. To the best of our knowledge IBV problems for the NLS equation with non-vanishing boundary data have not been yet considered in the framework of the RH method. We provide an implementation of the nonlinear steepest descent method for a matrix Riemann–Hilbert problem associated to the IBV problem with simplest periodic boundary data. Fortunately, this simple case contains all the novelty ingredients which are necessary to pose the corresponding RH problem for general periodic boundary data. The problem considered in this paper is similar, though not identical, to the shock problems arising for integrable PDEs on the whole line with two different finite-gap boundary conditions as x → ±∞. The development of the RH method for these problems goes back to the works done in the 80–90’s by R. Bikbaev, P. Deift, V. Novokshenov, and S. Venakides. Most recently, an implementation of the RH scheme to the shock problem for the focusing nonlinear Schrödinger equation on the whole line and the evaluation of the long-time asymptotics of the corresponding solution have been performed in [10]. It is worth mentioning that our construction of phase g-functions is different from that in [10]. Note also that the numeric results, carried out by Chunxiong Zheng, are in good agreement with our theoretical results [9]. Another approach to the IBV problems in the quarter plane based on the general theory of the PDE’s can be found in [2] and references therein. In this paper the authors proved that, if the small amplitude boundary forcing is periodic of period T, then the solution of the damped Korteweg–de Vries equation is asymptotically periodic for large t with the same period T. Note also that only local-in-time well posedness is known [19], even with very simple initial and boundary data. The main results of this paper were announced in [4]. 1.1. We consider the following initial-boundary value problem for the focusing nonlinear Schrödinger equation: iqt + qx x + 2|q|2 q = 0, with x, t ∈ R+ , q(x, 0) = q0 (x), q(0, t) = g0 (t) = aeiα e2iωt, q0 (0) = g0 (0) = aeiα,

(1.1a) (1.1b) (1.1c) (1.1d)

where q0 (x) vanishes for x → +∞, a > 0, α and ω are real numbers. We suppose that the solution q(x, t) of the IBV problem exists for x, t ∈ R+ . This solution is C ∞, continuous with all its derivatives up to the boundary {x = 0} ∪ {t = 0} of the quarter xt-plane and q(x, t) ∈ S(R+ ) in x for any fixed t ∈ R+ . Here S(R+ ) is the space of Schwartz functions on R+ : S(R+ ) = {u(x) ∈ C ∞ (R+ ) | x n u (m) (x) ∈ L ∞ (R+ ) for any n, m ≥ 0}. We assume the initial data q0 (x) ∈ S(R+ ). It is also worth noticing that all the considerations of this paper are actually valid if the boundary condition (1.1c) is replaced by

Asymptotics for NLS with Time-Periodic Boundary Condition

481

its natural weaker version, q(0, t) = g0 (t) = aeiα e2iωt + v0 (t),

(1.2)

with v0 (t) ∈ S(R+ ). The focusing nonlinear Schrödinger equation admits [21] a Lax pair consisting of the two linear eigenvalue problems presented below - Eqs. (1.3) and (1.4). For the study of the initial-boundary value problem (1.1) we shall use, following the methodology of [17], a simultaneous spectral analysis of two eigenvalue problems, one for the linear x-equation: x + ikσ3 = Q(x, t), 0 q(x, t) Q(x, t) := , −q(x, ¯ t) 0 1 0 , σ3 := 0 −1

(1.3a) (1.3b)

and the other for the linear t-equation ˜ t + 2ik 2 σ3 = Q(x, t; k), 2 ˜ Q(x, t; k) := 2k Q(x, t) − i(Q (x, t) + Q x (x, t))σ3 ,

(1.4a) (1.4b)

where (x, t; k) is a 2 × 2 matrix-valued function, k ∈ C. It is well-known that this system of linear equations is compatible [1,21] if and only if q(x, t) solves the nonlinear Schrödinger equation. 1.2. To formulate the Riemann–Hilbert problem related to the IBV problem (1.1), we need to introduce spectral functions using the initial data, and the Dirichlet and Neumann boundary data. Thus we have to make an assumption on the structure of the Dirichlet to Neumann map. For ω < −3a 2 we claim that this map takes the form (1.5) below. Assumptions. We assume that the IBV problem (1.1) has a global solution q(x, t), sufficiently smooth and with sufficient decay for x → +∞. We also assume that for ω < −3a 2 the Neumann boundary values take the form qx (0, t) = g1 (t) = 2iabeiα e2iωt + v1 (t), with v1 (t) ∈ S(R+ ), a2 − ω > 0. b := 2

(1.5a) (1.5b)

Note that numerical simulations provided by Chunxiong Zheng are in good agreement with this last assumption [9]. This assumption is also supported by the asymptotic results obtained in Sect. 5.4 (see Remark 4 at the end of this section). Remark 1 (Dirichlet-to-Neumann). The structure of the Dirichlet to Neumann map depends essentially on the relation between the frequency ω and the amplitude a. An exact example in the case ω ≥ a 2 /2 shows that the condition (1.5) is no longer valid for every initial-boundary data. Indeed, it follows from this example that there is a component in the space of the data where, instead of (1.5), the relevant assumption about the behavior of qx (0, t) is: ˆ iα e2iωt + v1 (t), with v1 (t) ∈ S(R+ ), qx (0, t) = 2a be ω a2 − > 0. bˆ := 2 4

(1.6a) (1.6b)

482

A. Boutet de Monvel, A. Its, V. Kotlyarov

The example is the following exact solution — stationary soliton — of the NLS equation: √ q(x, t) = 2ω

2

e4iη t eiα e2iωt = 2ηeiα , √ cosh 2η(x − x0 ) cosh 2ω(x − x0 )

where η :=

1√ 2ω. 2

For this solution we have: q(0, t) = aeiα e2iωt and 2

e4iη t ˆ iα e2iωt , tanh 2ηx0 ≡ 2a be qx (0, t) = 4η cosh 2η(x0 ) 2

where a=

2η , cosh 2ηx0

bˆ = η tanh 2ηx0 . In this case we have the relation a 2 + 4bˆ 2 = 2ω. Thus the stationary soliton is the explicit solution of the IBV problem (1.1), satisfying (1.6), with initial data q(x, 0) =

√ 2ω

cosh

√

eiα 2ω(x − x0 )

.

For a further analysis and for asymptotic results in the case ω ≥ a 2 /2, see [8]. 1.3. In this paper we restrict our attention to the case ω < −3a 2 . The focusing NLS equation admits the exact solution qp (x, t) = aeiα e2ibx+2iωt, b := (a 2 − ω)/2 > 0, a > 0. In this case we have the relation a 2 − 2b2 = ω. ˜ Let Q˜ p (t, k) = Q˜ p (0, t; k), where Q˜ p (x, t; k) is defined like Q(x, t; k) but starting from qp (x, t) instead of q(x, t), i.e. Q˜ p (t, k) := 2k Q p (t) − i(Q 2p (t) + (Q p )x (t))σ3 ,

Asymptotics for NLS with Time-Periodic Boundary Condition

483

with Q p (t) := Q p (0, t) =

aeiα e2iωt , 0

0

−ae−iα e−2iωt 0 (Q p )x (t) := (Q p )x (0, t) = 2iae−iα be−2iωt 0 qp (x, t) . Q p (x, t) := −q¯p (x, t) 0

2iaeiα be2iωt , 0

Consider now the t-part (1.4a) of the Lax pair associated with Q˜ p (t), i.e. t (t, k) + 2ik 2 σ3 (t, k) = Q˜ p (t, k)(t, k), t > 0, k ∈ C,

(1.7)

where (t, k) is 2 × 2 matrix-valued. A particular solution of (1.7) is given by (t, k) = E(t, k)ei(ω−(k))σ3 t, iωσˆ 3 t

E(t, k) = e E(k) := e ⎛ 1 ν(k) + ν(k) 1 E(k) = ⎝ 2 e−iα ν(k) − 1 ν(k)

iωσ3 t

−iωσ3 t

E(k)e , ⎞ 1 ν(k) − ν(k) ⎠, 1 ν(k) + ν(k)

eiα

(k) = 2(k − b)X (k), X (k) = (k + b)2 + a 2 ,

41 ν(k) = k+b−ia . k+b+ia

(1.8a) (1.8b) (1.8c) (1.8d) (1.8e) (1.8f)

We fix the branches of the square roots by their asymptotics, for k → ∞: X (k) = (k + b)2 + a 2 = k + b + O(k −1 ), 1 k + b − ia 4 ia ν(k) = + O(k −2 ) =1− k + b + ia 2k

(1.9a) (1.9b)

¯ on the complex k-plane cut along any curve connecting the two branch points E and E. In this paper we carry out the principal ingredients of the asymptotic analysis of the basic Riemann–Hilbert problem which is formulated below and whose detailed presentation is given in [7]. Under certain further technical assumptions on the Riemann–Hilbert data, we shall describe the long time asymptotics of the solution of the related IBV problem. Notations. (1) If µ is a 2 × 2 matrix we denote its columns by [µ]1 and [µ]2 . (2) Let be a function defined in a neighborhood of an oriented contour in the Riemann sphere C ∪ {∞} or in some Riemann surface and let k ∈ be a non self-crossing point. We denote by + (k) the boundary value of at k from the left side and by − (k) its boundary value from the right side.

484

A. Boutet de Monvel, A. Its, V. Kotlyarov

2. Eigenfunctions We first define the contour := {k ∈ C | Im (k) = 0}, where (k) is given by (1.8d). Let us put k1 = Re k and k2 = Im k. Then the equation Im (k) = 0 means k2 = 0 or

a2 = (k1 − b)(k1 − κ− )(k1 − κ+ ) with |k1 | ≤ |b|, k1 k22 = (k1 − b) k12 + bk1 + 2 where 2b2 = a 2 − ω. In what follows we suppose ω < −3a 2 , i.e., b2 > 2a 2 , and b > 0. Therefore, κ± are real and a2 b2 b − , κ± = − ± 2 4 2 with −b < κ− ≤ −b/2 ≤ κ+ < 0 (Fig. 1). The contour consists of the real axis R, the finite arc γ ∪ γ¯ whose endpoints are ¯ the branch points E = −b + ia and E¯ = −b − ia, and the contour ∪ : ¯ = R ∪ γ ∪ γ¯ ∪ ∪ . The D j , j = 1, 2, 3, 4 are the following domains: D1 D2 D3 D4

:= := := :=

{k {k {k {k

∈ C | Im k ∈ C | Im k ∈ C | Im k ∈ C | Im k

> 0, > 0, < 0, < 0,

Im (k) > 0}, Im (k) < 0}, Im (k) > 0}, Im (k) < 0}.

Fig. 1. The domains D j for b2 > 2a 2 , a > 0, b > 0

Asymptotics for NLS with Time-Periodic Boundary Condition

485

We also define + := D1 ∪ D3 = {k ∈ C | Im (k) > 0}, − := D2 ∪ D4 = {k ∈ C | Im (k) < 0}. So we obtain a partition of the complex k-plane C: D1 ∪ D2 ∪ D3 ∪ D4 ∪ = C. We assume that there exists a unique global solution q(x, t) satisfying (1.1) and ˜ (1.5) and we consider the associated functions Q(x, t) and Q(x, t; k) defined by (1.3b) and (1.4b), respectively. Define the 2 × 2 matrix-valued functions {µ j (x, t; k)}3j=1 for 0 < x < ∞ and 0 < t < ∞, as the solutions of the following Volterra integral equations:

∞

eik(ξ −x)σˆ 3 (Qµ3 )(ξ, t; k)dξ, t 2 ˜ 2 )(0, τ ; k)dτ e−2ik (t−τ )σˆ 3 ( Qµ µ2 (x, t; k) = I + e−ikx σˆ 3 0 x + e−ik(x−ξ )σˆ 3 (Qµ2 )(ξ, t; k)dξ, µ3 (x, t; k) = I −

(2.1a)

x

µ1 (x, t; k) = e

(2.1b)

0 −ikx σˆ 3 +iωt σˆ 3

E(k) (2.1c) t ei[ω−(k)](t−τ )σˆ 3 E −1 (τ ; k) Q˜ 0 (τ ; k)µ1 (0, τ ; k)dτ + e−ikx σˆ 3 E(t, k) ∞ x + e−ik(x−ξ )σˆ 3 (Qµ1 )(ξ, t; k)dξ, 0

where E(k), E(t, k), and (k) are defined by (1.8b), (1.8c), and (1.8d) respectively, and ˜ Q˜ 0 (t; k) := Q(0, t; k) − Q˜ p (t; k). Proposition 1. The 2 × 2 matrices {µ j (x, t; k)}3j=1 have the following properties: (i) For j = 1, 2, 3: det µ j (x, t; k) ≡ 1.

(2.2)

(ii) The functions { j }3j=1 defined by 1 (x, t; k) := µ1 (x, t; k)e−ikxσ3 +i[ω−(k)]tσ3, j (x, t; k) := µ j (x, t; k)e

−ikxσ3 −2ik 2 tσ3

,

j = 2, 3

(2.3) (2.4)

satisfy the Lax pair (1.3)–(1.4). (iii) For j = 1, 2, 3: µ j (x, t; k) = I + O(k −1 ), k → ∞, Im k = 0.

(2.5)

(iv) Near k = −b ± ia, the matrix µ1 (x, t; k) exhibits inverse fourth-root singularities like those the matrix E(k) has.

486

A. Boutet de Monvel, A. Its, V. Kotlyarov

(v) The matrix µ1 (x, t; k) has different boundary values along a cut γ connecting the two points k = −b ± ia, which are the branch points of the function X (k). (vi) The matrix µ2 (x, t; k) is entire in k ∈ C. Furthermore (3) (1) (4) (34) (12) µ1 = µ(2) , µ , µ , (2.6) = = µ µ µ µ µ 2 3 1 1 2 2 3 3 where (a) µ(2) 1 means that the first column vector [µ1 (x, t; k)]1 is bounded and analytic in D2 , (3) (b) µ1 means that the second column [µ1 (x, t; k)]2 is bounded and analytic in D3 , means that [µ3 (x, t; k)]2 is bounded and analytic in D1 ∪ D2 , etc. (c) µ(12) 3 Let { j }3j=1 be the 2 × 2 matrix-valued functions defined in Proposition 1. Then in 3 their domains of definition the functions {l (x, t; k)}l=1 satisfy both equations of the Lax pair, and their determinants (2.2) do not vanish. Hence they are linearly dependent and satisfy the following dependence relations: k ∈ R, 3 (x, t; k) = 2 (x, t; k)s(k), 1 (x, t; k) = 2 (x, t; k)S(k), k ∈ ,

(2.7) (2.8)

where s(k) and S(k) are defined by ¯ a( ¯ k) b(k) , ¯ k) ¯ −b( a(k) ¯ k) ¯ A( B(k) S(k) := 1 (0, 0; k) =: . ¯ k) ¯ − B( A(k)

s(k) := 3 (0, 0; k) =:

(2.9) (2.10)

Furthermore, the scattering relations (2.7) and (2.8) yield 1 (x, t; k) = 3 (x, t; k)T (k),

(2.11)

where T (k) = s −1 (k)S(k). We denote by {Ti j (k)}i,2 j=1 the entries of the 2 × 2 matrix T (k). Then (2.9)-(2.10) imply: ¯ = a(k) A( ¯ k) ¯ + b(k) B( ¯ k), ¯ T11 (k) = T¯22 (k) ¯ ¯ T12 (k) = −T21 (k) = a(k)B(k) − b(k)A(k).

(2.12) (2.13)

We define c(k) :=

¯ k) ¯ ¯ k) ¯ B( T21 (k) b( − =− , T11 (k) a(k) a(k)T11 (k)

(2.14)

which is analytic in k ∈ D2 \{z 1 , z 2 , . . . , z m } and is O(k −1 ) as k → ∞. Here, the z j ’s are the zeros of the function T11 (k) in D2 . These properties of c(k) follow from the definition of c(k) and from the corresponding properties of the functions a(k), b(k),

Asymptotics for NLS with Time-Periodic Boundary Condition

487

A(k), B(k) described below and in [7]. We assume that there are only a finite number of zeros z j (i.e., m < ∞) and that all they are simple. Let us also denote r (k) :=

¯ b(k) a(k)

for k ∈ R

(2.15)

the “reflection coefficient” of the x-problem, and ρ(k) := c(k) + r (k).

(2.16)

Let q0 (x) ∈ S(R+ ). Then the map Sx : {q0 (x)} −→ {a(k), b(k)}

(2.17)

defined by (2.7), (2.9) has the following properties: Properties of a(k), b(k). The spectral functions a(k) and b(k) satisfy: (i) (ii) (iii) (iv)

a(k), b(k) are analytic and bounded for k ∈ C+ . a(k), b(k) ∈ C ∞ (R). |a(k)|2 + |b(k)|2 ≡ 1, k ∈ R. a(k) = 1 + O(k −1 ), b(k) = O(k −1 ), k → ∞.

The map Sx has an inverse Qx : {a(k), b(k)} −→ q0 (x) given by: (x) q0 (x) = 2i lim k M12 (x, k), k→∞

where M (x) (x, k) is the unique solution of some Riemann–Hilbert problem RH x [7]. Now let g0 (t) := q(0, t) = ae2iωt, g1 (t) := qx (0, t) = 2iabe2iωt + v1 (t)

with v1 (t) ∈ S(R+ ).

Then the map St : {g0 (t), g1 (t)} −→ {A(k), B(k)}

(2.18)

defined by (2.8), (2.10) has the following properties: Properties of A(k), B(k). The spectral functions A(k) and B(k) satisfy: (i) (ii) (iii) (iv)

A(k),B(k) are analytic and bounded for k ∈ + = D1 ∪ D3 . ¯ E = −b + ia, E¯ = −b − ia. A(k), B(k) ∈ C ∞ ( \{E, E}), ¯ k) ¯ + B(k) B( ¯ k) ¯ ≡ 1 for k ∈ . A(k) A( 1 1 A(k) − 21 ν(k) + ν(k) and B(k) − 21 eiα ν(k) − ν(k) are bounded for k ∈ + .

(v) A(k) = 1 + O(k −1 ) and B(k) = O(k −1 ) for k → ∞.

488

A. Boutet de Monvel, A. Its, V. Kotlyarov

The map St has an inverse Qt : {A(k), B(k)} −→ {g0 (t), g1 (t)} given by (t)

g0 (t) = 2i lim k M12 (t, k), k→∞

(t)

(t)

g1 (t) = lim [4k 2 M12 (t, k) + 2ig0 (t)k M22 (t, k)], k→∞

where M (t) (t, k) is the unique solution of some Riemann–Hilbert problem RHt [7]. Global relation. The spectral functions satisfy the “global relation” b(k)A(k) − a(k)B(k) ≡ 0 for k ∈ D1 .

(2.19)

The global relation yields that T12 (k) ≡ 0 for k ∈ D1 and T21 (k) ≡ 0 for k ∈ D4 . In particular, it means that the spectral term (2.16) vanishes for k ≥ κ+ : ρ(k) ≡ 0

for k ∈ [κ+ , +∞).

(2.20)

The function c(k) defined by (2.14) is analytic in D2 and has a jump across the contour γ: f (k) := c− (k) − c+ (k) =

−ie−iα

− + (k) T11 (k)T11

for k ∈ γ .

(2.21)

The function ρ(k) and all its derivatives have jumps at k = κ− . Since r (k) is smooth we have for l = 0, 1, 2, . . ., dl dl dl dl ρ(k) − l ρ(k) = fl (κ− ) := l c(k) − l c(k) . l dk dk dk dk k=κ− −0 k=κ− +0 k=κ− −0 k=κ− +0 (2.22) Dispersion relation. The spectral functions ρ(k), f (k) and the discrete spectrum are not independent. They are connected by a very important formula, the “dispersion relation”: ⎡ ⎤ m s − z¯ j −2 −2 ds ≡ 0 mod 4π 2, log ⎣h(s) δ (s)⎦ (2.23) s − z X (s) j + γ ∪γ¯ j=1

where {z 1 , z 2 , . . . , z m } are the zeros of T11 (k) in D2 , and −ieiα f (k), k ∈ γ, h(k) = iα −1 ¯ ¯ −ie f (k), k ∈ γ¯ , κ+ 1 log(1 + |ρ(s)|2 )ds , k ∈ C\(−∞, κ+ ]. δ(k) = exp 2π i −∞ s−k

Asymptotics for NLS with Time-Periodic Boundary Condition

489

Proof (of the dispersion relation). The function T11 (k) is analytic in D2 . The global relation (2.19) together with property (iii), i.e., det S(k) ≡ 1 for k ∈ yields T11 (k) =

a(k) A(k)

for k ∈ .

If A(k j ) = 0 for k j ∈ D1 then B(k j ) = 0. Thus, due to the global relation, a(k j ) = 0 and vice versa. Hence T11 (k) has analytic continuation to the upper half-plane Im k > 0 ¯ has analytic continuwith jump on γ and T11 (k) = 0 in D1 . Similarly T22 (k) = T¯11 (k) ation to the lower half-plane Im k < 0 with jump on γ¯ and T22 (k) = 0 in D4 . Moreover, T22 (k) = 1/T11 (k) for k ∈ (κ+ , ∞) since T12 (k) = T21 (k) = 0 and det T (k) = 1. Let, as before, {z 1 , z 2 , . . . , z m } be the zeros of T11 (k) in D2 , and {¯z 1 , z¯ 2 , . . . , z¯ m } the zeros of T22 (k) in D3 . Then the function ⎧ m k − z¯ j ⎪ T (k) , k ∈ D1 ∪ D2 , ⎪ 11 ⎨ j=1 k − z j Tˆ (k) := m k − z¯ 1 j ⎪ ⎪ , k ∈ D3 ∪ D4 ⎩ T22 (k) j=1 k − z j is analytic outside (−∞, κ+ ]. Using (2.14)–(2.16) we find ρ(k) = T21 (k)/T22 (k) for k ∈ R. Moreover, ρ(k) ≡ 0 for k ∈ [κ+ , +∞) by (2.20). From the determinant relation T11 (k)T22 (k)[1 + |ρ(k)|2 ] = 1, we find Tˆ− (k) = [1 + |ρ(k)|2 ]Tˆ+ (k), k ∈ (−∞, κ+ ]. Since δ+ (k) = [1 + |ρ(k)|2 ]δ− (k) for k ∈ (−∞, κ+ ], the function F(k) = Tˆ −1 (k)δ −1 (k) is analytic everywhere except in γ ∪ γ¯ . Taking into account the definition (2.21) of f (k) we have m k − z¯ j −2 −2 F− (k)F+ (k) = h(k)δ (k) , k ∈ γ ∪ γ. ¯ (2.24) k − zj j=1

Thus F(k) solves a scalar Riemann–Hilbert problem: Find a scalar function F(k) such that • • • •

F F F F

is analytic outside the contour γ ∪ γ¯ , does not vanish, is bounded at infinity, has the jump (2.24) across γ ∪ γ. ¯

¯ Since To solve this RH problem let us use the function X (k) = (k − E)(k − E). m k−¯z j −2 −2 δ (k) log h(k) k−z j j=1 log F(k) log F(k) − = , k ∈ γ ∪ γ, ¯ X (k) + X (k) − X + (k)

490

A. Boutet de Monvel, A. Its, V. Kotlyarov

we have

F(k) = exp

⎧ ⎪ ⎪ ⎨

X (k) ⎪ 2π i ⎪ ⎩

log h(s)

m j=1

s−¯z j s−z j

−2

δ −2 (s)

s−k

γ ∪γ¯

⎫ ⎪ ⎪ ⎬

ds , (2.25) X + (s) ⎪ ⎪ ⎭

and finally: T11 (k) =

m k − z j −1 δ (k)F −1 (k). k − z¯ j j=1

Since T11 (∞) = δ(∞) = 1, then F(∞) = 1. The later equation yields the constraint (2.23) on the spectral functions. 3. The Basic Riemann–Hilbert Problem The relations among the eigenfunctions (2.7)-(2.13) can be rewritten in the form of a Riemann–Hilbert problem RH xt : M− (x, t; k) = M+ (x, t; k)J (x, t; k), k ∈ ,

(3.1)

which is connected with the IBV problem (1.1). The orientation of the contour = R ∪ γ ∪ γ¯ ∪ ∪ ¯ is shown in Fig. 2. The boundary values. In (3.1) M+ (x, t; k) denotes the boundary value at k ∈ of the 2 × 2 matrix-valued function M(x, t; k) from the left of the oriented contour , while M− (x, t; k) denotes the boundary value from the right, see Fig. 2.

Fig. 2. The oriented contour for the case b2 > 2a 2 , a > 0, b > 0

Asymptotics for NLS with Time-Periodic Boundary Condition

491

The phase function. We denote ξ :=

x . 4t

(3.2)

Then the “phase function” is defined by θ (k, ξ ) := 2k 2 + 4ξ k = 2k 2 +

kx . t

(3.3)

The jump matrix J(x,t;k). The jump matrix J (x, t; k) is given by different formulas: % ⎧$ −2itθ(k) ⎪ 1 − ρ(k)e ¯ ⎪ ⎪ , k ∈ (−∞, κ+ ), ⎪ 2itθ(k) ⎪ 1 + |ρ(k)|2 ⎪ ⎨ −ρ(k)e J (x, t; k) = $ (3.4a) % ⎪ ⎪ −2itθ(k) ⎪ 1 −¯ r (k)e ⎪ ⎪ ⎪ , k ∈ (κ+ , ∞), ⎩ −r (k)e2itθ(k) 1 + |r (k)|2 % ⎧$ ⎪ 1 0 ⎪ ⎪ , k ∈ , ⎪ 2itθ(k) 1 ⎪ ⎪ ⎨ c(k)e , (3.4b) J (x, t; k) = $ % ⎪ ⎪ −2itθ(k) ⎪ ¯ 1 c( ¯ k)e ⎪ ⎪ ¯ ⎪ , k ∈ , ⎩ 0 1 % ⎧$ ⎪ 1 0 ⎪ ⎪ k ∈ γ, ⎪ 2itθ(k) 1 , ⎪ ⎪ ⎨ f (k)e (3.4c) J (x, t; k) = $ % ⎪ ⎪ −2itθ(k) ⎪ ¯ ¯ 1 − f (k)e ⎪ ⎪ ⎪ , k ∈ γ¯ . ⎩ 0 1 Here c(k), r (k), ρ(k) and f (k) are defined by (2.14), (2.15), (2.16) and (2.21), respectively. In the presence of discrete spectrum the following residue conditions hold: 2

resk=k j [M(x, t; k)]1 = i m 1j e2i(k j x+2k j t) [M(x, t; k j )]2 , resk=z j [M(x, t; k)]1 = i m 2j e

2i(z j x+2z 2j t)

resk=¯z j [M(x, t; k)]2 = −im¯ 2j e resk=k¯ j [M(x, t; k)]2 =

[M(x, t; z j )]2 ,

−2i(¯z j x+2¯z 2j t)

[M(x, t; z¯ j )]1 ,

¯ ¯2 −im¯ 1j e2i(k j x+2k j t) [M(x, t; k¯ j )]1 ,

k j ∈ D1 , (3.5a) z j ∈ D2 , (3.5b) z¯ j ∈ D3 ,

(3.5c)

k¯ j ∈ D4 , (3.5d)

where k j , j = 1, 2, . . . , n and z j , j = 1, 2, . . . , m are simple zeros of the spectral functions a(k) in D1 and of T11 (k) in D2 , respectively. The corresponding residues are as follows: ¯ k¯ j )a( ¯˙ k¯ j ))−1 , ˙ j ))−1 , m 2j = −i resk=z j c(k), m¯ 1j = (ib( m 1j = (ib(k j )a(k ¯ m¯ 2j = i resk=¯z j c( ¯ k).

492

A. Boutet de Monvel, A. Its, V. Kotlyarov

Then the solution q(x, t) of the IBV problem (1.1) for the NLS equation is given by q(x, t) = 2i lim (k M(x, t; k))12 . k→∞

(3.6)

4. Inverse x t-Scattering Problem: Reconstruction of the Solution q(x, t) In this section we recall results from [7].

The Riemann–Hilbert problem R Hxt . Find a 2 × 2 matrix-valued function M(x, t; k) such that M(x, t; k) is sectionally meromorphic in k ∈ C\ . (4.1a) Its first column [M(x, t; k)]1 has simple poles at k j ∈ D1 and z j ∈ D2 ; the (4.1b) second column [M(x, t; k)]2 has simple poles at k j ∈ D4 and z j ∈ D3 . The associated residues satisfy the relations (3.5). M(x, t; k) satisfies the jump condition M− (x, t; k) = M+ (x, t; k)J (x, t; k), for k ∈ , where the jump matrix J (x, t; k) is defined in terms of the spectral functions by (3.4). (4.1c) det M(x, t; k) ≡ 1. (4.1d) Behavior at k = ∞ : (4.1e) −1 M(x, t; k) = I + O(k ). Theorem 1 ([7]). Let q0 (x) ∈ S(R+ ). Suppose that the functions g0 (t) = aeiα e2iωt and g1 (t) = 2iaeiα be2iωt +v1 (t) are such that the spectral functions {a(k), b(k), A(k), B(k)} satisfy the global relation b(k)A(k) − a(k)B(k) = 0, k ∈ D1 .

(4.2)

Then: (i) The above Riemann–Hilbert problem RH xt has a unique solution M(x, t; k). (ii) If we define q(x, t) in terms of this solution by q(x, t) = 2i lim (k M(x, t; k))12 , k→∞

then (a) q(x, t) solves the NLS equation (1.1a), (b) q(x, t) satisfies the initial-boundary conditions: q(x, 0) = q0 (x), q(0, t) = g0 (t), qx (0, t) = g1 (t).

(4.3)

Asymptotics for NLS with Time-Periodic Boundary Condition

493

x , b= Fig. 3. Regions in the (x, t)-quarter-plane: ξ = 4t

a 2 −ω 2

5. Long-Time Asymptotic Analysis of the Riemann–Hilbert Problem Assumptions In what follows, we assume that the Riemann–Hilbert data, i.e., the functions ρ(k), r (k), c(k) and f (k) satisfy the following additional properties: #1 The function c(k) admits analytic continuation across the cut γ ∪ γ¯ connecting E and E¯ on the second sheet of the Riemann surface of the function X (k). #2 The function f (k) admits a Taylor series expansion at k = E = −b + ia of the form f (k) =

∞ &

c j (k − E)

2 j+1 2

.

(5.1)

j=0

We also assume for simplicity that #3 The discrete spectrum of the problem is empty, i.e., • a(k) does not vanish in D1 , ¯ k) ¯ + b(k) B( ¯ k) ¯ does not vanish in D2 . • T11 (k) = a(k) A( It means that the set of eigenvalues {k j }nj=1 ∪ {z j }mj=1 is empty. In the setting of the basic Riemann–Hilbert problem RHxt , Assumption #1 makes the choice of the contour γ ∪ γ¯ itself flexible, with condition (2.22) always satisfied at the point of intersection with the real axes. In this section, we will show that there exist four different asymptotic formulae which describe the long-time behavior of the solution q(x, t) of the IBV problem in four different regions of the first quarter of the xt-plane. For the first region, the so-called Zakharov–Manakov region, we do not actually need the two first extra assumptions just formulated. For the next, the asymptotic solitons region, see Sect. 5.2, we only need Assumption #2. For the two remaining regions, Assumption #1 plays an important technical role in Sects. 5.4 and 5.5. It is worth noticing that if we assume the more general setting (1.2) and take the Riemann–Hilbert data ρ(k), r (k), c(k) and f (k) as the basic functional parameters of our IBV problem, then, of course, we won’t have any problem with securing the validity of the assumptions above. The interesting question is how big is the piece of the initial-boundary data which is excluded by the restrictions #1 and #2?

494

A. Boutet de Monvel, A. Its, V. Kotlyarov

5.1. The Zakharov-Manakov region ξ ≡ 4tx > b. To study the asymptotic behavior of the Riemann–Hilbert problem RH xt in the region x > 4bt we use well-known techniques from [11]. 5.1.1. The first transform is as usual: M(x, t; k) = M (1) (x, t; k)δ σ3 (k), where ([17])

1 δ(k) = exp 2π i

κ0 (ξ ) −∞

log(1 + |ρ(s)|2 )ds s−k

' , k ∈ C\(−∞, κ0 (ξ )],

(5.2)

and κ0 (ξ ) = −ξ = −

x 4t

(5.3)

is the stationary point of the phase function θ (k, ξ ) = 2k 2 + 4ξ k = 2k 2 +

kx . t

5.1.2. The next transformation is: M (2) (x, t; k) = M (1) (x, t; k)G(k), where

⎧$ % ⎪ 1 0 k ∈ D1 , ⎪ ⎪ ⎪ ⎨ −ˆr (k)δ −2 (k)e2itθ(k) 1 , arg(k − κ0 ) ∈ (0, π/4), % G(k) = $ ¯ 2 (k)e−2itθ(k) ⎪ 1 r¯ˆ (k)δ k ∈ D4 , ⎪ ⎪ , ⎪ ⎩ arg(k − κ0 ) ∈ (7π/4, 2π ), 0 1 ⎧$ % ⎪ 1 0 k ∈ D1 , ⎪ ⎪ , ⎪ ⎨ −2 2itθ(k) c(k)δ ˆ (k)e 1 % arg(k − κ0 ) ∈ (π/4, π/2), G(k) = $ 2 −2itθ(k) ¯ ¯ (k)e ⎪ 1 −c( ˆ k)δ k ∈ D4 , ⎪ ⎪ , ⎪ ⎩ arg(k − κ0 ) ∈ (3π/2, 7π/4), 0 1 ⎧$ % ⎪ 1 0 k ∈ D2 , ⎪ ⎪ , ⎪ ⎨ −2 2itθ(k) arg(k − κ0 ) ∈ (0, π/4), −ρ(k)δ ˆ (k)e %1 G(k) = $ 2 −2itθ(k) ¯ ¯ ⎪ 1 ρ( ˆ k)δ (k)e k ∈ D3 , ⎪ ⎪ , ⎪ ⎩ arg(k − κ0 ) ∈ (7π/4, 2π ), 0 1 1 0 k ∈ D2 , π/4 < arg(k − κ0 ) < 3π/4, G(k) = , k ∈ D3 , 7π/4 > arg(k − κ0 ) > 5π/4, 0 1 ⎧$ % ¯ 2 (k)e−2itθ(k) ⎪ 1 −ρ¯ˆ1 (k)δ k ∈ D2 , ⎪ ⎪ , ⎪ ⎨ 0 arg(k − κ0 ) ∈ (3π/4, π ), 1 % G(k) = $ ⎪ 1 0 k ∈ D3 , ⎪ ⎪ , ⎪ ⎩ arg(k − κ0 ) ∈ (π, 5π/4), ρˆ1 (k)δ −2 (k)e2itθ(k) 1

(5.4a)

(5.4b)

(5.4c)

(5.4d)

(5.4e)

Asymptotics for NLS with Time-Periodic Boundary Condition

495

Fig. 4. The contour and J (2) (x, t; k) for ξ > b

where rˆ (k), c(k), ˆ ρ(k), ˆ ρˆ1 (k) are suitable analytic approximations of the functions r (k), c(k), ρ(k), ρ1 (k) = ρ(k)/(1 + |ρ(k)|2 ) (cf. [14]). Then we obtain the RH problem (2) M− (x, t; k) = M+(2) (x, t; k)J (2) (x, t; k)

on the contour = γ ∪ γ¯ ∪ crossκ0 depicted in Fig. 4. The jump matrices J (2) (x, t; k) are written for the rays of the cross in Fig. 4. Moreover, % ⎧$ ⎪ 1 0 ⎪ ⎪ , k ∈ γ, ⎪ ⎨ (ρˆ+ (k) − ρˆ− (k) + f (k))δ −2 (k)e2itθ(k) 1 (2) % J (x, t; k) = $ ⎪ ¯ − ρ¯ˆ+ (k) ¯ − f¯(k))δ ¯ 2 (k)e−2itθ(k) ⎪ 1 (ρ¯ˆ− (k) ⎪ ⎪ , k ∈ γ¯ . ⎩ 0 1 In virtue of the inequality ξ ≡

x 4t

> b and taking into account (2.22) we see that

J (2) (x, t; k) = I + O(t −∞ ) as t → +∞, and uniformly in k ∈ γ ∪ γ¯ . Hence, in the region ξ > b the jump across the arc γ ∪ γ¯ does not contribute to the main term of the asymptotics of the solution. 5.1.3. The final transformation is: M (2) (x, t; k) = X (x, t; k)M as (x, t; k), where the matrix M as (x, t; k) solves the standard model problem associated with the stationary phase point κ0 (ξ ) and which is given explicitly in terms of parabolic cylinder functions (see e.g. [11]). The function X (x, t; k) admits the estimate: log2 t . X (x, t; k) = I + O (1 + |k|)t 1/2 Thus we come to the following statement (cf. [17]).

496

A. Boutet de Monvel, A. Its, V. Kotlyarov

Theorem 2 (Zakharov–Manakov region, ξ > b) Suppose that all conditions of Theorem 1 and Condition #3 are satisfied. Then in the region ξ > b the asymptotics of the solution (4.3) has a quasi-linear dispersive character, i.e., it is described by Zakharov–Manakov type formulas: 2 x x x ix exp log t + iφ − +o(t −1/2 ), q(x, t) = t −1/2 α − + 2iα 2 − 4t 4t 4t 4t x > b > 0, (5.5) t → +∞, 4t with the amplitude α and the phase φ given by 1 α 2 (k) = log 1 + |ρ(k)|2 , 4π 3π 2 + arg ρ(k) + arg −2iα 2 (k) φ(k) = 6α (k) log 2 + 4 k log|µ − k|dα 2 (µ), +4 −∞

(5.6)

(5.7)

where (z) denotes Euler’s gamma-function. 5.2. Asymptotic solitons region b − N8a+1 logt t < ξ ≡ 4tx < b.. In what follows we always assume for simplicity Condition #3: the discrete spectrum of the problem is empty, i.e., a(k) and T11 (k) do not vanish. 5.2.1. Let us perform the same transforms as in Subsect. 5.1.1–5.1.2: M(x, t; k) M (1) (x, t; k) M (2) (x, t; k). In this case the matrix Riemann–Hilbert problem is as follows: (2)

(2)

M− (x, t; k) = M+ (x, t; k)J (2) (x, t; k) with contour = γ ∪ γ¯ ∪ crossκ0 , see Fig. 5. For the jump matrix J (2) (x, t; k) for k ∈ crossκ0 , see Fig. 5. For k ∈ γ ∪ γ¯ : J (2) (x, t; k) ⎧$ % ⎪ 1 0 ⎪ ⎪ , ⎪ ⎪ ⎪ f (k)δ −2 (k)e2itθ(k) 1 ⎪ ⎪ $ % ⎪ ⎪ ⎪ 1 0 ⎪ ⎪ ⎪ ⎨ (ρˆ+ (k) − ρˆ− (k) + f (k))δ −2 (k)e2itθ(k) 1 , % = $ ¯ˆ− (k) ¯ − ρ¯ˆ+ (k) ¯ − f¯(k))δ ¯ 2 (k)e−2itθ(k) ⎪ 1 ( ρ ⎪ ⎪ , ⎪ ⎪ ⎪ 0 1 ⎪ ⎪$ % ⎪ ⎪ ¯ 2 (k)e−2itθ(k) ⎪ ⎪ 1 − f¯(k)δ ⎪ , ⎪ ⎩ 0 1

k ∈ γ, arg(k − κ0 ) > π/4, k ∈ γ, 0 < arg(k − κ0 ) < π/4, k ∈ γ¯ , 7π/4 < arg(k − κ0 ) < 2π, k ∈ γ¯ , arg(k − κ0 ) < 7π/4.

Asymptotics for NLS with Time-Periodic Boundary Condition

log t Fig. 5. The contour and J (2) (x, t; k) for b − N8a+1 t < ξ < b

5.2.2. The next transformation is: ˆ M (2) (x, t; k) = (x, t; k)T f (x, t; k), where

⎧$ % ⎪ 1 0 ⎪ ⎪ , for |k − E| < ε, ⎪ ⎪ ⎪ K (x, t; k) 1 ⎪ ⎨$ f % T f (x, t; k) = ¯ 1 K¯ f (x, t; k) ⎪ ¯ < ε, ⎪ , for |k − E| ⎪ ⎪ 0 1 ⎪ ⎪ ⎪ ⎩ I, otherwise,

with

f (s)δ −2 (s)e2itθ(s) ds, s−k γ ∩|k−E|<ε f¯(¯s )δ 2 (s)e−2itθ(s) 1 ¯ ¯ ds. K f (x, t; k) = − 2π i γ¯ ∩|k− E|<ε s−k ¯

K f (x, t; k) =

1 2π i

Then the Riemann–Hilbert problem takes the form: ˆ − (x, t; k) = ˆ + (x, t; k)J ˆ (x, t; k), k ∈ on the contour = γε ∪ γ¯ε ∪ Cε ∪ C¯ ε ∪ crossκ0 , where γε = {k ∈ γ | |k − E| ≥ ε}, Cε = {k | |k − E| = ε}. See Fig. 6. The jump matrix is Jˆ (x, t; k) =

J (2) (x, t; k) for k ∈ γε ∪ γ¯ε ∪ crossκ0 , T f (x, t; k) for k ∈ Cε ∪ C¯ ε .

497

498

A. Boutet de Monvel, A. Its, V. Kotlyarov

log t Fig. 6. The contour and Jˆ (x, t; k) for b − N8a+1 t < ξ < b

Due to Assumption #2 on the behavior of f (k) near k = E, the functions K f (x, t; k) ¯ are of the form: and K¯ f (x, t; k) K f (x, t; k) = FN (k, t, ξ ) + R(k, t, ξ ), ¯ = F¯ N (k, ¯ t, ξ ) + R( ¯ k, ¯ t, ξ ), K¯ f (x, t; k) where

FN (k, t, ξ ) =

N & d j (t, ξ ) j=0

t j+3/2

e2itθ(E) . (k − E) j+1

In the region b−

N + 1 log t x <ξ =
we have, for t → +∞: e2itθ(E) = O(t N +1 ), d j (t, ξ ) = d j + O(t −1 ), where d j = const = 0, R(k, t, ξ ) = O(t −3/2 ). Hence, for k ∈ Cε ∪ C¯ ε the jump matrix Jˆ (x, t; k) can be written in the form: reg

Jˆ (x, t; k) = T f (x, t; k) = T fN (k, t, ξ ) + T f (k, t, ξ ),

Asymptotics for NLS with Time-Periodic Boundary Condition

where

⎧$ % ⎪ 1 0 ⎪ ⎪ ⎪ ⎨ FN (k, t, ξ ) 1 , N % T f (k, t, ξ ) = $ ¯ N (k, ¯ t, ξ ) ⎪ 1 − F ⎪ ⎪ , ⎪ ⎩ 0 1 ⎧$ % ⎪ 1 0 ⎪ ⎪ ⎪ ⎨ R(k, t, ξ ) 1 , reg % T f (k, t, ξ ) = $ ¯ k, ¯ t, ξ ) ⎪ 1 − R( ⎪ ⎪ , ⎪ ⎩ 0 1

499

for |k − E| = ε, ¯ = ε, for |k − E| for |k − E| = ε, ¯ = ε. for |k − E|

Let reg (k, t, ξ ) denote the solution of the “regular” RH problem − (k, t, ξ ) = + (k, t, ξ )Jreg , k ∈ . reg

reg

The contour is the same as before. The jump matrix Jreg is obtained from Jˆ by the reg replacement T f (k, t, ξ ) T f (k, t, ξ ). It is clear that

reg

log2 t (k, t, ξ ) = I + O 1−2ε t

as (k, t, ξ ), 0 < ε < 1/2,

where as (k, t, ξ ) is the same parabolic cylinder model matrix function as above in Sect. 5.1.3. 5.2.3. Now we put ˆ (k, t, ξ ) = sol (k, t, ξ )reg (k, t, ξ ). Soliton model RH-problem The matrix-valued function sol (k, t, ξ ) solves the following model RH problem: • sol (k, t, ξ ) is analytic in k ∈ C\{Cε ∪ C¯ ε }. sol ¯ • sol − (k, t, ξ ) = + (k, t, ξ )Jsol (k, t, ξ ) for k ∈ C ε ∪ C ε .

• sol (k, t, ξ ) = I + O(k −1 ), as k → ∞. The contour Cε ∪ C¯ ε is the union of two circles of small radius ε > 0 centered at E and ¯ respectively. The jump matrix has the form E, Jsol (k, t, ξ ) = + (k, t, ξ )T fN (k, t, ξ )(+ (k, t, ξ ))−1 . reg

reg

This problem can be solved purely algebraically. Finally we obtain: 1 sol ˆ , t → +∞, (k, t, ξ ) = (k, t, ξ ) I + O (1 + |k|)t 1/2 that yields

q(x, t) = qsol (x, t) + O t −1/2 , t → +∞

500

A. Boutet de Monvel, A. Its, V. Kotlyarov

for the solution of the IBV problem (1.1) in the region 4bt − N2a+1 log t < x < 4bt. The explicit formula for an asymptotic soliton chain qsol (x, t) can be deduced algebraically from the above Riemann–Hilbert problem. Alternatively, by using the Marchenko approach, this asymptotics was studied in [6], where the following theorem was obtained: Theorem 3 (Asymptotic solitons, [6]). Suppose that all conditions of Theorem 1 and Conditions #2 and #3 are satisfied. Let N be a positive integer. Then in the region b − N8a+1 logt t < 4tx < b the solution (4.3) is an asymptotic soliton chain: [ N2+1 ]

|q(x, t)| = 2

& j=1

4a 2 + O(t −1/2 ), − 4bt − x j ) + log t 2 j−1/2 ]

cosh2 [2a(x

for t → +∞, where x j = x (0) j −

1 2π

−b −∞

log[1 + |ρ(s)|2 ] ds (s + b)2 + a 2

x (0) j

and the numbers depend on γ and c(k). The first asymptotic soliton (N = 1) takes the form: qsol (x, t) =

2ae2ibx+4i(a cosh[2a(x − 4bt +

where 1 ϕ1 = arg z 0 − π

κ0

−∞

2 −b2 )t+iϕ

1 2a

1

log t 3/2 − x1 )]

,

(s + b) log[1 + |ρ(s)|2 ] ds. (s + b)2 + a 2

Here z 0 depends on E = −b + ia. Asymptotic solitons, as one can see, are generated by the continuous spectrum (γ ∪ γ¯ ) of the scattering problem, more precisely, by the end points E and E¯ of the contour γ ∪ γ¯ . These solitons do not satisfy exactly the nonlinear equation because the argument of cosh contains log t. Asymptotic solitons are observed for large time in a neighborhood of the leading edge of the solution while the original solitons, which are generated by the discrete spectrum of the corresponding scattering problem, can be observed at any time. More information about “asymptotic solitons” can be found in [6]. 5.3 The sector 0 ≤ ξ ≡ 4tx < b. For 0 ≤ ξ < b, Im θ (k) is negative along a part or all of the contour γ ∪ γ¯ . Therefore the method used in the initial region (ξ > b) will not work. For 0 ≤ ξ < b, we have to follow a modification of the nonlinear steepest descent method as suggested in [13]. Instead of θ (k) = 2k 2 + 4ξ k we should find a new phase function g(k) = g(k, ξ ), which transforms the original Riemann–Hilbert problem to a model RH problem of finite-gap type (see [12]). Such a g-function √ does really exist. It leads to a genus zero finite-gap model √ problem for 0 ≤ ξ < b − a 2 and to a genus one finite-gap model problem for b − a 2 < ξ < b. Both are explicitly solved, using elementary functions in the first region, Sect. 5.4, and elliptic theta functions in the second region, Sect. 5.5, respectively.

Asymptotics for NLS with Time-Periodic Boundary Condition

5.4 Plane wave region 0 ≤ ξ ≡

x 4t

501

√ < b − a 2.

√ 5.4.1 The g-function. In the region 0 ≤ ξ < b − a 2 we take as g-function (cf. (1.8d)): g(k) = g(k, ξ ) = 2(k − b + 2ξ )X (k) = (k) + 4ξ X (k),

(5.8)

¯ = (k + b)2 + a 2 . This function has the same asympwhere X (k) = (k − E)(k − E) totic behavior for large k as the initial phase function θ (k), i.e. g(k) = 2k 2 + 4ξ k + g∞ (ξ ) + O(k −1 ), k → ∞, with

√ g∞ (ξ ) = a 2 − 2b2 + 4bξ = ω + 4bξ, 0 ≤ ξ < b − a 2.

The zeros µ± of the differential (k − µ− )(k − µ+ ) dk X (k) = d((k) + 4ξ X (k)) k 2 + (b + ξ )k + a 2 /2 + bξ dk =4 X (k)

dg(k) = 4

√ are as follows for 0 ≤ ξ ≤ b − a 2: b+ξ ± µ± (ξ ) = − 2

a2 (b − ξ )2 − . 4 2

(5.9)

√ They are real while 0 ≤ ξ ≤ ξ0 = b − a 2 and complex conjugate if ξ > ξ0 . We will see in the next section that for ξ > ξ0 the g-function should be chosen differently. In what follows the signature table of the function Im g(k) for different values of ξ plays a very important role. The lines of separation between the different domains are the real axis k2 = 0 and the algebraic curve k22

k1 − b + 2ξ a2 2 + bξ . = k1 + (b + ξ )k1 + 2 k1 + ξ

They are indeed given by Im g(k) = 0. The signature table of the function Im g(k) is depicted in Fig. 7 for 0 ≤ ξ < ξ0 and in Fig. 10 for ξ = ξ0 .

502

A. Boutet de Monvel, A. Its, V. Kotlyarov

√ Fig. 7. The signature table of Im g(k) for 0 < ξ < b − a 2

5.4.2 We shall now take advantage of Condition #1 imposed on the Riemann–Hilbert data c(k). We deform the contour γ ∪ γ¯ to the contour γg ∪ γ¯g where Im g(k) = 0. This ¯ All functions, that contour depends on ξ because it connects the points E, µ− (ξ ), and E. had jumps across γ ∪ γ¯ , have now jumps across γg ∪ γ¯g with the same jump relations as they had before the deformation. The basic Riemann–Hilbert problem RH xt has to be considered now on a new contour: = R ∪ γg ∪ γ¯g ∪ ∪ ¯ and with the new phase function. More precisely, we put M(x, t; k) = eitg∞ (ξ )σ3 M (1) (x, t; k)ei[kx+2k

2 t−tg(k)]σ 3

,

where the phase function g(k) = g(k, ξ ) is defined in (5.8). Then the matrix M (1) (x, t; k) satisfies the following RH problem: (1)

(1)

M− (x, t; k) = M+ (x, t; k)J (1) (x, t; k) with the jump matrix

% ⎧$ −2itg(k) ⎪ 1 − ρ(k)e ¯ ⎪ ⎪ , ⎪ 2itg(k) 1 + |ρ(k)|2 ⎪ ⎪ ⎪$−ρ(k)e ⎪ % ⎪ ⎪ ⎪ 1 −¯r (k)e−2itg(k) ⎪ ⎪ ⎪ , ⎪ ⎪ −r (k)e2itg(k) 1 + |r (k)|2 ⎪ ⎪ $ % ⎪ ⎪ ⎪ ⎪ e−2itg+ (k) 0 ⎪ ⎪ , ⎪ ⎨ f (k) e2itg+ (k) (1) $ % J (x, t; k) = ⎪ ¯ e−2itg+ (k) − f¯(k) ⎪ ⎪ ⎪ , ⎪ 2itg (k) ⎪ 0 e + ⎪ ⎪$ % ⎪ ⎪ ⎪ ⎪ 1 0 ⎪ ⎪ ⎪ 2itg(k) 1 , ⎪ c(k)e ⎪ ⎪ ⎪ % $ ⎪ ⎪ ⎪ ¯ −2itg(k) 1 c( ¯ k)e ⎪ ⎪ ⎪ , ⎩ 0 1

k ∈ (−∞, κ+ ), k ∈ (κ+ , ∞), k ∈ γg , k ∈ γ¯g , k ∈ , ¯ k ∈ .

Asymptotics for NLS with Time-Periodic Boundary Condition

503

5.4.3 Let us perform the same transformation as in Sect. 5.1.1: M (1) (x, t; k) = M (2) (x, t; k)δ σ3 (k). The function δ(k) is defined as in (5.2), but now κ0 = µ+ (ξ ), where µ+ (ξ ) is the stationary point of the new phase function g(k). The corresponding jump matrix J (2) (x, t; k) is factorized as follows: J (2) (x, t; k) %$ % ⎧$ ¯ +2 (k)e−2itg(k) ⎪ 1 −ρ¯1 (k)δ 1 0 ⎪ ⎪ , k < µ+ (ξ ), ⎪ −2 ⎪ 1 −ρ1 (k)δ− (k)e2itg(k) 1 ⎪ ⎪$0 ⎪ % $ % ⎪ ⎨ ¯ 2 (k)e−2itg(k) 1 0 1 −ρ( ¯ k)δ , µ+ (ξ ) < k < κ+ , = ⎪ −ρ(k)δ −2 (k)e2itg(k) 1 0 1 ⎪ ⎪ $ % $ % ⎪ ⎪ ⎪ ¯ 2 (k)e−2itg(k) ⎪ 1 0 1 −¯r (k)δ ⎪ ⎪ , k > κ+ . ⎩ −2 2itg(k) −r (k)δ (k)e 1 0 1 For the remaining arcs of the jump matrix is given by: % ⎧$ ⎪ e−2itg+ (k) 0 ⎪ ⎪ , ⎪ ⎪ f (k)δ −2 (k) e2itg+ (k) ⎪ ⎪ ⎪$ % ⎪ ⎪ ¯ 2 (k) ⎪ e−2itg+ (k) − f¯(k)δ ⎪ ⎪ , ⎪ ⎨ 0 e2itg+ (k) (2) % J (x, t; k) = $ ⎪ ⎪ 1 0 ⎪ ⎪ , ⎪ ⎪ −c(k)δ −2 (k)e2itg(k) 1 ⎪ ⎪ $ % ⎪ ⎪ ⎪ ¯ 2 (k)e−2itg(k) ⎪ 1 −c( ¯ k)δ ⎪ ⎪ , ⎩ 0 1

k ∈ γg , k ∈ γ¯g , k ∈ , ¯ k ∈ .

5.4.4 Now we use the same transformation as in Sect. 5.1.2: M (3) (x, t; k) = M (2) (x, t; k)G(k), where G(k) is given by (5.4) with θ (k) replaced by g(k) and κ0 by µ+ (ξ ). After this G-transformation, the Riemann–Hilbert problem becomes: (3)

(3)

M− (x, t; k) = M+ (x, t; k)J (3) (x, t; k). The contour (3) = γg ∪ γ¯g ∪ crossµ+ of this RH problem is depicted in Fig. 8. (3) Let cross = {k | arg(k − µ+ ) = π4 , 3π , 5π , 7π }. Let Dµ+ be a small disk centered at 4 4 4 (3) (3) , the jump matrix admits the following estimate: µ+ . For k ∈ cross \ Dµ+ ∩ cross J (3) (x, t; k) = I + O(e−εt ), ε > 0, t → +∞. Therefore the main attention we have to pay is to γg ∪ γ¯g , where the jump matrix (2) J (3) (x, t; k) := G −1 + (k)J (x, t; k)G − (k)

504

A. Boutet de Monvel, A. Its, V. Kotlyarov

√ Fig. 8. The contour (3) of the RH problem for 0 < ξ < b − a 2

factorizes as follows: ⎧$ $ % % ¯ˆ1+ (k)δ ¯ˆ1− (k)δ ¯ 2 (k)e−2itg+ (k) ¯ 2 (k)e2itg+ (k) ⎪ 1 ρ 1 − ρ ⎪ (2) ⎪ J (x, t; k) , k ∈ γg , ⎪ ⎨ 0 1 0 1 (3) $ $ % % J = ⎪ 1 0 1 0 ⎪ (2) ⎪ ⎪ ⎩ −ρˆ (k)δ −2 (k)e2itg+ (k) 1 J (x, t; k) ρˆ (k)δ −2 (k)e−2itg+ (k) 1 , k ∈ γ¯g . 1+

1−

Let F(k), k ∈ / γg ∪ γ¯g be a function such that F− (k)F+ (k) = −ieiα f (k)δ −2 (k)

for k ∈ γg .

Here α is the same as in (1.1). Then for k ∈ γg we can factorize J (2) (x, t; k) as follows: e−2itg+ (k) 0 J (x, t; k) := f (k)δ −2 (k) e2itg+ (k) −1 0 F+ (k)F−−1 (k)e−2itg+ (k) F+ (k) 0 = 0 F+ (k) ie−iα F− (k)F+−1 (k)e2itg+ (k) 0 F− (k) × −1 0 F− (k) 0 ieiα 1 −ieiα ψ(k)e−2itg+ (k) = F+−σ3 (k) 0 1 ie−iα 0 1 −ieiα ψ −1 (k)e2itg+ (k) F−σ3 (k), × 0 1 (2)

where ψ(k) := F+ k)F−−1 (k). Similarly, if ¯ −2 (k) F− (k)F+ (k) = −ieiα f¯−1 (k)δ

for k ∈ γ¯g ,

Asymptotics for NLS with Time-Periodic Boundary Condition

505

then for k ∈ γ¯g we can factorize J (2) (x, t; k) as follows: −2itg (k) + ¯ 2 (k) e − f¯(k)δ J (2) (x, t; k) := 0 e2itg+ (k) −1 ieiα F+ (k)F−−1 (k)e−2itg+ (k) F+ (k) 0 = 0 F+ (k) 0 F− (k)F+−1 (k)e2itg+ (k) 0 F− (k) × 0 F−−1 (k) 1 0 0 ieiα −σ3 = F+ (k) −ie−iα ψ −1 (k)e2itg+ (k) 1 ie−iα 0 1 0 × F σ3 (k). −ie−iα ψ(k)e−2itg+ (k) 1 − This leads us to introduce the following scalar Riemann–Hilbert problem: Scalar RH-problem Find a scalar function F(k) such that • • • •

F F F F

is analytic outside the contour γg ∪ γ¯g , does not vanish, is bounded at infinity, satisfies the jump relation F− (k)F+ (k) = h(k)δ −2 (k), k ∈ γg ∪ γ¯g ,

where

−ieiα f (k), k ∈ γg , h(k) = iα −1 ¯ ¯ −ie f (k), k ∈ γ¯g .

As in Sect. 2 we find the solution of this problem in the form: ' X (k) log[h(s)δ −2 (s, ξ )] ds F(k) = exp , 2π i γg ∪γ¯g s−k X + (s) 1 ds . log[h(s)δ −2 (s, ξ )] F(∞) = eiφ(ξ ) where φ(ξ ) = 2π γg ∪γ¯g X + (s) Putting this solution F(k) in the above factorization of J (2) (x, t; k) for k ∈ γg , and using ¯ 2 (k)e−2itg+ (k) ¯ 2 (k)F+2 (k)e−2itg+ (k) 1 ρ¯ˆ1+ (k)δ 1 ρ¯ˆ1+ (k)δ −σ3 −σ3 F+ (k) = F+ (k) , 0 1 0 1 ¯ 2 (k)e2itg+ (k) ¯ 2 (k)F−2 (k)e2itg+ (k) 1 −ρ¯ˆ1− (k)δ 1 −ρˆ¯1− (k)δ F−σ3 (k) = F−σ3 (k), 0 1 0 1 we get for k ∈ γg : J (3) (x, t; k) = F+−σ3 (k)Nup (k)J mod Nˆ up (k)F−σ3 (k),

506

A. Boutet de Monvel, A. Its, V. Kotlyarov

√ Fig. 9. The lenses around γg ∪ γ¯g for 0 < ξ < b − a 2

where

¯ + f −1 (k)]e−2itg+ (k) 1 δ 2 (k)F+2 (k)[ρ¯ˆ1+ (k) , Nup (k) = 0 1 0 ieiα J mod = , −iα ie 0 ¯ − f −1 (k)]e−2itg− (k) 1 −δ 2 (k)F−2 (k)[ρ¯ˆ1− (k) ˆ Nup (k) = . 0 1

Similarly, for k ∈ γ¯g , we get: J (3) (x, t; k) = F+−σ3 (k)Nlow (k)J mod Nˆ low (k)F−σ3 (k), where

1 0 Nlow (k) = ¯ 2itg+ (k) 1 , −δ −2 (k)F+−2 (k)[ρˆ1+ (k) + f¯−1 (k)]e 0 ieiα J mod = , ie−iα 0 1 0 Nˆ low (k) = −2 ¯ 2itg− (k) 1 . δ (k)F−−2 (k)[ρˆ1− (k) − f¯−1 (k)]e

5.4.5 The next step – the “opening lenses” step – is as follows (cf. [13]). Let ⎧ ⎪ F σ3 (∞)M (3) (x, t; k)F −σ3 (k) ⎪ ⎪ ⎪ σ3 (3) −σ3 ⎪ ⎪ ⎨ F (∞)M (x, t; k)F (k)Nup (k) −1 (k) M (4) (x, t; k) := F σ3 (∞)M (3) (x, t; k)F −σ3 (k) Nˆ up ⎪ ⎪ σ (3) −σ ⎪ F 3 (∞)M (x, t; k)F 3 (k)Nlow (k) ⎪ ⎪ ⎪ ⎩ σ3 −1 F (∞)M (3) (x, t; k)F −σ3 (k) Nˆ low (k)

k k k k k

outside the lenses, inside the upper right lens, inside the upper left lens, inside the lower right lens, inside the lower left lens.

Asymptotics for NLS with Time-Periodic Boundary Condition

507

Here we again use Property #1 to perform the analytic continuation of the matrices Nup (k), Nlow (k), Nˆ up (k) and Nˆ low (k) to the indicated domains. Then we have (4)

(4)

M− (x, t; k) = M+ (x, t; k)J (4) (x, t; k), where

⎧ Nup (x, t; k) ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ Nˆ up (x, t; k) (4) J (x, t; k) = J mod ⎪ ⎪ ⎪ Nlow (x, t; k) ⎪ ⎪ ⎩ˆ Nlow (x, t; k)

on the boundary of the right upper lens, on the boundary of the left upper lens, k ∈ γg ∪ γ¯g , on the boundary of the right lower lens, on the boundary of the left lower lens.

Due to the signature table of the phase function g(k) the matrices Nup (x, t; k), Nˆ up (x, t; k), Nlow (x, t; k), Nˆ low (x, t; k) are exponentially close to the unit matrix outside of small neighborhoods of the end points E and E¯ and of the stationary phase point µ− . The analysis of parametrix solutions near the points E, E¯ and the point µ− is very similar to the analysis done in [12] and [11], respectively. In the first case, the relevant model Riemann–Hilbert problem is solvable in terms of Bessel functions while in the second case this is again a parabolic cylinder Riemann–Hilbert problem. Skipping technical details, we arrive at the following asymptotic representation of the function M (4) (x, t; k): (5.10) M (4) (x, t; k) = I + O(t −1/2 ) M mod (x, t; k), where M mod (x, t; k) solves the 0-gap model problem RHmod (cf. [12]): Zero-gap model RH-problem mod M− (x, t; k) = M+mod (x, t; k)J mod , k ∈ γg ∪ γ¯g ,

with constant jump matrix: J

mod

=

0

ie−iα

ieiα . 0

Let us recall the multi-valued function k + b − ia 1/4 k − E 1/4 = ν(k) = k + b + ia k − E¯ introduced in (1.8f), Sect. 1, and such that ν(k) = 1 −

ia 1 + O 2 as k → ∞. 2k k

Since ν− (k) = iν+ (k) on the cut γg ∪ γ¯g the solution of RHmod is explicitly given by ⎞ ⎛ 1 iα ν(k) − 1 e ν(k) + 1 ν(k) ν(k) ⎠. M mod (x, t; k) = ⎝ 2 e−iα ν(k) − 1 ν(k) + 1 ν(k)

ν(k)

508

A. Boutet de Monvel, A. Its, V. Kotlyarov

Let M • (x, t; k) denote the solution of the Riemann–Hilbert-problem RH• and

m •12 (x, t) := lim k M • (x, t; k) 12 . k→∞

The previous considerations yield the following chain of equalities: q(x, t) = 2im 12 (x, t) (1)

= 2ie2itg∞ (ξ ) m 12 (x, t) = 2ie2itg∞ (ξ ) m (2) 12 (x, t) (3)

= 2ie2itg∞ (ξ ) m 12 (x, t) + O(t −1/2 ) (4)

= 2ie2itg∞ (ξ ) m 12 (x, t)F −2 (∞) + O(t −1/2 ) −2 −1/2 ). = 2ie2itg∞ (ξ ) m mod 12 (x, t)F (∞) + O(t iα −2 −2iφ(ξ ) Taking into account that g∞ (ξ ) = ω + 4bξ , 2im mod 12 (x, t) ≡ ae , F (∞) = e we get the following theorem: √ Theorem 4 (Plane wave region, 0 ≤ ξ < b − a 2). Suppose that all conditions of Theorem 1 and Conditions #1, #2, #3√are satisfied. Then in the region 0 ≤ ξ < b − a 2 the solution (4.3) of the IBV problem takes the form of a plane wave:

q(x, t) = aeiα e2i[bx+ωt−φ(ξ )] + O(t −1/2 ), t → +∞, 1 ds . φ(ξ ) = log[h(s)δ −2 (s, ξ )] 2π γg ∪γ¯g X + (s)

(5.11)

Remark 2 (Matching) As it is indicated, the above arguments are valid in the case x = 0, i.e. ξ = 0 as well. This, in particular, implies the relation φ(0) = 0

mod 2π,

which is fulfilled due to the constraint (2.23) (in the absence of discrete spectrum). Hence the plane wave (5.11) matches with the boundary condition. √ Remark 3 If b ≤ a 2 then the plane wave region is empty. Remark 4 Equation (5.11) is consistent with our assumption (1.5) on the structure of the Dirichlet to Neumann map. It also should be noticed that in the case x = 0, i.e., ξ = 0, we do not need restrictions #1 and #2 on the RH data to proceed with our asymptotic approach. Indeed, in this case, γ = γg and therefore no deformation of the original contour γ ∪ γ¯ is needed. √ √ 5.5 Modulated elliptic region b − a 2 < ξ ≡ 4tx < b. When ξ = ξ0 = b − a 2 the zeros µ− (ξ0 ) and µ+ (ξ0 ), see (5.9), of the differential dg(k) coincide (See Fig. 10): µ− (ξ0 ) = µ+ (ξ0 ) = µ = −

b + ξ0 . 2

For ξ > ξ0 the zeros µ± (ξ ) of dg(k) become complex conjugate. As a result the previous considerations fail. We need to introduce a new g-function for this region.

Asymptotics for NLS with Time-Periodic Boundary Condition

509

√ Fig. 10. The signature table of Im g(k) for ξ = ξ0 = b − a 2

Let ξ > ξ0 . We start by replacing the original RH problem by the analogous RH problem whose phase function is the previous g-function g0 = g(ξ0 ) associated to ξ = ξ0 . The contour is = R ∪ ∪ ¯ ∪ γ0 ∪ γ¯0 , where γ0 := γg(ξ0 ) and γ¯0 := γ¯g(ξ0 ) . As before this new RH problem is equivalent to the original one. 5.5.1 The new g-function. A suitable g-function for ξ > ξ0 can be obtained as follows. First, we need to introduce a new real stationary point µ = µ(ξ ) which must be a zero of the new differential dg. On the other hand we have to preserve the asymptotic behavior of the g-function for large k. To do so we must change the denominator of the differential dg. Thus the new differential takes the form: (k − µ(ξ ))(k − µ− (ξ ))(k − µ+ (ξ )) dg(k) = 4 dk, ¯ ¯ )) (k − E)(k − E)(k − d(ξ ))(k − d(ξ where µ(ξ ), µ± (ξ ), and d(ξ ) are to be determined. It is easy to see that if we choose ¯ 0 ) = − b + ξ0 , µ(ξ0 ) = µ± (ξ0 ) = d(ξ0 ) = d(ξ 2 then for ξ = ξ0 the new differential coincides with the previous one: dg(k) = 4

(k − µ− )(k − µ+ ) dk. X (k)

One can also see that dg is an Abelian (elliptic) differential of the second kind with poles at ∞± on the Riemann surface X of ¯ ¯ )), − d(ξ ))(k − d(ξ w(k) = (k − E)(k − E)(k E = −b + ia, d(ξ ) = d1 (ξ ) + id2 (ξ ).

510

A. Boutet de Monvel, A. Its, V. Kotlyarov

The branch of the square root is fixed by this asymptotics on the upper sheet X+ : w(k) = k 2 + O(k), k → ∞+ . We choose on this Riemann surface a basis {a, b} of cycles as follows. The b-cycle is a closed clock-wise oriented simple loop around the arc γ E,d joining E and d. The a-cycle starts on the upper sheet X+ from the left side of the cut γ E,d , goes to the left side of the cut γd, ¯ E¯ , proceeds to the lower sheet X− , and then returns to the starting point. We write the Abelian differential dg(k) in the form: dg(k) = 4

k 3 + c2 k 2 + c1 k + c0 dk w(k)

and normalize it so that its a-period vanishes. Since d dg = 2 dg, d¯

a

¯ d], this normalization condition where the path of integration is the line segment [d, means (d 3 dk 2 d¯ (k + c2 k + c1 k) w(k) c0 = − . ( d dk d¯ w(k)

Note that c0 is real. It depends on c1 , c2 , d1 (ξ ) and d2 (ξ ). The requirement g(k) = θ (k) + O(1)

as k → ∞+ ,

i.e. g(k) = 2k 2 + 4ξ k + O(1), implies c2 = b + ξ − d1 , 1 c1 = bξ − (b + ξ )d1 + (d22 + a 2 ). 2 Thus, c0 = c0 (ξ, d1 , d2 ). Then the g-function, taken as the sum of two Abelian integrals: k k 3 z + c2 z 2 + c1 z + c0 dz, (5.12) + g(k) = 2 w(z) E E¯ has real b-period1

$ Bg = 2

d¯ %

d

+ E

E¯

z 3 + c2 z 2 + c1 z + c0 dz w(z)

(5.13)

and the following asymptotics at k = ∞+ : ( 1 Taking into account the relation d dg = 0 and the absence of residue of dg at k = ∞ we see that in ± d¯ (E fact E¯ dg = 0 as well, and that the function g(k) can be written as a single Abelian integral: g(k) = 4

k 3 z + c2 z 2 + c1 z + c0 dz, w(z) E

and, simultaneously, we have indeed

Bg =

b

dg.

Asymptotics for NLS with Time-Periodic Boundary Condition

511

g(k) = 2k 2 + 4ξ k + g∞ (ξ ) + O(k −1 ), where g∞ (ξ ) = 2

∞+

+ E

∞+ z 3 E¯

+ c2 z 2 + c1 z + c0 − (z + ξ ) dz +2a 2 −2b2 +4bξ w(z) (5.14)

is a real-valued function of ξ . Convention (Integration paths). The contours of integration in both integrals in (5.12), (5.13) and (5.14) are chosen according to the following convention that allows to work with a single-valued branch of the multi-valued function g(k). In what follows we use only the upper sheet of the Riemann surface. Moreover, let us complete the contour ¯ by attaching to it the infinite vertical pieces, (−b + i∞, E] and γ E,d ∪ γ E,d ∪ [d, d] ¯ −b − i∞). Then the integration paths are chosen so that they do not intersect the [ E, augmented contour. Observe also, that across the added pieces, the function g(k) does not jump: g+ (k) − g− (k) = 0

¯ −b − i∞). for k ∈ (−b + i∞, E) ∪ ( E,

In order to determine µ, µ± , d and d¯ as functions of ξ let us rewrite the differential dg in the form: dg(k) = 4

(k − µ)(k − µ− )(k − µ+ ) dk, w(k)

where µ± = µ1 ± iµ2 and µ, µ1 are negative. Comparing with the previous form of the differential dg we obtain µ + 2µ1 − d1 = −(b + ξ ), 1 1 2µµ1 + µ21 + µ22 + (b + ξ )d1 − d22 = a 2 + bξ, 2 2 µ(µ21 + µ22 ) = −c0 (ξ, d1 , d2 ). Let (k − d)1/2 be the local parameter at d. It is easy to see that the local expansion of g(k) at d is of the form g(k) = Bg + g1 (k − d)1/2 + g2 (k − d)3/2 + . . . ,

Bg ∈ R.

i. Since g(E) = Im g(d) = 0 then the points E and d = d(ξ ) are connected by a curve γd where Im g(k) ≡ 0. ii. Since g(k) is real on the real axis then there exist a real point µ = µ(ξ ) and some curve connecting µ and d, where Im g(k) ≡ 0. iii. Since g(k) behaves like θ (k) for large k then there exists a curve where Im g(k) ≡ 0, starting from d and going to infinity along the asymptotic line Re k = −ξ . Here −ξ is the stationary point of the phase function θ (k).

512

A. Boutet de Monvel, A. Its, V. Kotlyarov

√ Fig. 11. The signature table of Im g(k) in the region b − a 2 < ξ < b

Thus the curve Im g(k) = 0 must have three branches going out from the point d. This is possible if and only if g1 = 0, i.e. (k − d)1/2 g (k)|k=d = 4

(d − µ(ξ ))(d − µ− (ξ ))(d − µ+ (ξ )) = 0. ¯ ¯ (k − E)(k − E)(d − d)

Since µ is real then µ+ = d and µ− = d¯ and finally we have ¯ )) (k − d(ξ ))(k − d(ξ dg(k) = 4(k − µ(ξ )) dk. ¯ (k − E)(k − E) Previous considerations yield the signature table of the function Im g(k) shown in Fig. 11. The functions µ(ξ ), d(ξ ) = d1 (ξ ) + id2 (ξ ) have to satisfy now the following equations: µ + d1 = −(b + ξ ), b+ξ 2 (b − ξ )2 2 , d2 − 2 d1 + = a2 − 2 2 µ(µ21 + µ22 ) = −c0 (ξ, d1 , d2 ). These three equations can be reduced to one equation with respect to µ(ξ ) involving elliptic integrals. Indeed, let us put d1 = d1 (ξ ) = −µ − b − ξ, then d2 = d2 (ξ ) =

2µ2 + 2(b + ξ )µ + a 2 + 2bξ .

The a-period of the differential dg vanishes. It means that µ=

I1 (d1 , d2 ) , I0 (d1 , d2 )

Asymptotics for NLS with Time-Periodic Boundary Condition

where I1 (d1 , d2 ) =

d1 −id2

I0 (d1 , d2 ) =

d1 +id2

d1 +id2 d1 −id2

z

513

(z − d1 )2 + d22 dz, (z + b)2 + a 2

(z − d1 )2 + d22 dz. (z + b)2 + a 2

Thus we have one functional equation for µ = µ(ξ ): µ = H (µ, ξ ),

I1 (−µ − b − ξ, 2µ2 + 2(b + ξ )µ + a 2 + 2bξ ) . H (µ, ξ ) = I0 (−µ − b − ξ, 2µ2 + 2(b + ξ )µ + a 2 + 2bξ )

(5.15)

We are interested which moves into the closed interval √ in the solution µ = µ(ξ ) √ [−b, −b + a/ 2], when ξ belongs to [b, b − a 2]. Equation (5.15) is consistent with this requirement. Indeed, if ξ = b then by construction of the g-function, the branch point d(b) = d1 (b) + id2 (b) must coincide with the branch point E = −b + ia. In this case the elliptic integrals degenerate and I0 = 2ia, I1 = −2iab. Hence µ(b) = −b. The function g(k) reduces consequently (up to a constant) to the phase function θ (k) = θ (k, ξ ) = 2k 2 + 4ξ k which attended the problem in the region ξ > b: g(k, b) − 2|E|2 = θ (k, b). On the other hand, let us note that after the change of variable z = d1 + isd2 in the integrals I0 and I1 the functional equation takes the form: √ 1 s 1 − s 2 ds −1 a 2 + (b + d1 + isd2 )2 µ = d1 − d2 1 √ 1 − s 2 ds −1 a 2 + (b + d1 + isd2 )2 with the substitution: d1 := d1 (µ, ξ ) = −µ − b − ξ, d2 := d2 (µ, ξ ) = 2µ2 + 2(b + ξ )µ + a 2 + 2bξ . √ d2 (ξ0 ) must be equal to zero and, If ξ = ξ0 = b − a 2 then, due to the construction, √ √ hence, d1 (ξ0 ) = −(b + ξ0 )/2 = −b + a/ 2. Therefore µ(ξ0 ) = d1 (ξ0 ) = −b + a/ 2. The function g(k) reduces consequently to √ the g-function considered in the previous Subsect. 5.4 √ for the region 0 < ξ < b − a 2. ¯ For b − a 2 < ξ < b we deform the contour γ0 ∪ γ¯0 into the contour γd ∪ γd ∪ [d, d] (again using Property #1 of c(k)). The part γd ∪ γd of this contour is chosen in such a way that Im g(k) ≡ 0 on it. Then the function g(k) has the following properties: k ∈ γd ∪ γd ; g+ (k) + g− (k) = 0, ¯ g+ (k) − g− (k) = Bg , Im Bg = 0, k ∈ [d, d]. ¯ −b − i∞): We recall that the function g(k) is continuous in k ∈ (−b + i∞, E) ∪ ( E, ¯ −b − i∞). g+ (k) − g− (k) = 0, k ∈ (−b + i∞, E) ∪ ( E,

514

A. Boutet de Monvel, A. Its, V. Kotlyarov

√ Fig. 12. The contour (3) for the elliptic region b − a 2 < ξ < b

5.5.2 Let us perform the same transformations as in Subsect. 5.4.2–5.4.4: M(x, t; k) M (1) (x, t; k) M (2) (x, t; k) M (3) (x, t; k). The function δ(k) is defined as (5.2), but now κ0 = µ(ξ ), where µ(ξ ) is the real stationary point of the new phase function g(k). The RH problem is considered now on the contour (3) , depicted in Fig. 12. The jump matrix (2) J (3) (x, t; k) := G −1 + (k)J (x, t; k)G − (k)

admits the following estimates as t → +∞: J (3) (x, t; k) = I + O(e−εt ), ε > 0, k ∈ (3) \{Cµ ∩ (3) }, arg(k − µ(ξ )) =

2j − 1 π, 4

j = 1, 2, 3, 4,

¯ the jump matrix where Cµ is a small circle centered at µ. Furthermore, for k ∈ [d, d] (2) J (3) (x, t; k) := G −1 + (k)J (x, t; k)G − (k)

takes different forms: ¯ and arg(k − µ) > π/4, • For k ∈ [d, d] 0 e−2it Bg . f (k)δ −2 (k)eit (g+ (k)+g− (k)) e2it Bg ¯ and 0 < arg(k − µ) < π/4, • For k ∈ [d, d] 1 0 0 e−2it Bg ρˆ+ (k)δ −2 (k)e2itg+ (k) 1 f (k)δ −2 (k)eit (g+ (k)+g− (k)) e2it Bg 1 0 . × −ρˆ− (k)δ −2 (k)e2itg− (k) 1

Asymptotics for NLS with Time-Periodic Boundary Condition

515

¯ and −π/4 < arg(k − µ) < 0, • For k ∈ [d, d] −2it B g − f¯(k)δ ¯ 2 (k)e−it (g+ (k)+g− (k)) ¯ 2 (k)e−2itg+ (k) e 1 −ρ¯ˆ+ (k)δ 0 1 0 e2it Bg ¯ 2 (k)e−2itg− (k) 1 ρ¯ˆ− (k)δ × . 0 1 ¯ and arg(k − µ) < −π/4, • For k ∈ [d, d] −2it B g − f¯(k)δ ¯ 2 (k)e−it (g+ (k)+g− (k)) e . 0 e2it Bg ¯ and for t → +∞, they are close to the diagonal matrix Therefore, away from d and d, −2it B g 0 e mod . (5.16) = J 0 e2it Bg The jump matrix which has to be factorized is the part of the matrix J (3) (x, t; k) on the arcs γd and γd : % ⎧$ ⎪ e−2itg+ (k) 0 ⎪ ⎪ , k ∈ γd , ⎪ ⎨ f (k)δ −2 (k) e2itg+ (k) (3) % J (x, t; k) = $ ⎪ −2itg+ (k) − f¯(k)δ 2 (k) ¯ ⎪ e ⎪ ⎪ , k ∈ γd . ⎩ 0 e2itg+ (k) As in Sect. 5.4.4 we have to consider an auxiliary scalar Riemann–Hilbert problem: Scalar RH-problem Find a scalar function F(k) such that • F is analytic outside the contour γd ∪ γd , • F does not vanish, • F satisfies the jump relation F− (k)F+ (k) = h(k)δ −2 (k), k ∈ γd ∪ γd , where

h(k) =

−ieiα f (k), k ∈ γd , ¯ k ∈ γd . −ieiα f¯−1 (k),

To find the solution of this RH problem, let us use the function ¯ ¯ w(k) = (k − E)(k − E)(k − d)(k − d). Since

log F(k) w(k)

log F(k) − w(k) +

−

=

log[h(k)δ −2 (k)] , k ∈ γd ∪ γd , w+ (k)

516

A. Boutet de Monvel, A. Its, V. Kotlyarov

we have as in Sect. 5.4.4:

F(k) = exp

w(k) 2π i

γd ∪γd

log[h(s)δ −2 (s, ξ )] ds . s−k w+ (s)

The important difference, however, is that now the function F(k) has an essential singularity at infinity. Indeed we have 1 ik F(k) = F∞ e , k → ∞, 1+O k where

1 ds ≡ (ξ ) = , log[h(s)δ −2 (s, ξ )] 2π γd ∪γd w+ (s) i ds F∞ = exp (s − e1 ) log[h(s)δ −2 (s, ξ )] 2π γd ∪γd w+ (s)

(5.17)

with e1 =

E + E¯ + d + d¯ . 2

(5.18)

In order to account for this singularity, let us introduce the normalized (the a-period vanishes) Abelian integral ω(k) of the second kind with simple poles at ∞± : k 2 z − e1 z + e0 ω(k) = dz, w(z) E where e1 is the same as in (5.18) (therefore the differential dω has no residues) and e0 is defined from the condition dω(k) = 0, (5.19) a

i.e.

d¯

e0 d

dz =− w(z)

d¯ d

(z 2 − e1 z + e0 )

dz . w(z)

The large k expansion of ω(k) on the upper sheet is of the form ω(k) = k + ω∞ (ξ ) + O(k −1 ), k → ∞+ , where

z 2 − e1 z + e0 − 1 dz − E w(z) E ∞+ ∞+ 2 1 z − e1 z + e0 ≡ + − 1 dz + b 2 w(z) E E¯

ω∞ =

∞+

(5.20)

is a real function of ξ . The path of integration is any contour lying in the right half-plane Re k > −b of the upper sheet and going to infinity along the real axis. In the last identity, we have taken into account Eq. (5.19) and the absence of residue at infinity for dω. With

Asymptotics for NLS with Time-Periodic Boundary Condition

517

the same convention about the choice of the contour of integration as in Sect. 5.5.1 for the case of the Abelian integral g(k) (the contour of integration does not intersect the ¯ ∪ (−b + i∞, E) ∪ ( E, ¯ −b − i∞)), we see that the augmented contour γd ∪ γd ∪ [d, d] Abelian integral ω(k) satisfies similar jump relations: ω+ (k) + ω− (k) = 0, k ∈ γd ∪ γd ; ¯ ω+ (k) − ω− (k) = Bω , k ∈ [d, d]; ¯ −b − i∞). ω+ (k) − ω− (k) = 0, k ∈ (−b + i∞, E) ∪ ( E, Here, Bω is the b-period of the integral ω(k): Bω =

b

dω = 2

z 2 − e1 z + e0 dz = w(z)

d E

$

d¯ %

d

+ E

E¯

z 2 − e1 z + e0 dz, (5.21) w(z)

where the last equation follows again from (5.19) and from the absence of residue at infinity. This equation explicitly indicates that Im Bω = 0. Let us now pass from the function F(k) to the function ˆ := F(k)e−iω(k) . F(k) This function has no more essential singularity at k = ∞. Indeed ˆ F(∞, ξ ) = exp(iφ(ξ )), where 1 φ(ξ ) = 2π

γd ∪γd

(s − e1 )log[−ieiα f (s)δ −2 (s, ξ )]

ds − ω∞ . w+ (s)

ˆ The function F(k) has the same jumps as F(k) across the arcs γd and γ¯d¯ , and an extra ¯ Indeed, we have jump across the interval [d, d]. Fˆ+ (k) = e−iBω . Fˆ− (k) 5.5.3 Let us now make the last step M (3) (x, t; k) M (4) (x, t; k), consisting in opening lenses around the contours γd and γd . In this step we use the funcˆ tion F(k) in place of F(k) when performing the Nup -Nlow factorizations of the jump matrix J (3) (x, t; k) and when defining the matrix M (4) (x, t; k). This step brings us to a one-gap model problem: One-gap model RH-problem mod ¯ M− (x, t; k) = M+mod (x, t; k)J mod (k), k ∈ γd ∪ γ¯d¯ ∪ [d, d],

(5.22a)

518

A. Boutet de Monvel, A. Its, V. Kotlyarov

with jump matrix:

% ⎧$ ⎪ 0 ieiα ⎪ ⎪ , k ∈ γd ∪ γd , ⎪ ⎨ ie−iα 0 mod % J (k) = $ ⎪ −it Bg −iBω ⎪ 0 e ⎪ ¯ ⎪ , k ∈ [d, d], ⎩ 0 eit Bg +iBω

(5.22b)

and with the asymptotic condition at k = ∞: M mod (x, t; ∞) = I.

(5.22c)

¯ Note the difference with the “preliminary” matrix J mod across the segment [d, d] indicated in (5.16). The rigorous asymptotic statement needs, of course, an analysis of the relevant parametrix in the neighborhoods of the end points. The local representation of g(k) at E and E¯ is characterized by a square root type behavior: √ g(k) ∼ k − E + const, at E, ¯ g(k) ∼ k − E¯ + const, at E. This means that the associated local model RH problems are solvable in terms of Bessel functions near E and E¯ (see again [12]). Similarly, the local representation of g(k) at d and d¯ exhibits a 3/2-root type behavior: g(k) ∼ (k − d)3/2 + const, at d, ¯ 3/2 + const, at d. ¯ g(k) ∼ (k − d) ¯ the associated local model RH problems are solvable in terms Hence, near d and d, of Airy functions near d and d¯ (see again [13]). The resulting estimate for the matrix function M (4) (x, t; k) is M (4) (x, t; k) = I + O t −1/2 M mod (x, t; k). (5.23) It is worth mentioning, that the error term of order t −1/2 comes from the contribution of the stationary phase point µ(ξ ). 5.5.4 The model problem (5.22) can be solved in terms of elliptic theta functions. For this purpose let us introduce the necessary ingredients. Consider the elliptic (two band) Riemann surface of ¯ ¯ )), − d(ξ ))(k − d(ξ w(k) = (k − E)(k − E)(k ¯ ) depend on ξ . Let where the branch points d(ξ ) and d(ξ 1 k dz U (k) = c E w(z) be the normalized Abelian integral, i.e., its a-period is equal to one, which means: d dz . c=2 d¯ w(z)

Asymptotics for NLS with Time-Periodic Boundary Condition

519

Then, 2 τ := c

d E

dz w(z)

(5.24)

with Im τ > 0. Furthermore, the following relations are valid: U+ (k) + U− (k) = 0, U+ (k) + U− (k) = −1, U+ (k) − U− (k) = τ,

k ∈ γd ; k ∈ γd ; ¯ k ∈ [d, d].

The next ingredient is the new function ν(k) defined by ν(k)4 =

(k − E)(k − d(ξ )) ¯ ¯ )) (k − E)(k − d(ξ

and by its asymptotic behavior as k → ∞, k ∈ / γd ∪ γ d : ν(k) = 1 +

a + d2 (ξ ) + O (k −2 ). 2ik

For the function ν(k) we have cuts along the contours γd , γd and ν− (k) = iν+ (k) along these cuts. If ¯ ) Ed(ξ ) − E¯ d(ξ ad1 (ξ ) − bd2 (ξ ) , = ¯ ) a + d2 (ξ ) E − E¯ + d − d(ξ √ which lies in the interval [−b, −b + a/ 2], then E0 =

ν(E 0 ) −

1 = 0, ν(E 0 )

ν(E 0 ) +

1

= 0. ν(E 0 )

The last ingredient is the theta function with Im τ = Im τ (ξ ) > 0: & 2 θ3 (z) = eπ iτ m +2π imz m∈Z

which has the following properties: θ3 (−z) = θ3 (z), θ3 (z + 1) = θ3 (z), θ3 (z + τ ) = e−π iτ −2π iz θ3 (z). Now introduce the matrix (k) = (t, ξ ; k) with entries: θ3 [U (k) + U (E 0− ) − 1/2 − τ/2 − Bg t/2π − Bω /2π ] 1 1 ν(k) + 11 (k) = , 2 ν(k) θ3 [U (k) + U (E 0− ) − 1/2 − τ/2] θ3 [U (k) − U (E 0− ) + 1/2 + τ/2 + Bg t/2π + Bω /2π ] 1 eiα ν(k) − 12 (k) = , 2 ν(k) θ3 [U (k) − U (E 0− ) + 1/2 + τ/2] θ3 [U (k) − U (E 0− )−1/2−τ/2 − Bg t/2π − Bω /2π ] 1 e−iα ν(k)− 21 (k) = , 2 ν(k) θ3 [U (k) − U (E 0− ) − 1/2−τ/2] θ3 [U (k) + U (E 0− ) + 1/2 + τ/2 + Bg t/2π + Bω /2π ] 1 1 ν(k) + 22 (k) = , 2 ν(k) θ3 [U (k) + U (E 0− ) + 1/2 + τ/2]

520

A. Boutet de Monvel, A. Its, V. Kotlyarov

where E 0− is the preimage of E 0 on the second sheet of the Riemann surface. This ¯ function is analytic on the first sheet of the Riemann surface cut along γd ∪ γd ∪ [d, d], where it satisfies the jump conditions (5.22b) of the model RH problem (5.22a). Then the solution of this problem (5.22) is given by M mod (x, t; k) = −1 (x, t; ∞)(x, t; k). As in Subsect. 5.4.5, for the plane wave region, q(x, t) = 2im 12 (x, t) (1)

= 2ie2itg∞ (ξ ) m 12 (x, t) (2)

= 2ie2itg∞ (ξ ) m 12 (x, t) −1/2 = 2ie2itg∞ (ξ ) m (3) ) 12 (x, t) + O(t (4)

= 2ie2itg∞ (ξ ) m 12 (x, t) Fˆ −2 (∞) + O(t −1/2 ) −1/2 ˆ −2 ). = 2ie2itg∞ (ξ ) m mod 12 (x, t) F (∞) + O(t We denote U0 = U (E 0− ) −

1 τ − . 2 2

Take into account that θ3 [Bg t/2π + Bω /2π + U (∞+ ) − U0 ] θ3 [Bg t/2π + Bω /2π − U (∞+ ) − U0 ] θ3 [U (∞+ ) + U0 ] , × θ3 [U (∞+ ) − U0 ]

iα 2im mod 12 (x, t) = [a + d2 (ξ )] e

) , we get the asymptotics of the solution of the IBV problem and Fˆ −2 (∞) = e−2iφ(ξ √ (1.1) in the region b − a 2 < ξ < b. √ Theorem 5 (Elliptic region, b − a 2 < ξ < b). Suppose that all conditions of Theorem 1 and Conditions #1, #2, √ #3 are satisfied. Then in the region b − a 2 < ξ < b the solution (4.3) of the IBV problem takes the form of a modulated elliptic wave:

q(x, t) = [a + Im d(ξ )] eiα

θ3 (Bg t/2π + Bω /2π − V− ) θ3 (V+ ) 2ig∞ (ξ )t−2iφ(ξ ) e θ3 (Bg t/2π + Bω /2π − V+ ) θ3 (V− )

+ O(t −1/2 ).

(5.25)

Here, Bg , Bω , are functions of the slow variable ξ = (5.17), respectively, and

x 4t

defined by (5.13), (5.21),

1 τ (ξ ) − + U (∞+ ), 2 2 1 τ (ξ ) V− = U (E 0− ) − − − U (∞+ ). 2 2

V+ = U (E 0− ) −

Asymptotics for NLS with Time-Periodic Boundary Condition

521

Furthermore, θ3 (z) =

&

eπ iτ m

2 +2π imz

m∈Z

is the theta function of invariant τ = τ (ξ ), Im τ > 0, defined in (5.24) and ⎤ ⎡ ∞ ∞ ¯ ⎣(z − µ(ξ )) (z − d(ξ ))(z − d(ξ )) − (z + ξ )⎦ dz + 2a 2 + g∞ (ξ ) = 2 ¯ ¯ (z − E)(z − E) E E − 2b2 + 4bξ is a regularization of the phase function g(k). Finally, the phase shift is given by ) * ds 1 , (s − e1 − ω∞ ) log h(s)δ −2 (s, ξ ) φ(ξ ) = 2π γd ∪γd w+ (s) where

−ieiα f (k), k ∈ γd , ¯ k ∈ γd , −ieiα f¯−1 (k), ' µ(ξ ) 1 log(1 + |r (s) + c(s)|2 )ds δ(k) = exp , k ∈ C\(−∞, µ(ξ )], 2π i −∞ s−k h(k) =

and e1 = e1 (ξ ), = (ξ ), ω∞ = ω∞ (ξ ) and µ(ξ ) are defined by (5.18), (5.17), (5.20) and (5.15), respectively. The spectral functions c(k) and r (k) are defined by the initial and boundary data, see (2.14) and (2.15), respectively, and f (k) = c− (k) − c+ (k) is the jump of c(k), see (2.21). elliptic

planewave

Remark 5 (Matching). For ξ = ξ0 we have Im d(ξ0 ) = 0. Hence g∞ (ξ0 ) = g∞ (ξ0 ), and θ ( · , ξ0 ) ≡ 1. We have then that the elliptic wave (5.25) coincides with the plane wave (5.11) as ξ = ξ0 . The problem of matching of the elliptic wave with the asymptotic solitons and these solitons with the vanishing Zakharov–Manakov asymptotics is much more complicated and will be addressed in detail in a separate publication. The underlying mechanism of these matchings can be briefly described as follows. On the right edge of the transition zone, that is when ξ ∼ b, the error term in the asymptotic formula of Theorem 3 becomes dominant. Moreover, as it follows from ˆ the representation of the function (k, t, ξ ) as the product sol (k, t, ξ )reg (k, t, ξ ) (see Sect. 5.2.3), this term is given by an expression similar to the Zakharov–Manakov formula (5.5). On the left edge of the transition zone, when we have to assume that N = O(t/ log t), the sum from Theorem 3 becomes an infinite sum of cosh−2 which will account for the appearance of the Weierstraß ℘-function in the asymptotics of |q(x, t)|. As we have already indicated, the details of this transformation will be presented elsewhere. Remark 6 If b2 = 2a 2 , i.e., ξ0 = 0 then the asymptotic behavior of the solution is only described by elliptic functions with modulated parameter, and the plane wave region disappears.

522

A. Boutet de Monvel, A. Its, V. Kotlyarov

Acknowledgements. The authors thank Percy Deift, John Elgin, John Gibbons and Dmitry Shepelsky for useful discussions, and Chunxiong Zheng for numeric simulations of the problem. The work of the first author was partially supported by the “Agence Nationale de la Recherche” grant ANR-08-BLAN-0311-01. The work of the second author was supported in part by NSF grant DMS-0401009.

References 1. Ablowitz, M.J., Segur, H.: Solitons and the Inverse Scattering Transform. SIAM Studies in Applied Mathematics, Vol. 4, Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM), 1981 2. Bona, J.L., Sun, S.M., Zhang, B.-Y.: Forced oscillations of a damped Korteweg-de Vries equation in a quarter plane. Commun. Contemp. Math. 5(3), 369–400 (2003) 3. Boutet de Monvel, A., Fokas, A.S., Shepelsky, D.: The mKdV equation on the half-line. J. Inst. Math. Jussieu 3(2), 139–164 (2004) 4. Boutet de Monvel, A., Its, A.R., Kotlyarov, V.: Long-time asymptotics for the focusing NLS equation with time-periodic boundary condition. C. R. Math. Acad. Sci. Paris 345(11), 615–620 (2007) 5. Boutet de Monvel, A., Kotlyarov, V.: Scattering problem for the Zakharov-Shabat equations on the semiaxis. Inverse Problems 16(6), 1813–1837 (2000) 6. Boutet de Monvel, A., Kotlyarov, V.: Generation of asymptotic solitons of the nonlinear Schrödinger equation by boundary data. J. Math. Phys. 44(8), 3185–3215 (2003) 7. Boutet de Monvel, A., Kotlyarov, V.: The focusing nonlinear Schrödinger equation on the quarter plane with time-periodic boundary condition: a Riemann-Hilbert approach. J. Inst. Math. Jussieu 6(4), 579–611 (2007) 8. Boutet de Monvel, A., Kotlyarov, V., Shepelsky, D.: Decaying long-time asymptotics for the focusing NLS equation with periodic boundary condition. Int. Math. Res. Not. IMRN 2009(3), 547–577 (2009) 9. Boutet de Monvel, A., Kotlyarov, V.P., Shepelsky, D., Zheng, C.: Initial boundary value problems for integrable systems: towards the long time asymptotics. Preprint BiBoS n◦ 08-09-299 (2008), available at http://www.math.uni-bielefeld.de/~bibos/preprints/08-09-299.pdf 10. Buckingham, R., Venakides, S.: Long-time asymptotics of the nonlinear Schrödinger equation shock problem. Comm. Pure Appl. Math. 60(9), 1349–1414 (2007) 11. Deift, P.A., Its, A.R., Zhou, X.: Long-time asymptotics for integrable nonlinear wave equations. In: Important Developments in Soliton Theory, Springer Ser. Nonlinear Dynam., Berlin: Springer, 1993, pp. 181–204 12. Deift, P.A., Its, A.R., Zhou, X.: A Riemann-Hilbert approach to asymptotic problems arising in the theory of random matrix models, and also in the theory of integrable statistical mechanics. Ann. of Math. (2) 146(1), 149–235 (1997) 13. Deift, P., Kriecherbauer, T., McLaughlin, K.T.-R., Venakides, S., Zhou, X.: Uniform asymptotics for polynomials orthogonal with respect to varying exponential weights and applications to universality questions in random matrix theory. Comm. Pure Appl. Math. 52(11), 1335–1425 (1999) 14. Deift, P., Zhou, X.: A steepest descent method for oscillatory Riemann-Hilbert problems. Bull. Amer. Math. Soc. (N.S.) 26(1), 119–123 (1992) 15. Deift, P., Zhou, X.: A steepest descent method for oscillatory Riemann-Hilbert problems. Asymptotics for the MKdV equation. Ann. of Math. (2) 137(2), 295–368 (1993) 16. Fokas, A.S., Its, A.R.: The linearization of the initial-boundary value problem of the nonlinear Schrödinger equation. SIAM J. Math. Anal. 27(3), 738–764 (1996) 17. Fokas, A.S., Its, A.R., Sung, L.-Y.: The nonlinear Schrödinger equation on the half-line. Nonlinearity 18(4), 1771–1822 (2005) 18. Fokas, A.S., Menyuk, C.R.: Integrability and self-similarity in transient stimulated Raman scattering. J. Nonlinear Sci. 9(1), 1–31 (1999) 19. Holmer, J.: The initial-boundary-value problem for the 1D nonlinear Schrödinger equation on the halfline. Differ. Integral Equ. 18(6), 647–668 (2005) 20. Kaup, D.J., Steudel, H.: Virtual solitons and the asymptotics of second harmonic generation. Inverse Problems 17(4), 959–970 (2001). (Special issue to celebrate Pierre Sabatier’s 65th birthday (Montpellier, 2000)) 21. Zakharov, V.E., Shabat, A.B.: Exact theory of two-dimensional self-focusing and one-dimensional selfmodulation of waves in nonlinear media. Ž. Èksper. Teoret. Fiz. 61(1), 118–134 (1971) (Russian, with English summary); English transl., Sov. Phys. JETP 34(1), 62–69 (1972) Communicated by M. Aizenman

Commun. Math. Phys. 290, 523–555 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0845-x

Communications in

Mathematical Physics

Relativistic Diffusion in Gödel’s Universe Jacques Franchi I.R.M.A., Université de Strasbourg, 7 rue René Descartes, 67084 Strasbourg cedex, France. E-mail: [email protected] Received: 28 February 2008 / Accepted: 22 March 2009 Published online: 5 June 2009 – © Springer-Verlag 2009

Abstract: K. Gödel [G1] published his exact solution to Einstein’s field equations in 1949. On the other hand, a general Lorentz invariant operator, associated to the so-called “relativistic diffusion”, and making sense in any Lorentz manifold, was introduced recently by Franchi-Le Jan in [F-LJ]. Here is proposed a study of the relativistic diffusion in the framework of Gödel’s universe, which contains matter. Such study is related to the determination of a boundary for this non-causal universe. 1. Introduction K. Gödel [G1] published his exact solution to Einstein equations in 1949. The most striking feature of this cosmological model is that it is non-causal (though locally and geodesically causal), containing closed timelike curves. For this reason, it is generally considered as rather unphysical. Possessing a series of interesting properties, it aroused however a great interest among physicists. For example, it contains rotating matter, but no singularity. Moreover, the explicit exact solutions to Einstein equations are not so many. W. Kundt [Ku] and S. Chandrasekhar-J.P. Wright [C-W] studied its geodesics, and S. Hawking-G. Ellis ([H-E], Sect. 5.7) emphasise on coordinates (defined by Gödel himself) showing up its rotational symmetry (about any point), to draw a nice picture of its dynamics. D. Malament ([M1,M2]) calculated the minimal energy of a closed timelike curve. K. Gödel [G2] discussed other rotating universes, which are spatially homogeneous, finite, and expanding, and he showed in particular that there are many examples of such strongly causal cosmological models. A.V. Levichev [L] takes advantage of a group structure on Gödel’s universe G, to study all left-invariant Lorentz metrics on G (including Gödel’s). The relativistic diffusion was introduced by J. Franchi and Y. Le Jan in [F-LJ], in the framework of general relativity, on an arbitrary Lorentz manifold, as the only diffusion which is covariant under Lorentz isometries. In this sense, it is the Lorentzian

524

J. Franchi

analogue of the Brownian motion on a Riemannian manifold. It can be seen as a random perturbation of the timelike geodesic flow. It lives on the pseudo-unit tangent bundle of the considered Lorentz manifold, and is roughly the development of the integrated Brownian motion of the unit pseudo-sphere (in a fixed tangent space). Its precise construction, by means of stochastic differential geometry on the frame bundle, is the purpose of ([F-LJ], Sect. 3). Note that another type of diffusion has been introduced on a Lorentz manifold in [De], maybe physically more significant, but without Lorentz covariance. As for the Brownian motion of a Riemannian manifold ([A,A-S,Ki,S]), there are natural questions about the relativistic diffusion of a Lorentz manifold such, as: What could be said about its long-proper time behaviour? Which knowledge about the manifold could it yield? It is reasonable to hope that the importance of heat kernels in Riemannian geometry could have an analogue in Lorentzian geometry. This seems to justify studies about the relativistic diffusion. As for Brownian motion, some answers could depend heavily on the base manifold. Anyway, it seems now hard to formulate and establish general results concerning a generic relativistic diffusion. To appreciate this difficulty, and why it is larger as in the Riemannian-Brownian case, recall that, beyond the non-positivity of the underlying metric, the relativistic diffusion does not live on the base manifold, but only on the pseudo-unit tangent bundle, implying in particular that it is basically seven-dimensional (the Lorentz manifolds of main interest, for obvious physical reasons, have four dimensions); and there is no general reason that it must contain lower-dimensional subdiffusions. Even in the highly symmetric Schwarzschild case (studied in [F-LJ]), the most reduced significant subdiffusion is three-dimensional, which is to be compared with the constantly curved Riemannian case, for which a crucial one-dimensional (radial) subdiffusion fortunately exists. Till now, the relativistic diffusion has been studied only in two space-times (namely, Minkowski and Schwarzschild-Kruskal-Szekeres ones), which are empty (having vanishing Ricci tensor) and satisfy all causality conditions considered in [H-E]. The simplest Lorentz manifold is Minkowski space, which is flat. The relativistic diffusion in Minkowski space, and especially its long-time behaviour, was first studied by Dudley [Du]. Very recently, Bailleul [B] performed the non-trivial determination of the Poisson boundary of Minkowski space (endowed with the relativistic operator), which is equivalent to the determination of the invariant σ -field of the natural filtration of the relativistic diffusion, and gave a geometric description of this boundary. The long-time (or longproper time) behaviour of the relativistic diffusion in Schwarzschild-Kruskal-Szekeres space-time was studied in [F-LJ], however without reaching the full determination of the Poisson boundary, which appears complicated, even in this empty and highly symmetric space-time. Here a study of the relativistic diffusion on Gödel’s universe G is proposed, which presents the interest of also having a lot of symmetries (G admits a group structure with respect to which its metric is left-invariant), but to have a non-vanishing Ricci tensor and to be non-causal, two significant differences with the two preceding examples. The leading purpose of the present work is to study, in the framework of G, the behaviour of the relativistic diffusion, with an emphasis on the asymptotic behaviour, which as mentioned above, is intended to open on the Poisson boundary of G. The aim of this work is thus two-fold: on one hand, to show that, despite the non-existence of a causal boundary (but trivial), there is for a space-time like G a non-trivial intrinsic notion of boundary, which admits some geometric description (in terms of beams, i.e. classes of light rays, which can be seen as cylinders); on the other hand, to reinforce with this third

Relativistic Diffusion in Gödel’s Universe

525

example a guess that should hold generally: relativistic diffusions should asymptotically behave as light rays. This article begins with a detailed study of timelike and lightlike geodesics, taking a different view from Kundt [Ku], Chandrasekhar-Wright [C-W], and going into more detail. For example, a detailed proof of (piece-wise) geodesic transitivity of G is given (Proposition 2). As a conclusion of the study of lightlike geodesics, a definition (Definition 2) of a beam (or boundary point, as an equivalence class of lightlike geodesics, without use of causality) and of convergence to a beam is given, which on one hand appears to be rather natural in this non-causal universe, and on the other hand becomes then justified by the determination of the Poisson boundary of G, even if incomplete. This notion of convergence is reinforced to a certain extent, in the last Sect. 3.8. Thus, the set B of beams has a natural structure of geometric 3-dimensional boundary, on which the isometry group of Gödel’s universe operates. To have a maybe more geometrical intuition or a picture of beams, it is possible to see them as oriented cylinders in G, as explained below, in Remark 4 and just after Definition 2. Then the relativistic diffusion of G is introduced. In order to study such a 7-dimensional diffusion, some sub-diffusions are considered, of dimensions 1, 2, and 4. A leading concern is here to bring out all asymptotic variables of the relativistic diffusion, or in other words, the invariant σ -field of its natural filtration, which is in turn closely related to the Poisson boundary of G (endowed with its invariant relativistic operator). The clue in this direction is the guess that convergence to a beam should eventually occur. The following theorem, progressively established in Sect. 3 below, shows that this general guess stands out as reinforced, and that some space of beams could generically yield a good notion of Lorentzian boundary, independent of global causality. The main results of the present article are summarised in the following (see precisely Theorem 1 in Sect. 3.7): Theorem. (i) The relativistic diffusion is irreducible (on its 7-dimensional phase space). (ii) Almost surely, the relativistic diffusion path possesses a 3-dimensional asymptotic random variable, and converges to a beam (in the sense of Definition 2 and Sect. 3.8). (iii) The support of possible beams the relativistic diffusion can converge to, is the whole 3-dimensional boundary space of beams. Note that Property (i) distinguishes strongly the relativistic diffusion of G from its analogues of Minkowski and Schwarzschild space-times, for which the absolute time component increases strictly with proper time. See Sect. 3.6 below. As a consequence of this theorem, and on the basis of some secondary results and considerations, the following conjecture appears to hold likely: by the determination of the 3-dimensional asymptotic random variable evoked in (ii) above, the whole invariant σ -field of the relativistic diffusion of Gödel’s universe G has been brought out, and then its whole Poisson boundary, which would thus identify with the geometric boundary B. The only relativistic case in which the Poisson boundary has been determined, up to now, is Minkowski space, by two different methods: Doob’s conditioning and then couplings, using an explicit expression of the laws of already found asymptotic variables, in [B]; or alternatively: study of the random walk associated with a lifted relativistic diffusion on Poincaré group, in [B-R]. The use of either method does not seem to be easy in the present curved case (likely as in any other curved case), since neither are the laws of the asymptotic variables explicit, nor is there any Poincaré group symmetry.

526

J. Franchi

The group structure of G (with left-invariant metric, see Sect. 2.1 below) does not seem to lift to some Poincaré-like group structure on the tangent bundle T 1 G (or on the frame bundle), as it should to prove efficient in the study of the relativistic diffusion (which lives in T 1 G), see Remark 5 below. Finally the Poisson boundary can be determined using time-reversing, in some Riemannian frameworks, see [A-T-U]. But this method does not seem to work here, where a (from infinity on) proper time-reversed relativistic diffusion appears out of reach. 2. Gödel’s Pseudo-Metric Definition 1. Gödel’s universe G is the manifold R4 , endowed with coordinates ξ := (t, x, y, z), and with the pseudo-metric (having signature (+, −, −, −)) defined by: ds 2 := dt 2 − d x 2 + 21 e2

√

2ωx

dy 2 + 2e

√

2ω x

dt dy − dz 2 ,

for some strictly positive constant ω. The inverse matrix of this pseudo-metric ((gi j )) is as follows: √ ⎞ ⎛ −1 0 2 e− 2 ω x 0 ⎜ 0 −1 0√ 0 ⎟ ⎟ √ ((g i j )) = ⎜ ⎝ 2 e− 2 ω x 0 −2 e−2 2 ω x 0 ⎠. 0 0 0 −1 Recall that a vector (t˙, x, ˙ y, ˙ z˙ ) above (t, x, y, z) is timelike if it belongs to the light √ √ cone based at (t, x, y, z), i.e. if and only if t˙2 − x˙ 2 + 21 e2 2 ω x y˙ 2 +2 e 2 ω x t˙ y˙ − z˙ 2 > 0, √ 2 or equivalently if and only if 21 e 2 ω x y˙ + 2 t˙ > t˙2 + x˙ 2 + z˙ 2 . This same vector will be √

said to be future-directed if moreover e 2 ω x y˙ + 2 t˙ > 0. This prescribes continuously a half of the light cone as indicating a preferred direction, seen as that of future. Accordingly, a piece-wise C 1 path is said to be timelike if its tangent vector is everywhere timelike, and future-directed if its tangent vector is moreover everywhere futuredirected. Along any timelike curve (ts , xs , ys , z s ), the unit pseudo-norm relation, defining proper time s, is: √ 2 (0) 1 + t˙s2 + x˙s2 + z˙ s2 = 21 e 2 ω xs y˙s + 2 t˙s . The isometry group of Gödel’s universe is the five-dimensional Lie group generated by: 1) the translations (t, x, y, z) → (t + t0 , x, y + y0 , z + z 0 ) of the linear (t, y, z) 3-subspace; √ 2) the hyperbolic dilatations (t, x, y, z) → (t, x + x0 , y e− 2 ω x0 , z); 3) the rotational symmetries (u, r, φ, z) → (u, r, φ + φ0 , z), in the new coordinates system (u, r, φ, z) ∈ R × R+ × (R/ 2π ω Z) × R defined by |t − u| < π/ω and: √

e

2ωx

= cosh(2r) + sinh(2r )cos(ωφ); e tg[ (φ + t − u)] = e ω 2

−2r

√

2ωx

ωy = sinh(2r)sin(ωφ);

tg[ ωφ 2 ];

we have indeed: ds 2 = [du + 2 sinh 2 r dφ]2 − 2ω−2 dr 2 − 21 sinh 2 (2r )dφ 2 − dz 2 .

Relativistic Diffusion in Gödel’s Universe

527

Gödel ([G1], Sect. 4) proved that these three types of isometries generate indeed the full isometry group. As the action of this group is clearly transitive on R4 , Gödel’s universe is an homogeneous space-time. Letting ω go to 0, we recover Minkowski space-time as limit of Gödel’s universe. Some generalisations of Gödel’s pseudo-metric have been proposed since, which include a second parameter, see in particular [R-T]. Not to add computational complexity, we restrict here to the genuine Gödel pseudo-metric of Definition 1. 2.1. A group structure. Gödel’s universe can be viewed as a matrix group, hence a Lie group, by means of the following identification: ⎛

√

e− 2 ω x ⎜ 0 G ξ = (t, x, y, z) ≡ ⎜ ⎝ 0 0

0 1 0 0

0 0 1 0

⎛ √ ⎞ e 2ωx y ⎜ z⎟ ⎟, so that ξ −1 ≡ ⎜ 0 ⎝ 0 t⎠ 1 0

0 1 0 0

√ ⎞ 0 −e 2 ω x y ⎟ 0 −z ⎟. ⎠ 1 −t 0 1

The Lie algebra G ⎛ 1 0 0 √ ⎜0 0 0 X := − 2ω ⎝ 0 0 0 0 0 0

of G is generated by ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 0 0 0 0 1 0 0 0 0 0 0 0 0 0⎟ ⎜ 0 0 0 0⎟ ⎜0 0 0 1⎟ ⎜0 0 0 0⎟ , Y := ⎝ , Z := ⎝ , T := ⎝ . 0⎠ 0 0 0 0⎠ 0 0 0 0⎠ 0 0 0 1⎠ 0 0 0 0 0 0 0 0 0 0 0 0 0 √ The only non-trivial commutation relation is [Y, X ] = 2ω Y . For (t, x, y, z) ∈ R4 we have: √

− 2ωx √ exp[t T + x X + yY + z Z ] = t, x, ( 1−e )y, z .

2ωx

do The left-invariant vector fields L A f (ξ ) := ds f (ξ es A ), ∀ ξ ∈ G, A ∈ G, f ∈ C 1 (G) are given by: √ ∂ ∂ ∂ ∂ − 2 ω x , ,e , . (LT , L X , LY , L Z ) = ∂t ∂ x ∂y ∂z

Considering the Lorentz metric g 0 on G, given in the basis (T, X, Y, Z ) by: ⎛ ⎞ 1 0 1 0 ⎜ 0 −1 0 0 ⎟ ((gi0j )) = ⎝ , 1 0 21 0 ⎠ 0 0 0 −1 we see that Gödel’s metric g happens to be the left-invariant metric generated on G by the Lorentz metric g 0 on G: L A , L A g = A, A g0 for any A ∈ G. But g is not bi-invariant: the right-invariant vector field associated with the Lie deriv√ √ do ∂ ∂

s X 2 2 2 ω x − 1 is not ative ds f (e ξ ) is L X = ∂ x − 2 ω ∂y , so that L X , L X g = ω e constant on G. All possible left-invariant metrics on G are considered in [L]. Among them, only one kind, of which Gödel’s metric is typical, happens to satisfy both important conditions: it is complete and it satisfies the weak energy condition (see Sect. 2.4 below).

528

J. Franchi

We have for any ξ0 = (t0 , x0 , y0 , z 0 ), ξ = (t, x, y, z) ∈ G: √

ξ0 × ξ = (t + t0 , x + x0 , y e−

2 ω x0

+ y0 , z + z 0 ).

Note that the two first types of isometries listed above are thus merely left translations √ on G. The third type is however not that simple. The map ξ → (e− 2 ω x , y, t, z) is a Lie group isomorphism from G onto Aff+ (R) × R2 , where Aff+ (R) denotes the group of increasing affine maps of R. G is two-step solvable, but not nilpotent. It has left √ √ f (ξ ξ0 )dµ(ξ ) = e− 2 ω x0 f dµ, G Haar measure dµ(ξ ) = e 2 ω x dtd xdydz. As is not unimodular. It is not semi-simple: the only non-zero scalar product (in the basis (T, X, Y, Z )), with respect to its Killing metric g K (·, ·) := Trace(ad(·) ◦ ad(·)), is: g K (X, X ) = 2ω2 . 2.2. Timelike geodesics. Geodesics are associated with the Lagrangian L(ξ˙ , ξ ), given by: √

√

2 L(ξ˙s , ξs ) = t˙s2 − x˙s2 + 21 e2 2 ω xs y˙s2 + 2 e 2 ω xs t˙s y˙s − z˙ s2 . ∂ ∂ L(ξ˙s , ξs ) ∂ L(ξ˙s , ξs ) The equation of geodesics reads here: = j j ∂s ∂ ξ˙s ∂ξs t˙s + e

√

2 ω xs

y˙s = a;

(1)

√

√

e2 2 ω xs y˙s + 2 e 2 ω xs t˙s = b; z˙ s = c; √ √ √ √ x¨s + (ω/ 2) e2 2 ω xs y˙s2 + 2 ω e 2 ω xs t˙s y˙s = 0;

(2) (3) (4)

for constant a, b, c. Equations (1) and (2) jointly are equivalent to: √

t˙s = b e− √

y˙s = 2 a e−

2 ω xs

2 ω xs

(1 )

− a;

− b e−2

√

2 ω xs

;

(2 )

and then using Eqs. (1 ), (2 ), (3), we see that Eq. (0) is equivalent to: 2 √ √ 1 + b e− 2 ω xs − a + x˙s2 + c2 = 21 b2 e−2 2 ω xs , or equivalently: x˙s2 +

1 2

2 √ 2 a − b e− 2 ω xs = a 2 − c2 − 1.

(0 )

Note that necessarily a 2 ≥ 1 + c2 , and also ab > 0 (since by (0 ), ab ≤ 0 would √ 1 imply a 2 − 1 ≥ a 2 − c2 − 1 − x˙s2 = 2 [2a − be− 2 ω xs ]2 ≥ 2a 2 , which is clearly impossible).

Relativistic Diffusion in Gödel’s Universe

529

Then, owing to Eq. (2), Eq. (4) is equivalent to: √ 2 x˙s + ys = Y, for some constant Y. ωb Set k := By

(0 ),

1 2 −2 2 [1 − (1 + c ) a ]

(4 )

∈ [0, √1 [. 2

we must have: √ √ x˙s = 2 a k cos(ω ϕs ), b e− 2 ω xs = 2 a − 2 a k sin(ω ϕs ),

(5)

for some angular component ϕs ; whence after derivation: √

ϕ˙s = b e− and then:

2 a ω (s − s0 ) =

ϕs

2 ω xs

= 2 a [1 − k sin(ω ϕs )],

(5 )

2 ω dϕ tg (ω ϕs /2) − k . =√ Arctg √ 1 − k sin(ω ϕ) 1 − k2 1 − k2

Therefore tg (ω ϕs /2) =

1 − k 2 tg a 1 − k 2 ω (s − s0 ) + k.

(6)

Moreover by (1 ), (1) we have: t˙s = a − 2 a k sin(ω ϕs ) = ϕ˙s − a, e

√

2 ω xs

y˙s + 2t˙s = ϕ˙s ,

so that we deduce now directly the following: Proposition 1. The timelike geodesics of Gödel’s universe G are determined by the following equations: √

e−

2 ω xs

= 2a b [1 − k sin(ω ϕs )], ys = Y − ts = T0 − a s + ϕs , z s = z 0 + c s,

2a ω b k cos(ω ϕs ),

the smooth angle ϕs being determined by: tg (ω ϕs /2) = 1 − k 2 tg a 1 − k 2 ω (s − s0 ) + k.

(6)

The real parameters a, b, c, Y, T0 , z 0 , s0 are constant along each geodesic, and such that a 2 ≥ 1 + c2, ab > 0, without any other constraint, and k = 21 [1 − (1 + c2 ) a −2 ] ∈ [0, √1 [. 2 The projection in the (x, y)-plane is bounded and periodic, and satisfies: 2 √ 2 b − 2 ω xs − 1 + ω2 ab (ys − Y ) = k 2 . 2a e Such timelike geodesic is future-directed if and only if the non-vanishing speed √

e

2 ω xs

y˙s + 2 t˙s = ϕ˙s = 2 a [1 − k sin(ω ϕs )]

is positive, hence if and only if a > 0.

(7)

530

J. Franchi

Remark 1. We deduce from Proposition 1 the following formula:

ts = T0 − a s + ω2 Arctg 1 − k 2 tg a 1 − k 2 ω (s − s0 ) + k ,

(8)

in which the successive determinations of Arctg, at the successive proper time values s ∈ s0 + √ π 2 ( 21 + Z), are understood to be chosen conveniently, in order that the a 1−k ω absolute time coordinate (ts ) be continuous, as it must be. Observe that (ts ) is strictly monotonic if and only if k ≤ 21 , or equivalently, if and only if (1 + c2 ) ≤ a 2 ≤ 2(1 + c2 ). Note that this last observation is related to the global non-causality of Gödel’s space-time: there are future-directed geodesic arcs along √ which absolute time (ts ) decreases. 3, and running time interval For example, taking a > 0 and k = 1/ √ √ the√proper 2 s √ √1 , 3 and then ϕs runs0 , s0 + 3Arctg , we see tg ( ωϕ ) running the interval 2 2aω 3 √ √ π 2 2nπ 2π 2nπ √ ] < 0. ning an interval 3ω + ω , 3ω + ω , so that t √3√Arctg√2 −ts0 = ω1 [ π3 − 3 Arctg s0 +

2

2aω

Remark 2. The case k = 0 is particular. It implies (using Eqs. (0 ), (1 ), (2 )): t˙2 = 1 + z˙ 2 and x˙ = y˙ = 0, and then: (xs , ys ) constant and ts = t0 + as, z s = z 0 + c s, with a 2 = 1 + c2 . Reciprocally, if x˙0 = y˙0 = 0, then (by Eqs. (2 ), (0 )) the corresponding geodesic must satisfy also k = 0, and then be included in the phase subspace defined by: E0 := {x˙ = y˙ = 0} = t˙2 = 1 + z˙ 2 ; x˙ = y˙ = 0 . Therefore, the case k = 0 corresponds to the geodesically stable phase subspace E0 . Remark 3. Every timelike geodesic is defined for all proper times s, unbounded and causal. Moreover, it never accumulates near its past. Proof. This is clear for timelike geodesics such that k = 0 by Remark 2, so that we can restrict to a timelike geodesic such that k > 0. Then, if for proper times s < s we had (xs , ys , ts ) = (xs , ys , ts ), then by Proposition 1 we should have: π √n tg a 1 − k 2 ω(s − s0 ) = tg a 1 − k 2 ω(s − s0 ) , then s = s + 2 |a|ω 1−k

∗

with n ∈ N , whence by Eq. (8) : ts − ts = sign(a) 2πω n −

√π n ω 1−k 2

0=

nπ ω

, and then the contradiction:

(2 − (1 − k 2 )−1/2 ) ≥

π ω.

Similarly, if we had proper times s < s with difference s − s bounded away from 0, such that (xs , ys , ts ) be arbitrarily close to (xs , ys , ts ), then by Proposition 1 we √ √

2 should have tg a 1 − k ω(s − s0 ) arbitrarily close to tg a 1 − k 2 ω(s − s0 ) , ∗

π √ whence s − s arbitrarily close to N , whence by Eq. (8), 0 arbitrarily close |a|ω 1−k 2

∗ √π Z , which yields the same contradiction as above. to 2π 2 ω − ω 1−k

Relativistic Diffusion in Gödel’s Universe

531

The following statement, which will be used later to ensure the irreducibility of the relativistic diffusion, shows up the non-causal structure of Gödel’s universe, despite the preceding Remark 3: the causal past of any point of Gödel’s universe is the whole Gödel’s universe. In particular, the causal boundary, in the sense of Penrose (or Geroch-Kronheimer-Penrose, see ([H-E], Sect. 6.8)), reduces to a single point. Proposition 2. Gödel’s universe G is piece-wise geodesically transitive: any two points of it can be linked by a piece-wise geodesic future-directed timelike continuous path. Proof. This result derives from the two following references: – Property (6) of [G1] shows up future-directed timelike curves, whose spacial projection draws a circle and along which absolute time decreases, which entails in fact (though this is not explained in [G1]) the transitivity of space-time under futuredirected timelike curves; – [P], according to which a point p of any given space-time can be linked to another point p by a future-directed timelike curve if and only if p can be linked to p by a piece-wise geodesic future-directed timelike path. However, for the sake of completeness, let us deduce the claim directly from the above. √ Using first Remark 2, by means of a single geodesic arc, with a = 2 and c = ±1, we reduce the proof to the case of points p, p (to be linked) having the same coordinate z. Fixing from now on c = 0, we can forget the coordinate z. By homogeneity of the space-time, we can then consider that p = (t, 0, 0, 0) and p = (0, x, y, 0), for arbitrary given (t, x, y) ∈ R3 . Let us use Proposition 1, considering the three following examples of future-directed geodesic moves: √ π 1) Using Remark 1, i.e. taking k = √1 , a = 3, ϕs = 3ω , ϕs = 2π 3ω , we get: 3

x s = x s , ys − ys =

2√ , ωb 3

ts − ts =

1 ω

π 3

−

√

√ 3 Arctg 2 √ 2

< 0.

Hence, moving so several times, we can decrease the absolute time t arbitrarily largely, without changing the x coordinate, and, choosing b large, almost without changing the y 6|x| 1 coordinate. We can thus suppose now that p = (t, 0, y0 , 0), with t < − log 3 1 − √ − 3 π and |y | < 1. 0 ω √ π 3π π 2) Taking k = 21 , a = 2, and (ϕs , ϕs ) = ( 2ω , 2ω ) or ( −π 2ω , 2ω ), we get: xs − xs = ± log√3 , ys = ys , ts − ts = πω 1 − √1 . ω 2

Hence, moving so approximately

3

|x|

log 3 |x| p = (t , x , y0 , 0), with t < − log 3 1−

√

ω 2 times, we can suppose now that √1 − πω and 0 < x − x < log√3 . Movω 2

3

π , a large b, and a convenient ing once more in a similar way, i.e. taking k = 21 , ϕs = 2ω π 3π ϕs ∈] 2ω , 2ω [, we link p to p1 = (t1 , x, y1 , 0), with t1 < − πω and |y1 | < 2. √ 3) Taking k = 21 , a = 2, and (ϕs , ϕs ) = (0, πω ) or ( πω , 2π ω ), we get: √ 2 xs = xs , ys − ys = ± 2ωb2 , ts − ts = πω 1 − √ . 3 3

532

J. Franchi

Hence, moving so once, and choosing b conveniently, we link p1 to p2 = (t2 , x, y, 0), with t2 < 0. Finally, observe from Remark 2 that (taking k = c = 0) there are future-directed timelike geodesics increasing at will the coordinate t, without changing any other coordinate: this allows to link p2 to p . 2.3. Lightlike geodesics and boundary of G. Equations (1), (2), (3), (4) remain the same, while the pseudo-norm Eq. (0) is replaced by: √ 2 t˙s2 + x˙s2 + z˙ s2 = 21 e 2 ω xs y˙s + 2 t˙s . (0

) As previously, knowing Eqs. (1) and (2), Eq. (4) is equivalent to √ 2 x˙ + y = Y, for some constant Y, (4 ) ωb and is implied by Eqs. (0

), (1), (2), (3). Thus lightlike geodesics are the solutions to the system: √

t˙s = b e− √

y˙s = 2 a e−

2 ω xs

2 ω xs

(1 )

− a;

− b e−2

√

2 ω xs

(2 )

;

z˙ s = c; and x˙s2 +

1 2

(3)

2 √ 2 a − b e− 2 ω xs = a 2 − c2 .

(0

)

Hence we must have again a 2 ≥ c2 , and ab > 0 (if not, the trajectory must be constant), and, setting now κ := 21 (1 − c2 /a 2 ) ∈ [0, √1 ], 2

we get the same equations for the geodesic motion as in Proposition 1, merely with κ replacing k. As the parameter s cannot here any longer have a meaning like proper time, but stands only for an affine parameter, determined up to a change s → us +v, the constant (a, b, c) is now irrelevant. Note that on the contrary, the constant Y of Eq. (4 ) is relevant. The only meaningful a priori parameter for a lightlike geodesic is the “impact” parameter: c b B = ( , , Y ) := , , Y ∈ B := [−1, 1] × R∗+ × R. (9) a a Eliminating s, we see indeed that a lightlike geodesic having impact parameter B solves: √

dt = ( e−

2ωx

√

√

√

− 1) dz; (2 − e− 2 ω x ) e− 2 ω x dt = ( e− 2 ω x − 1) dy; √ √ ( e− 2 ω x − 1) d x = ± 1 − 2 − 21 (2 − e− 2 ω x )2 dt.

Let us sum up the description of lightlike geodesics in the following statement.

Relativistic Diffusion in Gödel’s Universe

533

Proposition 3. Any lightlike geodesic (xτ , yτ , z τ , tτ ) having impact parameter B = ( , , Y ) ∈ B satisfies, for an additional parameter (Z 0 , T0 ) ∈ R2 and for any real τ : ⎛ √

⎞ √ √ 2 2 2 2 1− 1 + tg τ + 1 − √ 2 ⎜ ⎟ e− 2 ω xτ = × ⎝1 − ⎠;

2 √ √

2+ 1 + 2 tg τ + 1 − 2 yτ = Y +

⎛ 2(1 − 2 ) ω

⎞

⎜ ⎝1 − 2+

zτ = Z 0 +

tτ = T0 −

τ

ω (1 + 2 )/2

⎟

2 ⎠; √ 1 + 2 tg τ + 1 − 2

τ

ω (1 + 2 )/2

2 Arctg ω

4

;

(1 + 2 )/2 tg τ +

(1 − 2 )/2 ;

2 ω

2 1 − 2 (yτ − Y ) = . −1 + 2 2 2 Remark 4. The last equation in Proposition 3 shows that to any given lightlike geodesic is associated a cylinder C B , parallel to the (t, z)-coordinate plane. Reciprocally, by Proposition 3 again, any lightlike geodesic which is drawn on the cylinder C B has a prescribed projection on the (x, y)-coordinate plane (up to changing affine parameter τ ). The equations displayed in Proposition 3 define a lightlike geodesic associated to any given B ∈ B. The successive determinations of Arctg in the above expression of tτ are understood as in Remark 1, to be consistent with the continuity of t· Considering then any continuous angular parameter ϕ = ϕτ (determined modulo 2π/ω) such that tg (ω ϕτ /2) = (1 + 2 )/2 tg τ + (1 − 2 )/2, CB :

+

√

√

e−

2 ω xτ

by Proposition 3 we have: √

e−

2 ω xτ

= 2 − 2 (1 − 2 ) sin(ω ϕτ ) − 2 (1 − 2 ) cos(ω ϕτ ),

and

ω yτ = ω Y

together with: tτ = T0 −

ω

√

τ (1+ 2 )/2

+ ϕτ , z τ + tτ = Z 0 + T0 + ϕτ .

Since the function τ → ωϕτ − 2τ is π -periodic, the functions tτ τ → ω tτ − 2 (1 − [2(1 + 2 )]−1/2 ) τ and τ → z τ − 2(1 + 2 ) − 1 √0 +ϕτ −2τ/ω) = Z 0 − (T 2 2(1+ ) −1

534

J. Franchi

are π -periodic too. This implies that tτ wanders out to infinity, nearly linearly, and that the projection on the (t, z)-coordinate plane of any lightlike geodesic has an asymptotic direction: lim

τ →±∞

zτ . = tτ 2(1 + 2 ) − 1

This prescribes geometrically the sign of parameter , which was not determined by the cylinder C B alone, which however determines (| |, , Y ). Note that the impact parameter B = ( , , Y ) ∈ B has thus indeed a clear geometrical meaning or picture: it can be identified with the oriented cylinder (C B = C B (| |, , Y ), sign( )). Note finally that the additional parameter (Z 0 , T0 ) depends on a translation on the parameter (τ, ϕτ ), and then, contrary to B = ( , , Y ), is geometrically irrelevant. Recall that in a strongly causal space-time, it seems natural to use the causal boundary, in the sense of Penrose, to classify lightlike geodesics by gathering in an equivalence class, called a beam, all geodesics which converge to a given causal boundary point (having asymptotically the same past, see ([H-E], Sect. 6.8)). On the contrary, in the present setting (recall Proposition 2) such classification is totally inoperative. It seems that no alternative classification has been proposed so far, which is relevant in a non-causal setting. Now, owing to the above Remark 4, we are led to adopt here the following alternative classification of lightlike geodesics into beams, and then also, to see the 3-dimensional space of beams as an alternative notion of (non-causal) boundary, as follows: Definition 2. Let us call beam, or boundary point, of Gödel’s universe, any equivalence class of lightlike geodesics, identifying those which have the same impact parameter B = ( , , Y ) ∈ B. Thus B = [−1, 1] × R∗+ × R is the boundary of Gödel’s universe. Let us say that a curve s → ξs = (ts , xs , ys , z s ) of class C 1 in Gödel’s universe converges to the beam B = ( , , Y ) if, setting: as := t˙s + e

√

2 ω xs

y˙s , bs := e

√

2 ω xs

(2 t˙s + e

√

2 ω xs

y˙s ), and Ys :=

√ 2 x˙s + ys , ω bs

the following convergences hold, as s → +∞: z˙ s −→ , as

√ 2 ω

2 bs 1− 2 e− 2 ω xs −1 + (ys −Y ) −→ . −→ , Ys −→ Y, as 2 2 2

Recall from Remark 4 the following picture of beams (or impact parameters): any beam B = ( , , Y ) ∈ B can be identified with the oriented cylinder (C B = C B (| |, , Y ), sign( )). The notion of convergence to the boundary B can be reinforced to a certain extent: see Corollary 7 and Remark 13, in the last Sect. 3.8. We saw above that any lightlike geodesic belonging to a beam B converges to it. On the contrary, a timelike geodesic does not converge to any beam: by Proposition 1, we get indeed B = ( ac , ab , Y ) ∈ B, but the cylinder C B has too small a “radius”, since we have k 2 = [1− 2 −a −2 ]/2 < [1− 2 ]/2.

Relativistic Diffusion in Gödel’s Universe

535

Proposition 4. The isometry group of Gödel’s universe (recall Sect. 2) operates on the boundary B, so that the above Definition 2 is consistent. It acts precisely as follows: (1) The translation (t, x, y, z) → (t + t0 , x, y + y0 , z + z 0 ) changes ( , , Y ) into ( , , Y + y0 ). √ (2) The hyperbolic dilatation (t, x, y, z) → (t, x + x0 , y e− 2 ω x0 , z) changes ( , , Y ) into ( , e

√

2 ω x0

√

, Y e−

2 ω x0

).

(3) The rotational symmetry (u, r, φ, z) → (u, r, φ + φ0 , z) changes ( , , Y ) into , α + [ − α] cos(ω φ0 ) − ω Y sin(ω φ0 ), ω cos(ω φ0 ) + [ − α] sin(ω φ0 ) , ω [α + [ − α] cos(ω φ0 ) − ω sin(ω φ0 )] 2a cosh (2r )−sinh 2 (2r ) φ˙ is constant under φ → φ + φ0 , and on each geodesic. u+2 ˙ sinh 2 r φ˙ 2

1+ 2 2 2 We have indeed: α = 2 (1+ω2 Y 2 )+ 1+

, or equivalently: −α = 2 (1−ω Y )− .

where α :=

Proof. The two first items are straightforward. On the other hand, the action of the rotational symmetry (u, r, φ, z) → (u, r, φ + φ0 , z) is not so obvious. However, a com˙ and: putation shows that we have in coordinates (u, r, φ, z): a = u˙ + 2(sinh r )2 φ, b = A+ cos(ω φ) − 2ω−1r˙ sin(ω φ),

Z := ω b Y = sin(ω φ)+2ω−1 r˙ cos(ω φ),

with A := 2a cosh (2r ) − sinh 2 (2r ) φ˙

and

˙ sinh (2r ).

:= [2a − cosh (2r ) φ]

∂L is seen to be constant on each ∂ φ˙ geodesic, by looking at the expression of the Lagrangian L in coordinates (u, r,√φ, z). Alternatively, a computation yields: 2 A = b + ω2 b (2Y y − y 2 ) + (4a − b e− 2 ω x ) √ − 2 ω x , whence by using Proposition 3: e 2 4 2(1− )2 2 2 2A 2α − = a − = ω Y − ω2 2 1 − 2 + tg 2 ( ωϕ2 τ ) ⎤ ⎡ √ 2 4 1 − 2 tg 2 ( ωϕ2 τ ) 1⎣ ⎦ + 4−

2 + tg 2 ( ωϕ2 τ )

˙ sinh2 r = 2a + 2 Note that A = 2a + 4[a − cosh2 r φ]

= ω2 Y 2 +

(1 − 2 ) 4 −2 , whence

α=

1 + 2 (1 + ω2 Y 2 ) + . 2

Now we have at once: under φ → φ + φ0 , (a, b, Z ) is changed into (a, A + [b − A] cos(ω φ0 ) − Z sin(ω φ0 ), Z cos(ω φ0 ) + [b − A] sin(ω φ0 )),

536

J. Franchi

so that ( , , Y ) is changed into (recall from the above that α = A/a): , aA + [ − aA ] cos(ω φ0 ) − ω Y sin(ω φ0 ), ω cos(ω φ0 ) + [ − aA ] sin(ω φ0 ) . ω [ aA + [ − aA ] cos(ω φ0 ) − ω sin(ω φ0 )] 2.4. Ricci curvature and energy tensor. Recall that the Christoffel symbols are computed by: ∂gi j ∂gi k 1 k ∂g j , i j = 2 g + − ∂ξ i ∂ξ j ∂ξ or by the fact that geodesics solve ξ¨ k + ikj ξ˙ i ξ˙ j = 0, and that the Ricci tensor (Ri j ) is computed by: Ri j = From Eqs. t xy

(1 ),

∂ikj ∂ξ k

−

k ∂ik k k + k i j − i jk . ∂ξ j

(2 ),

(3), we find all non-vanishing Christoffel coefficients: √ √ √ √ √ x x = ty = (ω/ 2 ) e 2 ω x , tt x = 2 ω, yy = (ω/ 2 ) e2 2 ω x , √ √ y t x = − 2 ω e − 2 ω x .

Therefore we get all non-vanishing Ricci coefficients: Rtt = 2ω2 ; Rty = Ryt = 2ω2 e Hence, the scalar curvature is Einstein equations Ri j −

√

2ωx

; Ryy = 2ω2 e2

√

2ωx

.

R = g i j Ri j = 2ω2 . 1 2

R gi j + gi j = Ti j

are satisfied, with cosmological constant = ω2 representing a positive pressure, and √ √ √ 2 energy tensor (Ti j ) = (Ri j ) = (u i u j ), where u := ( 2 ω, 0, 2 ω e ω x , 0) represents the four-velocity of matter, which rotates with constant velocity ω. The energy is thus: 2 √ E(ξ, ξ˙ ) := Ti j (ξ ) ξ˙ i ξ˙ j = 2ω2 t˙ + e 2 ω x y˙ = 2ω2 a(ξ, ξ˙ )2 . In particular, the weak energy condition of [H-E] holds: E(ξ, ξ˙ ) ≥ 0 for any timelike vector ξ˙ = (ξ˙ i ). The dominant energy condition of [H-E] means that moreover the vector (T i j (ξ )ξ˙ j ) must be non-spacelike, which is equivalent to g k (ξ )Tki (ξ )T j (ξ )ξ˙ i ξ˙ j ≥ 0. It holds here too, since we get: g k (ξ )Tki (ξ )T j (ξ )ξ˙ i ξ˙ j = g k (ξ )u k u × (u i ξ˙ i )2 = 4ω4 a(ξ, ξ˙ )2 . Similarly, the strong energy condition of [H-E]: E(ξ, ξ˙ ) ≥ 21 g k (ξ )Tk (ξ )gi j (ξ )ξ˙ i ξ˙ j , √ is satisfied too, since it is here equivalent to: 2 a(ξ, ξ˙ )2 ≥ t˙2 − x˙ 2 + 21 e2 2 ω x y˙ 2 + √

2 e 2 ω x t˙y˙ − z˙ 2 , or also to: t˙2 + 27 e2 clearly for any timelike ξ˙ ∈ Tξ G.

√

2ωx

y˙ 2 + 2 e

√

2ωx

t˙y˙ + x˙ 2 + z˙ 2 ≥ 0, which holds

Relativistic Diffusion in Gödel’s Universe

537

3. Relativistic Diffusion (ξs , ξ˙ s ) on G Recall from [F-LJ] that the general expression of the relativistic operator L is: L = ξ˙ k

2

∂ ∂ 3σ ˙k ˙ i ξ˙ j ikj (ξ ) ξ + − ξ + 2 ∂ξ k ∂ ξ˙ k

σ2 2

(ξ˙ k ξ˙ l − g kl (ξ ))

∂2 , ∂ ξ˙ k ∂ ξ˙ l

σ being an arbitrary fixed strictly positive (speed or heat) parameter. Equivalently in the present setting, the relativistic diffusion (ξs , ξ˙s ), in coordinates ξ = (t, x, y, z), solves the following system of stochastic differential equations: dts = t˙s ds; d xs = x˙s ds; dys = y˙s ds; dz s = z˙ s ds; √ √ √ 2 d t˙s = −2 2 ω t˙s x˙s ds − 2 ω e 2 ω xs x˙s y˙s ds + 3 2σ t˙s ds + σ d Mst ; √ √ √ √ 2 d x˙s = − 2 ω e 2 ω xs t˙s y˙s ds − (ω/ 2 ) e2 2 ω xs y˙s2 ds + 3 2σ x˙s ds + σ d Msx ; √ √ 2 d y˙s = 2 2 ω e− 2 ω xs t˙s x˙s ds + 3 2σ y˙s ds + σ d Msy ; d z˙ s =

3σ2 2

z˙ s ds + σ d Msz ; y

where the R4 -valued martingale Ms := (Mst , Msx , Ms , Msz ) has (rank 3) quadratic covariant matrix: √ ⎛ ⎞ t˙s2 + 1 t˙s x˙s t˙s y˙s − 2 e− 2 ω xs t˙s z˙ s j d Msi , d Ms ⎜ x˙s2 + 1 x˙s y˙s√ x˙s z˙ s ⎟ t˙s x˙s √ ij ⎟. ((K s )) := =⎜ ⎝ ds t˙s y˙s − 2 e− 2 ω xs x˙s y˙s y˙s2 + 2 e−2 2 ω xs y˙s z˙ s ⎠ x˙s z˙ s y˙s z˙ s z˙ s2 + 1 t˙s z˙ s Recall that the unit pseudo-norm relation reads: 1 + t˙s2 + x˙s2 + z˙ s2 =

1 2

√

e

2 ω xs

y˙s + 2 t˙s

2

.

(0)

Thus, the relativistic diffusion (ξs , ξ˙s ) is 7-dimensional, having phase space: √ 2 , E := (t, x, y, z, t˙, x, ˙ y, ˙ z˙ ) ∈ R8 1 + t˙2 + x˙ 2 + z˙ 2 = 21 e 2 ω x y˙ + 2 t˙ or equivalently: E = (t, x, y, z, t˙, x, ˙ y, ˙ z˙ ) ∈ R8 1 + x˙ 2 + z˙ 2 +

√ 1 2 2ωx 2e

2 √ . y˙ 2 = t˙ + e 2 ω x y˙

Note that the particular phase subspace distinguished in Remark 2: E0 := E ∩ {x˙ = y˙ = 0} = E ∩ t˙2 = 1 + z˙ 2 ; x˙ = y˙ = 0 is clearly not stable under the relativistic diffusion (ξs , ξ˙s ), contrary to the geodesic flow, and even instantly unstable: starting from any point in E0 , its exit time from E0 is null.

538

J. Franchi

Remark 5. On one hand, it is convenient in the Riemannian case (see for example [N-R-W]) to get the Brownian motion on a symmetric space by means of a left Brownian motion on a covering Lie group. On the other hand, Theorem 3.2(ii) of [F-LJ] states that the relativistic diffusion on a Lorentz manifold is got by development (from a fixed tangent space) of its Minkowskian analogue, the Minkowskian relativistic diffusion being essentially an integrated hyperbolic Brownian motion (living however on the tangent bundle). Thus, in the case of G, Sect. 2.1 could let it be thought that developing such an integrated hyperbolic Brownian motion would yield a more invariant and tractable presentation of the relativistic diffusion. However, on one hand Gödel’s metric is not the Killing one (which is degenerate), and solving explicitly the equation of parallel transport is not easy here. And on the other hand, the group structure of G does not seem to lift to a natural group structure on or over the pseudo-unit tangent bundle T 1 G, and therefore does not seem to allow a more invariant presentation of the relativistic diffusion (this latter exists only at the level of the tangent bundle). Even geodesics of G through the unit are not simply deduced from geodesics of G (which are straight lines), despite the Gavrilov equation ξ˙s = Jac1 L ξs × (g 0 )−1 × Jac1 Ad(ξs ) × g 0 × ξ˙0 appearing in ([L], (6)) (which does not simplify the solution in Sect. 2.2 above). 3.1. Reduction of the dimension. The study of geodesics induces consideration of the following quantities (which, as z˙ s , are constant along each geodesic), setting (as in Definition 2): as := t˙s + e

√

2 ω xs

y˙s

bs := e

and

√

2 ω xs

(2 t˙s + e

√

2 ω xs

y˙s ).

(10)

Recall from Sect. 2.4 that as2 is an energy. Then we have: das =

3σ2 2

as ds + σ d Msa =

3σ2 2

as ds + σ (d Mst + e

and dbs =

3σ2 2

bs ds + σ d Msb =

3σ2 2

bs ds + σ e

√

2 ω xs

√

2 ω xs

(2 d Mst + e

Moreover we have: √ √ √ √ d x˙s = (ω/ 2 ) e−2 2 ω xs bs2 ds − 2 ω e− 2 ω xs as bs ds +

3σ2 2

√

d Msy ); 2 ω xs

d Msy ).

x˙s ds + σ d Msx ,

and the R4 -valued martingale M˜ s := (Msa , Msb , Msx , Msz ) has (rank 3) quadratic covariation matrix: √ ⎞ ⎛ as2 − 1√ as bs − 2 e√ 2 ω xs as x˙s as z˙ s ⎜ a b − 2 e 2 ω xs b2 − 2 e2 2 ω xs b x˙ ij bs z˙ s ⎟ ⎟. s s s s s (( K˜ s )) = ⎜ ⎝ 2 as x˙s bs x˙s x˙s + 1 x˙s z˙ s ⎠ as z˙ s bs z˙ s x˙s z˙ s z˙ s2 + 1 From this, we deduce the following, which will allow, in the following sections, the asymptotic study (as proper time s goes to infinity) of relativistic paths. Corollary 1. The (7-dimensional) relativistic diffusion (ξs , ξ˙s ) admits the following subdiffusions: (as ); (˙z s ); (as , z˙ s ); (xs , x˙s , as , bs ).

Relativistic Diffusion in Gödel’s Universe

539

The unit pseudo-norm relation can be written: √

1 + x˙s2 + z˙ s2 + (as − e−

2 ω xs

bs )2 =

√ 1 −2 2 ω xs e 2

bs2 ,

(00)

or equivalently: 1 + x˙s2 + z˙ s2 +

1 2

√

(2 as − e−

2 ω xs

(00 )

bs )2 = as2 .

Hence the phase space E of the relativistic diffusion (ξs , ξ˙s ) can be written equivalently: √ E = (t, x, y, z, a, b, x, ˙ z˙ ) ∈ R8 1 + x˙ 2 + z˙ 2 + 21 (2 a − e− 2 ω x b)2 = a 2 . And the particular phase subspace E0 distinguished in Remark 2 can be written: √ E0 = E ∩ a 2 = 1 + z˙ 2 ; 2 a = e− 2 ω x b; x˙ = 0 = E ∩ a 2 = 1 + z˙ 2 . √

Remark 6. We see in particular that as2 ≥ 1 and that bs2 ≥ 2 e2 2 ω xs , for any proper time s ≥ 0. Therefore, (as ) and (bs ) almost surely never vanish. Moreover, √ √ bs they must have the same sign, since (00 ) implies e− 2 ω xs − 2 ≤ 2 and then a s √ √ √ √ bs as ≥ 2 − 2. This implies also e 2 ω xs − 1 ≤ 1/ 2. e− 2 ω xs as bs Remark 7. The phase space E splits into two connected components: E = E + E − , with E + := E ∩ {a ≥ 1, b > 0} and E − := E ∩ {a ≤ −1, b < 0}. Similarly, E0 = √E0+ E0− , √ with E0+ := E0 ∩ E + and E0− := E0 ∩ E − . Note that since 2t˙s + e 2 ω xs y˙s = e− 2 ω xs bs , the paths in E + are always future-directed. Since the symmetry (a, b) → (−a, −b) exchanges (E + , E0+ ) and (E − , E0− ), from now on, we can restrict the phase space of the relativistic diffusion (ξs , ξ˙s ) to E + (its behaviour on E − being trivially related). 3.2. Sub-processes (λs ) and (ϕs ). The unit pseudo-norm relation (00 ) and the determination of timelike geodesics (recall Sect. 2.2) induce to consider two real sub-processes (λs ) and (ϕs ), defined by: λs := argch as2 − z˙ s2 , x˙s = sinh(λs ) cos(ωϕs ), √

e−

2 ω xs

bs = 2 as −

√

2 sinh(λs ) sin(ωϕs ).

Proposition 5. The sub-processes (λs ) and (ϕs ) satisfy the following equations: √

dλs = σ dβsλ + σ 2 coth (2λs )ds, dϕs = σ d Msϕ + e− 2 ω xs bs ds,

s 2 a z where βsλ := sinh (2λτ ) [aτ d Mτ − z˙ τ d Mτ ] is a Brownian motion, and the martin-

0 ϕ gale (Ms ) defined by ϕ d Ms , dβsλ = 0. In particular, (λs )

ϕ

d Ms :=

sinh (λs ) cos(ωϕs ) dβsλ −d Msx ω sinh (λs ) sin(ωϕs )

ϕ

satisfies : d Ms =

ds , ω2 sinh2 (λs )

is a non-negative sub-diffusion (of the relativistic diffusion).

540

J. Franchi

Proof. We have first (using Sect. 3.1): sinh(2λs)dλs + cosh(2λs)dλs = d[cosh2 λs ] = d[as2 − z˙ s2 ] = 2σ [as d Msa − z˙ s d Msz ] + 2σ 2 cosh(2λs)ds, whence dλs + coth (2λs )dλs = σ dβsλ + 2σ 2 coth (2λs )ds, and it is easily verified that dβsλ = ds, whence the first formula. The two other quadratic covariations displayed in the statement are also directly computed. Then, d x˙s = d[sinh(λs ) cos(ωϕs )] = σ cosh(λs ) cos(ωϕs )dβsλ − ω sinh(λs ) sin(ωϕs )dϕs + σ2

2

cos(ωϕs ) + 3 x˙s ds − sinh(λs )

ω2 2

x˙s dϕs −

ω 2

x˙s cosh(λs ) sin(ωϕs )dλs , dϕs .

ϕ

Using the definition of (Ms ), and Sect. 3.1 again, we deduce the equation giving dϕs . Remark 8. The phase subspace E0 of Remarks 2 and 7 is precisely: E0 = E ∩ {λ = 0}. The equation satisfied by (λs ) can be precisely solved as follows, provided λ0 > 0, using some real Brownian motion β, started from β0 = 21 log[coth λ0 ]:

λs =

1 2

log coth β inf u

u

0

coth (2βv ) dv = σ s . 2

2

This implies that we have almost surely: λs > 0 for any s > 0: the state subspace E0 of Remark 2 is polar for the relativistic diffusion. It is also instantly unstable. Hence we can finally restrict the state space E + (recall Remark 7) of the relativistic diffusion to E + \E0+ . 3.3. Study of the one-dimensional sub-diffusions (as ), (˙z s ), (λs ). These three onedimensional sub-diffusions are easily handled. Lemma 1. There exist three standard real Brownian motions (ws ), (ws ), (w˜ s ), and three almost surely converging processes (ηs ), (ηs ), (η˜ s ), such that we have: as = exp σ 2 s + σ ws + ηs |˙z s | = exp σ 2 s + σ ws + ηs

for any proper time s ≥ 0,

for any sufficiently large proper time s,

and λs = σ 2 s + σ w˜ s + η˜ s

for any proper time s ≥ 0.

Relativistic Diffusion in Gödel’s Universe

541

Proof. The stochastic differential equations satisfied by (as ) and (˙z s ) are respectively: 2 2 das = 3 2σ as ds + σ as2 − 1 dws , and d z˙ s = 3 2σ z˙ s ds + σ z˙ s2 + 1 dws , for two standard real Brownian motions (ws ) and (ws ). These equations are solved as follows; we have real Brownian motions (Wu ) and (Wu ) such that: u 2 −2 as = F W inf u (Wv − 1) dv > σ s , 0

and

u (1 − |Wv |2 )−2 dv > σ s , z˙ s = G W inf u 0

with

F(W ) :=

√ W W 2 −1

W

. 1−|W |2

and G(W ) := √

u

Clearly, as u increases to the hitting time of 1 by W , then 0

(Wv2 − 1)−2 dv increases

to infinity, Wu goes to 1, and F(Wu ) goes to infinity, showing that as goes almost surely to infinity with s. The same reasoning holds for (˙z s ) (except that |W0 | must be smaller than 1, while W0 must be larger than 1 (recall Remark 7)), so that, in the same way, |˙z s | goes almost surely to infinity with s. Then we have almost surely, for any sufficiently large proper time s: 2 1 d log as = (1 + 2 a 2 )σ ds + σ 1 − as−2 dws , s d log |˙z s | = (1 − 2 1z˙ 2 )σ 2 ds + σ 1 + z˙ s−2 dws . s

Whence, for real Brownian motions w, ˜ w˜ and for sufficiently large proper times s0 , s: ⎤ ⎡

s 2

s 2 −4 σ du σ a du ⎥ ⎢ log as = log a0 + σ 2 s + σ ws + − w˜ ⎣ u

2 ⎦ 2 0 2 au 0 1+ 1− 1 2 a u

= σ 2 s + o(s) > σ 2 s/2, and similarly:

log

|˙z s | = σ 2 (s − s0 ) + σ (ws − ws 0 ) − |˙z s0 |

s

s0

σ 2 du 2 z˙ u2

⎡ ⎤

s 2 −4 σ z˙ du ⎥ ⎢ + w˜ ⎣ u

2 ⎦ s0 1 + 1+ 1 z˙ 2 u

= σ 2 s + o(s) > σ 2 s/2. The convergence of the integrals in the above formulas follows, implying the two first claims. Then, since coth (2λs ) > 1, the comparison theorem and Proposition 5 ensure that we have almost surely: λs ≥ λ0 + σ w˜ s + σ 2 s −→ +∞.

542

J. Franchi

Moreover, we have almost surely λs ≥ σ 2 s/2, for large

s enough s0 and for s ≥ s0 . du 2 2 We deduce that η˜ s := λs − σ s − σ w˜ s = η˜ s0 + 2σ converges almost 4λu − 1 e s0 surely. Lemma 1 and Proposition 5 imply immediately the following. Corollary 2. There exists an almost surely converging process (ηˇ s ) such that:

s √ 2 5/9 e− 2 ω xu bu du + ηˇ s , for any s ≥ 0. And |z s | = eσ s+o(s ) as s → ∞. ϕs = 0

We have also the following lower control, which we shall use later. # √ 2 Lemma 2. For any A > 3, we have P (∃ s > 0) |˙z s | ≤ A eσ s/2 |˙z 0 | ≥ A2 < # √ 2 1/ A, and P (∃ s > 0) as ≤ A eσ s/2 a0 ≥ A2 ≤ 1/A. √ Proof. Fix A > 3 and |˙z 0 | ≥ A2 . The stochastic differential equation satisfied by (log |˙z s |), already written in the proof of Lemma 1, is equivalent to: −σ 2 s/2 σ 2 −2 d log ˙z s e = 2 (1 − z˙ s ) ds + σ 1 + z˙ s−2 dws . Let us apply the comparison theorem (see for example ([I-W], Theorem 4.1)): setting √ 2 2 T Az := inf s |˙z s | = A eσ s/2 and log rs := log A2 + σ2 (1− A−2 ) s +σ 1 + A−2 ws , 2 we have: inf z z˙ s e−σ s/2 ≥ inf z rs , whence 0≤s≤T A

0≤s≤T A

P[T Az < ∞] ≤ P[ log rs hits log A] = P[ ws −

1−A−2 1+A−2

s/2 hits log A] = A

Similarly, for a0 ≥ A2 and T Aa := inf{u | au = A eσ

2 d as e−σ s/2 =

σ2 2

2 s/2

−2

− 1−A−2 1+A

√ < 1/ A.

}: since

as−2 ds + σ 1 − as−2 dws ,

we get:

P[T Aa < ∞] ≤ P[ ws − s/2 hits log A] = 1/A. 3.4. Study of the two-dimensional sub-diffusion (as , z˙ s ). We get easily an asymptotic variable of the sub-diffusion (as , z˙ s ). Proposition 6. The process (˙z s /as ) converges almost surely, toward some random limit such that 0 < | | ≤ 1.

Relativistic Diffusion in Gödel’s Universe

543

Proof. Using Itô’s Formula and Sect. 3.1, we get: d

d z˙ s σ 2 z˙ s z˙ s z˙ s das z˙ s das das , d z˙ s σ z˙ s z a = d M − + − = − d M ds, s s − 2 3 2 as as as as as as as as3

with % $ 1 − |˙z s /as |2 z˙ s as−1 d Msz − d Msa = ds. as as2 Hence, we have some real Brownian motion Wˇ such that almost surely, for any s ≥ 0: z˙ s z˙ 0 = − σ2 as a0

0

s

z˙ u du + σ Wˇ au3

0

s

1 − |˙z u /au |2 du , au2

which almost surely converges toward some random limit ∈ R, by Lemma 1. The unit pseudo-norm relation (00 ) implies as2 − 1 ≥ z˙ s2 , and then 2 ≤ 1. Similarly, using Lemma 1 again, for any large enough proper time s we have: 2 das σ as as as d z˙ s as d z˙ s das , d z˙ s σ as a z = d Ms − d − + − = d Ms + 3 ds, 2 3 2 z˙ s z˙ s z˙ s z˙ s z˙ s z˙ s z˙ s z˙ s

with $

z˙ s−1

% (as /˙z s )2 − 1 as a z d Ms − = d Ms ds, z˙ s z˙ s2

whence the almost sure convergence of (as /˙z s ), which proves that = 0 almost surely. Remark 9. It can be shown that the process (˙z s − as ) converges in law, toward the law of 1/2

∞ 2 −2 u−2 wu (1 − ) e du × N , N denoting a N (0, 1) Gaussian variable, indepen0

dent from ( , w). It can also be shown that it does not converge in probability, letting one think that the asymptotic σ -algebra of (as , z˙ s ) is generated by the only variable . The following statement ensures that the range of possible limits in Proposition 6, is the whole [−1, 0[ ∪ ]0, 1]. This provides a continuum of non-trivial bounded harmonic functions for the relativistic operator L on T 1 G. Proposition 7. For any real 0 suchthat 0 < | 0 | ≤ 1, and for any ε > 0, we have z˙ s P 0 − ε < = s→∞ lim < 0 + ε > 1 − ε, provided z˙ 0 /a0 is close enough from as 0 and a0 is large enough. Proof. Fix A > 9, a0 > A2 and |˙z 0 | ≥ A2 , such that z˙ 0 /a0 is close to 0 (precisely, we demand | log( a0z˙ 0 0 )| < A−2 ), and consider the event: 2 2 A := as2 > 1 + A2 eσ s and z˙ s2 > A2 eσ s

for all s ≥ 0 .

544

J. Franchi

√ By Lemma 2 we have: P(A) > 1 − 2/ A. Now, on A we have:

∞

∞

∞ du du du −2 −2 + ≤ 2σ A and ≤ σ −2 A−2 . 2−1 2 2 a z ˙ z ˙ 0 0 0 u u u |˙z s | , displayed in the proof of Hence, we see from the expression giving log as2 − 1 Proposition 6, that we have on A: | log( / 0 )| ≤ 2 A−2 + σ max{|Wˇ s | | 0 ≤ s ≤ σ −2 A−2 }. Finally, as

P[σ max{|Wˇ s | | 0 ≤ s ≤ σ −2 A−2 } > A−1/2 ] ≤ 2 P[max{Wˇ s | 0 ≤ s ≤ A−2 } > A−1/2 ] √ = 2 P[ |Wˇ A−2 | > A−1/2 ] = 4 P[Wˇ 1 > A ] < e−A/2 ,

√ we obtain: P[ | log( / 0 )| ≤ 2 A−2 + A−1/2 ] > 1 − 2/ A − e−A/2 .

We shall need to know that in fact | | < 1 almost surely. Lemma 3. The random limit = lim (˙z s /as ) of Proposition 6 satisfies almost surely: s→∞

0 < | | < 1.

Proof. Let us consider As := 1 − (1 + z˙ s2 )as−2 = as−1 sinh λs (recall Sect. 3.2), which √ goes almost surely to 1 − 2 , by Lemma 1 and Proposition 6. On the other hand, using Itô’s Formula and Proposition 5, we get: d(log As ) = d(log[sinh λs ]) − d(log as ) dλs σ = coth λs dλs − − σ 2 (1 + 21 as−2 )ds − d Msa 2 2 sinh λs as as 1 z˙ s a z d Ms − = σ − d Ms sinh2 λs as sinh2 λs σ2 1 1 + 2 coth (λs )coth (2λs ) − − 2 − 2 ds 2 sinh2 λs as 2 σ σ (˙z s2 + 1) d Msa − as z˙ s d Msz − = ds. as sinh2 λs 2 as2 ˇ Hence, we have for some real Brownian motion B: ∞

∞ z˙ s2 + 1 ds 2 2 ˇ log(1 − ) = 2 log A0 + 2 σ B ds − σ , 2 2 as sinh λs as2 0 0 which converges (in R) almost surely, by Lemma 1 and Proposition 6, showing that indeed 2 < 1 almost surely. We have furthermore the following. Proposition 8. The law of the random limit = lim (˙z s /as ) has no atom. s→∞

Relativistic Diffusion in Gödel’s Universe

545

Proof. Fix any 0 ∈ ] − 1, 1[, and set δs := z˙ s − 0 as . The stochastic differential equation satisfied by (δs ) is easily seen to be: dδs =

3σ2 2

δs ds + σ

δs2 + 1 − 20 dβs ,

for some standard real Brownian motion (βs ). This diffusion equation can be solved as −1 1 follows: we have a real Brownian motion (Wu ) (started from W0 ∈] 1− 2 , 1− 2 [ ) such 0

that:

δs = F W inf u

0

with F(Wu ) :=

(1 − 20 ) dv

u

0

1 − (1 − 20 )2 Wv2

>σs

,

(1− 20 )3/2 Wu . 1−(1− 20 )2 Wu2

As u increases to the hitting time of ±(1 −

20 )−1

u

by W , then

(1 − 20 ) dv

1 − (1 − 20 )2 Wv2 increases to infinity, Wu goes to ±(1 − and F(Wu ) goes to ±∞, showing that |δ | goes almost surely to infinity with s. (The invariant measure of the diffusion (δs ) is s 0

20 )−1 ,

δ 2 + 1 − 20 dδ.) Then we have almost surely, for any sufficiently large proper time s: d log |δs | = (1 −

1− 20 ) σ 2 ds 2 δs2

& ±σ 1+

1− 20 δs2

dβs .

Whence, for real Brownian motions w, w˜ and for sufficiently large proper times s0 , s: log

|δs | = σ 2 (s − s0 ) + σ (ws − ws0 ) − |δs0 |

σ 2 (1− 20 ) 2

s

s0

du + σ 1− 20 w˜ 2 δu

s du s0

δu2

= σ 2 s + o(s) > σ 2 s/2. This implies the convergence of the integrals in the above formula, and then the existence of a standard real Brownian motion (ws 0 ) and of an almost surely converging process (ηs 0 ), such that almost surely, for any sufficiently large proper time s we have: |δs | = |˙z s − 0 as | = exp σ 2 s + σ ws 0 + ηs 0 = exp σ 2 s + o(s 5/9 ) . For the same 0 and (δs ) as above, we get as in the proof of Proposition 6: log

|δs /as | = −σ 2 |δs0 /as0 |

s

s0

z˙ u du σ 2 − au2 δu 2

s

s0

1 − (˙z u /au )2 du +σ Wˇ δu2

s

s0

1 − (˙z u /au )2 du , δu2

almost surely for any sufficiently large s0 , s. Using the above, this shows the almost sure z˙ s convergence of log as − 0 , hence by Proposition 6, that indeed P[ = 0 ] = 0.

546

J. Franchi

3.5. Study of the four-dimensional sub-diffusion (xs , x˙s , as , bs ). The process (bs ) alone is easily handled, in a way similar to Lemma 2. Lemma 4. There exist a real standard, real Brownian motion (ws

), and an almost surely converging process (ηs

), such that we have: bs = exp σ 2 s + σ ws

+ ηs

for any proper time s. Proof. By Remark 7 and Sect. 3.1, there exists a real standard, real Brownian motion (ws

) such that for any proper time s we have: √ √ 2 2 2 2 ω xs −2 d log bs = σ ds + σ e bs ds + σ 1 − 2 e2 2 ω xs bs−2 dws

, so that there exists another real Brownian motion w˜

such that:

s √ du 2 2 e2 2 ω xu 2 log bs = log b0 + σ s + σ bu 0 ⎛ ⎡ ⎤2 ⎞ √

s 2 2 ω x −1 u b 2e ⎜ u ⎦ du⎟ +σ ws

+ σ w˜

⎝ ⎣ ⎠. √ 0 bu + bu2 − 2 e2 2 ω xu Now using Remark 6 and Lemma 1 we get: 2

∞ √

∞ √ 2 2 ω xu du 2 ω x u au e e = au−2 du < ∞, bu2 bu 0 0 which shows that ηs

:= log bs − σ 2 s − σ ws

almost surely converges, as s → ∞.

Then we get easily a new asymptotic variable of the relativistic diffusion. Lemma 5. The process log(bs /as ) converges almost surely, as s → ∞. Proof. Recalling from Remark 6 that bs /as > 0, we have for any proper time s ≥ 0: √ bs = σ 2 e2 2 ω xs bs−2 ds − 21 σ 2 as−2 ds + σ (bs−1 d Msb − as−1 d Msa ), d log as or equivalently, for some real Brownian motion W :

s √

s bs b0 du du log − log = σ2 e2 2 ω xu 2 − σ 2 as a0 b 2 au2 0 0 √u √ s e2 2 ω xu 1 e 2 ω xu +σ W − − 2 du . 4 au bu bu2 au 0

s √ du Now, as already noticed in the proof of Lemma 4 for e2 2 ω xu 2 , by Remark 6 bu 0 and Lemma 1 we get the following, which guarantees the almost sure convergence of log(bs /as ):

∞ √2 ω x u

∞ √ e 2 ω x u au e a −2 du < ∞. du = au bu bu u 0 0

Relativistic Diffusion in Gödel’s Universe

547

By Remark 6 again, we deduce at once the following: Corollary 3. The process (xs ) is almost surely bounded. Setting := lim

s→∞

0< 1−

√1 2

≤ lim inf e s→∞

√

2 ω xs

≤ lim sup e s→∞

√

2 ω xs

≤ 1+

√1 2

bs , we have: as

< ∞.

Remark 10. It is possible to complete Corollary 3, as Remark 9 completed Proposition 6: it can be shown that the process (bs − as ) converges in law, but not in probability. The following statement, analogous to Proposition 7, ensures that the range of possible limits in Corollary 3 (and Lemma 5), is the whole ]0, ∞[. This provides another continuum of non-trivial bounded harmonic functions for the relativistic operator L . Proposition 9. For any real 0 such that 0 < 0 < ∞, and for any ε > 0, we have b P 0 − ε < = lim s < 0 + ε > 1 − ε, provided b0 /a0 is close enough from s→∞ as

0 and a0 is large enough. √ Proof. Fix A > 3, a0 > A2 , and use the expression displayed for log(bs /as ) in the proof of Lemma 5, Remark 6, and Lemma 2, to get an event of probability ≥ 1 − 1/A:

∞ du log − log b0 ≤ σ 2 (1 + √1 )2 2 a0 a2 0 u

∞ du 1 +σ max |Wu | 0 ≤ u ≤ 4(1 + √ ) 2 au2 0 ≤ 3 A−2 + σ max |Wu | 0 ≤ u ≤ 7 σ −2 A−2 , b0 −2 −1/2 +3A + A so that P | log − log 0 | ≤ log > 1−1/A − e−A/14 . a 0 0 3.6. Irreducibility. We establish here mainly the irreducibility property (i) of the main result (Theorem 1 below, also stated in the Introduction). In particular, the relativistic diffusion of G can perform any absolute time-stemming, in accordance with the noncausality of G. Note that this distinguishes it strongly from its analogues of Minkowski and Schwarzschild space-times, for which the absolute time component increases strictly with proper time. Proposition 10. (i) The relativistic diffusion is irreducible: from any starting point, it hits any non-empty open subset of the phase space E + \ E0+ with strictly positive probability. (ii) For any starting point (in E + ), the law of the asymptotic variable ( , ) charges any non-empty open subset of the range (] − 1, 0[ ∪ ]0, 1[) ×]0, ∞[. Proof. (i) We know from Proposition 2 (in Sect. 2.2) that there are piece-wise geodesic future-directed timelike continuous paths, and then trajectories in the support of the relativistic diffusion (ξ· , ξ˙· ), moving at will the coordinates (t, x, y, z).

548

J. Franchi

Owing to the quadratic covariation (rank 3) matrix of the R3 -valued martingale (Msa , Msb , Msz ) (recall Sect. 3.1) and to the unit pseudo-norm relation (00), we can find three independent standard real Brownian motions (w1 , w 2 , w 3 ) such that:

d z˙ s =

3σ2 2

z˙ s ds + σ

das =

3σ2 2

as z˙ s dws1 + σ as ds + σ 2 z˙ s + 1

z˙ s2 + 1 dws1 ;

'

as2 − z˙ s2 − 1 dws2 ; z˙ s2 + 1 √

bs z˙ s as bs − 2 e 2 ω xu (˙z s2 + 1) dws1 + σ dws2 dbs = 2 bs ds + σ z˙ s2 + 1 (˙z s2 + 1)(as2 − z˙ s2 − 1) √ √2 ω x u x 2e ˙s dws3 . +σ 2 2 as − z˙ s − 1 3σ2

Let us use the support theorem of Stroock and Varadhan (see for example ([I-W], Theorem VI.8.1)). We see thus from the above stochastic differential system, that the following trajectories belong to the support of (ξ· , ξ˙· ) ≡ (t· , x· , y· , z · , z˙ · , a· , b· , x˙· ): – trajectories moving at will the coordinate z˙ , without changing the coordinates (t, x, y, z); – trajectories moving at will the coordinate a, without changing the coordinates (t, x, y, z, z˙ ); – trajectories moving at will the coordinate b, provided x˙ = 0, without changing the coordinates (t, x, y, z, x, ˙ a). So far, it has become clear that it is possible, within the support of the relativistic diffusion, to move any point of the phase space E + \E0+ having given first coordinates (t, x, y, z, z˙ , a) ∈ R6 , onto some point of the phase space E + \E0+ having prescribed first coordinates (t , x , y , z , z˙ , a ) ∈ R6 . It remains only to consider the last two coordinates (b, x). ˙ They are of course constrained by the unit pseudo-norm relation (00 ), which tells precisely that they run some ellipse of this plane of coordinates, which is centred on the axis {x˙ = 0}. The last type of trajectory mentioned above allows now to move (b, x) ˙ arbitrarily on the upper half and on the lower half this ellipse, without changing the other coordinates, within the support of the relativistic diffusion. Hence, we have shown that the support of the relativistic diffusion connects any couple of points belonging to the same connected component of (E + \ E0+ ) ∩ {x˙ = 0}. Now, by Sect. 3.1 again, we can find three independent standard real Brownian motions (w¯ 1 , w¯ 2 , w¯ 3 ) such that: √ √ √ √ d x˙s = (ω/ 2 ) e−2 2 ω xs bs2 ds − 2 ω e− 2 ω xs as bs ds 2 + 3 2σ x˙s ds + σ x˙s2 + 1 d w¯ s1 ; ' x ˙ as2 − x˙s2 − 1 a 2 s s das = 3 2σ as ds + σ d w¯ s1 + σ d w¯ s2 ; x˙s2 + 1 x˙s2 + 1

Relativistic Diffusion in Gödel’s Universe

549 √

bs x˙s as bs − 2 e 2 ω xu (x˙s2 + 1) d w¯ s1 + σ d w¯ s2 dbs = bs ds + σ x˙s2 + 1 (x˙s2 + 1)(as2 − x˙s2 − 1) √ √2 ω x u z 2e ˙s d w¯ s3 . +σ as2 − x˙s2 − 1 3σ2 2

The same argument as above, applied to this new decomposition, shows similarly that the support of the relativistic diffusion connects any couple of points belonging to the same connected component of (E + \ E0+ ) ∩ {˙z = 0}, hence of (E + \ E0+ ) ∩ {x˙ 2 + z˙ 2 = 0}. This ends actually the proof of irreducibility, since the latter set is a connected and dense open subset of the whole phase space E + \ E0+ . (ii) This is a direct consequence of (i) above and of Propositions 7 and 9: by (i), it is indeed enough to start the relativistic diffusion so that z˙ 0 /a0 be close to a given 0 ∈ (] − 1, 0[ ∪ ]0, 1[), b0 /a0 be close to a given 0 > 0, and a0 be large enough. 3.7. Convergence to a beam. Recall that, owing to Definition 2, we are looking for a limiting beam B = ( , , Y ). Let us exhibit now the third asymptotic random variable Y , for the relativistic diffusion. Proposition 11. The process Ys := ys + toward some real random variable Y .

√

2 ω

x˙s converges almost surely, as s → ∞, bs √

√

Proof. Recall from Formulas (10) that we have y˙s = e− 2 ω xs (2 as − e− 2 ω xs bs ). We have then: x˙s d x˙s x˙s dbs x˙s dbs dbs , d x˙s d = − + − bs bs bs2 bs3 bs2 √ √ √ √ x˙s = √ω e−2 2 ω xs bs ds − 2 ω e− 2 ω xs as ds − 2σ 2 e2 2 ω xs 3 ds 2 bs σ x˙s + d Msx − σ 2 d Msb , bs bs whence ω √ 2

dYs = −2σ 2 e2

√

2 ω xs

x˙s σ x˙s ds + d Msx − σ 2 d Msb , bs3 bs bs

and for some Brownian motion W : s

s √ 2 du √ 2 2 2 ω xu x˙ u

2 2 ω xu x˙ u ω ω √ √ 1− 2e . Y = Y − 2σ e du + σ W 2 s 2 0 bu3 bu bu2 0 0 By Corollary 3, Lemma 4, and Proposition 6 (which implies, according to Sect. 3.2, that x˙s /bs is bounded), the two above integrals, and then Ys , converge almost surely. Corollary 4. We have almost surely, as s → ∞: √ 2 ω

2 e− 2 ω xs − 1 + (ys − Y ) −→ 2 2

2 1 2 (1 − ).

550

J. Franchi

Proof. By definition of Ys (in Proposition 11) and Sect. 3.2, the left-hand side equals:

2 2

as

sinh (λs ) ω

sinh (λs ) (Ys − Y ) − √ −1− √ sin(ω ϕs ) + cos(ω ϕs ) bs 2 2 bs 2 bs 2 √

sinh λs

as 2 sinh λs s = √ + − 1 a sin(ωϕs ) bs − 1 − bs b 2 bs s √ ω

2 sinh λs (Y − Y ) (Y − Y ) − cos(ωϕ ) , + ω

s s s 2 2 bs

which goes to

1− 2 2 ,

by Proposition 6, Corollary 3, Proposition 11, and since (by ' z˙ 2 as

sinh λs s = × 1 − − as−2 −→ Lemma 1, Corollary 3, and Proposition 6): bs bs as 1 − 2 . The following statement, analogous to Propositions 7 and 9, ensures that the range of possible limits Y in Proposition 11 is the whole R. This provides again another continuum of non-trivial bounded harmonic functions for the relativistic operator L acting on T 1 G. Proposition 12. For any real y and any ε > 0, we have P[y − ε < Y < y + ε] > 1 − ε, provided Y0 is close enough from y and a0 is large enough. √ 2 Proof. Recall from Lemma 2 that the event A := as ≥ a0 eσ s/2 for any s ≥ 0 has −1/2

(for a0 > 3) probability larger than 1 − a0

|Y − Y0 | = O(1) =

so that

∞

0 1 O( a0 ) +

. The proof of Proposition 11 shows that

∞ du du + max |Ws | 0 ≤ s ≤ O(1) 2 au au2 0

W ∗ [O( a10 )] on A,

P |Y − Y0 | ≤ 2 a0−1/3 > 1 − 2 a0−1/2 , for large enough a0 .

Proposition 12 improves Proposition (10,(ii)). We deduce indeed at once the following. Corollary 5. For any starting point (in E), the law of the asymptotic variable ( , , Y ) charges any non-empty open subset of the range (] − 1, 0[ ∪ ]0, 1[) ×]0, ∞[×R. More precisely, if the starting point of the relativistic diffusion satisfies: z˙ 0 /a0 close enough to 0 ∈] − 1, 1[, b0 /a0 close enough to 0 > 0, Y0 close enough to y ∈ R, and a0 large enough, then with arbitrary large probability, ( , , Y ) is arbitrary close to ( 0 , 0 , y). The theorem of the Introduction (Sect. 1) is now established. Indeed, gathering successively Remark 8 and Proposition 10, Propositions 6, Lemma 3, Corollary 3, Proposition 11 and Corollary 4, and Corollary 5, we get the following main result (for which σ > 0 is necessary, due to the observation made after Definition 2, in Sect. 2.3).

Relativistic Diffusion in Gödel’s Universe

551

Theorem 1. (i) The relativistic diffusion is irreducible, on its phase space E + \E0+ . (ii) Almost surely, the relativistic diffusion path possesses a 3-dimensional asymptotic random variable B = ( , , Y ) ∈ B, and converges to this beam B, in the sense of Definition 2. Indeed, we have almost surely, as proper time s goes to infinity: z˙ s /as −→ ∈] − 1, 0[ ∪ ]0, 1[; bs /as −→ ∈ ]0, ∞[; Ys −→ Y ∈ R;

2

√

e−

2ω xs

2 ω

2 −1 + (ys − Y ) −→ 2

2 1 2 (1 − ).

(iii) The asymptotic random variable ( , , Y ) can be arbitrary close to any given ( 0 , 0 , y) ∈ ] − 1, 1[×]0, ∞[×R, with positive probability. Hence, the whole boundary (space of beams) B is the support of beams the relativistic diffusion can converge to. Remark 11. A rapid look at Remark 4 could let one think that there could be a fourth asymptotic random variable for the relativistic diffusion, namely a possible almost sure limit for X s := z s + ts − ϕs . But as a matter of fact, it can be shown (in the same vein as Remarks 9 and 10) that there is no such limit, in accordance with the last sentence of Remark 4, on the geometric irrelevance of the additional parameter (Z 0 , T0 ). Remark 12. From the proofs of Proposition 14, Proposition 11, and Proposition 6, we have the following representation of the asymptotic variable B = ( , , Y ):

∞

∞ √

∞ b0 du bu du bu −1 b a

= d M + 2σ 2 e 2 ω xu 2 − σ 2 + σ a − d M u u u ; au a0 au au3 0 0 0

∞ √ √ 2 ∞ √ x˙u Y = Y0 − 2 ω2 σ e2 2 ω xu 3 du + ω2 σ bu−1 d Mux − bx˙uu d Mub ; bu 0 0

∞

∞ z˙ 0 z ˙ u z˙ u −1 z a = d M − σ2 du + σ a − d M u u u . au a0 au3 0 0 By Proposition 8, the law of the asymptotic variable B has no atom, and by Theorem (1, (iii)), it is really three-dimensional. None of , , Y is a function of the other two. Theorem 1 and Remarks 9, 10, 11, 12, incite to believe in the following, which, by classical methods, would imply that the Poisson boundary of G identifies with its geometric boundary B: Conjecture. The invariant σ -field of the relativistic diffusion in Gödel’s universe is the σ -field generated by the asymptotic three-dimensional random variable B = ( , , Y ) of Theorem 1 (exhibited by Proposition 6, Corollary 3, and Proposition 11).

3.8. Improvement of convergence. We show here that the convergence of Theorem (1,(ii)), of the generic diffusion path (ξs , ξ˙s ), occurs in fact in some stronger sense: on one hand, it is exponentially fast, as stated in Corollary 7 below, and on the other hand, it holds partially in the sense of Skorohod topology, as explained in the concluding Remark 13.

∞ Lemma 6. We have almost surely, for any n ≥ 1 and ε > 0: lim asn−ε au−n du = 0. s→∞

s

552

J. Franchi

Proof. Recall from the proof of Lemma 1 that we have for any 0 ≤ s ≤ u:

s+u

s+u dv av−2 dwv as 2 σ2 = exp −σ u − σ (ws+u − ws ) − 2 +σ . as+u av2 s s 1 + 1 − av−2 Hence, by Lemma 1, for s → ∞:

∞

∞

−n −n au du ∼ as exp −n σ 2 u − n σ (ws+u − ws ) du s 0

( ε−n ) ∞ = o as exp −n σ 2 u − n σ (ws+u − ws ) − 2ε (σ 2 s + σ ws ) du

0

( ε−n ) ∞ 2 (n− 2ε )ws ε s+u s du exp −σ (n − 4ε )u + 4ε + σnw = o as (s+u) (s + u)+ 4 − σs 0 ( ) = o asε−n . Proposition 13. We have for any ε > 0, almost surely:

lim (˙z s − as ) as−ε = 0 .

s→+∞

z˙ 2 % $ z˙ s z˙ s s Proof. Since d Msz − d Msa = 1 − ds, using the expression for d as as as displayed in the proof of Proposition 6, we have: ∞

∞ z˙ 2 du z˙ u u 2 ˜ 1− , z˙ s − as = σ as du − σ as W au3 au au2 s s for some Brownian Motion W˜ . Now, by Proposition 6 and Lemma 6, we have almost surely: ∞

∞

∞ 2

z˙ u du z˙ u −2 ε−2 ˜ 1 − , W du = O(a )du = o a au a 2 u s a3 s

u

s

=o s

∞

1−ε du au2

2

u

s

= o as2ε−1 .

Therefore we get finally: z˙ s − as = o asε−1 + o as2ε = o as2ε .

Lemma 3, Proposition 13 and Sect. 3.2 imply at once the following. Corollary 6. For any ε > 0, we have almost surely: √

e−

2 ω xs

sinh λs = 1 − 2 + o(asε−1 ), as

bs = 2 − 2(1 − 2 ) sin(ω ϕs ) + o(asε−1 ) and as x˙s = 1 − 2 as cos(ω ϕs ) + o(asε ).

In the same vein as Proposition 13, we have the following. Proposition 14. We have for any ε > 0, almost surely:

lim (bs − as ) as−ε = 0.

s→+∞

Relativistic Diffusion in Gödel’s Universe

bs d as

Proof. We have: $ and

d Msb

553 √

= 2σ

2

e

2 ω xs

as2

bs d Msa , d Msb − as

bs σ ds + 3 as as

ds − σ 2

% √ √ bs bs2 a 2 ω xs bs 2 2 ω xs − d Ms = 4 e −2e − 2 ds, as as as

so that there exists a standard real Brownian motion W such that:

∞ √

∞ bs bu 2 e 2 ω xu

− = 2σ 2 du − σ du 3 au2 as a s s u ∞ √ √ bu2 du 2 ω xu bu 2 2 ω xu +σ W 4e . −2e − 2 au au au2 s Now, by Corollary 3 and Lemma 6, we have almost surely:

∞

e

√ 2 ω xu au2

du + s

ε−2 = o as ,

∞

s ∞

bu du au3

4e

s

√

2 ω x u bu au

− 2 e2

√

2 ω xu

−

bu2 au2

du au2

= o asε−2 .

Hence, as in the proof of Proposition 13, we deduce − bs /as = o as2ε−1 .

) ( Proposition 15. We have Y − Ys = o asε−1 almost surely, for any ε > 0. Proof. From the proof of Proposition 11, we have (for some Brownian motion W ):

∞ √ x˙u Ys − Y = Ys − Y∞ = e2 2 ω xu 3 du bu s

∞ x˙ 2 du √ 2 u 1 − 2 e2 2 ω xu +W 2ωσ2 bu bu2 s

∞

∞

as−2 ds + W O(1) au−2 du = o asε−1 , = O(1) √ 2 2 σ2 ω

s

as in the proof of Proposition 13, by Lemma 6.

s

As Proposition 6, Corollary 3, and Proposition 11 implied Corollary 4 in Sect. 3.7, the following is easily implied by the above Propositions 13, 14, 15. Corollary 7. For any ε > 0, we have almost surely, as s → ∞:

2

√

e−

2 ω xs

2 ω

2 (ys − Y ) = 21 (1 − 2 ) + o asε−1 . −1 + 2

In Theorem (1, (ii)), the four converging processes are at distance o(e(ε−1)σ s ) of their limits. 2

554

J. Franchi

Finally, another type of improvement of the convergence is noticed in the following. Remark 13. By Corollary 6 and Propositions 14 and 15, we have on one hand: √ √ 2 2(1− 2 ) sin(ω ϕs ) + o(e(ε−1)σ s ) and e− 2 ω xs = 2 −

√ 2 2(1− 2 ) ys = Y − cos(ω ϕs ) + o(e(ε−1)σ s ), ω

while by Remark 4, for any light ray ξ = (t, x, y, z) belonging to the beam B = ( , , Y ), using the increasing diffeomorphism ϕ¯ = (τ → ϕ¯τ ), we have on the other hand: √ √ 2(1− 2 ) exp[− 2 ω x ϕ¯ −1 (ϕs ) ] = 2 − sin(ω ϕs ) and

√ 2(1− 2 ) y ϕ¯ −1 (ϕs ) = Y − cos(ω ϕs ). ω

Hence, we have in the (x, y)-plane a strong convergence, of the projection of the generic relativistic diffusion path to the projection of a lightlike geodesic, in the Skorohod topology: 2 xs − x ϕ¯ −1 (ϕs ) + ys − y ϕ¯ −1 (ϕs ) = o(e(ε−1)σ s ). Otherwise, by Corollary 6, Propositions 5, 13, and Remark 4, we have:

s √ z s + ts = z s + t0 + (e− 2 ω xu bu − au )du 0

s = ϕs + O(1) + (˙z u − au )du = ϕs + o(eε s ) 0

= z ϕ¯ −1 (ϕs ) + t ϕ¯ −1 (ϕs ) + o(eε s )

) ( 2 = z ϕ¯ −1 (ϕs ) + t ϕ¯ −1 (ϕs ) 1 + o e(ε−1)σ s −→ ∞. Hence, again in the Skorohod topology, and in the (z, t)-plane, the projection of the limiting light ray ξ stands for an asymptotic direction for the projection of the generic relativistic diffusion path, but there is no exactly asymptotic lightlike geodesic (a parabolic branch occurs). References [A] [A-S] [A-T-U] [B] [B-R] [C-W] [De]

Ancona, A.: Convexity at infinity and Brownian motion on manifolds with unbounded negative curvature. Rev. Mat. Iberoamer. 10, no 1, 189–220 (1994) Anderson, M.T., Schoen, R.: Positive harmonic functions on complete manifolds of negative curvature. Ann. Math. (2) 121, no 3, 429–461 (1985) Arnaudon, M., Thalmaier, A., Ulsamer, S.: Existence of non-trivial harmonic functions on Cartan-Hadamard manifolds of unbounded curvature. To appear Math. Zeit., doi:10.1007/s00209008-0422-6, 2009 Bailleul, I.: Poisson boundary of a relativistic diffusion. Prob. Th. Rel. Fields 141, no 1-2, 283–329 (2008) Bailleul, I., Raugi, A.: Where does randomness lead in space-time? To appear in ESAIM, Probab. and Stat., 2009, doi:10.1051/ps:2008021, 39 pp Chandrasekhar, S., Wright, J.P.: The geodesics in Gödel universe. Proc. Nat. Ac. Sc. U.S.A. 47, no 3, 341–347 (1961) Debbasch, F.: A diffusion process in curved space-time. J. Math. Phys. 45, no 7, 2744–2760 (2004)

Relativistic Diffusion in Gödel’s Universe

[Du] [F-LJ] [G1] [G2] [H-E] [I-W] [Ki] [Ku] [L] [M1] [M2] [N-R-W] [P] [R-T] [S]

555

Dudley, R.M.: Lorentz-invariant Markov processes in relativistic phase space. Arkiv för Mat. 6, no 14, 241–268 (1965) Franchi, J., Le Jan, Y.: Relativistic diffusions and Schwarzschild geometry. Comm. Pure Appl. Math. LX, no 2, 187–251 (2007) Gödel, K.: An example of a new type of cosmological solution of einstein’s field equations of gravitation. Rev. Mod. Phys. 21, 447–450 (1949) Gödel, K.: Rotating universes in general relativity theory. Proc. Int. Congress Math., Cambridge, Mass., 1950, vol. 1, Providence, RI.: Amer. Math. Soc., 1952, PP. 175–181 Hawking, S.W., Ellis, G.F.R.: The Large-Scale Structure of Space-Time. Cambrige: Cambrige University Press, 1973 Ikeda, N., Watanabe, S.: Stochastic Differential Equations and Diffusion Processes. Amsterdam/Tokyo: North-Holland/Kodansha, 1981 Kifer, J.I.: Brownian motion and positive harmonic functions on complete manifolds of nonpositive curvature. In: From Local Times to Global Geometry, Control and Physics (Coventry 1984–85), Pitman Res. Notes Math. Vol. 150, Harlow: Longman Sci. Tech., 1986, PP. 187–232 Kundt, W.: Trägheitsbahnen in einem von gödel angegebenen kosmologischen modell. Zeit. für Phys. 145, 611–620 (1956) Levichev, A.V.: Causal structure of left-invariant lorentz metrics on the group M2 ⊗ R2 . Siberian Math. J. 31, 607–614 (1990) Malament, D.: Minimal acceleration requirement for time travel in Gödel space-time. J. Math. Phys. 26, no 4, 774–777 (1985) Malament, D.: A note about closed timelike curves in Gödel space-time. J. Math. Phys. 28, no 10, 2427–2430 (1987) Norris, J.R., Rogers, L.C.G., Williams, D.: Brownian motions of ellipsoids. Trans. Amer. Math. Soc. 294, no 2, 757–765 (1986) Penrose, R.: Techniques of Differential Topology in Relativity. J. Conf. Board of the Math. Sciences Regional Conf. Series in Applied Math., no 7. Philadelphia, PA: Society for Industrial and Applied Mathematics, 1972 Rebouças, M.J., Tiomno, J.: Homogeneity of Riemannian space-times of Gödel type. Phys. Rev. D 28, no 6, 1251–1264 (1993) Sullivan, D.: The Dirichlet problem at infinity for a negatively curved manifold. J. Differ. Geom. 18, no 4, 723–732 (1984)

Communicated by G. W. Gibbons

Commun. Math. Phys. 290, 557–576 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0846-9

Communications in

Mathematical Physics

A Numerical Study of Arnold Diffusion in a Priori Unstable Systems Massimiliano Guzzo2 , Elena Lega1 , Claude Froeschlé1 1 UNSA, CNRS UMR 6202, Observatoire de Nice, Bv. de l’Observatoire,

B.P. 4229, 06304 Nice cedex 4, France

2 Università degli Studi di Padova, Dipartimento di Matematica Pura ed Applicata,

via Trieste 63, 35121 Padova, Italy. E-mail: [email protected] Received: 28 February 2008 / Revised: 25 March 2009 / Accepted: 30 March 2009 Published online: 13 June 2009 – © Springer-Verlag 2009

Abstract: This paper concerns the problem of the numerical detection of Arnold diffusion in a priori unstable systems. Specifically, we introduce a new definition of Arnold diffusion which is adapted to the numerical investigation of the problem, and is based on the numerical computation of the stable and unstable manifolds of the system. Examples of this Arnold diffusion are provided in a model system. In this model, we also find that Arnold diffusion behaves as an approximate Markovian process, thus it becomes possible to compute diffusion coefficients. The values of the diffusion coefficients satisfy the scaling D() 2 . We also find that this law is correlated to the validity of the Melnikov approximation: in fact, the D() 2 law is valid up to the same critical value of for which the error terms of Melnikov approximations have a sharp increment. 1. Introduction Diffusion in conservative dynamical systems has been a much studied subject in the last decades. Apart from specific examples, the understanding of the general mechanisms which can produce drift and diffusion in the phase space of such systems is an interesting, and still open, problem. The existence of a slow diffusion of the actions in a specific quasi-integrable system has been proved for the first time by Arnold [1]. The proof of the Arnold diffusion is based on the existence of chains of whiskered tori such that, under the effect of a perturbation, the unstable manifold of one intersects the stable manifold of the next one. The sequence of such invariant tori is called a transition chain and the shadowing argument used to prove diffusion through the transition chain is called a transition chain mechanism. A non-generic feature of Arnold’s example is that the hyperbolic invariant manifold along which Arnold proves the existence of diffusion is fibered by invariant tori for all values of the perturbing parameter. That is, the restriction of the dynamical system to the invariant manifold is integrable. Generalizations of Arnold’s example consider normally hyperbolic invariant manifolds such that the restriction of the

558

M. Guzzo, E. Lega, C. Froeschlé

dynamics to them is not integrable. As a consequence, the distribution of the invariant whiskered tori has gaps which correspond to the resonances of the dynamical system restricted to the invariant manifold. In [6] the existence of transition chains in regions of the invariant manifold which do not contain a selected number of main resonances is proved. In [8] transition chains crossing the main resonances are constructed by including also stable and unstable manifolds of invariant sets which are topologically different from invariant whiskered tori. The existence of diffusing motions has been proved also in [2–4] using different models and techniques, including variational methods based on Mather theory, and in [26,27] using the so-called separatrix map. The most important techniques to prove the existence of transitions chains (used in [6,8,26,27]) are based on the so–called Melnikov theory, which provides first order approximations of the stable and unstable manifolds. Arnold’s paper motivated a great debate about the possibility of numerical detection of Arnold diffusion. A few years after the first numerical detections of chaotic motions [12], problems related to the numerical detection of Arnold diffusion were discussed in [7]. In the following decades, many authors studied numerically the diffusion through resonances, referring to it as Arnold diffusion. For example, explicit reference to possible interpretations of numerical diffusion as Arnold diffusion can be found in [17]. Other papers, such as [9,16,20,29], studied the numerical diffusion of orbits in coupled standard maps by changing the perturbation parameters and the number of coupled maps (for a review see also [19] and references therein). Computations of the stable and unstable manifolds of hyperbolic tori related to an Arnold diffusion problem can be found in [25]. In [10,11,18] we studied the diffusion of orbits in quasi–integrable systems for values of the perturbing parameters for which there is numerical evidence of applicability of the KAM and Nekhoroshev theorems. In this paper we study the problem of the numerical detection of Arnold diffusion for an important class of conservative systems, the so–called a priori unstable ones following the terminology introduced in [6]. To do this, we first need to define precisely what is the Arnold diffusion that one can measure with numerical experiments, that can be repeated for a finite number of initial conditions and values of the perturbing parameter. We therefore provide a new definition of Arnold diffusion which is based on the computation of the stable and unstable manifolds of two whiskered tori (or other invariant hyperbolic objects) for specific values of the perturbing parameter . The perturbing parameter is required to be sufficiently small so that the normally hyperbolic invariant manifold is filled by a large volume of whiskered tori. An ideal verification of Arnold diffusion would correspond to the detection of heteroclinic transitions among the stable and unstable manifolds of different whiskered tori. But, the probability of finding an orbit which passes near a selected number of heteroclinic points is very small (in [6], based on a prescribed selection of the successive passages, this is reflected in superexponential estimates of the time of diffusion, in the sense used there), and moreover the time needed for an orbit to perform an exact heteroclinic excursion would be infinite. Therefore, we base our definition on the detection of approximate heteroclinic transitions, corresponding to the existence of orbits with initial conditions in a neighbourhood of a whiskered torus which enter a finite neighbourhood of another whiskered torus in some finite time, with suitably ‘large’ variation of the action variables which are constants of motion for = 0. On the one hand, the new notion of Arnold diffusion is weaker than the usual one, because it refers to finite values of and to specific neighbourhoods of the whiskered tori. On the other hand, it provides a numerical verification of the topological mechanism which is behind the diffusion of the actions, inspired by the proofs of Arnold diffusion such as [6 and 8]. We provide numerical examples of this Arnold diffusion,

Hyperbolic Manifolds and Arnold Diffusion

559

with a control of the numerical errors, including also a transition among whiskered tori with large gaps between them. The detection of Arnold diffusion in the sense stated by the new definition requires the precise detection of one approximate heteroclinic transition. Therefore, we needed to set the numerical precision to the high value of 400 digits in the numerical experiments. Relaxing the numerical precision of the computations (precisely we switched to double precision) we could compute the statistical properties of many of these transitions. We remark that, while the lower precision affects drastically the individual integrated orbits after some Lyapunov times, it affects much less the computation of statistical quantities, such as the Lyapunov exponents and the diffusion coefficients (see, for example, [23]). We find that, for small values of , Arnold diffusion behaves as an approximate Markovian process (see Sect. 4 and 5.2 for the precise meaning of ‘approximate’) allowing one to compute diffusion coefficients, and the values of these diffusion coefficients satisfy a scaling D() 2 . For higher values of , data cannot be fitted by the 2 law, and we do not try any fit because their statistics is poor (see Sect. 4 and the technical Sect. 5.2 for details). It is remarkable that the D() 2 law is correlated to the validity of the Melnikov approximation, in the sense that the D() 2 law is valid up to the same critical value of for which the error terms of Melnikov approximations (which increase with as well) have a sharp increment. The paper is organized as follows: in Sect. 2 we introduce a new definition of Arnold diffusion suited for numerical experiments; in Sect. 3 we provide numerical examples of Arnold diffusion; in Sect. 4 we numerically show that, for small values of , Arnold diffusion behaves as an approximate Markovian process and we discuss the relevance of Melnikov approximations. The technical tools used through the paper, that is the tools related to normal hyperbolicity, the computation of the stable and unstable manifolds, the statistical tools and the computation of Melnikov approximations are reported in Sect. 5. Conclusions are provided in Sect. 6. 2. A Definition of Arnold Diffusion Suited to Numerical Experiments We consider dynamical systems defined by a family of smooth symplectic maps: (I , ϕ ) = φ (I, ϕ), with the action–angle variables (I, ϕ) defined on a domain B × Tn , with B ⊆ Rn open bounded. The family φ depends smoothly on the parameter . We assume that: – some actions I j , . . . , In , with j > 1, are constants of motion of the unperturbed map φ0 ; – φ0 has an invariant sub–manifold 0 which is normally hyperbolic and symplectic (the definition of normal hyperbolicity is recalled in Sect. 5.1); – the restriction of φ0 to 0 is an integrable anisochronous map1 . We call such maps a priori unstable. We will consider suitably small || such that the map φ has an invariant sub-manifold which is normally hyperbolic, symplectic, and canonically smoothly conjugate to 0 . Therefore, the KAM theorem for maps applies to the restriction of φ to , and is the small parameter. For some c0 > 0, and any small , we require that any orbit (I (t), ϕ(t)) = φt (I (0), ϕ(0)), with (I (0), ϕ(0)) ∈ , satisfies: I (t) − I (0) < c0 1 The complete integrability of φ 0|0 is intended with reference to the symplectic form d I ∧ dϕ| .

(1)

560

M. Guzzo, E. Lega, C. Froeschlé

for any t ∈ Z. That is, the motions of φ with initial conditions on are uniformly bounded in the actions. Therefore, Arnold diffusion concerns the dynamics in neighborhoods of . Definition 1. The problem of Arnold diffusion for φ consists in proving that, for any suitably small = 0 and for any neighbourhood V of there exist motions such that for some t ∈ Z it is: (I (0), ϕ(0)), (I (t), ϕ(t)) ∈ V , and I (t) − I (0) > 2c0 .

(2)

The above definition of Arnold diffusion is not well suited for the numerical study of the problem, because numerical integrations cannot span any value of the perturbing parameter and any small neighbourhood of . Therefore, we give below a definition which is more adapted to the numerical investigation, and it still contains most of the whole complexity of Arnold diffusion. Definition 2. The problem of the numerical detection of Arnold diffusion for φ in the ˜ ⊆ consists in the numerical detection of: subset ˜ such that the closures C(x ), C(x ) of • two points x = (I , ϕ ), x = (I , ϕ ) ∈ their orbits have empty intersection; • two vectors x = (I , ϕ ), x = (I , ϕ ) ∈ R2n ; • a positive t ∈ N and an index k ∈ { j, ..., n}; such that: • x + x ∈ Wu (x ), where Wu (x ) denotes the unstable manifold of x ; • φt (x + x ) + x ∈ Ws (x ), where Ws (x ) denotes the stable manifold of x ; • for any ( I˜ , ϕ˜ ) ∈ C(x ), ( I˜ , ϕ˜ ) ∈ C(x ) it is: ˜ (3) Ik − I˜k > ck + Ik + Ik , ˜ where ck is such that any orbit (I (h), ϕ(h)) = φh (I (0), ϕ(0)), with (I (0), ϕ(0)) ∈ , satisfies: |Ik (h) − Ik (0)| < ck ∀ h ∈ Z;

(4)

for some values of the perturbing parameter such that the KAM theorem applies to φ | and inequality (1) is satisfied. We remark that Arnold diffusion in the sense of Definition 2 does not exist for the unperturbed map φ 0 , because in such a case the actions I j , . . . , In are constants of motion. The above definition is clearly inspired by the proofs of existence of Arnold diffusion which show that the stable and unstable manifolds of different invariant tori of intersect transversely (for a precise statement we refer to [6]). An ideal verification of Arnold diffusion would correspond to the detection of an heteroclinic intersection among Wu (x ) and Ws (x ). But, the probability of finding an orbit passing near a selected number of heteroclinic points is very small (in [6], based on a prescribed selection of the successive passages, this is reflected in superexponential estimates of the time of diffusion, in the sense used there), and moreover the time needed to perform an exact heteroclinic excursion would be infinite. Therefore, Definition 2 is based on the detection of approximate

Hyperbolic Manifolds and Arnold Diffusion

561

heteroclinic transitions, corresponding to the existence of orbits with initial conditions in a neighbourhood of x which enter a finite neighbourhood of x in some finite time, with variation of the action variable Ik as in (3). On the one hand, Definition 2 is weaker than Definition 1 because it refers to finite values of and to specific neighbourhoods of the whiskered tori. On the other hand it provides also a numerical verification of the topological mechanism which is behind the diffusion of the actions, inspired by the proofs of Arnold diffusion, such as in [6 and 8]. In Sect. 3 we provide an example of numerical detection of Arnold diffusion for which Eq. (3) is verified within the numerical errors, i.e. we find: ˜ (5) Ik − I˜k > ck + Ik + Ik + ρ, where ρ > 0 is an estimator of the numerical errors (see technical Sect. 5.1 for details). 3. Arnold Diffusion in a Model Problem We detect Arnold diffusion in the a priori unstable system defined by the family of maps: φ : R2 × T2 −→ R2 × T2 (ϕ1 , ϕ2 , I1 , I2 ) −→ (ϕ1 , ϕ2 , I1 , I2 )

(6)

such that: ϕ1 = ϕ1 + I1 ϕ2 = ϕ2 + I2 sin ϕ1 , (cos ϕ1 + cos ϕ2 + c)2 sin ϕ2 , I2 = I2 + (cos ϕ1 + cos ϕ2 + c)2

(7)

I1 = I1 − a sin ϕ1 +

where a > 0, and c > 2 are parameters (in all the numerical experiments we set a = 0.4, c = 2.1). The symplectic structure on R2 × T2 is the standard one: dϕ1 ∧ d I1 + dϕ2 ∧ d I2 . According to the definitions given in Sect. 2, the family (7) indeed defines an a priori unstable system. In fact: – The action I2 is a first integral of the unperturbed map φ0 . – φ0 has an invariant manifold 0 defined by: 0 = {(I1 , ϕ1 , I2 , ϕ2 ) : such that (I1 , ϕ1 ) = (0, π )},

(8)

which is normally hyperbolic and symplectic. The stable and unstable manifolds of 0 are the product of the stable and unstable manifolds of the hyperbolic fixed point of the standard map: ϕ1 = ϕ1 + I1 , with R × T, domain of (I2 , ϕ2 ).

I1 = I1 − a sin ϕ1

562

M. Guzzo, E. Lega, C. Froeschlé

– The restriction of φ0 to 0 is represented by the 2–dimensional map: ϕ2 = ϕ2 + I2 ,

I2 = I2 ,

(9)

while I1 = I1 , ϕ1 = ϕ1 . The 2–dimensional map (9) is integrable anisochronous, with first integral I2 . The manifold 0 is invariant also for the map φ for any , and it is also normally hyperbolic if is suitably small. Therefore, according to the notations and definitions given in Sect. 2, for all such it is = 0 and the restriction of φ to the invariant manifold is explicitly represented by the 2–dimensional map: ϕ2 = ϕ2 + I2 ,

I2 = I2 +

sin ϕ2 , (cos ϕ2 + c − 1)2

(10)

while I1 = I1 , ϕ1 = ϕ1 . The explicit representation of = 0 and of φ | simplifies a lot the technical implementation of the numerical experiments. In the following, to simplify the notations, we denote φ = φ and = 0 = . To numerically detect Arnold diffusion for the map φ in the sense provided by Definition 2 we need to study with some more detail the dynamics of the restricted map (10). First, we fix an interval D = [0.26, 0.38] of the action I2 and we determine with a numerical method the value c such that the KAM theorem is valid for any 0 = ≤ c in some open domain of (I2 , ϕ2 ) containing D × T. Precisely, because we detected the presence of KAM curves in numerically computed phase portraits of (10) for 0 ≤ ≤ 0.002, while we did not detect any KAM curve for = 0.0026, we inferred that c ∈ (0.002, 0.0026). Because the KAM curves of (10) are topological barriers for the variation of the action I2 for any motion with initial condition on , condition (1) is satisfied for some c0 > 0 for any 0 ≤ < 0.002. We numerically find Arnold diffusion for = 10−6 , 10−4 < c (of course, other values can be investigated with the same techniques). The phase portraits of the restricted map φ| are reported in Fig. 1 top–left panel ( = 10−6 ) and bottom–left panel ( = 10−4 ): in both cases the phase portraits contain several KAM curves, and for = 10−4 also an evident resonance. We remark that, in studies of Arnold diffusion, this kind of resonances are usually called large gaps in the distribution of invariant tori. The variations of the action I2 for initial conditions on are bounded by a constant c2 (see (4)) which can be computed from the phase portraits (see Table 1). The points x , x that we find to satisfy Eqs. (3) and (5) belong to the bold KAM curves marked in the two phase portraits (the bottom ones for x and the upper ones for x ). In both cases = 10−4 , 10−6 we found that the crucial tolerances for the correction I2 , I2 on the action I2 and the estimator ρ of the numerical error are respected within many orders of magnitude (see Table 1 and Fig. 1, right panels). Because inequalities (3) and (5) are satisfied (see Table 1, last column), the two computations are a numerical detection of Arnold diffusion in the sense stated by Definition 2. We remark (see Fig. 1, bottom–left panel) the presence of an evident resonance between the invariant tori containing x and x , and therefore the unstable manifold of x has crossed this large gap before arriving near the stable manifold of x . All details of these numerical computations are reported in Sect. 5.1. The above numerical detections of Arnold diffusion are based on the computation of pieces of the stable and unstable manifolds. To give also a global geometric idea of how the stable and unstable manifolds support Arnold diffusion we computed their parametrization with respect to their arc–length. Figure 2 reports the computation of the

Hyperbolic Manifolds and Arnold Diffusion

563

0.3244 0.32435 0.3243 0.32425 0.3242

I2

0 0.32415

2

2 13

2. 10

0.3241

0.002 0.32405

13

4. 10

0.324

0.000 1

0.005

0.32395 0

1

2

3

ϕ2

4

5

6

1

0.000 0.002 0.005

0.34 0.338 0.336 0.334

I2

0.332

2

0.33

2

0

1. 10

8

0.328

2. 10

0.326

3. 10

8

4. 10

0.324

0.002

8 8

0.000 1

0.005

0.322 0

1

2

3

ϕ2

4

5

6

1

0.000 0.002 0.005

Fig. 1. Numerical test of Eq. (3). On the top–left. Phase plane of the map φ restricted to for = 10−6 : the invariant tori containing x and x are represented by bold curves. On the top–right. Projection on the space of variables ϕ1 − π, I1 , I2 − I2 of the point φ T (x + x ) in the orbit of x + x and of the stable manifold of x , for = 10−6 . We remark that with a correction x characterized by I2 of order 10−13 the point φ T (x + x ) + x belongs to the stable manifold of x . On the bottom–left. Phase plane of the map φ restricted to for = 10−4 : the invariant tori containing x and x are represented by bold curves. We remark the presence of a large resonance between the two invariant tori. On the bottom–right. Projection on the space of variables ϕ1 − π, I1 , I2 − I2 of the point φ T (x + x ) in the orbit of x + x and of the stable manifold of x , for = 10−4 . We remark that with a correction x characterized by I2 of order 10−8 the point φ T (x + x ) + x belongs to the stable manifold of x Table 1. Tolerances for the verification of Eqs. (3) and (5). Because the last column has values bigger than 1, inequalities (3) and (5) are satisfied

c2

I

10−6 10−4

4 10−5 4 10−3

< 10−12 < 3 10−7

2

ρ < 10−173 < 10−113

˜ ˜ I2 − I2 c2 +I2 +I2 +ρ

>6 > 1.6

unstable manifold of a point on the torus containing x for = 10−6 (see Sect. 5.1 for the computational details). On the top right panel it appears clearly that I2 undergoes relatively large fluctuations. The unstable manifolds, which are contained in a plane of constant I2 for = 0, are unrolled along the I2 direction for > 0, thus supporting diffusion in the neighborhood of . The (many) returns of Wu (x ) near the manifold can be well appreciated in the three–dimensional representation (bottom right panel).

564

M. Guzzo, E. Lega, C. Froeschlé 1.5

6e-05 5e-05

1

4e-05 3e-05

I2(s)-I2(0)

I1(s)

0.5

0

-0.5

2e-05 1e-05 0 -1e-05 -2e-05

-1

-3e-05 -1.5

-4e-05 0

200

400

600

800

1000

1200

1400

1600

1800

0

200

400

600

s

800

1000 1200 1400 1600 1800

s

4e-05

3e-05

I1

I2-I2(0)

2e-05 1.5 1 0.5 0 -0.5 -1 -1.5

1e-05

0 0 1 2

-1e-05

ϕ1-π

3 4 5

-2e-05 0

1

2

3

4

5

4e-05 3e-05 2e-05 1e-05 0 I2-I2(0) -1e-05 -2e-05 -3e-05

6

ϕ2 5

Fig. 2. Computation of a parametrization of the unstable manifold of a point x = (ϕ1 , ϕ2 , I1 , I2 ) = φ 10 (π, 0, 0, 0.324) on a KAM curve, with respect to its arc–length s, for = 10−6 . On the top: Representation of I1 (s) (on the left) and I2 (s) (on the right). On the bottom left: The orbit of φ| is on a KAM torus. The vertical segment contains the projection on the plane (I2 − I2 (0), ϕ2 ) of the points of Wu (x) with |ϕ1 − π | ≤ 0.5 (reducing the tolerance on ϕ1 decreases the number of points on the figure, but does not decrease the amplitude of the segment). The fluctuations of Wu (x) along I2 are definitely bigger than the variation of I2 along the torus. On the bottom right: Representation of the unstable manifold of x in the three dimensional space ϕ1 − π , I2 − I2 (0), I1

To compare the value of the action I2 of these return points with the variation of I2 along the torus, we represent in the bottom left panel the KAM curve of x and the vertical segment which corresponds to the representation on the plane (I2 , ϕ2 ) of the points of the unstable manifold with |ϕ1 − π | ≤ 0.5 (reducing the tolerance on ϕ1 decreases the number of points on the figure, but does not decrease the amplitude of the segment). The amplitude of this segment is larger than c2 , providing indication that these returns of Wu (x ) near support Arnold diffusion. 4. Statistical Properties of Arnold Diffusion and Melnikov Approximations The definitions of Arnold diffusion given in Sect. 2 characterize it as the possibility of orbits with initial conditions near of returning near with a suitably large variation of the actions (compared to ck , see (3)). In particular, because Definition 2 requires the detection of one orbit returning to , we needed to set the numerical precision to the high

Hyperbolic Manifolds and Arnold Diffusion

565

number of 400 digits, necessary to detect precisely that return. Relaxing the numerical precision of the computations (precisely we switched to double precision) we computed the statistical properties of many of these returns. We remark that, while the lower precision affects the individual integrated orbits after some Lyapunov times, it affects much less the computation of statistical quantities, such as the Lyapunov exponents and the diffusion coefficients (see, for example, [23]). Our statistical study is based on the following considerations. First, we remark that the map (7) depends periodically on all the actions. This property simplifies the definition of the diffusion process because, in principle, the actions are allowed to diffuse indefinitely on R2 . Then, we consider the curve γ ⊆ obtained by fixing all the variables except for the action I2 : γ = {(ϕ1 , ϕ2 , I1 , I2 ) with ϕ1 = π, ϕ2 = 0, I1 = 0},

(11)

we choose a neighbourhood W of γ , and perform a statistical analysis on the variations of I2 for orbits with initial conditions in W , returning to W after some time. For initial conditions x = (ϕ1 , ϕ2 , I1 , I2 ) ∈ W \ we denote by t (x) the return time to W and by ψ(x) = φ t (x) (x) the return map to W . The set W∗ on which ψ is defined can be a proper subset of W , but for the Poincaré recurrence theorem (which applies to the present case because the map φ is periodic with respect to the actions) it has the same Lebesgue measure as W . Then, let us denote I2 (x) = I2 , and I2i (x) = I2 (ψ i+1 (x)) − I2 (ψ i (x)), i.e. the action variation occurred in the i th return. A statistical approach to the dynamics in W∗ , such as the one described in [28], would be justified by the existence of a set W˜ ∗ ⊆ W∗ of points x such that the sequence I21 , I22 , . . . is a sequence of independent random variables. This is a very strong requirement that, in our knowledge, can represent only an approximate description of the dynamics of the system. In this spirit, the traditional statistical approaches, such as for example those based on random phases approximations, replace first the true dynamics with an approximate one which behaves as a Markovian process, and then compute statistical quantities that can be defined precisely via the Markovian approximation. Here, we proceed in a different way: instead of performing statistical approximations on the dynamics, we check that finite sets of initial conditions x1 , . . . , x N and the finite sequence I21 , . . . , I2T averaged over these initial conditions behave as if the process I 1 +···+I T

would be approximately Markovian, i.e. we check that the variable YT = 2 T 2 is normally distributed within a tolerance admitted for the central limit theorem convergence (see Sect. 5.2 for all the technical details). Then, we compute the diffusion coefficient D, of the initial conditions x1 , . . . , x N , as if the process would be a Markovian one: D=

N T 1 1 I2i (x j )2 , N T ti (x j ) j=1

i=1

where ti (x j ) = t (ψ i−1 (x j )) denotes the ith return time of the jth initial condition. We remark that positive diffusion coefficients can be measured only for = 0, because, for = 0, it is I2i (x) = 0 for all i, for all x ∈ W , for any choice of W . For = 0, the values of the diffusion coefficients depend also on the choice of W .

M. Guzzo, E. Lega, C. Froeschlé 30

20

25

19

Log10(1/D)

Log10(1/D)

566

20

15

10

18

17

16

5

15 2

4

6

8

Log10(1/eps)

10

12

14

8

8.2

8.4

8.6

8.8

9

9.2

9.4

9.6

Log10(1/eps)

Fig. 3. On the left: Computation of the diffusion coefficient for different values of ∈ (10−13 , 10−4 ) for a set of N = 100 initial conditions in W (the initial conditions are I2 = 0.324, I1 ∈ [−10−5 , 10−5 ], ϕ1 = π , ϕ2 = 0), using 100 return times to W . Data are very well fitted to a power law D() 2 for 10−13 ≤ ≤ 10−6 . Data diffusing with regular statistics (precisely, satisfying (s1), (s2), see Sect. 5.2) are represented with a cross symbol, while the other data are represented by squares. On the right: Representation of a zoom of the data of the left panel with their error bars

For example, we expect that the dynamics better approximate a Markovian process by restricting the neighbourhood W . In Fig. 3 we report the computation of the diffusion coefficient for different values of ∈ (10−13 , 10−4 ) for a set of N = 100 initial conditions in a set W defined by: W = {(I1 , ϕ1 , I2 , ϕ2 ) : max{|I1 | , |ϕ1 − π | , |ϕ2 |} < 0.01},

(12)

using T = 100 returns to W . We find that the values of the diffusion coefficients reported in Fig. 3 are well fitted by a power law D() 2 for ≤ 10−6 . For these small values of , the sets of integrated initial conditions behave as approximate Markovian processes (that is they satisfy conditions (s1), (s2), see Sect. 5.2) allowing us to compute diffusion coefficients, and the values of these diffusion coefficients satisfy a nice scaling law D() 2 . For higher values of , that is for ≥ 10−6 , data cannot be fitted by the 2 law, and we do not try any fit because their statistics is poor (i.e. (s1), (s2) are not satisfied, see Sect. 5.2). It is remarkable that the critical value = 10−6 is close to the value for which the error terms of Melnikov approximations (which increase with ) have a sharp increment. Precisely, in Fig. 4 we report the computation of a distance (defined in (28)) between the unstable manifold of a point of an invariant KAM curve and the unstable manifold computed using the Melnikov approximation (see Sect. 5.3 for the technical details). The distance between the two manifolds increases by two orders of magnitude between = 10−6 and = 10−4 . 5. Technical Tools 5.1. Normal hyperbolic invariant manifolds: numerical check of normal hyperbolicity and computation of the stable and unstable manifolds. The notion of normally hyperbolic invariant manifolds is extensively studied in [14], and can be stated as follows (see, for example, [13,14]):

Hyperbolic Manifolds and Arnold Diffusion

567

0.001

d

1e-04

1e-05

1e-06

1e-07 1e-08

1e-07

1e-06

1e-05

1e-04

0.001

eps

Fig. 4. Computation of the d defined in (28) as a function of . The quantity d represents a distance between the unstable manifold of a point x and its Melnikov approximation, divided by . From the values reported in the figure we can appreciate that d increases by two orders of magnitude between = 10−6 and = 10−4 . We also find that d decreases slowly for < 10−8 (for example, we measured d ∼ 10−7 for ∼ 10−20 )

Definition 3. Let M be a C q (q ≥ 1) compact connected manifold; let U ⊆ M be open and let φ : U → M be a C q embedding; let be a sub-manifold of M which is invariant by φ. The map φ is said to be normally hyperbolic on ( is also said to be normally hyperbolic invariant manifold) if there exists a Riemannian structure on M such that for any point x ∈ the tangent space Tx M has the following splitting: Tx M = E s (x) ⊕ Tx ⊕ E u (x), which is continuous, invariant, i.e. the linear spaces E s (x), E u (x) are invariant by φ: Dφ E s (x) ⊆ E s (φ(x)), Dφ E u (x) ⊆ E u (φ(x)), and there exist constants λ1 , λ2 , λ3 , µ1 , µ2 , µ3 satisfying: 0 < λ1 ≤ µ1 < λ2 ≤ µ2 < λ3 ≤ µ3 , µ1 < 1 < λ3 ,

(13)

such that: λ1 ≤

ξ ∈E

λ2 ≤ λ3 ≤

Dφ(x)ξ Dφ(x)ξ ≤ sup ≤ µ1 , (x)\0 ξ ξ ξ ∈E s (x)\0

inf s

Dφ(x)ξ Dφ(x)ξ ≤ sup ≤ µ2 , ξ ∈Tx \0 ξ ξ ξ ∈Tx \0 inf

inf

ξ ∈E u (x)\0

(14)

Dφ(x)ξ Dφ(x)ξ ≤ sup ≤ µ3 . u ξ ξ ξ ∈E (x)\0

Normally hyperbolic invariant manifolds have stable and unstable manifolds. Precisely, for any x ∈ there exist the smooth manifolds Wsloc (x), Wuloc (x) (see [14]) such that: x ∈ Wsloc (x), Wuloc (x), Tx Wsloc (x) = E s (x), Tx Wuloc (x) = E u (x) and for any n ≥ 0: y ∈ Wsloc (x) ⇒ d(φ n (x), φ n (y)) ≤ C(µ1 + c)n d(x, y), y ∈ Wuloc (x) ⇒ d(φ −n (x), φ −n (y)) ≤ C(λ3 − c)−n d(x, y),

(15)

568

M. Guzzo, E. Lega, C. Froeschlé

with C, c > 0 suitable constants (c suitably small) and where d(·, ·) denotes a distance on M. The manifolds Ws (x), Wu (x) are then obtained by iterating the local manifolds Wsloc (x), Wuloc (x) with φ −1 and φ respectively. The local stable and unstable manifolds of are defined by: Wsloc = ∪x∈ Wsloc (x), Wuloc = ∪x∈ Wuloc (x),

(16)

while the stable and unstable manifolds of are: Ws = ∪x∈ Ws (x), Wu = ∪x∈ Wu (x).

(17)

Below we describe the numerical methods that we use to compute points of the stable and unstable manifolds of the map (7), adapting the method of propagation of sets commonly used for hyperbolic fixed points of two dimensional maps. A sophisticated version of this method providing high precision computations and good visualizations of pieces of the manifold can be found in [24]. Different sophisticated methods can be found in the literature for computing unstable manifolds for the higher dimensional cases (see [15] for a detailed review with applications to the visualization of two dimensional manifolds). The common point of all these methods is that the manifolds are constructed from local linear approximations (see, for example, [5]). A technique specifically adapted to compute stable manifolds of hyperbolic tori is described in [25]. A numerical study of the relation between splittings of stable and unstable manifolds and normal forms is done in [22]. The methods that we used in Sect. 3 adapt these known techniques (for example of [24]) to the present case, and consist in the following steps: i) Verification that the manifold is normally hyperbolic. We numerically check that the invariant manifold is normally hyperbolic for = 0.0001, which is the largest value of the perturbing parameter used in this paper. Precisely, we check that a compact invariant region of , delimited by two invariant KAM curves containing (I2 , ϕ2 ) = (±2, 0), is normally hyperbolic with respect to the map φ N for some integer N . For each point x of a grid of initial conditions with I2 ∈ [−2, 2], I1 = 0, ϕ1 = π , ϕ2 = 0 we first compute the Lyapunov exponents of the map φ (up to N = 103 iterations) for initial tangent vectors in the tangent space Tx or t orthogonal to Tx , i.e. for vectors of the form ξ = (ξϕ1 , 0, ξ I1 , 0). We measure a positive Lyapunov exponent bigger than 0.62 for all the points of the grid, and of course a negative Lyapunov exponent smaller than −0.62. This is an indication of the hyperbolic splitting of the space Tx or t as a direct sum of a stable space E s (x) and an unstable space E u (x). The numerical algorithm for the computation of the Lyapunov characteristic exponents provides also an estimate for λ1 = µ1 and λ3 = µ3 related to φ N . It remains to estimate the constants λ2 , µ2 for the map φ N at the point x. Because in this case the growth of initial tangent vectors ξ = (0, ξϕ2 , 0, ξ I2 ) ∈ Tx is not always exponential, we do not compute the Lyapunov characteristic exponents, but we computed numerically the two dimensional matrix representing the restriction of Dφ N (x) to the space Tx and the quantities: λ2 ≤

Dφ N (x)ξ Dφ N (x)ξ ≤ sup ≤ µ2 . ξ ∈Tx \0 ξ ξ ξ ∈Tx \0 inf

Figure 5 (left panel) shows the numerical computation of log λ2 /N and log µ2 /N for N = 1000. From the comparison of the four computed quantities log λ1 , log λ2 , log µ2 , log λ3 we infer that they satisfy (13).

Hyperbolic Manifolds and Arnold Diffusion

569 -6

0,01

Log10 |Slope Eu(x_k’)-Slope Eu(x_k)|

log(λ2)/N log(µ2)/N

0,005

0

-2

-1

0

1

2

-7

-8

-9

-10

-11

-12

-13 -12

I2

-11

-10

-9

-8

-7

-6

Log10 |x_k-x_k’|

Fig. 5. On the left: Numerical estimates of log λ2 /N and log µ2 /N , N = 1000, computed on a grid of 1000 initial conditions with I2 ∈ [−2, 2], I1 = 0, ϕ1 = π , ϕ2 = 0 and = 10−4 . On the right: Test of the numerical precision in the computation of E u (xk ). The figure reports on logarithmic scale the difference among the slope of E u (xk ) and the slope of E u (xk ) (on the y axis) versus xk − xk (on the x axis), for those k > k such that xk − xk ≤ 10−6 . The upper curve refers to the case k = 10, which provides poor precision of the computation (of order 10−6 ), the lower curve refers to the case k = 105 , which provides good precision (better than 10−12 ). The data for k = 105 can be fitted by a straight line of slope 1. We can therefore infer that, within this precision, E u (x) is compatible with a Lipschitz condition in a neighbourhood of x

ii) Computation of the linear stable–unstable spaces. To compute numerical approximations of the linear space E u (x) we can now take advantage of the hyperbolicity of the dynamics. Precisely, we take a generic initial tangent vector ξ = (ξϕ1 , 0, ξ I1 , 0) ∈ E s (x) ⊕ E u (x) and we define the sequence: ξk = Dφ k (x)ξ = (ξϕk1 , 0, ξ Ik1 , 0) ∈ E s (φ k (x)) ⊕ E u (φ k (x)). The components (ξ Ik1 , ξϕk1 ) do not necessarily converge to limit values, but we know from hyperbolicity that the component of ξk on the space E u (φ k (x)) expands exponentially, while the component of ξk on the space E s (φ k (x)) contracts exponentially. Therefore, if k is a suitably high number (compared to the exponent of the expanding direction), the direction of the unstable space E u (φ k (x)) is determined by ξk . For example, for the initial condition (ϕ1 , ϕ2 , I1 , I2 ) = (π, 0, 0, 0.324), = 10−4 , after k = 105 iterations we obtain: xk = φ k (x) ∼ (π, 4.070625, 0, 0.324319),

E u (xk ) ∼< (0.652, 0, 0.75749, 0) >

and x j = φ j (x), E u (x j ) can be easily computed for any j needed. A test of the precision reached by these computations is done by computing E u (xk ) for k > k and by analyzing the variation of the slope of E u (xk ) as xk approaches xk . Two computations are reported in Fig. 5 (right panel): one for k = 105 as above, and another one for k = 10. The computation for k = 105 shows that the slope of E u (xk ) converges to the slope of E u (xk ) as xk approaches xk . This confirms that k = 105 is sufficient to compute the unstable space with an error smaller than 10−12 . Moreover, because the data in the figure can be fitted

570

M. Guzzo, E. Lega, C. Froeschlé

by a straight line of slope 1, we can infer that E u (x) is compatible with a Lipschitz condition in a neighbourhood of x. In the figure we report for comparison the same computation for k = 10: in this case the slope of E u (xk ) does not converge to the slope of E u (xk ) as xk approaches xk , but the difference among the slopes converges to a quantity of order 10−6 . iii) Computation of the stable–unstable manifolds. For any point x j , denoting by ξ j the unit vector generating the unstable space E u (x j ), we use the linear approximation: Wuloc (x j ) ∼ {x j + s ξ j , s ∈ [0, ρ)},

(18)

which is good as soon as ρ is very small (we use ρ = 10−10 in our computations). Then, we compute finite pieces of the unstable manifold using: φ j (Wuloc (x− j )) ⊆ Wu (x).

(19)

The small errors done by using the linear approximation for the local manifold do not accumulate at successive iterations, because the hyperbolic dynamics tends to reduce them (see [24]). iv) Computation of Wu (x ), Ws (x ) of Fig. 1. To detect Arnold diffusion in the system 3 (7) we compute points of Wu (x ) using Eq. (19), with x = φ 10 (π, 0, 0, 0.324), −6 −4 = 10 , 10 , with the high numerical precision of 400 digits. Then, we check if some of the computed points of Wu (x ) are good candidates to satisfy condition (3), that is if they have a variation of the action I2 bigger than c2 , with respect to −6 −4 x . Then, we choose the correction x . For both cases = 10 , 10 we found I of many orders of magnitude smaller than c2 (see Table 1), and I are 2 2 even much smaller, because the I2 component of E u (x ) is 0 and x ≤ 10−10 . The error estimator ρ is computed as follows. We considered a set of 10 points in a segment of amplitude 10−N aligned to E u (x ), in a neighbourhood of x + x . Then, we computed the orbits of these 10 points for the number of iterations T such that φ T (x + x ) + x ∈ Ws (x ) with a numerical precision of 2N . We decide that N is sufficiently large, compared to T , when the map φ separates the 10 points by a quantity ρ which is much smaller than the precision required to verify Eq. (3). For example, for = 10−6 and T = 1382, we found that ρ < 10−93 for N = 120, ρ < 10−153 for N = 180, while ρ = 10−173 with the actual precision of 400 digits. For = 10−4 and T = 1951 we found that ρ < 10−33 for N = 120, ρ < 10−93 for N = 180, and ρ < 10−113 for the actual precision of 400 digits. v) Computation of the parametrization of the manifold with respect to its arc length. To compute a parametrization of the manifold with respect to its arc length we proceed in two steps. First, we set K such that W K (x) = ∪ Kj=1 φ j (Wuloc (x− j )) ⊆ Wu (x)

(20)

can be parametrized by the ϕ1 coordinate, so that we can order the points in W K (x) with respect to ϕ1 . This allows one to construct a parametrization of W K (x) with respect to its arc–length, that we denote by s −→ (ϕ1 (s), ϕ2 (s), I1 (s), I2 (s)). Then, we reconstruct the unstable manifold for an arc–length much longer than the one obtained at the first step, to include many lobes of the manifold. This can be done by mapping with φ K additional points of the linear approximation of the

Hyperbolic Manifolds and Arnold Diffusion

571

local manifold, but paying attention to obtain a uniform sampling of the manifold with respect to its arc–length. This problem was already discussed in [24] and we use a similar procedure for the choice of the initial conditions on Wuloc (x−K ). More precisely, let us denote by x m , x m+1 the last two points of Wuloc (x−K ) used to compute W K (x), by x m = d(x m , x m+1 ), and by s m = s m+1 − s m the difference between the arc–lengths of the points φ K (x m ),φ K (x m+1 ). The choice of the point x m+2 will be done depending on s m as follows: ⎧ m+2 = x m+1 + x m if s1 < s m < s ⎨x m+2 m+1 m x =x + ηx if s m > s (21) ⎩ m+2 1 m+1 m =x + η x if s m < s1 x with s = 10−2 , s1 = 10−3 and η = 0.1. The result of the computation is reported in Fig. 2. 5.2. Computation of diffusion coefficients. In this section we describe the method that we use to estimate the diffusion coefficient related to Arnold diffusion for the map (7). We consider the curve γ ⊆ defined by (11), a neighbourhood W , and we perform a statistical analysis on the variations of I2 for orbits with initial conditions in W . We define the return map to W as follows: if there exists a minimum integer t (x) ≥ 1 such that φ t (x)−1 (x) ∈ / W and φ t (x) (x) ∈ W , we denote ψ(x) = φ t (x) (x). Then, let us denote I2 (x) = I2 , and by I2i (x) = I2 (ψ i+1 (x)) − I2 (ψ i (x)). A statistical approach to the dynamics in W∗ , such as the one described in [28], would be justified by the existence of a set W˜ ∗ ⊆ W∗ of points x such that the sequence I21 , I22 , . . . is a sequence of independent random variables. This is a very strong requirement that, in our knowledge, can represent only an approximate description of the dynamics of the system. In this spirit, the traditional statistical approaches, such as for example those based on random phases approximations, replace first the true dynamics with an approximate one which behave like a Markovian process, and then compute statistical quantities that can be defined precisely via the Markovian approximation. Here, we proceed in a different way: we fix a set W and then, instead of performing statistical approximations on the dynamics, we check that finite sets of initial conditions and the finite sequence I21 , . . . , I2T averaged over these initial conditions behave as if the process would be approximately Markovian. Because the variables I21 , . . . , I2T have the same mean and variance, but are not necessarily normally distributed, we check I 1 +···+I T

that the variable YT = 2 T 2 is normally distributed within a tolerance admitted for the central limit theorem convergence. Precisely: N s1) denoting by E(YT ) = N1 j=1 YT (x j ) the average of the variable YT over the set of N initial conditions x1 , . . . , x N , we require: 1 |E(YT )| ≤ √ E(YT2 ); (22) N s2) the cumulative density function T of YT (see, for example, [21]):

√

T (X ) − (X ) ≤ C

T σ

satisfies the Berry–Essèen inequality ρ √ , ∀X ∈ R, T

σ3

(23)

572

M. Guzzo, E. Lega, C. Froeschlé

with C = 0.8, where: σ2 =

N T 1 1 I2i (x j )2 N T j=1

i=1

is the mean variance of I21 , ..., I2T averaged over the N initial conditions, N T 3 1 1 i ρ= I2 (x j ) , N T j=1

and (x) =

1 2

1 + erf

√X 2

is the cumulative normal distribution.

By denoting: T0 =

i=1

min

i=1,...,N

sup

t (ψ j (xi )) ,

(24)

j=1,...,T

we say that a set of N initial conditions has regular statistics in the time interval [0, T0 ] if conditions (s1),(s2) are satisfied. Then, we compute the diffusion coefficient D on the set of N initial conditions x1 , . . . , x N as if the process would be a Markovian one, as the following average: D=

N T 1 1 I2i (x j )2 , N T ti (x j ) j=1

i=1

where ti (x j ) = t (ψ i−1 (x j )) denotes the ith return time of the jth initial condition. Remarks. (i) The quantity D is different from the variance σ 2 because it takes into account the individual return times ti (x j ). (ii) In view of the central limit theorem, the diffusion coefficient and the variance of the variable YT are computed by averaging over the variables I2i , while their errors are estimated as the normal errors √of the normal distribution of Y . Therefore, the error on D can be estimated by D 2/N . (iii) The results of this statistical analysis depend on the choice of W : on the one hand, we expect that the dynamics in neighbourhoods of γ better approximates a Markovian process by restricting the neighbourhood W ; on the other hand, for = 0 it is X i (x) = 0 for all i and for all x ∈ W , for any choice of the neighbourhood W . 5.3. Melnikov approximations. The Melnikov approximations of a priori unstable systems are obtained by neglecting the perturbation on the hyperbolic part of the system, as follows: Definition. Let us consider the map (7), x = (ϕ1 , ϕ2 , I1 , I2 ) ∈ and denote J = I2 . The Melnikov approximation of Wu (x) is the unstable manifold of x with respect to the ˜ following simplified map φ: ϕ1 = ϕ1 + I1 , I1 = I1 − a sin ϕ1 ,

ϕ2 = ϕ2 + J, I2 = I2 +

sin ϕ2 . (cos ϕ1 + cos ϕ2 + c)2

(25)

Hyperbolic Manifolds and Arnold Diffusion

573

60

60 Melnikov Full map

40

40

30

30

20 10 0

20 10 0

-10

-10

-20

-20

-30

-30

-40

0

200

400

600

800

1000 1200 1400 1600 1800

Melnikov Full map

50

(I2(s)-I2(0))/ε

(I2(s)-I2(0))/ε

50

-40

0

200

400

600

s

800

1000 1200 1400 1600 1800

s

Fig. 6. Each panel represents two parametrizations s → (I2 (s) − I2 (0))/ of the manifold Wu (x), with 5 x = φ 10 (π, 0, 0, 0.324) : one is obtained using the Melnikov approximation, while the other one is obtained using the full map. The left panel is for = 10−6 : the two parametrizations are close one to the other. The right panel is for = 10−4 : the Melnikov approximation is not valid

The numerical computation of the Melnikov approximation of Wu (x) is based on the following representation: Proposition. Let us consider x = (ϕ˜1 , ϕ˜2 , I˜1 , I˜2 ) ∈ and denote J = I˜2 . The Melnikov approximation of Wu (x) is represented by all points z = (ϕ1 , ϕ2 , I1 , I2 ) such that (ϕ1 , I1 ) is in the unstable manifold Wu∗ of the fixed point (π, 0) of the standard map: ϕ1 = ϕ1 + I1 , I1 = I1 − a sin ϕ1 ,

(26)

while ϕ2 = ϕ˜ 2 and: I2 = I˜2 −

−∞ k=−1

sin(ϕ˜2 − k J ) sin(ϕ˜2 − k J ) , − (cos ϕ1 (k) + cos(ϕ˜ 2 − k J ) + c)2 (cos(ϕ˜2 − k J ) + c − 1)2 (27)

where (ϕ1 ( j), I1 ( j)) denote the orbit with initial condition (ϕ1 , I1 ) ∈ Wu∗ with respect to the map (26). The proof of this proposition is reported at the end of this section. In Fig. 6 we compare two parametrizations s → (I2 (s) − I2 (0)) of the manifold Wu (x): one is obtained with the Melnikov approximation (27), while the other one is obtained using the full map (7). The left panel shows that for = 10−6 the two parametrizations are indeed very close to one another. The right panel shows that for = 10−4 the Melnikov approximation is not valid at all. In order to quantify the relevance of the error terms of the Melnikov approximation we have computed for 10−8 < < 10−3 the histograms H f and HM of (I2 (s) − I2 (0))/ for the full map and the Melnikov approximation respectively. We consider as an indicator of the distance between the two distributions the quantity:

574

M. Guzzo, E. Lega, C. Froeschlé

N d=

i=1 (H f (i) −

HM (i))2

N

,

(28)

where N = 100 is the number of bins. The quantity d (Fig. 4) increases by two orders of magnitude between = 10−6 and = 10−4 , while it slowly decreases for < 10−8 (not reported in Fig. 4). Proof of the Proposition. Let us denote by z( j) = (ϕ1 ( j), ϕ2 ( j), I1 ( j), I2 ( j)) the orbit of z = z(0) = (ϕ1 , ϕ2 , I1 , I2 ) and by x( j) = (ϕ˜1 ( j), ϕ˜2 ( j), I˜1 ( j), I˜2 ( j)) the orbit of ˜ The point z is in the unstable x = x(0) ˜ = (ϕ˜1 , ϕ˜2 , I˜1 , I˜2 ) with respect to the map φ. manifold of x if and only if it is: lim j→−∞ z( j)− x( j) = 0. Therefore, (I1 ( j), ϕ1 ( j)) tends to (0, π ) as j → −∞ if and only if (I1 (0), ϕ1 (0)) is in the unstable manifold Wu∗ of the fixed point (π, 0) of the map (26). Let us now prove (27). For any j ≤ −1 it holds: I2 ( j) =

j

(I2 (k) − I2 (k + 1)) + I2 (0)

(29)

k=−1

=

j k=−1

sin ϕ2 (k + 1) + I2 , (cos ϕ1 (k + 1) + cos ϕ2 (k + 1) + c)2

(30)

as well as: I˜2 ( j) =

j k=−1

( I˜2 (k) − I˜2 (k + 1)) + I˜2 (0) =

j k=−1

sin ϕ˜2 (k + 1) + I˜2 . (cos ϕ˜2 (k + 1) + c − 1)2

Therefore, lim j→−∞ I2 ( j) − I˜2 ( j) = 0 if and only if (27) holds.

6. Conclusions We have studied the Arnold diffusion along a normally hyperbolic manifold in a model of a priori unstable dynamical systems. We have introduced a definition of Arnold diffusion which is adapted to the numerical investigation of the problem, and is based on the numerical computation of the stable and unstable manifolds of the system. We have shown that the numerically computed stable and unstable manifolds indeed support this kind of Arnold diffusion. We also performed a numerical statistical study of Arnold diffusion, and we found that, for small values of , Arnold diffusion behaves as an approximate Markovian process, allowing one to compute diffusion coefficients. The dependence of the diffusion coefficient D on the perturbing parameter satisfies the scaling D() 2 for small values of . We also find that this law is correlated to the validity of the Melnikov approximation, in the sense that it is valid up to the same critical value of for which the error terms of Melnikov approximations have a sharp increment. This suggests that the Melnikov approximation is not only a technical tool which allows one to compute accurate approximations of the manifolds at small values of the perturbing parameters, but is related to a dynamical regime, and possibly it could be used to explain the statistical properties of Arnold diffusion.

Hyperbolic Manifolds and Arnold Diffusion

575

Acknowledgements. Guzzo acknowledges the project CPDA063945/06 of the University of Padova.

References 1. Arnold, V.I.: Instability of dynamical systems with several degrees of freedom. Sov. Math. Dokl. 6, 581–585 (1964) 2. Berti, M., Biasco, L., Bolle, P.: Drift in phase space: a new variational mechanism with optimal diffusion time. J. Math. Pures Appl. 82(6), 613–664 (2003) 3. Berti, M., Bolle, P.: A functional analysis approach to Arnold diffusion. Annales de LInstitut Henri Poincare (C) Non Linear Analysis 19(4), 395–450 (2002) 4. Bessi, U., Chierchia, L., Valdinoci, E.: Upper bounds on Arnold diffusion times via Mather theory. J. Math. Pures Appl. 80, 105–129 (2001) 5. Broer, H.W., Osinga, H.M., Vegter, G.: Algorithms for computing normally hyperbolic invariant manifolds. ZAMP 48, 480–524 (1997) 6. Chierchia, L., Gallavotti, G.: Drift and diffusion in phase space. Ann. Inst. H. Poincaré 60, 1–144 (1994) 7. Chirikov, B.V.: Research concerning the theory of nonlinear resonance and stochasticity”. Preprint N 267, Institute of Nuclear Physics, Novosibirsk (1969) Engl. Trans., CERN Trans. 71–40 (1971) 8. Delshams, A., de la Llave, R., Seara, T.M.: A geometric mechanism for diffusion in Hamiltonian systems overcoming the large gap problem: heuristics and rigorous verification on a model. Mem. Amer. Math. Soc. 179(844) (2006) 9. Efthymiopoulos, C., Voglis, N., Contopoulos, G.: Diffusion and Transient Spectra in 4-dimensional symplectic mapping. In: Analysis and Modelling of Discrete Dynamical Systems. Benest, D., Froeschlé, C., eds. NewYork: Gordon and Breach Science Publishers, 1998 10. Froeschlé, C., Guzzo, M., Lega, E.: Local and global diffusion along resonant lines in discrete quasi– integrable dynamical systems. Celest. Mech. Dyn. Astron. 92(1–3), 243–255 (2005) 11. Guzzo, M., Lega, E., Froeschlé, C.: First numerical evidence of Arnold diffusion in quasi–integrable systems. DCDS B 5(3) (2005) 12. Hénon, M., Heiles, C.: The Applicability of the third integral of motion: some numerical experiments. Astron. J. 69, 73–79 (1964) 13. Hasselblatt, B., Pesin, Y.: Partially Hyperbolic Dynamical Systems. Handbook of dynamical systems. Vol. 1B, Amsterdam: Elsevier B. V., 2006 14. Hirsch, M.W., Pugh, C.C., Shub, M.: Invariant Manifolds. Lecture Notes in Mathematics, Vol. 583. Berlin-New York: Springer-Verlag, 1977 15. Krauskopf, B., Osinga, H.M., Doedel, E.J., Henderson, M.E., Guckenheimer, J., Vladimirsky, A., Dellnitz, M., Junge, O.: A survey of methods for computing (un)stable manifolds of vector fields. Int. J. Bif. Chaos 15, 763–791 (2005) 16. Konishi, T., Kaneko, K.: Diffusion in Hamiltonian chaos and its size dependence. J. Phys. A: Math. Gen. 23(15), L715–L720 (1990) 17. Laskar, J.: Frequency analysis for multi-dimensional systems. Global Dynamics and Diffusion. Physica D 67, 257–281 (1993) 18. Lega, E., Guzzo, M., Froeschlé, C.: Detection of Arnold diffusion in Hamiltonian systems. Physica D 182, 179–187 (2003) 19. Lega, E., Froeschlé, C., Guzzo, M.: Diffusion in Hamiltonian quasi–integrable systems. In: Lecture Notes in Physics 729, Topics in gravitational dynamics, Benest, Froeschlé, Lega eds., Berlin-Heidelberg-NewYork: Spinger, 2007 20. Lichtemberg, A., Aswani, M.A.: Arnold diffusion in many weakly coupled mappings. Phys. Rev. E 57(5), 5321–5325 (1998) 21. Manoukian, E.: Modern Concepts and Theorems of Mathematical Statistics. Berlin-Heidelberg-NewYork: Springer, 1986 22. Morbidelli, A., Giorgilli, A.: On the role of high order resonances in normal forms and in separatrix splitting. Physica D 102, 195–207 (1997) 23. Ralston, A., Rabinowitz, P.: A First Course in Numerical Analysis, 2nd ed. NewYork: McGraw-Hill, 1978 24. Simó, C.: On the analytical and numerical approximation of invariant manifolds. In: Modern Methods in Celestial Mechanics, D. Benest, Cl. Froeschlé, eds, Gif-sur-Yvette: Editions Frontières, 1989, pp. 285–329 25. Simó, C., Valls, C.: A formal approximation of the splitting of separatrices in the classical Arnold’s example of diffusion with two equal parameters. Nonlinearity 14(6), 1707–1760 (2001) 26. Treschev, D.: Trajectories in a neighbourhood of asymptotic surfaces of a priori unstable Hamiltonian systems. Nonlinearity 15, 2033–2052 (2002)

576

M. Guzzo, E. Lega, C. Froeschlé

27. Treschev, D.: Evolution of slow variables in a priori unstable Hamiltonian systems. Nonlinearity 17, 1803–1841 (2004) 28. Varvoglis, H.: Chaos, random walks and diffusion in Hamiltonian systems. In: Hamiltonian systems and Fourier Analysis, Benest, Froeschlé and Lega, editors. Cambridge: Cambridge Scientific Publishers, 2005, pp. 247–287 29. Wood, B.P., Lichtenberg, A., Lieberman, M.A.: Arnold diffusion in weakly coupled standard map. Phys. Rev. A 42, 5885–5893 (1990) Communicated by G. Gallavotti

Commun. Math. Phys. 290, 577–595 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0864-7

Communications in

Mathematical Physics

Harmonic Measure and SLE D. Beliaev1,2 , S. Smirnov3 1 Department of Mathematics, Fine Hall, Princeton University,

Princeton, NJ 08544, USA. E-mail: [email protected]

2 IAS, Princeton, NJ 08544, USA 3 Section de Mathématiques, Université de Genève, 2-4 rue du Lièvre,

CH-1211 Genève 4, Switzerland. E-mail: [email protected] Received: 27 March 2008 / Accepted: 10 April 2009 Published online: 7 July 2009 – © Springer-Verlag 2009

Abstract: In this paper we study the multifractal structure of Schramm’s SLE curves. We derive the values of the (average) spectrum of harmonic measure and prove Duplantier’s prediction for the multifractal spectrum of SLE curves. The spectrum can also be used to derive estimates of the dimension, Hölder exponent and other geometrical quantities. The SLE curves provide perhaps the only example of sets where the spectrum is non-trivial yet exactly computable.

1. Introduction The motivation for this paper is twofold: to study multifractal spectrum of the harmonic measure and to better describe the geometry of Schramm’s SLE curves (see Sects. 1.1 and 1.2 for brief introductions to the respective subjects). Our main result is the following theorem in which we rigorously compute the average spectrum of harmonic measure on domains bounded by SLE curves (see below for precise definitions). ¯ of SLEκ is equal to Theorem 1. The average integral means spectrum β(t)

−t + κ −t +

4+κ −

(4 + κ)(4 + κ −

(4 + κ)2 − 8tκ 4κ

(4 + κ)2 − 8tκ ) 4κ t−

(4 + κ)2 16κ

t ≤ −1 − −1− t≥

3κ , 8

3κ 3(4 + κ)2 ≤t ≤ , 8 32κ

3(4 + κ)2 . 32κ

578

D. Beliaev, S. Smirnov

¯ of the bulk of SLE (see definition below) is The average integral means spectrum β(t) equal to 3(4 + κ)2 (4 + κ)(4 + κ − (4 + κ)2 − 8tκ ) t≤ , 5−t + 4κ 32κ 3(4 + κ)2 (4 + κ)2 , t≥ . t− 16κ 32κ Several results can be easily derived from this theorem: dimension estimates of the boundary of SLEκ hulls, Hölder continuity of SLEκ Riemann maps, Hölder continuity of SLEκ trace, and more. We also would like to point out that SLEκ seems to be the only family of models where the spectrum (even average) of harmonic measure is non-trivial and known explicitly. 1.1. Integral means spectrum. There are several equivalent definitions of harmonic measure that are useful in different contexts. For a domain Ω with a regular boundary we define the harmonic measure with a pole at z ∈ Ω as the exit distribution of the standard Brownian motion started at z. Namely, ωz (A) = P(Bτz ∈ A), where τ = inf{t : Btz ∈ Ω} is the first time the standard two-dimensional Brownian motion started at z leaves Ω. Alternatively, for a simply connected planar domain the harmonic measure is the image of the normalized length on the unit circle under the Riemann mapping that sends the origin to z. It is easy to see that harmonic measure depends on z in a smooth (actually harmonic) way, thus the geometric properties do not depend on the choice of the pole. So we fix the pole to be the origin or infinity and eliminate it from notation. Over the last twenty years it became clear that many extremal problems in the geometric function theory are related to the geometrical properties of harmonic measure and the proper language for these problems is the multifractal analysis. Multifractal analysis operates with different spectra of measures and relations between them. In this paper we study the harmonic measure on simply connected domains, so we give the rigorous definition for this case only. Let Ω = C\K , where K is a connected compact set and let φ be a Riemann mapping from D− (i.e. the complement of the unit disc) onto Ω such that φ(∞) = ∞. The integral means spectrum of φ (or Ω) is defined as log |φ (r eiθ )|t dθ . βφ (t) = βΩ (t) = lim sup − log(r − 1) r →1+ The universal integral means spectrum is defined as B(t) = sup βΩ (t), where supremum is taken over all simply connected domains with compact boundary. On the basis of work of Brennan, Carleson, Clunie, Jones, Makarov, Pommerenke and computer experiments for quadratic Julia sets Kraetzer [17] in 1996 formulated the following universal conjecture: B(t) = t 2 /4, B(t) = |t| − 1,

|t| < 2, |t| ≥ 2.

Harmonic Measure and SLE

579

It is known that many other conjectures follow from Kraetzer’s conjecture. In particular, Brennan’s conjecture [5] about integrability of |ψ |, where ψ is a conformal map from a domain to the unit disc is equivalent to B(−2) = 1, while Carleson-Jones conjecture [6] that if φ(z) = z + an z n is a bounded univalent function in the unit disc then |an | n −3/4 is equivalent to B(1) = 1/4. There are many partial results in both directions: estimates of B(t) from above and below (see surveys [3,15]). Upper bounds are more difficult and they are still not that far from the trivial bounds like B(1) ≤ 1/2. Currently the best upper bound is B(1) ≤ 0.46 [14]. Until recently lower bounds were also quite far from the conjectured value. The main problem in finding lower bounds is that it is almost impossible to compute the spectrum explicitly for any non-trivial domain. The origin of difficulties is easy to see: only fractal domains have interesting spectrum, but for them the boundary behavior of |φ (r eiθ )|t depends on θ in a very non smooth way, making it hard to find the average growth rate. We claim that in order to overcome these problems one should work with regular random fractals instead of deterministic ones. For random fractals it is natural to study the average integral means spectrum which is defined as log E |φ (r eiθ )|t dθ ¯ = lim sup β(t) . − log |r − 1| r →1 The advantage of this approach it that for many random fractals the average boundary behavior of |φ | is a very smooth function of θ . Therefore it is sufficient to study average behavior along any particular radius. Regular (random) fractals are invariant under some (random) transformation, making E|φ |t a solution of a specific equation. Solving this equation one can find the average spectrum. ¯ and β(t) do not necessarily coincide. It can even happen (and in this Note that β(t) ¯ is not a spectrum of any particular domain. paper we consider exactly this case) that β(t) ¯ is still bounded by the universal spectrum B(t). If there is a random fractal But β(t) ¯ > B(t), then for each scale rn = 1 + 1/2n there is a realization of the random with β(t) ¯ fractal for which the integral mean on the scale rn is at least c2n(β(t)− ) , where c is a universal constant. Then by Makarov’s fractal approximation [25] we can glue together all these realizations and find a domain which has a large spectrum on all scales. Another important notion is the dimension or multifractal spectrum of harmonic measure which can be non-rigorously defined as f (α) = dim{z : ω(B(z, r )) ≈ r α }, α ≥ 1/2, where ω(B(z, r )) is the harmonic measure of the disc of radius r centered at z. The condition α ≥ 1/2 is equivalent to Beurling’s estimate ω(B(z, r )) ≤ cr 1/2 . There are several ways to make this definition rigorous, leading to slightly different notions of spectrum. But it is known [25] that the universal spectrum F(α) = supΩ f (α) is the same for all definitions of f (α). For regular (in some sense) fractals the integral means and dimension spectra are related by a Legendre type transform (for general domains there is only one-side inequality). It is also known [25] that the universal spectra are related by a Legendre type transform: F(α) = inf (t + α(B(t) + 1 − t)), t

B(t) = sup

α>0

F(α) − t + t − 1. α

580

D. Beliaev, S. Smirnov

1.2. Schramm-Loewner Evolution. It is a common belief (and it was proved in a few cases) that planar lattice models at criticality have conformally invariant scaling limits as the mesh of the lattice tends to zero. Schramm [32] introduced a one parametric family of random curves which are called S L E κ (SLE stands for Stochastic Loewner Evolution or Schramm-Loewner Evolution) that are the only possible limits of cluster perimeters for critical lattice models. It turned out to be also a very useful tool in many related problems. In this section we give the definition of SLE and the necessary background. The discussion of various versions of SLE and relations between them can be found in Lawler’s book [19]. To define SLE we need a classical tool from complex analysis: the Loewner evolution. In general this is a method to describe by an ODE the evolution of the Riemann map from a growing (shrinking) domain to a uniformization domain. In this paper we use the radial Loewner evolution (where uniformization domain is the complement of the unit disc) and its modifications. Definition 1. The radial Loewner evolution in the complement of the unit disc with driving function ξ(t) : R+ → T is the solution of the following ODE: ∂t gt (z) = gt (z)

ξ(t) + gt (z) , ξ(t) − gt (z)

g0 (z) = z.

(1)

It is a classical fact [19] that for any driving function ξ , gt is a conformal map from Ωt → D− , where D− is the complement of the unit disc and Ωt = D− \K t is the set of all points where solution of (1) exists up to the time t. The Schramm-Loewner Evolution S L E κ is defined as a Loewner evolution driven by √ √ the Brownian motion with speed κ on the unit circle, namely ξ(t) = ei κ Bt , where Bt is the standard Brownian motion and κ is a positive parameter. Since ξ is random, we obtain a family of random sets. The corresponding family of compacts K t is also called SLE (or the hull of SLE). A number of theorems was already established about SLE curves. Rohde and Schramm [29] proved that SLE for κ = 8 is a.s. generated by a curve. Namely, almost surely there is a random curve γ (called trace) such that Ωt is the unbounded component of D− \γt , where γt = γ ([0, t]). The trace is almost surely a simple curve when κ ≤ 4. In this case the hull K t is the same as the curve γt . For κ ≥ 8 the trace γt is a space-filling curve. In the same paper they also proved that almost surely the Minkowski (and hence the Hausdorff) dimension of the S L E κ trace is no more than 1 + κ/8 for κ ≤ 8. Beffara [2] proved that the Hausdorff dimension is equal to 1+κ/8 for κ = 6, later expanding the result to all κ ≤ 8. Recently Lawler presented in [20] a completely different proof of the Hausdorff dimension of SLE paths. Lind [24] proved that the trace is Hölder continuous. Another natural object is the boundary of SLE hull, namely the boundary of K t . For κ ≤ 4 the boundary of SLE is the same as SLE trace (since the trace is a simple curve). For κ > 4 the boundary is the subset of the trace. Rohde and Schramm [29] proved that for κ > 4 the dimension of the boundary is no more than 1 + 2/κ. In 1998 Lawler [18] proved that the a.s. multifractal spectrum of the Brownian frontier (which is the same as the boundary of S L E 6 ) can be expressed in terms of intersection exponents. He also showed that these exponents are non-trivial. They have been computed later by Lawler, Schramm, and Werner in [21–23]. In [9,10], physicist Duplantier using quantum gravity methods predicted the average multifractal spectrum of SLE. The same result was later derived using conformal field theory by Bettelheim, Rushkin, Gruzberg, and Wiegmann [4,30].

Harmonic Measure and SLE

581

Another important property of SLE curves is the so-called duality property: the boundary of the S L E κ hull for κ > 4 is in the same measure class as the trace of S L E 16/κ . This property was first discovered by Duplantier, and much later proved by Zhan [33] and Dubedat [8]. In this paper we rigorously compute the average integral means spectrum of SLE and show that it coincides with Duplantier’s prediction. This gives new proofs that dimension of the boundary is no more than 1 + 2/κ for κ > 4 and SLE maps are Hölder continuous, and provides more evidence which supports the duality conjecture. Since β¯ is defined in terms of a Riemann mapping, it is more convenient to work with f t = gt−1 . From Eq. (1) one can derive an equation on f t . Unfortunately this equation involves f t as well as ∂t f t , so we have a PDE instead of ODE. There is another approach which leads to a nice equation. Changing the direction of the flow defined by Eq. (1) we get the equation for “inverse” function g−t . For a given driving function ξ , maps gt−1 and g−t are different, but in the case of Brownian motion they have the same distribution.The precise meaning is given by the following lemma (which is an analog of the Lemma 3.1 from [29]): Lemma 1. Let gt be a radial SLE, then for all t ∈ R the map z → g−t (z) has the same distribution as the map z → fˆt (z)/ξt , where fˆt (z) = gt−1 (zξt ). Proof. Fix s ∈ R. Let ξˆ (t) = ξ(s + t)/ξ(s). Then ξˆ has the same distribution as ξ . Let gˆ t (z) = gs+t (gs−1 (zξ(s)))/ξ(s). It is easy to check that gˆ 0 (z) = z and gˆ −s (z) = g0 (gs−1 (zξ(s)))/ξ(s) = fˆs (z)/ξ(s). Differentiating gˆ t (z) with respect to t we obtain ∂t gˆ t (z) = gˆ t (z) hence gˆ t has the same distribution as SLE.

ξˆ (t) + gˆ t (z) , ξˆ (t) − gˆ t (z)

This lemma proves that the solution of the equation ∂t f t (z) = f t (z)

f t (z) + ξ(t) , f t (z) − ξ(t)

f 0 (z) = z,

(2)

√

where ξ(t) = ei κ Bt has the same distribution as gt−1 . Abusing notations we call it also S L Eκ . One of the most important properties of SLE is Markov property, roughly speaking it means that the composition of two independent copies of SLE is an SLE. The rigorous formulation is given by the following lemma. (1)

(2)

Lemma 2. Let f τ be an S L E κ driven by ξ (1) (τ ), 0 < τ < t and f τ be an S L E κ driven by ξ (2) (τ ), 0 < τ < s, where ξ (1) and ξ (2) are two independent Brownian motions on the circle. Then f s+t (z) = f s(2) ( f t(1) (z)/ξ (1) (t))ξ (1) (t) is S L E κ at time t + s.

582

D. Beliaev, S. Smirnov

Proof. This composition is the solution of Loewner Evolution driven by ξ(τ ), where (1) 0 < τ ≤ t, ξ (τ ), ξ(τ ) = ξ (2) (τ − t)ξ (1) (t), t < τ ≤ t + s. It is easy to see that ξ(τ ) is also a Brownian motion on the circle with the same speed √ κ, hence f t+s is also S L E κ . We will need yet another modification of SLE which is in fact a manifestation of stationarity of radial SLE. √ Definition 2. Let ξ(t) = exp(i κ Bt ) be a two-sided Brownian motion on the unit circle. The whole plane S L E κ is the family of conformal maps gt satisfying ∂t gt (z) = gt (z)

ξ(t) + gt (z) , ξ(t) − gt (z)

with initial condition lim et gt (z) = z,

t→−∞

z ∈ C\{0}.

The whole-plane SLE satisfies the same differential equation as the radial SLE, the difference is in the initial conditions. One can think about the whole-plane SLE as about the radial SLE started at t = −∞. And this is the way to construct the whole-plane SLE and prove the existence. Proposition 4.21 in [19] proves that the whole-plane Loewner Evolution gt with the driving function ξ(t) is the limit as s → −∞ of the follow(s) (s) ing maps: gt (z) = e−t z if t ≤ s, gt (z) is the solution to (1) with initial condition (s) gs (x) = e−s z. The same is also true for inverse maps. We use this argument to prove that there is a limit of e−t f t as t → ∞. Lemma 3. Let f t be a radial S L E κ then there is a limit in law of e−t f t (z) as t → ∞. Proof. The function e−t f t is exactly the function which is used to define the wholeplane SLE. Multiplication by the exponent corresponds to the shift in time in the driving function. The function e−t f t (z) has the same distribution as the inverse of g0(−t) (z), hence it converges to F0 , where Fτ = gτ−1 and gτ is a whole-plane SLE. 1.3. Results, conjectures, and organization of the paper. It is easy to see that the geometry near “the tip” of SLE (the point of growth) is different from the geometry near “generic” points. This means that for some problems it is more convenient to work with the so-called bulk of SLE, i.e. the part of the SLE hull which is away from the tip. We repeat the statement of the main theorem in which we compute the average spectrum of SLE hull and SLE bulk. ¯ of SLE is equal to Theorem 1. The average integral means spectrum β(t) 4 + κ − (4 + κ)2 − 8tκ 3κ −t + κ t ≤ −1 − , 8 4κ (4 + κ)(4 + κ − (4 + κ)2 − 8tκ ) 3κ 3(4 + κ)2 −t + −1− ≤t ≤ , 4κ 8 32κ 3(4 + κ)2 (4 + κ)2 t≥ . t− 16κ 32κ

Harmonic Measure and SLE

583

¯ of the bulk of SLE is equal to The average integral means spectrum β(t) 3(4 + κ)2 (4 + κ)(4 + κ − (4 + κ)2 − 8tκ ) t≤ , 5−t + 4κ 32κ (4 + κ)2 3(4 + κ)2 t− , t≥ . 16κ 32κ Remark 1. The local structure of the SLE bulk is the same for all versions of SLE which means that they all have the same average spectrum. Remark 2. To prove this theorem we show that E| f (r eiθ )|t (r − 1)β ((r − 1)2 + θ 2 )γ , where β and γ are given by (12) and (11). We would like to point out that β and γ are local exponents so they are the same for different versions of SLE. There are several corollaries that one can derive from Theorem 1: Corollary 1. The SLE map f is Hölder continuous with any exponent less than 1 1 2 ακ = 1 − − + , µ µ2 µ where µ = (4 + κ)2 /4κ. Corollary 2. The Hausdorff dimension of the boundary of the SLE hull for κ ≥ 4 is at most 1 + 2/κ. Corollary 3. The SLE trace with time parametrization of SLE maps is Hölder continuous. The Hölder exponent is κ 1− . √ 24 − 2κ − 8 8 + κ The first two results are conjectured to be sharp. They both have been previously published in [16 and 29] correspondingly. Both results can be easily derived from the properties of the spectrum (see [25]) and Theorem 1. The third corollary first appeared in a paper by Lind [24] where she uses derivatives estimated by Rohde and Schramm. One can use Theorem 1 to prove this result. Theorem 1 gives the average spectrum of SLE. The question about spectra of individual realizations of SLE remains open. We believe that with probability one they all have the same spectrum β(t) which we call the a.s. spectrum. It is immediate that the tangent line at t = 3(4 + κ)2 /32κ intersects y-axis at −(4 + 2 κ) /16κ < −1. This contradicts Makarov’s characterization of possible spectra [25] which in particular states that the tangent line to β(t) should intersect y-axis between 0 and −1. Thus β¯ can not be a spectrum of any given domain. In particular β¯ is not the a.s. spectrum of SLE. On the other hand it suggests that the following conjecture is true. ¯ at tmin Conjecture 1. Let tmin and tmax be the two points such that the tangents to β(t) and tmax intersect the y-axis at −1. The almost sure value of the spectrum is equal to ¯ β(t) for tmin ≤ t ≤ tmax and continues as the tangents for t < tmin and t > tmax . Explicit formulas for tmin , tmax , and tangent lines are given in (4) and (5). See Fig. 1 for ¯ plots of β and β.

584

D. Beliaev, S. Smirnov

tmin

1

tmax tc

tmin

1

tmax tc

Fig. 1. Plots of β and β¯ spectra. We also show the graph of β˜ (the analytical part of the spectra) as well as tangent lines at tmin , tmax , and tc = 3(4 + κ)2 /32κ. The almost sure spectrum is equal to β˜ as long as it does not violate Makarov’s condition that tangent lines should intersect the y-axis above −1. This happens for tmin < t < tmax . Outside of this interval β continues as tangent lines. The average spectrum is given by β˜ as long as the derivative is less than 1. At t = tc the derivative is equal to 1 and β¯ continues as a straight line for t > tc

The rest of the paper is organized in the following way. In the first part of the Sect. 2 we discuss Duplantier’s prediction and Conjecture 1. In second part we compute the moments of | f | and prove Theorem 1. In Sect. 3.1 me make some remarks about possible generalizations of SLE. In the last Sect. 3.2 we explain a possible approach to Conjecture 1.

2. Integral Means Spectrum of SLE 2.1. Duplantier’s prediction for the spectrum of the bulk. In 2000, by means of quantum gravity, Duplantier predicted that the Hausdorff dimension spectrum of the bulk of SLE is f (α) = α −

(25 − c)(α − 1)2 , α ≥ 1/2, 12(2α − 1)

where c is the central charge which is related to κ by c=

(6 − κ)(6 − 16/κ) . 4

The negative values of f do not have a simple geometric interpretation, they correspond to negative dimensions (see papers by Mandelbrot [26,27]) which appear only in the random setting. They correspond to the events that have zero probability in the limit, but appear on finite scales as exceptional events. There is another interpretation in terms of beta spectrum which we explain below. Since negative values of f correspond to zero probability events, it makes sense to introduce the positive part of the spectrum: f + = max{ f, 0}. We believe that f + is the almost sure value of the dimension spectrum. This is the dimension spectrum counterpart of Conjecture 1. The function f + is equal to f for α ∈ [αmin , αmax ], where

Harmonic Measure and SLE

αmin αmax αmin αmax

585

√ √ 16 + 4κ + κ 2 − 2 2 16κ + 10κ 2 + κ 3 = , κ = 4, (4 − κ)2 √ √ 16 + 4κ + κ 2 + 2 2 16κ + 10κ 2 + κ 3 = , κ = 4, (4 − κ)2 2 = , κ = 4, 3 = ∞, κ = 4.

It is known (see [25]) that for regular fractals the β(t) spectrum is related to the f (α) spectrum by the Legendre transform. We believe those relations to hold for SLE as well: β(t) − t + 1 = sup ( f (α) − t)/α, α>0

f (α) = inf (t + α(β(t) − t + 1)). t

The Legendre transform of f + is supposed to be equal to the almost sure value of the integral means spectrum β(t), while the Legendre transform of f is believed to be ¯ equal to the average integral means spectrum β(t). The Legendre transform of f + has two phase transitions: one for negative t and one for positive. The Legendre transform of f + is equal to 1 − 1, t ≤ tmin , β(t) = t 1 − αmin

(4 + κ) 4 + κ − (4 + κ)2 − 8tκ , tmin < t < tmax , β(t) = −t + (3) 4κ 1 − 1, t ≥ tmax , β(t) = t 1 − αmax where tmin = − f (αmin )αmin , κ > 0, tmax = − f (αmax )αmax , κ = 4, tmax = 3/2, κ = 4. We can also express tmin and tmax in terms of µ = 4/κ + 2 + κ/4 = (4 + κ)2 /4κ: √ −1 − 2µ − (1 + µ) 1 + 2µ , tmin = µ √ −1 − 2µ + (1 + µ) 1 + 2µ tmax = . µ And the linear functions in (3) can be written as 1 t √ − 1 − 1, 1 − 2tmin /µ 1 − 1 − 1. t √ 1 − 2tmax /µ

(4)

(5)

586

D. Beliaev, S. Smirnov

For convenience we introduce ˜ = −t + β(t)

(4 + κ) 4 + κ − (4 + κ)2 − 8tκ

, 4κ which is the analytic part of the spectrum and defined for all t < (4 + κ)2 /8κ. This function is the analytic part of the Legendre transform of f . The critical points tmax and ¯ intersects the y-axis at −1. tmin are the points where the tangent line to the graph of β(t) + ˜ between these two critical points and The Legendre transform of f is equal to β(t) then continues as a linear function. Note that Makarov’s theorem [25] states that all possible integral means spectra satisfy the following conditions: they are non-negative convex functions bounded by the universal spectrum such that the tangent line at any point intersects the y-axis between 0 and −1. So there is another way to describe the Legendre transform of f + : it coincides with β˜ as long as this does not contradict Makarov’s criteria and then continues in the only possible way. If we do not cut off the negative part of f , then the picture is a bit different. There is no phase transition for negative t. For positive t, phase transition occurs later, and it happens because the derivative of f (α) is bounded at infinity. For large α, (4 + κ)2 3(4 + κ)2 1 f (α) = α 1 − + +O , 16κ 32κ α hence

(4 + κ) 4 + κ − (4 + κ)2 − 8tκ 3(4 + κ)2 ¯ = −t + β(t) , t≤ , 4κ 32κ (4 + κ)2 (4 + κ)2 3(4 + κ)2 ¯ =1− β(t) +t −1=t − , t> . 16κ 16κ 32κ ¯ is a The explanation of this phase transition is rather simple. It is obvious that β(t) convex function, and it follows from Makarov’s fractal approximation that the average spectrum is bounded by the universal spectrum. It is known that for the large values of |t| the universal spectrum is equal to |t| − 1. Altogether it implies that |β¯ (t)| ≤ 1 and if it is equal to 1 at some point then β¯ should be linear after this point. And β¯ = 1 exactly at t = 3(4 + κ)2 /32κ. 2.2. Rigorous computation of the spectrum. In this section we compute the average integral means spectrum of SLE (and its bulk) and show that it coincides with the Legendre transform of the dimension spectrum predicted by Duplantier. ˜ τ ) = E | f τ (z)|t , The average integral means spectrum is the growth rate of F(z, where f τ is a radial S L E κ . Actually, this function depends also on t and κ, but they are fixed throughout the proof and we will not mention this dependence to simplify the notation. ˜ τ ) is a solution of Lemma 4. The function F(z, t

where z = r eiθ .

r 4 + 4r 2 (1 − r cos θ ) − 1 ˜ r (r 2 − 1) F + F˜r (r 2 − 2r cos θ + 1)2 r 2 − 2r cos θ + 1 2r sin θ κ − 2 F˜θ + F˜θ,θ − F˜τ = 0, r − 2r cos θ + 1 2

(6)

Harmonic Measure and SLE

587

Proof. The idea of the proof is to construct a martingale Ms (w.r.t filtration defining ˜ The ds term in its Itô derivative should vanish. This will give SLE) which involves F. ˜ We set us a partial differential equation on F. Ms = E | f τ (z)|t | Fs . By Lemma 2,

E | f τ (z)|t | Fs = E | f s (z)|t | f τ −s ( f s (z)/ξs )|t | Fs ˜ s , τ − s), = | f s (z)|t F(z

where z s = f s (z)/ξs . We will need derivatives of z s and | f s |t , ∂s log | f s (z)| = Re = Re

∂z f s

f s +ξs f s −ξs f s

= Re

f s + ξs 2ξs f s − f s − ξs ( f s − ξ s )2

z s2 − 1 − 2z s r 4 + 4r 2 (1 − r cos θ ) − 1 = , (z s − 1)2 (r 2 − 2r cos θ + 1)2

where z s = r exp(iθ ). Next we have to find the derivative of z s , √ d log z s = d log r + idθ = d log f s − i κd Bs , where d log f s =

d fs zs + 1 = ds. fs zs − 1

Writing everything in terms of r and θ we get √ zs + 1 ds − i κd Bs zs − 1 √ 2r sin θ r2 − 1 ds + i − 2 ds − κd Bs . = 2 r − 2r cos θ + 1 r − 2r cos θ + 1

d log r + idθ =

Summing it all up we obtain r 4 + 4r 2 (1 − r cos θ ) − 1 , (r 2 − 2r cos θ + 1)2 √ 2r sin θ ds − κd Bs , dθ = − 2 r − 2r cos θ + 1 r (r 2 − 1) ds. dr = r d log r = 2 r − 2r cos θ + 1

∂s log | f s (z)| =

(7) (8) (9)

Let us write F(z, τ ) as F(r, θ, τ ). The ds term in the Itô derivative of M is equal to 4 r + 4r 2 (1 − r cos θ ) − 1 ˜ r (r 2 − 1) | f s (z)|t t F+ 2 F˜r 2 2 (r − 2r cos θ + 1) r − 2r cos θ + 1 2r sin θ κ − 2 F˜θ + F˜θ,θ − F˜τ . r − 2r cos θ + 1 2 This derivative should be 0 and, since f s is a univalent function and its derivative never vanishes, F˜ is a solution of (6).

588

D. Beliaev, S. Smirnov

By Lemma 3 there is a limit of e−τ f τ as τ → ∞. Hence we can introduce ˜ τ ), F(z) = E[|F0 (z)|t ] = lim e−τ t F(z, τ →∞

where F0 is a whole-plane SLE map at time zero. Passing to the limit in (6) we can see that F(z) is a solution of

r 4 + 4r 2 (1 − r cos θ ) − 1 r (r 2 − 1) Fr − 1 F+ 2 2 2 (r − 2r cos θ + 1) r − 2r cos θ + 1 2r sin θ κ Fθ + Fθ,θ = 0. − 2 r − 2r cos θ + 1 2 Notation 1. We define two constants β and γ : 4 + κ − (4 + κ)2 − 8tκ , γ = γ (t, κ) = 2κ t

(4 + κ)γ . 2 ˜ It is easy to see that the second constant β is equal to −β. β = β(t, κ) = t −

(10)

(11) (12)

Let us explain where these constants come from. Roughly speaking spectrum β(t) is the growth rate of F as r → 1. F is a solution of Eq. (10) which is parabolic as r → 1. It has a singularity when |z| = 1 which corresponds to the large time singularity in the usual parabolic equation. Coefficients of (10) have singularities at z = 1 which means that solutions could have an additional singularity at z = 1. Let us assume that F has a power series expansion near 1. Then we can write the power series expansion of coefficients of (10) and assuming that the leading term is (r − 1)β ((r − 1)2 + θ 2 )γ we get an equation on β and γ . Constants γ and β are solutions of these equations. Now let us explain why it makes sense to consider this expansion. There is another (and more popular) version of SLE: the chordal SLE in the upper half-plane, which is defined as the solution of 2 ∂τ f τ (z) = − . √ f τ (z) − κ Bτ If we define F(x, y, τ ) = E| f τ (x + i y)|t , then the argument similar to the one presented above proves that F satisfies a certain PDE. If we remove the Fτ term (which should be irrelevant for large τ ) then the equation will be x 2 − y2 2x 2y κ F− 2 Fx + 2 Fy + Fx x = 0. (13) (x 2 + y 2 )2 x + y2 x + y2 2 This equation is “tangent” to (10) at r = 1 and θ = 0. This equation has a solution of the form y β (x 2 + y 2 )γ , where β and γ as above. Actually, this is the way we found these exponents. This approach seems to be easier, but there are two major problems. First it is not easy to argue that we can neglect the derivative with respect to τ . Another problem is that y β (x 2 + y 2 )γ can not be equal to F since it blows up at infinity and we have to show that the local behavior does not depend on the boundary conditions at infinity. When this work was finished we learned from Gruzberg that several years ago Hastings in [13] derived Eq. (13) by completely different methods (and for completely different purposes). 2t

Harmonic Measure and SLE

589

Theorem 2. Let t≤

3(4 + κ)2 . 32κ

Then we have E

|z|=r

|F0 (r eiθ )|t dθ

1 r −1

β(t) ¯

,

¯ is where the expectation is taken for a whole-plane SLE map F0 = lim e−τ f τ and β(t) equal to −β(t, κ),

t > −1 −

−β(t, κ) − 2γ (t, κ) − 1,

3κ , 8

t ≤ −1 −

3κ . 8

(14)

Proof. Let Λ be the differential operator which corresponds to Eq. (10). This is a parabolic operator where θ corresponds to the spatial variable and r → 1 corresponds to the time variable. It is clear that F(z) is bounded on any circle of radius r0 > 1. Suppose that we can find positive functions φ+ and φ− which are bounded on the circle of radius r0 and such that Λφ− < 0 and Λφ+ > 0. Then there are positive constants c+ and c− such that F is between c+ φ+ and c− φ− on the circle of radius r0 . By the maximum principle it will be between c+ φ+ and c− φ− for all 1 < r < r0 . In Lemma 6 we will construct such functions φ− and φ+ . They are of the form φ± = (r − 1)β (r 2 − 2r cos θ + 1)γ (− log(r − 1))∓1 g(r 2 − 2r cos θ + 1), where g > 0 for r = 1. Both functions have the same polynomial growth rate as r → 1, thus F has also the same growth rate. By the Tonelli theorem

E |F0 |t = E |F0 (r, θ )|t dθ ≈ (r − 1)β (r 2 − 2r cos θ + 1)γ dθ, where ≈ means that functions have the same polynomial growth rate. For γ > −1/2 the weight (r 2 − 2r cos θ + 1)γ is integrable up to the boundary and we immediately get −β

1 E |F0 |t ≈ . r −1 |z|=r For γ ≤ −1/2 the situation is a bit different. In this case the integral of the weight blows up as (r − 1)2γ +1 , which gives us E |F0 |t dθ ≈ (r − 1)β+2γ +1 . It is easy to check that γ ≤ −1/2 if and only if t ≤ −1 − 3κ/8. t ¯ predicted by Duplantier. The Remark 3. The growth rate of E |F0 | is similar to β(t) phase transition at t = −1 − 3κ/8 is due to the exceptional behavior of SLE at the tip. If we integrate over values of θ bounded away from 0 then the weight |z − 1|2γ does not blow up and we have no phase transition at t = −1 − 3κ/8 any more. This gives us the spectrum of the bulk of SLE. Now we can prove Theorem 1 which is actually Theorem 2 stated in terms of integral ¯ is correct. means spectrum. This theorem proves that Duplantier’s prediction for β(t)

590

D. Beliaev, S. Smirnov

¯ for t ≤ 3(4 + κ)2 /32κ. Direct Proof (Theorem 1). Theorem 2 gives us the value of β(t) computations show that the derivative of −β(t, κ) at t = 3(4 + κ)2 /32κ is equal to one. As we mentioned before, the β¯ spectrum is a convex function bounded by the universal spectrum, and the universal spectrum is equal to |t| − 1 for the large values of |t| (see [7]). This means that if β¯ = 1 at some point then it should continue as a linear function with slope one. Hence β¯ should continue as t − (4 + κ)2 /16κ for t > 3(4 + κ)2 /32κ. Plugging in the values of β and γ we finish the proof of the theorem. To complete the proof of Theorem 2 we have to construct functions φ− and φ+ . We do it in three steps, first we write the restriction of Eq. (10) to the unit circle, then we find a positive solution g of the resulting equation. Finally we construct φ− and φ+ out of g. We look for a solution in the following form: f (r, θ ) = (r − 1)β (r 2 − 2r cos θ + 1)γ g(r 2 − 2r cos θ + 1). Plugging f into (10), factoring (r − 1)β (r 2 − 2r cos θ + 1)γ −2 out, and taking r = 1, we obtain a differential equation on g(2 − 2 cos θ ). Using relations between β, γ , t, and κ we can simplify coefficients and write the equation in the following form: −2(2 + κ)γ (1 − cos θ )2 g(2 − 2 cos θ ) + (2 − 2 cos θ ) −2 − κ + 2γ κ + 2κ cos θ − (κ − 2 + 2γ κ) cos(2θ ) g (2 − 2 cos θ ) + 2κ(2 − 2 cos θ )2 sin θ 2 g (2 − 2 cos θ ) = 0.

(15)

Lemma 5. Equation (15) has a smooth (with possible exception at θ = 0) positive bounded solution on the circle if and only if t≤

3(4 + κ)2 . 32κ

(16)

Proof. Changing the variable to x = 2 − 2 cos θ we rewrite (15) as a hypergeometric equation γ (2 + κ)g(x) + (8 − 2x + κ(x − 2) + 2γ κ(x − 4))g (x) + κ(x − 4)xg (x) = 0, (17) which has two independent solutions g1 (x) = 2 F1 (a, b, and

g2 (x) = x

where

1/2−a−b

2 F1

x 1 + a + b, ) 2 4

1 1 3 x − a, − b, − a − b, , 2 2 2 4

√ 1 1 − 2tκ a=γ − − , κ κ √ 1 1 − 2tκ b=γ − + . κ κ

Harmonic Measure and SLE

591

Function g(2 − 2 cos θ ) is a non-singular part of F and should have a second derivative everywhere on the unit circle except at the point θ = 0 (the equation on F has a singularity at this point). Note that 2 − 2 cos θ = 4 corresponds to the point −1 on the unit circle: this is not a singular point, hence g(x) should have expansion c + O(4 − x) at the endpoint 4. Any solution of (15) is a linear combination of g1 and g2 : g = c1 g1 + c2 g2 . We want to find coefficients c1 and c2 such that this sum is bounded and has a correct expansion at x = 4. Expansions of g1 and g2 at 4 are √ √ π Γ (1/2 + a + b) π Γ (1/2 + a + b) √ − 4 − x + O(4 − x), g1 (x) = Γ (1/2 + a)Γ (1/2 + b) Γ (a)Γ (b) and √ 21−2a−2b πΓ (3/2 − a − b) g2 (x) = Γ (1 − a)Γ (1 − b) √ 1−2a−2b 2 π Γ (3/2 − a − b) √ − 4 − x + O(4 − x). Γ (1/2 − a)Γ (1/2 − b) If c2 = 0 then 1/2 − a − b should be nonnegative, otherwise g is not bounded at 0. Note that 4 + κ − 4γ κ 1 −a−b = 2 2κ which is nonnegative if and only if t≤

3(4 + κ)2 32κ

which is exactly the restriction from the statement of the lemma. If t > 3(4 + κ)2 /32κ, then c2 = 0. In this case g has a correct expansion at 4 if and only if Γ (a) = 0 or Γ (b) = 0, but 1 − 2tκ < 0, so both a and b are not a real number and the gamma function has only real roots. We can introduce C=

Γ (1/2 + a + b)Γ (1/2 − a)Γ (1/2 − b) , 21−2a−2b Γ (a)Γ (b)Γ (3/2 − a − b)

and g3 (x) = g1 (x) − Cg2 (x). By construction g3 (x) = const + O(4 − x) near 4. Finally we have to prove that g3 is a positive function. Note that in (17) g and g have coefficients of different signs. Obviously, g3 (0) = 1. Suppose that g3 has a local minimum inside the interval (0, 4),

592

D. Beliaev, S. Smirnov

then g3 = 0 and g3 ≥ 0 at this point, hence g3 is also positive. Thus it is sufficient to check that g3 (4) > 0. The value of g3 (4) is easy to evaluate: √ g3 (4) = π Γ (1/2 + a + b) Γ (1/2 − a)Γ (1/2 − b) 1 − × Γ (1/2 + a)Γ (1/2 + b) Γ (a)Γ (b)Γ (1 − a)Γ (1 − b) √ π Γ (1/2 + a + b) cos(π(a + b)) = Γ (1/2 + a)Γ (1/2 + b) cos(πa) cos(π b) = π −3/2 Γ (1/2 + a + b) cos(π(a + b))Γ (1/2 − a)Γ (1/2 − b). By (16), a + b < 1/2, hence Γ (1/2 + a + b) cos(π(a + b)) > 0. Finally we have to show that Γ (1/2 − a)Γ (1/2 − b) > 0. We consider two different cases: when t ≤ 1/2κ and t > 1/2κ. In the second case a and b are conjugated and Γ (1/2 − a)Γ (1/2 − b) = |Γ (1/2 − a)|2 > 0. In the first case, we will prove that 1/2 − a > 0 and 1/2 − b > 0. It is easy to see that 1/2 − b < 1/2 − a, hence it is sufficient to prove that 1/2 − b > 0. Recall that √ 1 − 2tκ 1 1 1 −b = −γ + − , 2 2 κ κ hence 2 1 ∂t (1/2 − b) = √ > 0. − 1 − 2κt (4 + κ)2 − 8tκ This means that 1/2 − b has a minimum when t = 0, this minimum is 1 1 1 − b(0) = − γ (0) = > 0. 2 2 2 This proves that g3 (x) > 0 on [0, 4]. Lemma 6. Let g be a positive bounded solution of (15) and F = f (r, θ )(− log(r − 1))δ =(r − 1)β (r 2 − 2r cos θ + 1)γ g(r 2 − 2r cos θ + 1)(− log(r − 1))δ . Then ΛF > 0, δ < 0, ΛF < 0, δ > 0, for r sufficiently close to 1. Proof. Applying Λ we find

ΛF = (− log(r − 1))δ Λ f − f

r (r + 1)δ . (r 2 − 2r cos θ + 1)(− log(r − 1))

By Lemma 5 Λ f = (r − 1)β (r 2 − 2r cos θ + 1)γ O(r − 1), hence ΛF =(− log(r − 1))δ (r − 1)β (r 2 − 2r cos θ + 1)γ r (r + 1)δ(g(2 − 2 cos θ ) + O(r − 1)) . × O(r − 1) − w(− log(r − 1)) The sign of the main term is opposite to the sign of δ. This proves the claim.

Harmonic Measure and SLE

593

Remark 4. Note that we proved a stronger result than announced in Theorem 2: E |F |t has growth rate (r − 1)β up to a factor logδ (r − 1) for arbitrary small |δ|. 3. Concluding Remarks 3.1. Loewner Evolution driven by other processes. It is known that Loewner Evolution can be defined for a very large class of driving functions. In particular, they do not have to be continuous. In [3], we proposed to study Lévy-Loewner Evolution (L L E), which is the Loewner Evolution driven by a Lévy process (i.e. process with independent stationary increments). This defines a very rich class of random fractals. It seems that it is still possible to find the spectrum of harmonic measure for this class explicitly. In the fundamental Lemma 4 we only use the fact that the Brownian motion is a Lévy process. So the same argument can be applied for L L E. As a result we get that F = E |e−τ f τ (z)|t is the solution of 4 r (r 2 − 1) r + 4r 2 (1 − r cos θ ) − 1 − 1 F+ 2 t Fr 2 2 (r − 2r cos θ + 1) r − 2r cos θ + 1 2r sin θ − 2 Fθ + ΛF = 0, r − 2r cos θ + 1 where Λ is the generator of the driving Lévy process. Thus again finding the spectrum boils down to the analysis of a parabolic type integro-differential equation. We have freedom to choose the driving process (and the generator Λ), so it seems possible to find a driving process such that this equation could be solved and gives large spectrum. Unpublished computer experiments by Meyer [28] suggested that the spectrum for 1-stable process could be large (and possibly equal to the conjectured universal spectrum). Unfortunately later work by Gruzberg, Guan, Kadanoff, Oikonomou, Rohde, Rushkin, Winkel, and others [11,12,31] showed that this is wrong. But there is still a possibility that computer experiments exposed an existing phenomenon. It could be that the integral means grow fast for a few (relatively) large scales and when we approach the boundary their growth slows down. If this is true, one can use L L E as a building block in a snowflake (or any other construction which allows to replicate scales). In this way one can hope to construct a domain with large integral means on all scales. 3.2. Almost sure value of the spectrum. In this section, we speculate about what should be done to prove that the almost sure value of the spectrum is given by (3). −n 2πik/2n )|t . The spectrum Let us introduce random variables X k (n) = | f ((1 + 2 )e −n is the growth rate of 2 k X k . We know that n

2

−n

2

¯

EX k 2n β(t) .

k=1

We want to show that the probability ¯ X k − EX k | > 2n(β(t)−δ) P 2−n |

(18)

is summable for some positive δ. This will clearly imply that the spectrum of SLE is equal to β(t) with probability one.

594

D. Beliaev, S. Smirnov

Conformal field theory considerations suggest that X k and X l are essentially independent if |k −l| 1 (in other words the distance between points should be much larger than their distance to the boundary). In fact it is believed that derivatives are essentially independent if the distance between points is greater than any power (less than one) of the distance to the boundary. Let us exaggerate it a little bit more and assume that X k and X l are independent for any k = l. Let us denote X k − EX k by Yk . By the Chebyshev inequality the probability (18) is less than E| Yk |1+ . ¯ 2n(1+ )(β(t)+1−δ) It is known (see [1]) that for independent random variables with zero mean E| Yk |1+ ≤ c E|Yk |1+ , where c is an absolute constant which does not depend on the number of terms. Using this we can estimate the fraction above by ¯ E|Yk |1+ 2n 2n β(t+t ) ¯ ¯ ¯ ≤ c = c2n(1+β(t+t )−β(t)−1+δ− β(t)− + δ) . (19) ¯ ¯ 2n(1+ )(β(t)+1−δ) 2n(1+ )(β(t)+1−δ) For small < 0 (t) the exponent in the last formula is bounded by ¯ − + δ) = n( (β¯ (t)t − β(t) ¯ − 1) + 3/2 + δ + δ). n(β¯ (t)t + 3/2 + δ − β(t) ¯ If β¯ (t)t − β(t)−1 = c(t) < 0, then we can find a small t (depending on t only) such that ¯ − 1) + t3/2 < c(t) t /2. Fix δ = − t c(t)/4, then the exponent in (19) is t (β¯ (t)t − β(t) ¯ negative. This implies that the probability in (18) is summable if −1 < β(t)−t β¯ (t). The last inequality means that the tangent line to β at point t intersects the y axis above −1. This is exactly the condition which appeared in (3). Thus, assuming the independence of derivatives, we can prove that the almost sure ¯ for tmin < t < tmax . For other values of t Makavalue of the spectrum is equal to β(t) ¯ rov’s theorem implies that the spectrum should continue as a straight line tangent to β(t) at tmin and tmax correspondingly. Acknowledgements. Work supported in part by Swiss National Science foundation, STINT, Göran Gustafsson Foundation, and Knut and Alice Wallenberg Foundation.

References 1. Bahr, B.v., Esseen, C.-G.: Inequalities for the r th absolute moment of a sum of random variables, 1 ≤ r ≤ 2. Ann. Math. Stat. 36, 299–303 (1965) 2. Beffara, V.: Hausdorff dimensions for SLE6 . Ann. Probab. 32(3B), 2606–2629 (2004) 3. Beliaev, D., Smirnov, S.: Harmonic measure on fractal sets. In: European Congress of Mathematics. Zürich: Eur. Math. Soc., 2005, pp. 41–59 4. Bettelheim, E., Rushkin, I., Gruzberg, I.A., Wiegmann, P.: Harmonic measure of critical curves. Phys. Rev. Lett. 95(17), 170602 (2005) 5. Brennan, J.E.: The integrability of the derivative in conformal mapping. J. London Math. Soc. (2) 18(2), 261–272 (1978) 6. Carleson, L., Jones, P.W.: On coefficient problems for univalent functions and conformal dimension. Duke Math. J. 66(2), 169–206 (1992) 7. Carleson, L., Makarov, N.G.: Some results connected with Brennan’s conjecture. Ark. Mat. 32(1), 33–62 (1994) 8. Dubedat, J.: Duality of Schramm-Loewner evolutions. http://arxiv.org/abs/0711.1884v2[math.PR], 2007

Harmonic Measure and SLE

595

9. Duplantier, B.: Conformally invariant fractals and potential theory. Phys. Rev. Lett. 84(7), 1363–1367 (2000) 10. Duplantier, B.: Higher conformal multifractality. J. Stat. Phys. 110(3-6), 691–738 (2003) 11. Guan, Q.-Y.: Cadlag curves of S L E driven by Levy processes. http://arxiv.org/abs/0705.2321v2[math. PR], 2008 12. Guan, Q.-Y., Winkel, M.: S L E and alpha-S L E driven by Levy processes. http://arxiv.org/abs/math/ 0606685v1[math.PR], 2006 13. Hastings, M.B.: Exact multifractal spectra for arbitrary laplacian random walks. Phys. Rev. Lett. 88(5), 055506 (2002) 14. Hedenmalm, H., Shimorin, S.: Weighted Bergman spaces and the integral means spectrum of conformal mappings. Duke Math. J. 127(2), 341–393 (2005) 15. Hedenmalm, H., Sola, A.: Spectral notions for conformal maps: a survey. Comput. Methods Funct. Theory 8(1-2), 447–474 (2008) 16. Kang, N.-G.: Boundary behavior of SLE. J. Amer. Math. Soc. 20(1), 185–210 (electronic) (2007) 17. Kraetzer, P.: Experimental bounds for the universal integral means spectrum of conformal maps. Complex Variables Theory Appl. 31(4), 305–309 (1996) 18. Lawler, G.: The frontier of a brownian path is multifractal. Preprint, 1998 19. Lawler, G.: Conformally Invariant Processes in the Plane. Volume 114 of Mathematical Surveys and Monographs. Providence, RI: Amer. Math. Soc., 2005 20. Lawler, G.: Dimension and natural parametrization for sle curves. http://arxiv.org/abs/0712. 3263v1[math.PR], 2007 21. Lawler, G.F., Schramm, O., Werner, W.: Values of Brownian intersection exponents. I. Half-plane exponents. Acta Math. 187(2), 237–273 (2001) 22. Lawler, G.F., Schramm, O., Werner, W.: Values of Brownian intersection exponents. II. Plane exponents. Acta Math. 187(2), 275–308 (2001) 23. Lawler, G.F., Schramm, O., Werner, W.: Values of Brownian intersection exponents. III. Two-sided exponents. Ann. Inst. H. Poincaré Probab. Statist. 38(1), 109–123 (2002) 24. Lind, J.R.: Hölder regularity of the SLE trace. Trans. Amer. Math. Soc. 360(7), 3557–3578 (2008) 25. Makarov, N.G.: Fine structure of harmonic measure. St. Petersburg Math. J. 10(2), 217–268 (1999) 26. Mandelbrot, B.: Negative fractal dimensions and multifractals. Phys. A 163(1), 306–315 (1990) 27. Mandelbrot, B.: Multifractal power law distributions: negative and critical dimensions and other “anomalies,” explained by a simple example. J. Stat. Phys. 110(3–6), 739–774 (2003) 28. Meyer, D.: Private communications 29. Rohde, S., Schramm, O.: Basic properties of SLE. Ann. of Math. (2) 161(2), 883–924 (2005) 30. Rushkin, I., Bettelheim, E., Gruzberg, I.A., Wiegmann, P.: Critical curves in conformally invariant statistical systems. J. Phys. A 40(9), 2165–2195 (2007) 31. Rushkin, I., Oikonomou, P., Kadanoff, L.P., Gruzberg, I.A.: Stochastic Loewner evolution driven by Lévy processes. J. Stat. Mech. Theory Exp. 1, P01001, 21 pp. (electronic) (2006) 32. Schramm, O.: Scaling limits of loop-erased random walks and uniform spanning trees. Israel J. Math. 118, 221–288 (2000) 33. Zhan, D.: Duality of chordal SLE. Invent. Math. 174(2), 309–353 (2008) Communicated by M. Aizenman

Commun. Math. Phys. 290, 597–632 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0844-y

Communications in

Mathematical Physics

Adiabatic Limit and the Slow Motion of Vortices in a Chern-Simons-Schrödinger System Sophia Demoulini1 , David Stuart2 1 Centre for Mathematical Sciences, Wilberforce Road,

Cambridge, CB3 OWB, England. E-mail: [email protected]

2 Centre for Mathematical Sciences, Wilberforce Road, Cambridge, CB3 0WA,

England. E-mail: [email protected] Received: 17 April 2008 / Accepted: 22 March 2009 Published online: 18 June 2009 – © Springer-Verlag 2009

Abstract: We study a nonlinear system of partial differential equations in which a complex field (the Higgs field) evolves according to a nonlinear Schrödinger equation, coupled to an electromagnetic field whose time evolution is determined by a Chern-Simons term in the action. In two space dimensions, the Chern-Simons dynamics is a Galileo invariant evolution for A, which is an interesting alternative to the Lorentz invariant Maxwell evolution, and is finding increasing numbers of applications in two dimensional condensed matter field theory. The system we study, introduced by Manton, is a special case (for constant external magnetic field, and a point interaction) of the effective field theory of Zhang, Hansson and Kivelson arising in studies of the fractional quantum Hall effect. From the mathematical perspective the system is a natural gauge invariant generalization of the nonlinear Schrödinger equation, which is also Galileo invariant and admits a self-dual structure with a resulting large space of topological solitons (the moduli space of self-dual Ginzburg-Landau vortices). We prove a theorem describing the adiabatic approximation of this system by a Hamiltonian system on the moduli space. The approximation holds for values of the Higgs self-coupling constant λ close to the self-dual (Bogomolny) value of 1. The viability of the approximation scheme depends upon the fact that self-dual vortices form a symplectic submanifold of the phase space (modulo gauge invariance). The theorem provides a rigorous description of slow vortex dynamics in the near self-dual limit. 1. Introduction and Statement of Results In this article we study vortex dynamics in a nonlinear system of evolution equations (1.5) introduced by Manton (1997). This system is in fact a special case of an effective field theory for the fractional quantum Hall effect (the Zhang-Hansson-Kivelson, or ZHK, model). In addition it is a natural gauge invariant generalization of the nonlinear Schrödinger equation, possessing important structural features (Galileo invariance and self-dual structure with existence of related moduli spaces of solitons) which make

598

S. Demoulini, D. Stuart

it interesting to study for mathematical reasons. After introducing the system under study, and putting it into mathematical and physical context, we explain the necessary background material in order to state our results, which appear in Sect. 1.7.

1.1. Chern-Simons vortex dynamics. We start by motivating the study of Manton’s system from the mathematical perspective, before going on to show that it is equivalent to a special case of the ZHK model, and discussing its physical significance. 1.1.1. Manton’s system on R2 : mathematical context. To introduce Manton’s system, we start with the nonlinear Schrödinger equation on R2 : i

λ ∂ = − − (1 − ||2 ), ∂t 2

(1.1)

to be solved for : R × R2 → C; λ is a positive number. This has the following properties: (i) it defines a globally well-posed Cauchy problem, (ii) it admits topological soliton solutions, the Ginzburg-Landau vortices, and (iii) it is invariant under the group of Galilean transformations. Manton’s system is a generalization of (1.1), sharing these properties, which describes the evolution of a complex field , coupled to a dynamically evolving electromagnetic potential A = A0 dt + A1 d x 1 + A2 d x 2 . On R2 the system reads explicitly (writing a, b = ab): ¯ ∂ A2 ∂ ∂ ∂ A1 ∂ A1 + 1 − − i A2 , = −i, ∂t ∂x ∂x1 ∂x2 ∂x2 ∂ A2 ∂ ∂ ∂ A2 ∂ A1 = +i, + 2 − − i A1 , ∂t ∂x ∂x1 ∂x2 ∂x1 2 2 ∂ ∂ λ i − i A0 + − i A = − (1 − ||2 ), j ∂t ∂x j 2

(1.2)

j=1

∂ A2 ∂ A1 1 − = + (1 − ||2 ). ∂x1 ∂x2 2 In addition to (i)-(iii) above, this system has the following mathematical properties: (iv) it is gauge invariant, (v) self-dual structure and a large space of topological solitons (see Sect. 1.6). These properties make the study of vortex dynamics in Manton’s system interesting, since the self-dual structure makes a rigorous analysis possible when the vortices are arbitrarily close (see Sect. 1.6-1.7). The proof of our results makes use of special mathematical features present due to self-duality which are explained in Sect. 3; these features include complex and symplectic structures on the soliton moduli space, and a foliation of the phase space which we call the Bogomolny foliation.

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

599

1.1.2. Equivalence of Manton’s system and a special case of the ZHK model. The system (1.3) can be derived from the action S = c0 s(A, )d 2 xdt, where s(A, ) = − µνρ Aµ ∂ν Aρ + i, (∂t − i A0 ) + A0 + | jk ∂ j Ak |2 λ +|(∂ j − i A j )|2 + (1 − ||2 )2 , 4 where Greek indices run over {0, 1, 2} for space-time tensorial quantities, Roman indices run over {1, 2}, and µνρ , jk are the completely anti-symmetric symbols and the summation convention is understood. This action is one of a class involving the ChernSimons term µνρ Aµ ∂ν Aρ , see [17,22] for a review. It is characteristic of these theories that variation of the action with respect to A0 gives a constraint equation involving the magnetic field, in this case the final equation of (1.3). This equation is analogous to the Gauss law in ordinary Maxwell theory, and is referred to as a constraint because the previous (dynamical) equations in (1.3) imply that its time derivative vanishes (exactly as do the dynamical Maxwell equations for the Gauss law). This constraint means that many apparently different actions give rise to the same Euler-Lagrange equations: in particular we can replace the above action density with s˜ (A, ) = − µνρ Aµ ∂ν Aρ + i, (∂t − i A0 ) + A0 + |(∇ − i A)|2 λ+1 (1 − ||2 )2 . + 4 We now introduce the ZHK action SZHK (a, ; Aext ) = c sZHK d 2 xdt and show that Manton’s action S is in fact a special case of SZHK ; essentially the same observation appears also in [22, p. 54]. The ZHK action is the action for a mean field description of the quantum Hall effect. This effect refers to the current J j = σ jk E kext produced in an effectively two dimensional system of electrons in a strong transverse magnetic field, by application of an applied electric field E kext . In the right experimental situation the conductivity tensor σ jk is found to be off-diagonal (i.e. σ11 = 0 = σ22 ), with the non-zero entries σ12 = −σ21 = f e2 /, where f is an integer, or a fraction, for (respectively) the integer and fractional quantum Hall effect. This quantization of the values of σ12 means that as the number of charge carriers is increased there is no corresponding increase in the current - it lies on a plateau - at least until the number of carriers is sufficiently greatly increased, at which point the conductivity moves to another of the quantized values, and the current moves to another plateau. In the mean field description the field interacts with an external (applied) electromagnetic potential Aext and a “statistical” potential a, according to: sZHK =

κ µνρ 1 aµ ∂ν aρ + i, (∂t − ia0 − i Aext |(∇ − ia − i Aext )|2 0 ) + 2 2m +

(1 − |(x)|2 )V (x − x )(1 − |(x )|2 )d 2 x ,

(see [51], [17, Sect. 4.6], or [52, Eqs. (7)-(8)], taking note of the published erratum for the latter reference). To reduce this to s˜ we consider the case of a constant external magnetic ext ext field B ext = ∂1 Aext 2 − ∂2 A1 with A0 = 0. (The standard configuration in quantum Hall experiments involves a strong transverse magnetic field applied to an effectively

600

S. Demoulini, D. Stuart

two dimensional electron gas, with relatively small electric potentials applied along one of the planar directions.) Define A = a + Aext . Now check that µνρ aµ ∂ν aρ = a0 (∂1 a2 − ∂2 a1 ) − a1 (∂t a2 − ∂2 a0 ) + a2 (∂t a1 − ∂1 a0 ) = A0 (∂1 A2 − ∂2 A1 − B ext ) − (A1 − Aext 1 )(∂t A2 − ∂2 A0 ) +(A2 − Aext )(∂ A − ∂ A ) t 1 1 0 2 = µνρ Aµ ∂ν Aρ − 2 A0 B ext ext ext ext +∂t (Aext 1 A2 − A2 A1 ) + ∂1 (A0 A2 ) − ∂2 (A0 A1 ) µνρ ext µνρ ext = Aµ ∂ν Aρ − 2 A0 B + ∂µ (Aν Aρ )

and deduce that sZHK − s˜ZHK is a derivative, where s˜ZHK =

κ µνρ 1 Aµ ∂ν Aρ − κ B ext A0 + i, (∂t − i A0 ) + |(∇ − i A)|2 2 2m +

(1 − |(x)|2 )V (x − x )(1 − |(x )|2 )d 2 x .

(1.3)

Now recall that derivatives in the action density do not affect the corresponding EulerLagrange equations (they are null Lagrangians). It follows by comparing s˜ and s˜ZHK that the equations of motion for SZHK will be identical to those of Manton if we choose V (x) = (λ + 1)δ(x)/4, κ = −2, B ext = 1/2 and m = 1/2. Therefore, we conclude that, at least as far as the classical equations of motion are concerned, the ZHK model with these values is the same as Manton’s system in the case of • a constant external magnetic field of appropriate value, and • a point interaction V (x) ∝ δ(x). We now discuss the physical interpretation of the model in the fractional quantum Hall context. There is a microscopic model, due to Laughlin, which explains the observed phenomena in a well-accepted way in terms of a new phase of the two dimensional electron gas (for low temperature and high magnetic fields), with ground state described by the Laughlin wave function. There is an energy gap in the spectrum, so that the excitations above this new ground state have strictly positive energy - the Laughlin quasi-particles and quasi-holes, which have fractional statistics and fractional charge. It is this fractional charge, combined with the explanation of the integer quantum Hall effect, which gives rise to the fractional quantum Hall effect. The effective field theory proposed in [52], and reviewed at length in [51], gives a mean field description which is not expected to be accurate on microscopic length scales, but which does give an alternative explanation of all the main observed phenomena. In this mean field theory, the elementary excitations are described by the topological vortices, which are endowed with the same fractional charge and fractional statistics as the Laughlin quasi-particles. (The Chern-Simons term for the statistical gauge field a in the action serves to change the statistics in the well known way explained in [2,51]). It is understood that, in the mean field picture, it is the pinning of vortices which explains the observed plateaus in the Hall conductance ([34,42,51]). Thus a good understanding of vortex dynamics in the ZHK model should be useful to gain a better explanation of the phenomena within the context of the mean field approach. Needless to say there is still much work to be done to go from the results of this paper to results which would apply directly to the experimental situation: even apart from issues like the spatial domain and the real values of the coefficients in the model, it will be necessary to treat the applied electric potential which produces the Hall

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

601

current flow. This means that in the above derivation we should allow for an external electric field E kext = −∂k A0 , in addition to the static magnetic field, and investigate its effect on the vortex motion. Since the magnetic field is very strong in the experimental situation, it should be reasonable to treat the electric field perturbatively. To conclude the discussion of the motivation for our work, the system (1.3) is one of a class of dynamical Chern-Simons vortex models whose study is mathematically interesting (due to properties (i)–(v) above), and which is physically relevant (as we have just discussed). The use of such models in condensed matter applications is phenomenological, so the precise Lagrangian and many values of the coupling constants, etc. are not precisely known. (Actually, in [51,52] the ZHK action (1.3) is derived formally from an ostensibly microscopic, second quantized, action. However, this microscopic action itself seems to have a phenomenological character, since it involves excitations which are not fundamental electrons, but rather collective excitations - see the discussion following (2.6) in [42]). In any case our main result provides a rigorous basis for understanding vortex dynamics in a prototype for a class of theories which are of interest in two dimensional condensed matter theory. The adiabatic limit system (1.22) which we derive for the vortex dynamics cannot usually be written down explicitly, but as discussed in Remark 1.7.4, the behaviour of some of its solutions can be understood in many cases, and thus information on the dynamics of vortices can be deduced within the framework of this approximation. There are reasons to hope that qualitative features of the motion in this limiting situation will have a wider validity: see Remark 1.7.4. As a final comment on the quantum Hall effect, there is another type of soliton a nonlocal Skyrmion - which appears in treatments of the ferromagnetic properties of quantum Hall samples (see [18,41,49 and 14] for some analytic properties of these Skyrmions in a particular case). 1.1.3. General physical context for Chern-Simons models There has been a fairly long standing interest in systems of the type (1.3) in the physics literature; we give a brief summary and refer the reader to [17,22,23] for detailed reviews. The study of ChernSimons dynamics in 2+1 dimensional Maxwell and (non-abelian gauge) theories was started in the early 80’s (see e.g. [11]) and the incorporation of vortices into this dynamics (in systems with coupling to a nonlinear Schrödinger equation) has been studied since at least the early 90’s by theoretical physicists (see the review [23] for early work on Chern-Simons vortices). The reason for this interest is both because (i) the Chern-Simons models are used widely in condensed matter physics in descriptions of the quantum Hall effect and high T superconductivity, and (ii) because they provide a useful scenario in which to probe certain complex issues in field theories. Regarding the first point, there are various time-dependent models for magnetic vortices but at very low temperatures it is argued ([3,43]) that the motion should be non-dissipative so the usual Eliashberg-Gorkov equation is not appropriate, and the Chern-Simons coupled to Schrödinger vortex dynamics is widely used instead in the condensed matter literature, both in superconductivity and the quantum Hall effect; see [34, Sect. 10.7], [32, Chap. 6] for general discussions, in addition to the references for the ZHK model in the previous section. (Relativistic invariance is broken in these condensed matter applications, so the corresponding relativistic abelian Higgs model, whose vortex dynamics are studied in [44], is not appropriate. The main application which has been suggested for the relativistic dynamics appears to be cosmic string evolution.) There have been explanations offered for the wide occurrence of Cherns-Simons types models in two dimensional condensed matter applications in terms of universality features of

602

S. Demoulini, D. Stuart

large scale effective actions for two dimensional interacting electronic and magnetic systems with spin ([16, Sect. 3]). Regarding the second reason for interest in these models, it was realized in the 1980’s that in two dimensions there were possible quantum statistics other than the usual fermionic and bosonic types - anyons, are two dimensional quantum particles undergoing an arbitrary phase shift on interchange. Furthermore, composite objects made up from charged particles orbiting vortices (or flux tubes) have fractional spin and statistics ([50]). In [15] the authors study the quantum theory of a Lagrangian which is closely related to (1.12), and use it to investigate the quantization of solitons, quantum statistics and anyons in a rigorous quantum field theory setting. 1.2. Organization of the article. The article is organized as follows. Our main aim is the study of vortex dynamics in the Chern-Simons-Schrödinger system with spatial domain a Riemann surface, so we start in the next section by writing down the equations in this case, and then giving necessary background including a discussion of the self-dual vortices in Sect. 1.6. We then state our main result, Theorem 1.7.2, which describes the adiabatic approximation of vortex motion in the self-dual limit. This is proved in Sect. 2 following a strategy explained in the context of a simple model problem in Sect. 1.8. The proof uses some specialized identities related to the self-dual (or Bogomolny) structure, presented in Sect. 3 (which may be read separately). Various subsidiary facts and lemmas are given in the Appendix. 1.3. The equations on a surface. The dependent variables are a complex field (t, x), and an electromagnetic potential 1-form A0 dt + A1 d x 1 + A2 d x 2 . This 1-form determines a covariant derivative operator ∂ ∂ − i A 0 , D1 , D2 = − i A0 , ∇1 − i A1 , ∇2 − i A2 , D = (D0 , D1 , D2 ) = ∂t ∂t (1.4) which in turn determines the electric field E = E j d x j and magnetic field B(t, x) via (1.6); all these fields are defined for (t, x) ∈ R × , where is a two dimensional spatial domain, taken to be a Riemann surface with metric g jk d x j d x k , area form dµg and complex structure J : T ∗ → T ∗ (where j, k, . . . take values in {1, 2} and we use the summation convention). Introducing a covariant Laplacian operator by 1 D j g i j det g Di − A = − √ det g (using a local frame and coordinates), the equations are ∂B Ej + = −J jk i, Dk , j ∂ x ∂ λ i − i A0 = − A − (1 − ||2 ), ∂t 2 1 B = (1 − ||2 ). 2

(1.5)

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

603

The electric and magnetic field can be combined to give the space-time electromagnetic field Fµν d x µ ∧ d x ν = E j dt ∧ d x j + Bdµg . This two form is obtained as the commutator of the space-time covariant derivative (1.4) which mediates the coupling in (1.5): [Dµ , Dν ] = −i Fµν , where F0k = E k , and

1 F jk d x j d x k = Bdµg . (1.6) 2

(Greek indices run through 0, 1, 2 and Latin indices through 1, 2 only. Boldface is used to indicate the spatial part of a vector or one-form etc., except in Sect. 3 where time does not appear at all.) We now describe this set-up briefly in geometrical terms. Assume given a one dimensional complex vector bundle L → , with a real inner product h locally of the form a, b = hab, ¯ and corresponding norm |a|2 = a, a; if we employ a unitary frame over some chart then a, b = ab. ¯ We are then solving for an S 1 connection on the bundle L ≡ R × L → R × , with associated covariant derivative D, and a section of L. To be more explicit, fix a smooth connection on L determined by a covariant derivative operator ∇, so that the spatial part of D, which will be written D, takes the form D j = ∇ j − i A j for a real 1-form A = A j d x j ∈ 1 (); here ∇ is indeR pendent of time. (It is generally not possible to choose ∇ to be flat, and it will have a curvature, determined by a function b such that [∇ j , ∇k ]d x j d x k = −ibdµg ; it is always possible to choose b = const. , and we will do this throughout.) In any case, with this procedure the space of connections on L can be identified with the space of real one-forms. Then at each time t ∈ R we are solving for a section (t) of L, a 1-form A(t) = A1 (t)d x 1 + A2 (t)d x 2 on , and a real valued function A0 (t) on . The electric field is given by Ej =

∂Aj ∂ A0 − , ∂t ∂x j

and the magnetic field by Bdµg = bdµg + dA. (Here, and elsewhere, we write d in boldface when it is necessary to indicate that only the spatial part is taken.) The 2-form −i E j dt ∧ d x j − i Bdµg is the curvature associated to the space-time covariant derivative D, as in (1.6). For the case = R2 , the system was proposed by Manton (1997), who derived it as the Euler-Lagrange equation for the Lagrangian (1.12). Notation 1.3.1. We shall always consider conformal co-ordinate systems on in which

the metric is of the form g = e2ρ (d x 1 )2 + (d x 2 )2 and the volume element is then e2ρ d x 1 ∧ d x 2 . On functions the Hodge operator acts as ∗ f = f dµg = f e2ρ d x 1 ∧ d x 2 2 1 and ∗2 = 1, so that ∗dω = e−2ρ ( ∂ω − ∂ω ) for 1-forms ω. On 1-forms ∗(ω1 d x 1 + ∂x1 ∂x2 ω2 d x 2 ) = ω1 d x 2 − ω2 d x 1 , which is just the negative of the complex structure J , repj resented in conformal co-ordinates by the anti-symmetric tensor Ji with J21 = −1, 2 J1 = +1, the other components being zero. Correspondingly we decompose a one-form

604

S. Demoulini, D. Stuart

as ω = ω(1,0) dz + ω(0,1) d z¯ ; in particular for the derivative d f = ∂ f dz + ∂¯ f d z¯ , with ∂¯ f = 21 ( ∂∂xf1 + i ∂∂xf2 ), and D = D (1,0) + D (0,1) = ∂A dz + ∂¯A d z¯ , with ∂¯A = 21 ((∇1 − i A1 ) + i(∇2 − i A2 )) etc.; see Sect. 3. For a 1-form A we write the co-differential d∗ A = −div A, with div A = e−2ρ ( ∂∂ Ax 11 + ∂∂ Ax 22 ), and the Laplacian on

real functions is f = e−2ρ ∂ x∂i ∂fx i , (with the summation convention), and on sections of L the covariant Laplacian is −A = e−2ρ (D12 + D22 ) when a unitary frame is used. The operators div , ∗d, (resp. A ) all depend on g (resp. g, h), but this is not indicated as g, h are fixed, and similarly dependence of constants in estimates on (, g) and h will be suppressed throughout the article. 2

Notation 1.3.2. We are dealing with sections of smooth vector bundles V over with an inner product ·, · induced from the Riemannian metric g and the metric h on L in the standard way; since g, h are fixed throughout they will not be indicated. Thus, for example, |D|2 = e−2ρ (D1 , D1 + D2 , D2 ) . We write 0 (V ) for the smooth sections of V and p (V ) for the smooth p-forms taking values in V . We will make use of the Sobolev spaces H s (V ) of sections of V whose coefficient functions (in any frame over any open set ⊂ ) lie in the standard Sobolev space H s ( ); the corresponding Sobolev space of V -valued p-forms is denoted H s ( p (V )). In Sect. 1 and Sect. 2 we shall generally omit explicit reference to the vector bundle, since this is usually clear, and write H s in place of H s (V ) etc. (and · H s for the corresponding norms). However if it is necessary to emphasize that time is fixed, and the norm is taken over , we shall write H s (). Further notational conventions are given in the Appendix and in Sect. 3, particularly in relation to the complex structure (see also the textbook [24, Sect. 9.1] for a treatment of the background material).

1.4. Existence theory for the Cauchy problem. Inherent to the system (1.5) is the property of gauge invariance: let χ (t, x) be a smooth real valued function, then (A, ) is a smooth solution if and only if (dχ + A, eiχ ) is. This introduces a large degeneracy to the solution space which may be removed by a choice of gauge in various ways. We ˙ , ˙ will adopt here the following gauge condition which involves the time derivatives A, of A, : ˙ − i, ˙ ≡ e−2ρ (∂1 A˙ 1 + ∂2 A˙ 2 ) − i, ˙ = 0. div A

(1.7)

We make this choice because it allows a convenient description of the complex and symplectic structures on the moduli space of vortices (see Remark 1.6.3 and Sect. 3), and also is useful in the derivation of energy estimates for the time derivatives (see Sect. 2.2 and Sect. 2.3). In this gauge global existence can be stated as follows:

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

605

Theorem 1.4.1 (Global existence in gauge (1.7)). Consider the Cauchy problem for (1.5) with initial data (0) ∈ H 2 () and A(0) ∈ H 1 (). There exists a global solution satisfying (1.7) and the estimate |(t)| H 2 () ≤ ceαe

βt

(1.8)

for some positive constants c, α, β depending only on (, g), the equations, and the

initial data. The solution has regularity ∈ C [0, ∞); H 2 () ∩ C 1 [0, ∞); L 2 ()

and A ∈ C 1 [0, ∞); H 1 () . If the initial data are smooth, then the solution is also smooth. It is explained in Appendix A.3 how to derive this theorem from the global existence result of [13], which is stated in another gauge. Bounds of the type (1.8) were derived in [10] for the cubic nonlinear Schrödinger equation on R2 , by means of the inequality (1.9) |u| L ∞ ≤ C[1 + ln(1 + u H 2 )], valid for u ∈ H 2 (R2 ) and with C = C(u H 1 ). The proof of global regularity for (1.5) depends on a covariant version of this inequality (given in Lemma A.11), and a careful treatment of various commutator terms [Dµ , Dν ] which indicates that they have a comparable strength to the cubic nonlinear term. In conclusion, Theorem 1.4.1 provides a global solution which is a continuous curve in the space H2 , where for s ∈ R we define Hs ≡ {(A, ) ∈ H s−1 () × H s ()},

(1.10)

with the corresponding norm · Hs . From now on we will consider only (A, ) which lie (at a given time) in the space H2 . The gauge group at fixed time is given by G ≡ {g ∈ H 2 (; S 1 )}

(1.11)

and acts on H2 according to g · (A, ) = (A + g −1 dg, g). (Restricting to the set where is not identically zero the action is free and gives a principal G−bundle structure. The gauge condition (1.7) can be then regarded as giving a connection - i.e. a family of horizontal subspaces - on this bundle.)

1.5. Variational and Hamiltonian formulation. Equations (1.5) can be derived formally as the Euler-Lagrange equations associated to the functional 1 S(A, ) = −A ∧ F + (i, D0 + A0 + 2vλ (A, )) dtdµg , (1.12) 2 R× where 1 vλ (A, ) = 2

λ B + |D| + (1 − ||2 )2 4 2

2

(1.13)

is the density of the Ginzburg-Landau static energy. (The parameter λ is a positive real number.) Although S is not manifestly gauge invariant it changes by an exact form

606

S. Demoulini, D. Stuart

under gauge transformation, and the Euler-Lagrange equations (1.5) are gauge invariant. Vortices are critical points of the static energy vλ (A, )dµg , Vλ (A, ) =

as will be discussed further in the next section. To see that the system (1.5) is Hamiltonian, observe that there is a complex structure ˙ ) ˙ i ) ˙ = (−J A, ˙ which allows the introduction on the phase space H2 given by J : (A, of a symplectic structure (v, w) = Jv, w, where · , · is the L 2 inner product. Using this symplectic form the system (1.5), in temporal gauge A0 = 0, is a Hamiltonian flow generated by the Hamiltonian functional Vλ (A, ), which was just defined. (A short calculation reveals that the third equation of (1.5) is preserved by the evolution, and as such is really only a condition on the initial data. It will be referred to as the constraint equation.) 1.6. Self-dual vortices and dynamics in the limit λ → 1. The system (1.5) admits soliton solutions, called abelian Higgs, or Ginzburg-Landau, vortices, which are energy minimizing critical points of the static energy functional Vλ (A, ). We now discuss these solutions and their uses in understanding the dynamical system (1.5) via the adiabatic approximation. There is a special case, λ = 1, in which the adiabatic approximation is particularly powerful because the space of vortices is then unusually large - large enough that the motion on it can provide information on the dynamical interaction of several vortices. We call this the self-dual, or Bogomolny, case, and the corresponding solutions are called self-dual vortices. Now for such a solution, (A, ), with a given value of the topological integer N , (the degree of L), the field will have N zeros, counted with multiplicity. Each of these zeros can be thought of as the centre of a vortex. Thus the static solitons can be thought of as a nonlinear superposition of N vortices which do not interact. This was first fully understood in the case that is the upper half plane with canonical metric, when the equations were solved exactly by Witten (1977) by reducing them to the Liouville equation. In general it is still possible to make a reduction to a nonlinear elliptic equation of Kazdan-Warner type, whose solutions can be completely parametrized although not explicitly given. Following this, Taubes proved an existence theorem when is the Euclidean plane (Jaffe and Taubes 1982), and Bradlow (1988) did likewise for a compact Riemann surface, proving the following: Theorem 1.6.1 (Existence of vortices on a surface, [8]). If the area of a closed Riemann surface || is such that || > 4π N the Bogomolny bound is saturated: in fact the minimum value π N of V, where V : H2 → R, 1 V(A, ) ≡ V1 (A, ) = 2

1 B 2 + |D|2 + (1 − ||2 )2 4

(1.14)

dµg ,

is achieved on a set S N ⊂ H2 of pairs (A, ) which solve the Bogomolny, or self-dual vortex, equations: ∂¯A = 0,

1 B − (1 − ||2 ) = 0. 2

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

607

These minimizers will be referred to as the self-dual vortices, or just vortices. The quotient of S N by the gauge group G can be identified with Sym N (), the symmetric N -fold product of , via the mapping which takes to the set of its zeros. Remark 1.6.2 (Interaction and stability of vortices). The physical interpretation of Theorem 1.6.1 is that for λ = 1 the vortices do not interact; see [28] for a discussion of this, and some related conjectures, and [19] for some stability theorems. Remark 1.6.3 (Bogomolny structure and Bogomolny operator). The structural feature of V which makes Theorem 1.6.1 possible was identified by Bogomolny in [7]. In this instance it amounts to the fact that if we introduce the Bogomolny operator B to be the

nonlinear operator which maps (A, ) → B − 21 (1 − ||2 ), ∂¯A then 1 V= |B(A, )|2 dµg + π N 2 (see Sect. 3 for more information in this regard). Also see [8] for higher dimensional versions of this decomposition, and [21] for generalizations to solutions with non-vanishing electric field. Remark 1.6.4 (Geometry of moduli space). Quotient spaces of the type arising in Theorem 1.6.1 are usually known as moduli spaces: in this case we define the moduli space M N to be the space of gauge equivalence classes of self-dual vortices, so that M N ≡ Sym N (). We call the space S N the vortex space and proj : S N → M N the natural projection which takes (A, ) to its gauge equivalence class [(A, )]. The space M N inherits both a metric (induced from the L 2 metric) and a symplectic structure and is a Kaehler manifold (see [9]). Explicitly, we can identify the tangent space to M N ˙ ) ˙ of the linearized Bogomolny equations which also satisfy the conwith solutions (A, dition (1.7). The complex structure and symplectic structure on M N are then given by ˙ ), ˙ and consequently restricting the formulas given in the previous section to such (A, we will use the same notation, J and , for these objects. The existence of this complex structure on M N can be seen very clearly in the formulas in Sect. 3, in which complex notation is used to combine the linearized Bogomolny equations with (1.7) into a manifestly complex linear operator Dψ , for ψ = (A, ) ∈ S N . This can all be summarized by saying that we have an identification ˙ ) ˙ ] ˙ : D Bψ [A, ˙ = 0, and (1.7) holds}. T[ψ] M N ≈ Ker Dψ ≡ {(A,

(1.15)

1.7. Statement of the adiabatic limit theorem. In order to define the adiabatic limit system, we now define a Hamiltonian function M N → R by restricting the energy Vλ to the space of vortices, and observing that by gauge invariance this actually gives a smooth function on the quotient space M N . The corresponding Hamiltonian flow determines the slow motion of vortices for λ close to 1: For = |λ − 1| sufficiently small, the system (1.5) can be approximated, for times of order 1 , by the Hamiltonian flow on the phase space M N = Sym N () associated to the Hamiltonian function Vλ |M N via the symplectic form . We now move towards a precise formulation of this in Theorem 1.7.2. Since we are interested in the regime in which |λ − 1| 1, it is useful to introduce a large parameter µ=

1 |λ − 1|

(1.16)

608

S. Demoulini, D. Stuart

and let also, for λ = 1, σ =

λ−1 = ±1 |λ − 1|

(also defining σ = 0 for λ = 1 where necessary). We rescale time by τ = similarly, leading to the following rescaled equations:

(1.17) t µ,

∂ A1 ∂ A0 = µ (−∂1 B − i, D2 ) + , ∂τ ∂x1 ∂ A2 ∂ A0 = µ (−∂2 B + i, D1 ) + , ∂τ ∂x2 ∂ 1 σ i( − i A0 ) = µ(− A − (1 − ||2 )) − (1 − ||2 ). ∂τ 2 2

and A0

(1.18)

It is also natural to separate the energy Vλ into the (main) self-dual piece V = V1 , and a perturbation term proportional to λ − 1. Under the rescaling just introduced, the energy rescales by a factor µ, leading us to consider the Hamiltonian H = µV + U , where V ≡ V1 is as in (1.14), and the energy correction away from the self-dual, or Bogomolny, regime is given by σ U () = (1 − ||2 )2 dµg . (1.19) 8 The rescaled Eqs. (1.18) can be written as a Hamiltonian evolution for ψ = (A, ) in the form J

∂ψ = µV + U + J(d A0 , i A0 ), ∂τ

(1.20)

where J is the complex structure introduced at the end of Sect. 1.5, ˙ = (− A˙ 2 d x 1 + A˙ 1 d x 2 , i ) ˙ J( A˙ 1 d x 1 + A˙ 2 d x 2 , ) ˙ = with A

∂A ∂τ

˙ =

(1.21)

∂ ∂τ .

Remark 1.7.1 (Explicit formulation of adiabatic limit system). We now write the equations for the adiabatic limit system in an explicit way which will be useful later. The function U is clearly gauge invariant and defines by restriction a smooth function u on M N . Now recall (1.15): under this identification, the gradient of the function u on M N at [ S ] is identified with P S U , where P S is the orthogonal projector onto Ker D S (see Lemma 3.3.2). The Hamiltonian differential equations for u are then equivalent to J

∂ S = P S U . ∂τ

(1.22)

Given an initial value S (0) = ψ0 ∈ S N , this equation has a unique solution τ → S (τ ) ∈ S N which satisfies the gauge condition (1.7). Main Theorem 1.7.2 (Adiabatic limit). Let µ be the smooth solution of (1.20), satisfying the gauge condition (1.7), with smooth initial data µ (0), such that (i) (ii)

limµ→+∞ µ (0) − ψ0 H2 = 0, for some smooth ψ0 ∈ S N , and ˙ µ (0) H1 ≤ K < ∞. supµ≥1 µ (0)H2 +

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

Then there exists τ∗ > 0, independent of µ ≥ 1, such that for s < 2, lim sup µ (τ ) − S (τ )H = 0, µ→∞ [−τ ,τ ] ∗ ∗

s

609

(1.23)

where τ → S (τ ) ∈ S N is a curve in the vortex space S N , also satisfying (1.7), which is the unique solution of (1.22) with initial data S (0) = ψ0 . The projection onto the moduli space M N : τ → [ S (τ )] ∈ M N , is the unique solution of the Hamiltonian system on (Sym N (), ) associated to the Hamiltonian u defined in Remark 1.7.1, with initial value [ψ0 ] ∈ M N . This theorem in proved in Sect. 2, employing a strategy which is explained in Sect. 1.8, following discussion of a very simple model problem. Some of the novel features which arise in the implementation of this strategy for (1.18) are highlighted at the beginning of Sect. 2. Remark 1.7.3 (Related work). The approximation of the dynamical system (1.18) by a dynamical system through a space of equilibria (in this case the self-dual vortices, which are the equilibria for λ = 1) is referred to as an adiabatic limit or approximation. It was suggested in [30], following earlier conjectures of the same author on vortex and monopole dynamics in second order Lorentz invariant systems discussed in [31]. Proofs of the validity of the approximation in the case of second order dynamics were given in [44,45]; the strategy for the proof here, however, is different from that adopted in those references - see the discussion in Sect. 1.8. There has also been work on corresponding problems for σ -models, see [20,36]. A review of the analysis of adiabatic limit problems is given in [47], mostly directed towards infinite dimensional natural Lagrangian systems of the type appearing in classical field theory. (Natural Lagrangian systems are those derivable from Lagrangians of the classical “kinetic energy minus potential energy” form). Remark 1.7.4 (Implications for Chern-Simons vortex dynamics). Although it is not generally possible to evaluate explicitly the Hamiltonian and symplectic form in the reduced system (1.22), it is possible to understand some basic features of the vortex dynamics in this model, see [27,30,31,38]. This work has been directed mostly to the case when the spatial domain is R2 , so our Theorem 1.7.2 does not imply the validity of the approximation (1.22) in this case, see below. One general conclusion is that in the Chern-Simons model a force acting on the vortex produces motion at right angles to the direction of the force (in distinction to the behaviour in the relativistic case [31,44]). Now it is known computationally (see [28,31] and references therein), and in some special cases analytically ([46]), that the potential energy between two vortices depends on the distance between them, and is attractive for λ < 1 and repulsive for λ > 1. From this it can be deduced that two vortices will circle about one another, the direction of rotation depending upon whether λ < 1 or λ > 1. See [31, Sect. 7.13] for a discussion of these solutions in the R2 case. Also in the same reference it is observed that (1.22) possesses another related type of solution: a rigidly rotating p-gon, with p vortices placed at the vertices of a regular p-gon. Many of the arguments and calculations leading to the conclusions about vortex dynamics can be carried out equally well with spatial domain the standard sphere = S 2 ([37]), even with explicit formulae in special limiting cases ([46]), in which case Theorem 1.7.2 implies rigorously the rotational behaviour for vortices

610

S. Demoulini, D. Stuart

described above. In future work results on the existence and stability of such periodic solutions for the full system (1.5) will be presented. It is to be hoped that some of these qualitative conclusions about vortex dynamics, (which are justified for (1.5) by the Main Theorem 1.7.2) would have a wider validity for Chern-Simons models of vortex dynamics, not necessarily close to any self-dual limit. There is some numerical evidence for this in related situations, for example the scattering of vortices in the relativistic abelian Higgs model is qualitatively similar for all values of the Higgs coupling constant, even though a rigorous analysis in which the vortices actually collide is only possible in the self-dual limit; see [31,44]). On the other hand, the case of first order dynamics is in some ways numerically more problematic since it is not possible to produce any motion via choice of initial conditions (as can be done in the second order case), and it is necessary to have λ deviate from the self-dual value 1, and quite substantially so in order to get motion which is easily computationally observable. A numerical study in [27] which compares the approximation (1.22) with a computer simulation of (1.5) finds that, in the case of spatial domain = R2 , while the qualitative behaviour of two vortices is similar to that implied by (1.22) for |λ − 1| small, there are quantitative differences between the full dynamics and the adiabatic limit, which become quite marked as λ moves away from the value 1. As the authors of [27] say, it is unclear to what extent some of these differences are genuine errors due to the neglect of radiation in the finite dimensional truncation (1.22), as compared to being a numerical artefact; certainly some of the observed behaviour is consistent with energy being transferred into radiative modes, causing the vortices to spiral in towards one another in the attractive case ([27, Fig. 6]). In any case, there is no issue with radiation when is a compact spatial domain, in which case Theorem 1.7.2 does imply the validity of the approximation (1.22) for sufficiently small |λ−1|, and it seems reasonable to expect that in this case the dynamical bevaviour predicted by our analysis (relating (1.5) to (1.22) for small |λ − 1|) is at least qualitatively relevant to the applications in the theoretical physics literature.

1.8. A simple model problem and discussion of methodology. We consider here a simple two-dimensional example in order to exhibit as clearly as possible the phenomenon under study, and the strategy which will be employed in the proof of Theorem 1.7.2. (It is the basic strategy taken in [39] for finite dimensional natural Lagrangian systems, here adapted to the case of infinite dimensions and to take advantage of the Bogomolny structure.) For real numbers β and µ1, we consider a linear first order Hamiltonian system for z(τ ) = (z 1 (τ ), z 2 (τ )) ∈ C2 : Theorem 1.8.1. For each µ1, let τ → Z µ (τ ) ∈ C2 be the solution of z˙ 1 =

i(z 1 + βz 2 ),

z˙ 2 = i(βz 1 + µz 2 ),

(1.24)

with initial data satisfying |(Z µ1 (0), Z µ2 (0)) − (γ , 0)| = O(µ−1 ) as µ → +∞, for some fixed γ ∈ C. Then lim max |Z µ (τ ) − (γ eiτ , 0)| = 0.

µ→+∞ τ ∈R

(1.25)

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

611

Remark 1.8.2. The system (1.24) is Hamiltonian with the standard symplectic structure on C2 and with Hamiltonian function µV + U with V(z) = 21 z¯ 2 z 2 and U (z) =

1 1 1 z¯ z + β(¯z 1 z 2 + z¯ 2 z 1 ). 2

Thus V acts as a constraining potential for µ → +∞, forcing the solution onto the set S = C × {0} ⊂ C2 , where z 2 = 0. Projecting the system to S gives, formally, i z˙ 1 + z 1 = 0.

(1.26)

The theorem asserts that (1.26) indeed governs the behaviour of the limit of appropriate sequences of solutions to (1.24). Proof. The solution with initial data z(0) = (z 1 (0), z 2 (0)) is given by: β (1 − λ− )eiλ+ τ − (1 − λ+ )eiλ− τ z 1 (0) z 1 (τ ) = β(λ+ − λ− )

+ β eiλ+ τ − eiλ− τ z 2 (0) ,

−1 (1 − λ+ )(1 − λ− )(eiλ+ τ − eiλ− τ )z 1 (0) z 2 (τ ) = β(λ+ − λ− )

−β (1 − λ+ )eiλ+ τ − (1 − λ− )eiλ− τ z 2 (0) . + β(λ+ − λ− ) Here the λ± are the characteristic values of the system: ⎤ ⎡ 21 2 1 4(µ − β ) ⎦ λ± = (1 + µ) ⎣1 ± 1 − , 2 (1 + µ)2 which satisfy, by the binomial expansion, |λ+ − µ| = O(1), |λ− − 1| = O(µ−1 ) as µ → ∞. From this, and the fact that λ± ∈ R for large µ so that |eiλ± τ | = 1, the behaviour in (1.25) follows for the solutions Z µ (τ ) with initial data as described. Remark 1.8.3. In this example the exact solutions indicate that while Z µ2 → 0, the time derivatives Z˙ µ2 are bounded, but cannot generally be expected to have limit zero. In the absence of explicit formulae for Z µ (τ ), it is still possible to prove results like Theorem 1.8.1, either (i) by explicit perturbative construction of solutions to the full system, using solutions of the restricted system as a starting point, or (ii) by obtaining uniform bounds for the Z µ (τ ) which allow the extraction of convergent subsequences, and then identifying the unique limit of all such subsequences as the corresponding solution of the restricted system with Hamiltonian U |S . In the present article we will adopt the second strategy in our proof of Theorem 1.7.2 (although it would be possible to use the first strategy, as in [44]). To make the structure of the proof transparent, it is useful to consider in some detail how to execute the second strategy to prove a variant of Theorem 1.8.1:

612

S. Demoulini, D. Stuart

Theorem 1.8.4 (Weaker version of Theorem 1.8.1). In the situation of 1.8.1, lim

max |Z µ (τ ) − (γ eiτ , 0)| = 0,

µ→+∞ a<τ
(1.27)

for every bounded interval [a, b] ⊂ R. Remark 1.8.5. Although weaker than Theorem 1.8.1, the proof of Theorem 1.8.4 that we give generalizes to the infinite dimensional problem (1.5), (1.18), in which the explicit solutions corresponding to those used in the proof of Theorem 1.8.1 are of course not available. Proof. • Differentiation of Eqs. (1.24) in time gives the identical system ζ = z˙ . Use the energy identity: µV(ζ (τ )) + U (ζ (τ )) = µV(ζ (0)) + U (ζ (0)), together with the identical estimate for z(τ ), to deduce (using Cauchy-Schwarz) that the solutions Z µ of Theorem 1.8.1 satisfy |Z µ (τ )|+| Z˙ µ (τ )| ≤ C, with C independent of µ1. • By the previous item, deduce that the family of functions τ → Z µ (τ ) is uniformly (in µ1) bounded and equicontinuous, and so the Arzela-Ascoli theorem implies subsequential convergence Z µ j → Z in C(I ) for any bounded interval I ⊂ R. • The energy estimate implies that, for large µ there exists C > 0, independent of µ, such that µ Z¯ 2 Z 2 ≤ C. It follows that Z µ2 → 0 along any convergent subsequence. Now consider the integrated form of the first equation of (1.24) (i.e. project the system onto S = C×{0} ⊂ C2 , where z 2 = 0). Taking the limit µ j → ∞, it follows that the τ limit Z = (Z 1 , Z 2 ) of any convergent subsequence satisfies Z 1 (τ ) = i 0 Z 1 (τ )dτ and Z 1 (0) = γ . This integral equation has unique solution Z 1 (τ ) = γ eiτ , and hence the Cloc limit of any convergent subsequence is (γ eiτ , 0). It follows that Z µ converges to this limit in Cloc without restriction to subsequences. This proves Theorem 1 .) 1.8.4. (In view of Remark 1.8.3 we should not expect this convergence to be in Cloc The general situation to which Theorem (1.8.4), and its proof, potentially generalize is the following: on a phase space H we consider the integral curves Z µ (τ ) for a Hamiltonian µV + U for large µ (“the full system”). Under the assumption that S = {z ∈ H : min V = V(z)} is a symplectic submanifold of H, we can consider the “restricted system” on S determined by the Hamiltonian U |S , and try to prove that this Hamiltonian system can be used to describe the limiting behaviour of Z µ (τ ) as µ → +∞. An infinite dimensional example of this situation is provided by the Chern-Simons-Schrödinger system (1.18): in the next section we will provide a proof of Theorem 1.7.2 employing the same strategy to that used in the proof of Theorem 1.8.4 just given. 2. Uniform Bounds and Proof of the Main Theorem In this section we prove our main result, Theorem 1.7.2, along the lines suggested by the discussion of the simple model problem in the last section. The crucial stage is the proof of the main estimate, Theorem 2.3.1, which asserts the existence of a time interval, independent of µ, on which the solution ψ = (A, ) is uniformly bounded in H2 , and its time derivative is uniformly bounded in H 1 as µ → +∞. Given this bound, Theorem

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

613

1.7.2 can be deduced using a variant of the Lions-Aubin lemma, and a careful analysis of the µ → +∞ limit of (1.18). Before obtaining the uniform bound, we collect some identities used in the proof. Some more specialized identities related to the self-dual structure are collected separately in Sect. 3, and referred to as needed. Specifically, we draw the reader’s attention to the following two uses made of these more specialized identities: (i) Differentiation in time gives rise to Eq. (2.31) for ζ = ψ˙ in which the dominant term (as µ → +∞) involves L ψ , the Hessian of V defined in (2.40). It is shown in Sect. 3 that this operator takes the special form ∗ L ψ = Dψ Dψ + O(|B|),

(2.28)

with Dψ complex linear (see (3.62)), and B as in Remark 1.6.3. Observing that the ∗ D ζ , it is easy to L 2 norm is exactly preserved for equations of the form Jζ˙ = Dψ ψ believe that the stated structure of L ψ is useful in the derivation of µ-independent bounds for (2.31), (for initial data as in the theorem); this indeed turns out to be the case - see the proof of Theorem 2.3.1. (ii) After obtaining a convergent subsequence of solutions of (1.20) it is necessary to take the limit of the equation itself along the subsequence µ = µ j → +∞. For this purpose it is very convenient to be able to eradicate the term µV on the right hand side, since this is clearly hard to control for large µ: this can be done by applying a projection operator Pµ whose existence close to the set of self-dual vortices is assured by the Bogomolny structure: see Lemmas 3.3.1 and 3.3.2. (In geometrical terms there is a foliation of the phase space H2 , and the range of Pµ is the tangent space to the leaves of this foliation, after dividing out by the action of the gauge group using (1.7).) Although our final conclusions are in terms of the standard Sobolev norms based on the fixed connection ∇, it will be convenient to obtain bounds for the corresponding Sobolev norms defined at each fixed time with respect to the connection D = ∇ − iA, see (A.2). These can be related to the standard norms by (A.3)–(A.5). 2.1. The evolution equations and associated identities. In addition to the rescaled Eq. (1.20) for ψ = (A, ): J

∂ψ = µV + U + J(d A0 , i A0 ), ∂τ

we will use the differentiated equation for ζ = ψ˙ ≡ ∂ψ ∂τ . To write this down we need the linearization of the operator V (ψ), i.e. the second order linear differential operator L ψ obtained by differentiation of the map ψ → V (ψ): L ψ = DV (ψ), d2 ˙ ˙ or equivalently, ζ, L ψ ζ L 2 = ds 2 V(ψ + sζ )|s=0 . Explicitly, with ζ = (A, ), we have ˙ 2 + |D | ˙ 2 − 2D, i A ˙ ˙ ˙ 2 + ||2 |A| ˙ − 2D, ˙ i A ζ, L ψ ζ L 2 = |d A| (2.29) 1 ˙ 2 − (1 − ||2 )|| ˙ 2 dµg . +, 2

614

S. Demoulini, D. Stuart

Remark 2.1.1. There is a slightly simpler version of this formula, given in (2.40) below, when ζ is restricted by the gauge condition (1.7). Furthermore in Sect. 3 it is shown that the self-dual structure provides a useful way of rewriting this formula as in (2.28), in terms of the complex structure defined in (1.21), and using the complex one-form αdz, ˙ ˙ A˙ 2 ˙ 1 d x 1 + A˙ 2 d x 2 , see (3.60). Since this where α˙ = A1 −i , in place of the real one-form A 2 is used only at one point in the proof - in Lemma 2.3.8 - this formulation is presented separately in Sect. 3, and referred to only as needed. The linearization of U is the linear operator K ψ = DU (ψ), given by σ ˙ ) ˙ → 0, (1 − ||2 ) ˙ + σ , ˙ K ψ = (A, , 2

(2.30)

with σ defined in (1.17). Given these definitions, the chain rule implies that, if ψ is a ˙ ) solves smooth solution of (1.20), then ζ (τ ) = ψ(τ J

∂ζ ∂ = µL ψ ζ + K ψ ζ + J (d A0 , i A0 ). ∂τ ∂τ

(2.31)

We also need identities for the evolution of the Bogomolny operator B defined in Remark 1.6.3 and discussed in more detail in Sect. 3. The first component is preserved ∂ ∂τ

1 2 ˙ = 0, (B − (1 − || ) = e−2ρ (∂1 A˙ 2 − ∂2 A˙ 1 ) + , 2

(2.32)

as a consequence of (1.18). We will require that the initial data are such that B − 21 (1 − ||2 ) = 0 initially, and hence for all times. The second component of the Bogomolny operator B will be denoted 1 η = ∂¯A = (D1 + i D2 ), 2

(2.33)

(see Sect. 3), and we have the following identity: σ i(∂τ − i A0 )η = µ(−4∂¯A (e−2ρ ∂A η) + ||2 η) − ∂¯A (1 − ||2 ) . 2

(2.34)

(To verify this identity: substitute A = 4e−2ρ ∂A ∂¯A − B into the third line of (1.18) and then apply ∂¯A to the resulting equation and use the identity (E 1 + i E 2 ) = −2µ||2 ∂¯A which follows from the first two lines of (1.18).) Of course, the energy E(τ ) = µV(ψ(τ )) + U (ψ(τ )) = E0 > 0

(2.35)

is independent of time τ for regular solutions, as is the L 2 norm, (τ ) L 2 = L > 0.

(2.36)

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

615

2.2. Choice of gauge condition and related estimates. The divergence of E can be calculated to be: div E = e−2ρ (∂1 E 1 + ∂2 E 2 ) = µ (−B − e−2ρ ∂1 i, D2 + e−2ρ ∂2 i, D1 = µ(4e−2ρ |η|2 ) + i, (

∂ − i A0 ) − σ B||2 . ∂t

In the last line we have used B = 21 (1−||2 ), so that B = −, A −e−2ρ (|D1 |2 + |D2 |2 ), the equation for and the definition of η in (2.33). Under the gauge condition (1.7) we get the following equation for A0 : (− + ||2 )A0 = 4µe−2ρ |η|2 − σ B||2 .

(2.37)

Lemma 2.2.1 (Estimates for A0 ). Assume τ → ψ(τ ) = (A(τ ), (τ )), is a smooth solution, of (1.20) which satisfies the gauge condition (1.7), (2.35) and (2.36). Then for all r < ∞, there exists c0 (E0 , L , r ) > 0 such that, A0 (τ ) L r ≤ c0 (E0 , L , r )

(2.38)

and there exists c0 (E0 , L) > 0 such that A0 (τ ) H 2 ≤ c0 (E0 , L)(1 + µ∂¯A (τ ) L ∞ ).

(2.39)

Remark 2.2.2. This shows that in the original system (before rescaling) the time component of the potential A0 is O(|λ − 1|) in the gauge defined by (1.7). Proof. The crucial point here is the µ independence of the bounds. The second inequality follows from standard elliptic theory once the first is established. By (2.37) it is possible to write A0 = A+0 + Aˆ 0 , where (− + ||2 )A+0 = 4µe−2ρ |η|2 , so that A+0 ≥ 0 by the maximum principle, and (− + ||2 ) Aˆ 0 = −σ B||2 . The bounds stated in the lemma will follow by the triangle inequality once they are proved for A+0 , since they are for Aˆ 0 . Now integrating the equation for A+0 implies that ||2 A+0 L 1 = immediate + + 2 || A0 dµg ≤ C(E0 , L) since A0 ≥ 0; this bound is independent of µ 1 on account of (2.35). The standard elliptic theory for −u = f ∈ L 1 now gives the L r estimates for A+0 and hence the lemma. ˙ Let ζ = (A, ˙ ) ˙ satisfy the gauge condition (1.7), as Lemma 2.2.3 (Estimates for A). well as the linearized constraint equation (2.32). Then there exists a constant c1 > 0 ˙ H 1 ≤ c1 ˙ L 2 , and more generally, for any 1 0 such that A for a smooth solution, τ → ψ(τ ) = (A(τ ), (τ )), of (1.20) which satisfies the gauge condition (1.7). Proof. These are the standard estimates for the Hodge system, proved by using the Hodge decomposition to reduce to the Calderon-Zygmund estimate for the Laplacian.

616

S. Demoulini, D. Stuart

˙ ) ˙ satisfying the gauge condition (1.7), the operator L ψ On the subspace of ζ = (A, has a simpler form: L ψ ζ = L ψ ζ , where L ψ is the operator defined by ˙ 2 + |div A| ˙ 2 + |D| ˙ 2 + || ˙ 2 + ||2 (|A| ˙ 2) |dA| (2.40) ζ, L ψ ζ L 2 = 1 ˙ ˙ − (1 − ||2 )|| ˙ 2 dµg . − 4D, i A 2 Lemma 2.2.4 (The Hessian). Let ψ = (A, ) be smooth. Then the second order differential operator L ψ is a self-adjoint operator with domain H 2 , and there exist numbers c2 , c3 such that ζ, L ψ ζ L 2 ≥ c2 ζ 2H 1 − c3 ζ 2L 2 . A

The numbers c2 , c3 depend only on the numbers L and E0 , defined as in (2.35), (2.36). Proof. First of all, observe that ˙ 2 + |div A| ˙ 2 + |D| ˙ 2 + || ˙ ) ˙ 2 + ||2 (|A| ˙ 2 ) dµg ≥ c(E0 , L)(A, ˙ 2 1. |dA| H A

This can be proved by a straightforward contradiction argument that is very similar to the proof of Lemma 3.2.2 given below, so the details will be omitted. Next, to deduce the stated result, just bound the final two terms in (2.40) using the Holder inequality with 1 = 21 + 41 + 41 , the interpolation inequality in Lemma A.9 and Cauchy-Schwarz. Corollary 2.2.5. Assume given a smooth solution, τ → ψ(τ ) = (A(τ ), (τ )), of (1.20) which satisfies the gauge condition (1.7), (2.35) and (2.36). Then the quantity E1 (τ ) =

1 ζ (τ ), (L ψ + µ−1 K ψ )ζ (τ ) L 2 , 2

(2.41)

where ψ = ψ(τ ), satisfies for µ ≥ 1, E1 ≥ c4 ζ 2H 1 − c5 ζ 2L 2 A

with c4 , c5 depending only on E0 , L. 2.3. The main estimate. We say that a smooth solution, τ → ψ(τ ) = (A(τ ), (τ )), of (1.20) satisfies conditions (AE) and (AI), if the following conditions hold: (AE) (AI)

There exist positive numbers E0 , L such that (τ ) L 2 = L and E(τ ) = E0 , for all times τ ∈ R, where E(τ ) is the energy (2.35). (Recall that both these quantities are independent of τ .) ˙ The initial data are such that ψ(0)H2 + ψ(0) H 1 ≤ K < ∞. (Recall the definition of the norms in (1.10)).

Theorem 2.3.1. For µ ≥ 1 let τ → ψ(τ ) be a smooth solution of (1.20) satisfying conditions (AE) and (AI), for some fixed numbers K , L , E0 . There exist numbers τ∗ > 0 and M∗ > 0, independent of µ, such that ∂ max ψ(τ ), ψ(τ ) ≤ M∗ . (2.42) |τ |≤τ∗ ∂τ H2 ×H 1

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

617

Beginning of proof of Theorem 2.3.1. By time reversal invariance it is sufficient to prove the bound for 0 ≤ τ ≤ τ∗ , for some τ∗ > 0 independent of µ. Let ζ (τ ) =

∂ ˙ ). ψ(τ ) = ψ(τ ∂τ

For any M > ζ (0) L 2 there exists a time T (M, µ) > 0 such that sup

0≤τ ≤T (M,µ)

ζ (τ ) L 2 ≤ M.

(2.43)

We will prove that there exist positive numbers M∗ , τ∗ , independent of µ, such that T (M∗ , µ) ≥ τ∗ , and hence sup0≤τ ≤τ∗ ζ (τ ) L 2 ≤ M∗ . The proof proceeds by obtaining a series of µ-independent bounds, predicated upon (2.43), which imply boundedness

˙ ) in the Hilbert space H2 defined in (1.10) for 0 ≤ τ ≤ τ∗ . These bounds of ψ(τ ), ψ(τ are now stated in a sequence of lemmas, all of which refer to a smooth solution of (1.20),(1.7) which verifies (AE), (AI) and (2.43) for all τ under consideration. Lemma 2.3.2 (Estimate for in H 2 ). There exists C1 = C1 (E0 , L) > 0, independent of µ, such that (τ ) H 2 ≤ C1 (1 + ζ (τ ) L 2 ) ≤ C1 (1 + M). A

Proof. Using the third equation of (1.18) for , we bound 1 ˙ L 2 + A0 L 2 + (1 − ||2 ) L 2 . A L 2 ≤ 2 Now, by Lemma A.2.2, we can bound ∇A ∇A L 2 ≤ A L 2 + c(E0 )∇A L 4 , and hence, by Lemma A.9 and Cauchy-Schwarz: ∇A ∇A L 2 ≤ 2A L 2 + c(E0 , L). Therefore, using also Lemma 2.2.1, we deduce the bound (t) H 2 ≤ c(1+ ζ (τ ) L 2 ) ≤ c(1 + M), for some c = c(E0 , L) > 0, and the result follows.

√ Corollary 2.3.3. ∃C2 = C2 (E0 , L) > 0 such that, (τ ) L ∞ ≤ C2 1 + ln(1 + M) . Proof. This follows from Lemma A.11 and the previous lemma.

˙ There is a constant C3 (E0 , L) > 0 such Lemma 2.3.4 (Energy estimate for ζ = ψ). that, dE1 2 2 6 4 (2.44) dτ ≤ C3 (1 + L ∞ )ζ H A1 + C3 ζ L 2 + C3 ζ L 2 . where E1 is the quantity defined in (2.41). Proof. Compute

d dt E1 ,

substitute from (2.31), and use the observation that Jζ˙ , (d A˙ 0 , i A˙ 0 ) L 2 = 0,

by the constraint equation B = 21 (1 − ||2 ) in (1.5), to obtain ∂ dE1 1 ˙ L 2 + ζ, [ , L ψ + µ−1 K ψ ]ζ L 2 . ˙ i A0 = i , dτ 2 ∂τ

(2.45)

618

S. Demoulini, D. Stuart

To handle the second term, we make use of the following bounds (written schematically, i.e. suppressing indices and inner products which play no role): ζ 3 L 1 ≤ L ∞ ζ L 2 ζ 2L 4 ≤ c L ∞ ζ 2L 2 ζ H 1

A

˙ A ˙ L 4 ˙ L 1 ≤ ∇A ˙ L 2 A ˙ A∇ ˙ L 4 ≤ c L ∞ ζ 3/21 ζ 3/2 L2 H A

˙ L 1 ≤ ζ 2 4 ∇ A ˙ L 2 ≤ c L ∞ ζ 2 2 ζ 1 . ˙ 2 ∇ A H L L A

All of these bounds follow directly from Holder’s inequality, the interpolation inequality in Lemma A.2.1, Lemma 2.2.3 and the bound ˙ H 1 ≤ c L ∞ ˙ L 4 + A ˙ L2 . A It then follows, by inspection of the formulae for L ψ , K ψ in (2.29) and (2.30), that the E1 can be bounded by a sum of terms of this type, and hence: second term in ddτ ζ, [ ∂ , L ψ + µ−1 K ψ ]ζ ≤ c(1 + 2 ∞ )ζ 2 1 + cζ 6 2 + cζ 4 2 . L L L HA ∂τ L2 Also, we can bound ˙ L 2 | ≤ cA0 L r ˙ i A0 ˙ 2 2r ≤ cA0 L r ˙ 2 1, |i , H L A

where r > 1 and 1/r + 1/r = 1. Combining these with Lemma 2.2.1, we obtain (2.44), completing the proof of the lemma. Corollary 2.3.5. There is a constant C4 = C4 (E0 , K , L , M) > 0 such that, ζ (τ ) H 1 ≤ A C4 (1 + τ ), for all times τ ∈ [0, T (M, µ)]. Lemma 2.3.6 (Estimate for η = ∂¯A ). There exists C5 = C5 (E0 ) > 0 such that, at each time τ , ˙ 2 2 + 2L ∞ . ˙ 1 + A µη H 2 ≤ C (2.46) H L A

A

Proof. From Eq. (2.34) for η, and using the interpolation inequality in Lemma A.9, the elliptic term L(A,) η ≡ (−4∂¯A (e−2ρ ∂A η) + ||2 η) satisfies, for some c = c(E0 ) > 0, ˙ L 2 + cA0 L 4 (1 + η1/21 ) + c2L ∞ . (2.47) ˙ 1 + L ∞ A µL(A,) η L 2 ≤ H H A

We next see that (2.46) follows from the usual elliptic regularity estimate. Firstly, observe that associated to the operator L(A,) is the quadratic form Q (A,) (η) = η, L(A,) η L 2 () = 4|∂A η|2 e−4ρ + ||2 |η|2 e−2ρ dµg ,

which is bounded below by cη2H 1 , where c A

= c(E0 , L) > 0 by Lemma 3.2.2. It follows

that η H 1 ≤ cL(A,) η L 2 , a result which can be strengthened by the following: A

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

619

Claim. ∇A ∇A η L 2 ≤ cL(A,) η L 2 where c = c(E0 , L) > 0. By the Garding inequality, ∇A ∇A η L 2 ≤ L(A,) η L 2 + c(E0 , L)(∇A η L 4 + η H 1 ). A

Finally, using the interpolation inequality (A.9) and the Cauchy-Schwarz inequality, we deduce the inequality claimed. Corollary 2.3.7. There is a constant C6 = C6 (E0 , K , L , M) > 0 such that, µ∂¯A (τ ) L ∞ ≤ C6 (1 + τ ). Lemma 2.3.8 (Closing the argument: estimate for ζ in L 2 ). There is a constant C7 (E0 , L , M) such that ζ = ∂ψ ∂τ satisfies ζ (τ )2L 2 ≤ ζ (0)2L 2 eC7

τ 0

(µ∂¯A (s) L ∞ +(s)2L ∞ )ds

.

Proof. Compute, using (2.31), that d ζ (τ )2L 2 = 2Jζ, (µL ψ + K ψ )ζ dτ ˙ L2 = 0 since (by the gauge condition) ζ, (d A˙ 0 , i A˙ 0 ) L 2 = 0, and ζ, (0, i A0 ) ˙ ˙ = 0 pointwise). By Corollary 3.2.1 and the formula for K ψ , there exists (using i , C7 = C7 (E0 , L) > 0 such that d ζ (τ )2 2 ≤ C7 (µ∂¯ A (τ ) L ∞ + (τ )2 ∞ )ζ (τ )2 2 L dτ L L and so the stated inequality follows by the Gronwall lemma.

Completion of proof of Theorem 2.3.1. The previous lemma allows us to validate the claim that (2.43), and thus all the bounds in Lemmas 2.3.2-2.3.8, in fact hold on a µ-independent interval [0, τ∗ ], thus closing the argument. Indeed, by Corollaries 2.3.3 and 2.3.7 we have µ∂¯ A (τ ) L ∞ +(τ )2L ∞ ≤ C8 (1+τ ) for some C8 = C8 (E0 , L , M). Now let τ∗ , M∗ be such that ζ (0)2L 2 eC7 C8 (τ∗ +τ∗ /2) ≤ M∗2 . 2

(This is always possible for M∗ > ζ (0) L 2 and τ∗ small.) Then it follows that (2.43) holds with T (M∗ , µ) ≥ τ∗ , and that the bounds given in Lemma 2.3.2 through Corollary 2.3.7 hold on the interval [0, τ∗ ]. To conclude, we explain how to derive the bounds in (2.42). For ζ = ψ˙ we have boundedness of ζ (τ ) H 1 by Corollary 2.3.5. Integrating in A τ gives the bound for A H 1 in (2.42). Also the Kato and Sobolev inequalities ([28]) ˙ in L p for 2 ≤ p < ∞. Together with the boundedness of L ∞ this give a bound for ˙ W 1, p by Lemma 2.2.3. Hence, integrating in τ and applying implies boundedness of A Sobolev’s inequality we deduce boundedness of A L ∞ . Putting all this information into

˙ ) (A.3),(A.4) we can deduce, from Lemma 2.3.2 and Corollary 2.3.5, that (τ ), (τ is bounded in the (τ -independent) norm H 2 × H 1 as claimed in (2.42).

620

S. Demoulini, D. Stuart

2.4. Proof of Theorem 1.7.2. There are three stages to the proof: • Deduce, from the uniform bounds of Theorem 2.3.1 and the compactness Lemma 2.4.1, that for any sequence µ j → +∞, there exists a subsequence along which the µ j converge. • Identify the limit of these convergent subsequences. • Deduce, from the uniqueness of the limit just identified, that the µ do in fact converge as µ → +∞ (without restriction to subsequences). The first stage of the proof depends upon the following version of the Lions-Aubin Compactness Lemma (see [29, Lemma 10.4]), which is proved by a modification of the standard proof of the usual Ascoli-Arzela theorem: Lemma 2.4.1. Assume that (V, h) is a smooth vector bundle with inner product, over a compact Riemannian manifold (, g), which is endowed with a smooth unitary connection ∇ and corresponding Sobolev norms · H s on the space of sections defined as in [33]. Assume that l, s are positive numbers with l < s. Assume f n (τ ) is a sequence of smooth time-dependent sections of V which satisfy

max f n (τ ) H s + f˙n (τ ) H l ≤ C. |τ |≤τ∗

Then there exists a subsequence { f n j }∞ j=1 which converges to a limiting time-dependent section f ∈ C([−τ∗ , τ∗ ]; H s (V )), in the sense that, max|τ |≤τ∗ ( f n (τ, · )− f (τ, · )) H r → 0, for every r < s. Applying this we infer immediately the existence of a subsequence µ j → +∞ along which the solutions µ j = (Aµ j , µ j ) converge to a limit S (τ ) in the sense that lim sup µ j (τ ) − S (τ )H = 0, (2.48) µ j →∞ [−τ ,τ ] ∗ ∗

r

for r < 2. It follows from Corollary (2.3.7), that lim

sup ∂¯Aµ µ L ∞ = 0,

µ→+∞ [−τ ,τ ] ∗ ∗

and since the other Bogomolny equation B = 21 (1 − ||2 ) is satisfied as a constraint, we deduce by Theorem 1.6.1, that S (τ ) ∈ S N , i.e. the limit S (τ ) is a self-dual vortex for each τ ∈ [−τ∗ , τ∗ ]. In addition, by (2.42) we have µ (τ1 ) − µ (τ2 ) H 1 ≤ M∗ |τ1 − τ2 | so that, by (2.48), the limit S will also satisfy S (τ1 ) − S (τ2 ) H r ≤ c|τ1 − τ2 | for r < 1, i.e. the limit is Lipschitz, and in particular lies in W 1,∞ ([−τ∗ , τ∗ ]; L 2 ). For the second stage, we need to identify the limiting curve τ → S (τ ) ∈ S N as that described in Remark 1.7.1. It is clear, from the conditions on the initial data in the statement of Theorem 1.7.2, that S (0) = ψ0 ∈ S N , and so it remains to deduce the ordinary differential equation (1.22) which then determines the curve completely. To do this it is necessary to take the limit of (1.20): J

∂µ µ µ = µV + U + J(d A0 , i A0 µ ) ∂τ

(2.49)

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

621

as µ → ∞. The first term on the right hand side is the most evidently problematic. However, since the limiting motion is constrained to the vortex space S N , it is only necessary to take a limit projected onto the tangent space T S S N . To this end, it is actually most convenient to introduce Pµ (τ ) = Pµ (τ ) the spectral projection operator onto ∗ Ker Dµ (τ ) = Ker D D , discussed in Lemma 3.3.2. By the final statement of µ (τ ) µ (τ ) Lemma 3.3.2, and the convergence of µ j in (2.48), we know that Pµ (τ ) converge, in the L 2 → L 2 operator norm, to the operator P S (τ ) , which is the spectral projection ∗ operator onto Ker D S (τ ) = Ker D D S (τ ) . (This latter operator is also the orthogS (τ ) onal L 2 projector onto the tangent space T S S N (subject to the gauge condition (1.7).) Apply the operator Pµ (τ ) to Eq. (1.20), to obtain: Pµ (τ )J

∂µ = Pµ (τ )U (µ (τ )), ∂τ

(2.50)

since J(d A0 , i A0 µ ) and V (µ ) are both in the kernel of Pµ by Lemma 3.3.2. We can now identify the limit of the right hand side as P S (τ ) U ( S (τ )) at each τ , and the convergence is strong in L 2 (), by (2.48) and the above mentioned convergence of Pµ (τ ). ∂ For the left hand side it is necessary to consider the limit of the derivatives ∂τµ . Noting that these are bounded in e.g. L 2 ([−τ∗ , τ∗ ]; L 2 ()), we may assume (by restricting to a further subsequence if necessary), the weak in L 2 subsequential convergence to a limit which is the weak time derivative of S : f˜,

∂µ j ∂τ

L 2 ([−τ∗ ,τ∗ ];L 2 ()) → f˜,

∂ S 2 , 2 ∂τ L ([−τ∗ ,τ∗ ];L ())

for every f˜ ∈ L 2 ([−τ∗ , τ∗ ]; L 2 ()). Now to identify the limit along a convergent subsequence µ j → +∞, consider the projection operator P S (τ ) . Choosing f˜(τ, ·) = P S (τ ) ( f (τ, ·)), and using the symmetry of P S (τ ) this implies that +τ∗ +τ∗ ∂µ j ∂µ j L 2 () dτ = 2 dτ f, P S (τ ) J P S (τ ) f, J ∂τ ∂τ L () −τ∗ −τ∗ +τ∗ +τ∗ ∂ S ∂ S → P S (τ ) f, J f, P S (τ ) J L 2 () dτ = 2 dτ, ∂τ ∂τ L () −τ∗ −τ∗ for any f ∈ L 2 ([−τ∗ , τ∗ ]; L 2 ()). On the other hand, by the above mentioned convergence of Pµ (τ ) to P S (τ ) and the bounded convergence theorem we have +τ∗ ∂µ j ∂µ j Pµ j (τ ) f, J L 2 () dτ − P S (τ ) f, J L 2 () dτ → 0, ∂τ ∂τ −τ∗ on account of the bound (2.42). Therefore, we have in the limit: +τ∗ +τ∗ ∂ S 2 dτ = f, P S (τ ) J f, P S (τ ) U ( S (τ )) L 2 () dτ, ∂τ L () −τ∗ −τ∗

(2.51)

for any f ∈ L 2 ([−τ∗ , τ∗ ]; L 2 ()). But since the limit is known by the above to be in W 1,∞ ([−τ∗ , τ∗ ]; L 2 ), it is differentiable (with respect to τ , as a map into L 2 ) almost everywhere (the standard result extends to Hilbert space-valued functions, see, e.g., [4, Prop. 6.41]); the derivative lies in the tangent space T S S N , which is the range of the

622

S. Demoulini, D. Stuart

projector P S (τ ) . Consequently (2.51) implies that τ → S (τ ) is a solution of (1.22), with equality holding in L 2 for almost every τ . But this in turn implies that τ → S (τ ) is actually continuously differentiable into L 2 , and we have a classical solution of (1.22). Finally for the third stage: we have now identified the limit as a solution of the limiting Hamiltonian system specified using Remark 1.7.1. Choosing smooth co-ordinates on M N as in [46] we see that this is a smooth finite dimensional Hamiltonian system, and as such its solutions (for given initial data) are unique. Therefore all subsequences have the same limit, and so we can assert full convergence without resorting to subsequences. 3. Equations and Identities Related to the Self-Dual Structure Notation change: In this section time does not appear at all, and so the boldface A for the spatial component is not used: i.e. in this section only, A refers to the spatial part of the connection, A = A1 d x 1 + A2 d x 2 . Ginzburg-Landau vortices are critical points of the static Ginzburg Landau energy functional Vλ = vλ (A, )dµg introduced following (1.13). The coupling constant λ > 0 is central to the theory of critical points of the Ginzburg-Landau functional and the value λ = 1 is special as in this case the functional admits the Bogomolny decomposition introduced in Remark 1.6.3. This allows for a detailed understanding of the critical points not available for general values of λ, and the theory of critical points for such general values is incomplete. (There is, however, a substantial literature on the asymptotic behaviour of critical points in the λ → +∞ limit, starting with [6]; see [40] and references therein.) This decomposition of V ≡ V1 has proved to be very useful not only for the analysis of critical points, but also for the associated time-dependent equations of vortex motion. For our purposes we need in particular to derive a special form for the operator L ψ associated to the Hessian of V, see (3.61). 3.1. Complex structure. To discuss the Bogomolny structure in detail it is useful to use a complex formulation, so we introduce the complex co-ordinate z = x 1 + i x 2 for the complex structure J on . Using this, there is a decomposition of the complex 1-forms 1 = 1,0 ⊕ 0,1 into the ±i eigenspaces of J , see Notation 1.3.1. Let p (L) be C the space of p-forms taking values in the bundle L: then for p = 1 there is a similar decomposition, 1 (L) = 1,0 (L) ⊕ 0,1 (L). Applying this decomposition to D ∈ 1 (L) we are led to introduce the operator D 0,1 given by 1 D 0,1 = ((∇1 − i A1 ) + i(∇2 − i A2 )) dz = ∂¯ A d z¯ . 2 1 For real 1-forms A1 d x + A2 d x 2 ∈ 1 this decomposition reads R A1 d x 1 + A2 d x 2 = αdz + αd ¯ z¯ , A2 where α = A1 −i , and the map A → α (resp. A → α) ¯ is an R-linear isomorphism 2 1 1,0 ¯ −2ρ dµg . With this α notation we from to (resp. 0,1 ), and A2L 2 = 4 ααe R can write ∂ ∂¯ A = − i α. ¯ ∂ z¯

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

623

3.2. The Hessian. The Bogomolny decomposition amounts to the observation that, with λ = 1, 2 1 1 2 −2ρ 2 ¯ 1 − || ) V(A, ) ≡ V1 (A, ) = + (B − 4|∂ A | e dµg + π N , 2 2 where N = degL. If the following first order equations, called the Bogomolny equations, ∂¯ A = 0, (3.52) 1 B− (1 − ||2 ) = 0, 2 have solutions in a given class, they will automatically minimize V within that class. We introduce the nonlinear Bogomolny operator associated to this decomposition, B : 1R ⊕ 0 (L) −→ 0R ⊕ 0,1 (L), 1 (A, ) → B − (1 − ||2 ) , ∂¯ A φ . 2 Using the norm (β, η)2L 2 = (|β|2 + 4e−2ρ |η|2 )dµg induced from the metric on the target space, we see that V(A, ) = 21 B(A, )2L 2 + π N as in Remark 1.6.3; see [8]. The derivative of B at ψ = (A, ) is the map DBψ : 1 ⊕ 0 (L) −→ 0 ⊕ 0,1 (L) R R given by ˙ ) ˙¯ ˙ − i α), ˙ → (∗d A˙ + , ˙ , ∂¯ A ( A,

(3.53)

A˙ 1 −i A˙ 2 . 2

A2 where α = A1 −i and α˙ = Using this complex notation allows a simple uni2 fied formulation, which takes account the gauge condition (1.7): this condition is the real part of

¯˙ = 0, 4e−2ρ ∂¯ α˙ − i

(3.54)

˙ = 0, while the imaginary part of this expression is just the condition ∗d A˙ + (, ) appearing in the linearized Bogomolny equations. This suggests the introduction of the operators Dψ : 1,0 ⊕ 0 (L) −→ 0C ⊕ 0,1 (L) , (3.55) ∗ Dψ : 0C ⊕ 0,1 (L) −→ 1,0 ⊕ 0 (L) , given by ¯˙ ∂¯ ¯˙ ˙ = (4e−2ρ ∂¯ α˙ − i, Dψ (α, ˙ ) A ˙ − i α), ∗ ¯ Dψ (β, η) = (−∂β − iη, ¯ −4e−2ρ ∂ A η − iβ).

(3.56)

We use the real inner product associated to the L 2 norms induced from the metric as above, i.e.: ¯˙ dµ ¯˙ + ˙ (α, ˙ ), (α , ) L 2 = 4e−2ρ αα on 1,0 ⊕ 0 (L), g ¯ + 4e−2ρ ηη ββ ¯ dµg on 0C ⊕ 0,1 (L) . (β, η), (β , η ) L 2 =

624

S. Demoulini, D. Stuart

Integrating by parts we deduce that ∗ ˙ (β, η) 2 = (α, ˙ Dψ Dψ (α, ˙ ), ˙ ), (β, η) L

L2

,

∗ is the L 2 adjoint of D and so that Dψ ψ ∗ ¯˙ − i (∂¯ ¯ , ˙ = −∂(4e−2ρ ∂¯ α˙ − i) ˙ ) Dψ (α, ˙ ) Dψ A ˙ + iα

˙ − i α) ¯ ) ˙ −4e−2ρ ∂ A (∂¯ A ˙¯ − i(4e−2ρ ∂ α¯˙ + i ¯˙ + ||2 α˙ , = −∂(4e−2ρ ∂¯ α) ˙ + i(∂ A ) ¯˙ A . ˙ + ||2 ˙ + i4e−2ρ α∂ −4e−2ρ ∂ A ∂¯ A We compare this expression with the operator defined in (2.40): L ψ : 1,0 ⊕ 0 (L) −→ 1,0 ⊕ 0 (L) ,

(3.57)

(3.58)

which defines the Hessian of V on the subspace on which the gauge condition (1.7) is satisfied, i.e., d2 ˙ |=0 V(ψ + ψ), (3.59) d 2 ˙ ) ˙ satisfying (1.7). Using mixed real/complex notation for A/α, (2.40) for ψ˙ = ( A, implies the following formula: ˙ D1 ) + i(i , ˙ D2 ) , L ψ = −4∂(e−2ρ ∂¯ α) ˙ + ||2 α˙ − (i , 1 2 ˙ −2ρ ˙ ˙ − A − (1 − 3|| ) + 2ie A · D . (3.60) 2 ˙ L ψ ψ ˙ L 2 = D 2 Vψ (ψ, ˙ ψ) ˙ = ψ,

¯˙ − ¯˙ A and −(i , ˙ D1 ) + i(i , ˙ D2 ) = i ∂ Calculate A˙ · D = 2α˙ ∂¯ A + 2α∂ A ˙ ∂¯ A , from which it follows that i ˙ ∂¯ A −i ∗ ˙

(L ψ − Dψ Dψ )ψ = . (3.61) ˙ + 4ie−2ρ α˙ ∂¯ A B − 21 (1 − ||2 ) (Incidentally, observing that ˙ + ) ˙ = B(A, ) + Dψ ψ˙ + B(A + A,

1 2 ˙ ˙ ˙ || , −i¯α , 2

˙ ) ˙ satisfying (1.7), the identity (3.61) can also be read off from the with ψ˙ = ( A, ˙ + ): ˙ quadratic part of the Taylor expansion for V(A + A, 1 1 1 2 ˙ L 2 = |Dψ ψ| ˙ 2 2 + B(ψ) , ˙ L ψ ψ ˙ , −i α˙¯ ˙ ψ, | | L 2 2 2 1 1 1 ˙ 2 ˙ 22 + (B − (1 − ||2 ))|| = |Dψ ψ| L 2 2 2 ˙ +4e−2ρ ∂¯ A , −i α˙¯ dµg , using the inner product on 1,0 ⊕ 0 (L) defined above.)

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

625

Corollary 3.2.1. Let J denote the complex structure defined in (1.21). There exists a ˙ ∈ 1,0 ⊕ 0 (L), such number c > 0, independent of ψ = (α, ) and ζ = ψ˙ = (α, ˙ ) that |Jζ, L ψ ζ L 2 | ≤ c |B(ψ)| L ∞ |ζ |2L 2 . Proof. By (3.61) |Jζ, L ψ ζ L 2 − Dψ Jζ, Dψ ζ L 2 | ≤ |B(ψ)| L ∞ |ζ |2L 2 . Now the complex structure J written in complex notation, i.e. acting on 1,0 ⊕ 0( L), is given by ˙ = (−i α, ˙ Correspondingly, on 0 ⊕ 0,1 (L) we introduce the complex J(α, ˙ ) ˙ i ). C structure J (β, η) = (iβ, −iη). Then, by observation Dψ Jζ = −J Dψ ζ.

(3.62)

Therefore, writing w = Dψ ζ , we have Dψ Jζ, Dψ ζ L 2 = −J w, w L 2 = 0 by skewsymmetry, and the result follows. Lemma 3.2.2. Assume there are positive numbers L , E0 such that || L 2 = L, and Vλ (A, ) = E0 and λ > 0. Then the quadratic forms Q˜ (β) = 4|∂β|2 e−2ρ + ||2 |β|2 dµg on ⊕ 0C and Q (A,) (η) = 4e−4ρ |∂ A η|2 + e−2ρ ||2 |η|2 dµg on 0,1 (L)

are strictly positive, and in fact bounded below by (respectively) Cβ2H 1 and Cη2H 1 ,

where C is a positive number depending only upon the numbers L , E0 .

A

Proof. We will present the proof for the quadratic form Q (A,) (η) as the other is similar but easier. Clearly Q (A,) (η) ≥ 0 and in fact Q (A,) (η) = 0 if and only if η ≡ 0 on (because if ∂ A η ≡ 0 then η has isolated zeros (as in [28], Sect. 3.5); if η ≡ 0 then η ≡ 0, since = 0 a.e. contradicts ||2 = L > 0. Furthermore, we show that Q (A,) (η) ≥ c|η|2L 2 for a constant c; to be precise there exists c = c(L , E0 ) such that Q (A,) (η) ≥ c, for all η such that η L 2 = 1.

(3.63)

We will prove this by contradiction. First we obtain some bounds. By gauge invariance we are free to assume that the Coulomb gauge condition div A = 0 holds. With this gauge condition, we have the bound A H 1 ≤ c(E0 ), and so A is bounded in every L p space. Now use ∂η L p ≤ ∂ A η L p + Aη L p to deduce that ∂η2L p ≤ C(1 + Q (A,) (η)) for every p < 2, by Holder’s inequality. This in turn implies, by the L p estimate for the inhomogeneous Cauchy-Riemann system, that η is bounded similarly in L 4 , and so since A is also we can bound ∂η in L 2 and hence η in H 1 . Finally, since A and η are bounded similarly in L 4 , this imples that η2H 1 ≤ C(1 + Q (A,) (η)), with C depending A

only upon E0 , L. To conclude, in Coulomb gauge the A, , η are all bounded in H 1 in terms of L , E0 , Q (A,) (η).

626

S. Demoulini, D. Stuart

The contradiction argument now starts: assume (3.63) fails. Then, by the bounds just obtained and the Banach-Alaoglu and Rellich theorems, there is a sequence (Aν , ν , ην ) with Aν H 1 + ∇ν L 2 ≤ K (E0 , L), ν L 2 = L and ην L 2 = 1, such that Q (Aν ,ν ) (ην ) −→ 0, Aν −→ A weakly in H 1 , ν −→ weakly in H 1 and strongly in L p for any p < ∞, ην −→ η weakly in H 1 and strongly in L p . This implies that || L 2 = L > 0, Q (A,) (η) = 0 which implies as above that = 0 a.e. and contradicts as above that || L 2 is constant. This leads to Q (A,) (η) ≥ c1 |η|2L 2 where c1 = c1 (L , E0 ). Finally just apply the bound above for Dη L 2 to improve this up to the H A1 lower bound claimed. 3.3. The Bogomolny foliation. We introduce a foliation associated to the Bogomolny operator, which we regard as a map between the following Hilbert spaces: B : H 1 1R ⊕ 0 (L) −→ L 2 0R ⊕ 0,1 (L) , 1 2 ¯ (A, ) → B − (1 − || ) , ∂ A φ . 2 With this choice of norms B is a smooth function. The next result shows that it is a submersion if the energy is close to the minimum value: ∗ = {0}, Lemma 3.3.1. There exists θ∗ > 0 such that ∂¯ A L 2 < θ∗ implies that Ker D and Ker D is 2N dimensional (where N = degL). Proof. D∗ (β, η) = 0 is equivalent to −∂β − iη¯ = 0, −4e−2ρ ∂ A η − iβ¯ = 0. Apply the operations 4∂¯ to the first and 4∂¯ A to the second of these equations to deduce that ¯ + ||2 β − 4ie−2ρ ∂¯ A η¯ = 0, −4e−2ρ ∂∂β −4∂¯ A (e−2ρ ∂¯ A η) + ||2 η − i(∂¯ A )β = 0. The first two terms of these two equations are respectively the Euler- Lagrange operators associated to the quadratic forms Q˜ (β) and Q A, (η) studied in the previous lemma. Then we get the estimates Q˜ (β) ≤ c|∂¯ A | L 2 |β| L 4 |η| L 4 , Q A, (η) ≤ c|∂¯ A | L 2 |β| L 4 |η| L 4 , which implies the result, since Q˜ (β) ≥ c|β|2H 1 and Q A, (η) ≥ c|η|2H 1 . A

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

627

The natural geometrical context for the results of this section will now be explained. Define O∗ ≡ {(A, ) ∈ H 1 ( 1 ⊕ 0 (L)) : ∂¯ A L 2 < θ∗ } which is an open set R containing {ψ = (A, ) : B(ψ) = 0} ⊂ H 1 ( 1 ⊕ 0 (L)). Furthermore, the preR vious lemma implies that Dψ : ( 1,0 ⊕ 0 (L)) −→ ( 0 ⊕ 0,1 (L)) is surjective C for ψ ∈ O∗ . By the discussion in the paragraph preceding (3.55), this implies that 1 0 0 0,1 DBψ : ⊕ (L) −→ ⊕ (L) is also surjective for ψ ∈ O∗ , and hence R R the level sets of B form a foliation of O∗ whose leaves have tangent space equal to Ker DBψ by [1, Sect. 3.5 and Sect. 4.4]. The intersection of this tangent space with ˙ ) ˙ ) ˙ : ( A, ˙ satisfies (1.7)} is Ker D . S L ψ = {( A, ∗ D defined in Lemma 3.3.2. Assume ψ ∈ ( 1,0 ⊕ 0 (L)) ∩ O∗ . The operators Dψ ψ C 2 2 (3.57) are self-adjoint operators on L , with domain H , with 2N -dimensional kernel equal to Ker Dψ , and ∗ Dψ Dψ ζ L 2 + ζ L 2 ≥ cζ H 2 .

(3.64)

∗ D = Ker D . Then P (V (ψ)) Let Pψ be the orthogonal spectral projector onto Ker Dψ ψ ψ ψ = 0 and Pψ (J(dχ , iχ µ )) = 0 for any smooth real valued function χ . Finally, if also ψ ( j) ∈ ( 1,0 ⊕ 0 (L))∩O∗ , and sup j ψ ( j) H2 < ∞ and lim j→+∞ ψ ( j) −ψHr = C 0, for all r < 2, the corresponding projectors Pψ ( j) converge to Pψ in L 2 → L 2 operator norm.

Proof. The first assertion and the bound (3.64) follow from Lemma 3.3.1 and standard elliptic theory. The next statement follows by noting that if n ∈ Ker Dψ , then differen tiation of V(ψ) = 21 |B(ψ)|2 dµg + π N yields d V(ψ + sn) = B(ψ), DBψ (n) L 2 = 0, n, V (ψ) L 2 = ds s=0 since Ker Dψ ⊂ Ker DBψ by the discussion preceding (3.55). Next, n ∈ Ker Dψ implies that Pψ (J(dχ , iχ µ )) = 0 since integration by parts reduces this to the fact that n solves the first component of DBψ n = 0 in (3.53). The final statement follows by [25, Sect. IV.3], if it can be established that T j ≡ ∗ D ∗D Dψ ≡ Dψ ψ in the generalized sense of Kato (see ( j) ψ ( j) converges to T [25, Sect. IV.2.6]), or equivalently in the norm resolvent sense: lim (i + T )−1 − (i + T j )−1 L 2 →L 2 = 0.

j→∞

(3.65)

To verify this convergence, it is convenient first of all to verify it in ˜ ) ˜ ( j) ) = eiχ j · ψ ( j) and ψ˜ = ( A, ˜ = eiχ · ψ Coulomb gauge. So let ψ˜ ( j) = ( A˜ ( j) , ( j) ˜ The ˜ be gauge transforms (as defined following (1.11)), such that div A = 0 = div A. ( j) assumed properties of ψ ensure that sup χ j H 2 < ∞ and that lim χ j − χ H r = 0, ∀ r < 2, so that also ψ˜ ( j) → ψ˜ in Hr for r < 2. Now observe that in Coulomb gauge the formula (3.57) does not involve any derivatives of the connection one-form A at all. From this it is then immediate by inspection that (writing T˜ j ≡ D∗˜ ( j) Dψ˜ ( j) , and ψ T˜ ≡ D∗ D ˜ ,) ψ˜

ψ

(T˜ − T˜ j )ζ L 2 ≤ δ j ζ H 2 ≤ cδ j (ζ L 2 + T˜ ζ L 2 ),

(3.66)

628

S. Demoulini, D. Stuart

where δ j → 0 as j → +∞. But this last fact implies (by [25, Theorems IV.2.24-25]) that T˜ j converges to T˜ in the generalized sense, and hence in the resolvent sense: lim (i + T˜ )−1 − (i + T˜ j )−1 L 2 →L 2 = 0.

j→∞

(3.67)

This would establish the convergence of the corresponding spectral projectors in Coulomb gauge. To go back to the original ψ j it is just necessary to make use of the ˙ the induced action of the gauge following gauge invariance property: on ζ = (α, ˙ ) ˙ = (α, ˙ for any S 1 valued function g, and group is g • (α, ˙ ) ˙ g ) T˜ eiχ • ζ = eiχ • (T ζ ), and similarly with T j , χ j replaced by T, χ . This gauge invariance property implies that (i + T j )−1 = e−iχ j ◦ (i + T˜ j )−1 ◦ eiχ j and (i + T )−1 = e−iχ ◦ (i + T˜ )−1 ◦ eiχ , where by ◦ we mean operator composition, and eiχ is shorthand for the operator eiχ • etc. Finally, using lim χ j − χ H r = 0, ∀ r < 2, we see that (3.66) and (3.67) imply (3.65), completing the proof. Appendix A.1. Operators. To describe in detail the Laplacian operators which appear in the text, we assume to be covered by an atlas of charts Uα on each of which is a local trivialisation of L determined by a choice of a local unitary frame. (A smooth section of L then corresponds to a family of smooth functions α : Uα → C so that on Uα ∩ Uβ we have α = eiθαβ β with eiθαβ : Uα ∩ Uβ → S 1 smooth.) We assume given a smooth connection D = ∇ − iA on L acting as a covariant derivative operator on sections of L. Working in such a chart, and suppressing the index α, the Laplacian on sections of L is given by √ 1 − A = − √ D j g i j g Di = −e−2ρ (Di Di ). (A.1) g d 1 2 This satisfies − A , L 2 = d 2 |D( + )| L 2 |=0 . Next we need the Laplacian on one-forms. Starting with A = A1 d x 1 + A2 d x 2 ∈ 1 , R the negative Laplacian is the Euler-Lagrange operator associated to the Dirichlet form 1 2 2 2 (|div A| + |dA| )dµg (with the norms inside the integral determined by g in the standard way). Transferring to complex form α = 21 (A1 − i A2 ) ∈ 1,0 , this Dirichlet ¯ ∂α ¯ dµg . The corresponding negative Laplacian −1,0 form is just I (α) = 8 e−4ρ ∂α d is then defined by −1,0 α, β L 2 = d I (α + β)|=0 , where we use the induced inner product 1,0 as in Sect. 3. This leads to the following formula for the negative Laplacian −1,0 on α ∈ 1,0 :

¯ −1,0 α = −4∂(e−2ρ ∂α), which is precisely the operator appearing in Sect. 3. Similarly, on 0,1 (L) the negative Laplacian is ¯ −2ρ ∂ A η), −0,1 A η = −4∂ A (e which is the operator in (2.34).

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

629

A.2. Norms and inequalities. We define the Sobolev norms defined with the covariant derivative D = ∇A = ∇ − iA. (We write ∇A in place of D for emphasis here.) The first Sobolev norm is defined by ||2 + |∇ A |2 dµg . (A.2) ||2H 1 =

A

In the above integral the inner products are the standard ones induced from h and g. k, p The higher norms HA2 , . . . are defined similarly, as are the WA norms for integral k p and any p ∈ [1, ∞]. The L norms of the higher covariant derivatives arising from the connections ∇A and ∇ are related as expressed schematically in the following: ∇ L p ≤ ∇A L p + cA L ∞ L p , ∇∇ L p ≤ ∇A ∇A L p + cA L ∞ ∇A L p + c(1 + ∇A L p + A2L ∞ L p ), ∇∇∇ L p ≤ ∇A ∇A ∇A L p + cA L ∞ ∇A ∇A L p + c(1 + ∇A L ∞ + A2L ∞ )∇A L p +c 1 + ∇ 2 A L q L r + A3L ∞ L p ,

(A.3) (A.4)

(A.5)

where q −1 + r −1 = p −1 . We now collect together some inequalities from [13]. The system of equations B= f 1

div A = g

(A.6)

(where as above div : → is minus the adjoint of d) is a first order elliptic system which can be solved for A subject to the condition on f dµg dictated by an integer N , the degree of L. It can be rewritten 0

dA = ( f − b)dµg

div A = g

(A.7)

and solved via Hodge decomposition as long as the right hand sides have zero integral. There is a solution unique up to addition of harmonic 1-forms which satisfies AW 1, p ≤ c p (1 + f L p + g L p ) for p < ∞. Lemma A.2.1 (Covariant Sobolev and Gagliardo-Nirenberg inequalities). For (, g) as above and for (A, ) ∈ (H 1 × HA2 )(), then ∇A ∈ L 4 () and ∇A L 4 ≤ c∇A H 1

(A.8)

A

and also for all 1 ≤ p < ∞, HA2 → WA → L ∞ continuously on . Also 1/2 1/2 1/2 ∇A L 4 ≤ c∇A L 2 ∇A L 2 + ∇A ∇A L 2 , 1, p

(A.9)

where c depends only on (, g). Lemma A.2.2 (Covariant version of the Garding inequality). For = (A, ) such that the norms on appearing below are finite we have 1/2

∇A ∇A L 2 ≤ A L 2 + cB L ∞ ∇A L 2 1/2

1/2

1/2

+c L ∞ ∇A L 2 ∇ B L 2 , where c is a number depending only on (, g).

(A.10)

630

S. Demoulini, D. Stuart

Lemma A.2.3 (Covariant version of the Brezis-Gallouet inequality). If A ∈ H 1 () and ∈ HA2 () then (A.11) L ∞ () ≤ c 1 + H 1 ln(1 + H 2 ) , A

A

where c depends only on (, g). A.3. Global existence results and different choices of gauge. In this section we will summarize the existence theory for (1.5) from [5] and [13], and explain how Theorem 1.4.1 can be deduced from it. Existence theory can be worked out using various gauge conditions, and a choice of gauge is usually made to facilitate the calculations. The simplest condition for the statement of the theorem, which also is convenient if we wish to make the Hamiltonian structure manifest - see Sect. 1.5, is the temporal gauge condition A0 = 0; however, the regularity is stronger in Coulomb gauge div A = 0. We have the following statements. Theorem A.3.1 (Global existence in temporal gauge). Given data (0) ∈ H 2 () and A(0) ∈ H 1 (), there exists a global solution for the

Cauchy problem for (1.5) satisfying A0 = 0, with regularity ∈ C [0, ∞); H 2 () ∩ C 1 [0, ∞); L 2 () and

A ∈ C 1 [0, ∞); H 1 () . Furthermore, it is the unique such solution satisfying A0 = 0 and satisfies the estimate βt

(t) H 2 () ≤ ceαe , for some positive constants c, α, β depending only on (, g), the equations, and the initial data. This can be derived from Theorem 1.1 in [13], by applying a gauge transformation to put the solution obtained there into temporal gauge. To be precise the cited result gives a global solution (a0 , a, φ) of the system (1.5) satisfying the parabolic gauge condition a0 = div a, and the gauge invariant growth estimate βt

φ Ha2 () (t) ≤ ceαe .

(A.12)

The solution satisfies φ ∈ C [0, ∞); H 2 () ∩ C 1 [0, ∞); L 2 () , a ∈ C ([0, ∞);

H 1 () and a0 ∈ C [0, ∞); L 2 () . Now define χ ∈ C 1 [0, ∞); L 2 () by ∂t χ + a0 = 0 and χ (0) = 0. Define (, A) = (φeitχ , a + dχ ): this gives a solution to (1.5) satisfying the properties asserted in Theorem A.3.1. (Most of this can be read off immediately, except perhaps to verify that A ∈ C 1 [0, ∞); H 1 () , but this follows from the first equation in (1.5), using the fact that A0 = 0 and the right hand side is continuous into L 2 .) An alternative approach to local existence is given in [5], where it is shown that, in Coulomb gauge, systems of the type (1.5) can be put in the form of an abstract evolution equation to which Kato’s theory ([26]) applies. This yields the existence of a local solution denoted (A , ) with continuous into H 2 on a time interval of length determined by the H 2 norm of the initial data. But the estimate (A.12) above is gauge invariant, and allows continuation of the local solution to provide a global solu-

tion in Coulomb gauge with regularity ∈ C [0, ∞); H 2 () ∩ C 1 [0, ∞); L 2 ()

Adiabatic Limit and Vortices Slow Motion in Chern-Simons-Schrödinger System

631

and A ∈ C [0, ∞); H 3 () ∩ C 1 [0, ∞); H 1 () satisfying the Coulomb gauge condition div A = 0. Finally, we explain how to obtain Theorem 1.4.1 from these results. Given a solution A , in Coulomb gauge, as just described, define χ (t, x) to be the solution of ˙ − i , ˙ = −i , ˙ , (− + | |2 )χ˙ = div A with χ (0, x) = 0. Then it is easy to verify that A = A + dχ , = eiχ satisfies (1.7). Under the condition (t)2L 2 () = L > 0 the solution exists and is unique at time t; this condition is natural because (t) L 2 () is independent of time for solutions of (1.5). Now by the above mentioned Coulomb gauge regularity and the basic estimates for the Laplacian we deduce that χ ∈ C 1 ([0, ∞); H 2 ). This gives the global existence theorem in the gauge stated in Theorem 1.4.1. References 1. Abraham, R., Marsden, J., Ratiu, T.: Manifolds, Tensor Analysis and Applications. New York: SpringerVerlag, 1988 2. Arovas, D., Schrieffer, R., Wilczek, F., Zee, A.: Statistical mechanics of anyons. Nucl. Phys. B 251, 117–126 (1985) 3. Aitchison, I.J.R., Ao, P., Thouless, D., Zhu, X.: Phys. Rev B. 51, 6531 (1995) 4. Benyamini, Y., Lindenstrauss, J.: Geometric Nonlinear Functional Analysis. Providence, RI: American Mathematical Society, 2000 5. Berge, L., de Bouard, A., Saut, J.: Blowing up time-dependent solutions of the planar Chern-Simons gauged nonlinear Schrödinger equation. Nonlinearity 8, 235–253 (1995) 6. Bethuel, F., Riviere, T.: Vortices for a variational problem related to superconductivity. Ann. Inst. H. Poincaré Anal. Non Linéaire 12(3), 243–303 (1995) 7. Bogomolny, E.: Stability of Classical Solutions. Sov. J. Nucl. Phys. 24, 861–870 (1976) 8. Bradlow, S.: Vortices in holomorphic line bundles and closed Kaehler manifolds. Commun. Math. Phys. 118, 1–17 (1990) 9. Bradlow, S., Daskalopoulos, G.: Moduli of stable pairs for holomorphic bundles over Riemann surfaces. Internat. J. Math. 2, 477–513 (1991) 10. Brezis, H., Gallouet, T.: Nonlinear Schrödinger evolution equation. Nonlin. Anal. T.M.A. 4(4), 677–681 (1980) 11. Deser, S., Jackiw, R., Templeton, S.: Topologically massive gauge theories. Ann. Phys. 140, 372– 411 (1982) 12. Demoulini, S., Stuart, D.: Gradient flow of the superconducting Ginzburg-Landau functional on the plane. Commun. Anal. Geom. 5(1), 121–198 (1997) 13. Demoulini, S.: Global existence for a nonlinear Schrödinger-Chern-Simons system on a surface. Ann. Inst. H. Poincaré Anal. Non Linéaire 24(2), 207–225 (2007) 14. Demoulini, S., Stuart, D.M.A.: Existence and regularity for generalised harmonic maps associated to a nonlocal polyconvex energy of Skyrme type. Calc. Var. PDE 30(4), 523–546 (2007) 15. Froehlich, J., Marchetti, P-A.: Commun. Math. Phys. 121, 177–221 (1989) 16. Froehlich, J., Studer, U.M.: U (1) × SU (2) - gauge invariance of non-relativistic quantum mechanics and generalized Hall effects. Commun. Math. Phys. 148, 553–600 (1992) 17. Dunne, G.: Aspects of Chern-Simons theory. In: Les Houches Lectures on Topological Aspects of Low Dimensional Systems, EDP Sci., Les Ulis, 1998. Available online at http://arxiv.org/abs/hep-th/ 9902115v1, 1991 18. Girvin, S.: The Quantum hall effect: novel excitations and broken symmetries. In: Les Houches Lectures on Topological Aspects of Low Dimensional Systems, EDP Sci., Les Ulis, 1998. Available online at http:// arxiv.org/abs/cond-mat/9907002v1[cond-mat.mes-hall], 1999 19. Gustafson, S., Sigal, I.M.: The stability of magnetic vortices. Commun. Math. Phys. 212, 257–275 (2000) 20. Haskins, M., Speight, J.M.: The geodesic approximation for lump dynamics and coercivity of the Hessian for harmonic maps. J. Math. Phys. 44, 3470–3494 (2003) 21. Hassaïne, M., Horvathy, P.: Non-relativistic Maxwell-Chern-Simons vortices. Ann. Phys. 263(2), 276–294 (1998) 22. Horvathy, P., Zhang, P.: Vortices in abelian Chern-Simons gauge theory. http://arxiv.org/abs/0811. 2094v3[hep-th], 2009

632

S. Demoulini, D. Stuart

23. Jackiw, R., Pi, So-Young: Self-dual Chern-Simons solitons. In: Low-Dimensional Field Theories and Condensed Matter Physics (Kyoto,1991), Progr. Theoret. Phys. Suppl. 107, 1–40 (1992) 24. Jost, J.: Riemannian Geometry and Geometric Analysis. Berlin-Heidlberg-NewYork: Springer-Verlag, 1988 25. Kato, T.: Perturbation Theory for Linear Operators. Berlin-Heidlberg-NewYork: Springer-Verlag, 1980 26. Kato, T.: Quasi-linear Equations of Evolution with Applications to Partial Differential Equations. Springer Lecture Notes in Mathematics 448 Berlin-Heidlberg-NewYork: Springer-Verlag, 1975, pp. 27– 50 27. Krusch, S., Sutcliffe, P.: Schrödinger-Chern-Simons vortex dynamics. Nonlinearity 19, 1515–1534 (2006) 28. Jaffe, A., Taubes, C.: Vortices and Monopoles. Boston: Birkhauser, 1982 29. Majda, A., Bertozzi, A.: Vorticity and Incompressible Fluid Flow. Cambridge: Cambridge University Press, 2001 30. Manton, N.: First order vortex dynamics. Ann. Phys. 256, 114–131 (1997) 31. Manton, N., Sutcliffe, P.: Topological Solitons. Cambridge: Cambridge University Press, 2004 32. Nagosa, N.: Quantum Field Theory in Condensed Matter Physics. Berlin: Springer, 1999 33. Palais, R.: Foundations of Global Nonlinear Analysis. Mathematics lecture note series, NewYork: W.A. Benjamin, 1968 34. Prange, R., Girvin, S.: The Quantum Hall Effect. 2nd edition, New York: Springer-Verlag, 1990 35. Reed, M., Simon, B.: Functional Analysis. San Diego CA: Academic Press, 1980 36. Rodnianski, I., Sterbenz, J.: On the formation of singularities in the critical O(3) σ -model. http://arxiv. org/abs/math/0605023v3[math.AP], 2008 37. Romao, N.: Quantum Chern-Simons vortices on a sphere. J. Math. Phys. 42, 3445–3469 (2001) 38. Romao, N., Speight, J.M.: Slow Schrödinger dynamics of gauged vortices. Nonlinearity 17(4), 1337–1355 (2004) 39. Rubin, H., Ungar, P.: Motion under a strong constraining force. Commun. Pure Appl. Math. 10, 65–87 (1957) 40. Sandier, E., Serfaty, S.: Vortices in the Magnetic Ginzburg-Landau Model. Progress in Nonlinear Differential Equations and their Applications 70 Basel-Boston: Birkhauser, 2007 41. Sondhi, S.L., Karlhede, A., Kivelson, S.A., Rezayi, E.H.: Skyrmions and the crossover from the integer to fractional quantum Hall effect at small Zeeman energies. Phys. Rev. B 47, 16419 (1993) 42. Stone, M.: Superfluid dynamics of the fractional quantum Hall state. Phys. Rev. B 42, 1, 212 (1990) 43. Stone, M.: Int. J. Mod. Phys. B 9, 1359 (1995) 44. Stuart, D.: Dynamics of Abelian Higgs vortices in the near Bogomolny regime. Commun. Math. Phys. 159, 51–91 (1994) 45. Stuart, D.: The geodesic approximation for the Yang-Mills-Higgs equations. Commun. Math. Phys. 166, 149–190 (1994) 46. Stuart, D.: Periodic solutions of the Abelian Higgs model and rigid rotation of vortices. Geom. Funct. Anal. 9, 1–28 (1999) 47. Stuart, D.: Analysis of the adiabatic limit for solitons in classical field theory. Proc R Soc A 463, 2753–2781 (2007) 48. Taylor, M.: Partial Differential Equations. Applied Mathematical Sciences, Vol 117, Berlin-HeidelbergNewYork: Springer-Verlag, 1996 49. Tsvelik, A.M.: Quantum Field Theory in Condensed Matter Physics. Cambridge: Cambridge University Press, 2003 50. Wilczek, F.: Quantum mechanics of fractional spin particles. Phys. Rev. Lett. 49, 1, 957 (1982) 51. Zhang, S.C.: The Chern-Simons-Landau-Ginzburg theory of the fractional quantum Hall effect. Int. J. Mod. Phys. B 6(1), 43–77 (1992) 52. Zhang, S.C., Hansson, T.H., Kivelson, S.: Effective field theory model for the fractional quantum Hall effect. Phys. Rev. Lett. 62, 82 (1989), Erratum: Phys. Rev. Lett. 62, 980 (1989) Communicated by I. M. Sigal

Commun. Math. Phys. 290, 633–649 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0866-5

Communications in

Mathematical Physics

A Toroidal Magnetic Field Theorem R. Kaiser Fakultät für Mathematik und Physik, Universität Bayreuth, D-95440 Bayreuth, Germany. E-mail: [email protected] Received: 29 May 2008 / Accepted: 17 March 2009 Published online: 1 July 2009 – © Springer-Verlag 2009

Abstract: In the framework of magnetohydrodynamics the kinematic dynamo problem in a spherical fluid volume as well as in a plane layer is considered. On the premises of a purely toroidal magnetic field a nonlinear evolution equation for the toroidal scalar is derived. In this equation the flow field is constrained in such a way that no poloidal magnetic field can arise, but is otherwise arbitrary; the magnetic diffusivity is assumed to be spherically (horizontally, resp.) symmetric. Solutions of this problem are of particular interest since the magnetic field is confined to the fluid volume and therefore invisible to an external observer. It is proved in this paper that the maximum norm of smooth solutions of this equation decays exponentially fast to zero. Thus, dynamo solutions, i.e. nondecaying solutions, of this type do not exist. 1. Introduction It is common belief that the magnetic field of most stars and planets is generated by motions in an approximately spherically symmetric liquid conducting zone deep in the interior of the celestial body. The only trace of this process detectable for an external observer is the poloidal magnetic field component which extends into the surrounding vacuum, whereas the toroidal field component remains trapped in the conducting zone. So, a dynamo in a celestial body generating a purely toroidal magnetic field might remain invisible for an external observer. However, the mechanism of field generation as described by the induction equation is usually explained in terms of an interaction of toroidal and poloidal magnetic field components (cf. [1]). So, several researchers in the field conjectured that dynamos generating a purely toroidal magnetic field do not exist [2–6]. This conjecture is similar to other well-established “antidynamo” theorems [7], e.g. Cowling’s theorem excluding dynamos generating a purely axisymmetric magnetic field [8] or the “toroidal velocity theorem”, which excludes field generation if the fluid motion is purely toroidal [2,9]. The present situation, however, is more complicated than those described by these theorems.

634

R. Kaiser

On the assumption of a purely toroidal magnetic field one obtains from the induction equation an evolution equation for the toroidal scalar T and, additionally, a constraint equation which is not automatically preserved by the evolution equation. In an earlier publication [10] a solution of this constraint has been proposed resulting in a nonlinear evolution equation for T . The character of this equation is that of a second order parabolic equation with vanishing zeroth-order term and first-order terms which have neither pure advection form nor pure divergence form. It is this mixed form of the first-order terms, which prevents the straightforward application of maximum principles or L 2 - (i.e. energy) or L 1 -estimates, which are the mathematical essence of the above mentioned antidynamo results (cf. [11–13]). Instead, in [10] the monotonous decay of T in a mixed L 1 -L ∞ -norm, corresponding to the structure of the first-order terms, has been shown. From a mathematical point of view, however, the proof of this decay result cannot be considered as rigorous, as it depends on assumptions about T , which are not guaranteed for arbitrary (smooth) solutions of the governing equation. It is the aim of the present paper to make the arguments in [10] rigorous. Moreover, sharper decay results are derived, in particular, exponential decay of T to zero in the maximum norm. The plan of the paper is as follows: In Sect. 2 the mathematical framework of the dynamo problem is introduced, in particular, the induction equation and the poloidal/toroidal decomposition of solenoidal vector fields. Using this decomposition and a solution of the constraint, the governing equation for the toroidal scalar T is derived, and previous results about solutions of this equation are reviewed. Section 3 presents a method to eliminate from a parabolic equation of mixed advection/divergence form those variables, which correspond to the advection part: Taking maxima and minima of a smooth solution with respect to these variables defines a “reduced function”, which itself is a weak subsolution of a “reduced equation” of pure divergence form. In Sect. 4 it is shown that decay results in [14] apply with minor modifications to solutions of this reduced equation. Finally, in Sect. 5, the foregoing results are applied to the governing equation of the toroidal scalar, and corresponding decay results are formulated in the case of a spherical fluid domain as well as for a layer-like domain. 2. Mathematical Setting and Previous Results Given a volume V ⊂ R3 filled with a fluid of conductivity λ > 0, which is in prescribed motion according to a flow field v, the kinematic dynamo problem asks for solutions B, which do not decay in time, (the so-called dynamo solutions) of the following initialvalue problem: ⎧ ∂t B = ∇ × (v × B) − ∇ × (λ∇ × B), ∇ · B = 0 in V × (0, ∞), ⎪ ⎪ ⎪ ⎨ ∇ × B = 0, ∇ · B = 0 in R3 \ V × (0, ∞), (1) B continuous in R3 × (0, ∞), ⎪ ⎪ |B(r, ·)| → 0 for |r| → ∞, ⎪ ⎩ B(·, 0) = B0 , ∇ · B0 = 0 on V × {t = 0}. The induction equation (1)1 describes the generation of the magnetic field B by the motion of a conducting fluid. Outside the fluid volume we assume no further sources of magnetic field. Thus, B continues outside the fluid volume as a vacuum field, which vanishes at (spatial) infinity (cf. [12]). The fluid volume V is in the following either a ball B R of radius R > 0 or a plane layer L := R2 × [0, l] of thickness l. In the latter case B and v are assumed to be

Toroidal Magnetic Field Theorem

635

periodic in the “horizontal” variables x and y with periodicity cell P = [0, l x ] × [0, ly ], and condition (1)4 refers now to the “vertical” variable z, i.e. |r| → ∞ is replaced by |z| → ∞. The conductivity is assumed to be spherically symmetric, i.e. λ = λ(r, t), respectively λ = λ(z, t) in the plane case, and bounded from below λ ≥ λ0 > 0.

(2)

The flow field has a vanishing normal component at the boundary of the fluid volume, n · v|∂ V = 0, and satisfies some constraint (see below), but need not be divergence-free or symmetric in any sense. The plane and the spherical version of the dynamo problem have much in common, for instance, all known antidynamo theorems are equally valid in a ball or a plane layer (but not in a cylinder). The spherical case is closer to applications, whereas the plane case often serves as a toy model avoiding the complications due to spherical geometry. We deal with the plane case first. The poloidal/toroidal decomposition reads in this case [15]: B = ∇ × (∇ × S ez ) + ∇ × T ez + b(z) = (∂x ∂z S + ∂y T + bx , ∂y ∂z S − ∂x T + by , −h S + bz )T . ez is here the unit vector in z-direction, h means ∂x2 + ∂y2 , and b is the horizontal mean of B, b = B := |P1 | P B dxdy, depending only on z (bz being constant). Under the conditions S = T = 0 this decomposition is unique and allows the equivalent formulation of (1)1 in terms of the toroidal scalar T , the poloidal one S, and the horizontal mean b. If S and b are assumed to be zero, the horizontal mean of (1)1 , and the z-components of (1)1 and of its curl yield: ∂z vz ∂x T = ∂z vz ∂y T = 0, ∇T × ∇vz · ez = 0, h (∂t + v · ∇ − ∇ · λ∇) T = ∇ × (∇T × ∇vz ) · ez .

(3) (4) (5)

Equations (3) and (4) constrain (the z-component of) the flow field in such a way that no poloidal magnetic field nor a mean field can arise. Unfortunately, these constraints are not preserved by the evolution equation (5), and must be incorporated in some way into (5).1 A function w(τ, z, t) allowing the representation vz (x, y, z, t) = w(T (x, y, z, t), z, t) =: wT (x, y, z, t)

(6)

for vz clearly solves (3) and (4), and is motivated by the fact that (4) implies the existence of such a function locally if ez × ∇T = 0. With (6) the right-hand side in (5) can be reformulated: ez × ∇ · [∇T × ∇wT ] = −ez × ∇ · [∂z w|τ =T ez × ∇T ] T ∂z w dτ = −h [∂z W |τ =T ] = −h [∂z WT − vz ∂z T ] = −(ez × ∇) · (ez × ∇) with W (T, z, t) := (5) takes the form

T 0

0

w(τ, z, t) dτ and WT (x, y, z, t) := W (T (x, y, z, t), z, t). Thus

h (∂t + v · ∇h − ∇ · λ∇) T + ∂z WT = 0,

(7)

1 This is different from the divergence-constraint in (1) . Imposing the constraint on the initial value of the 1 induction equation guarantees already a divergence-free solution.

636

R. Kaiser

where ∇h = ∇ − ez ∂z . Note that h can be removed from Eq. (7) if the bracket has zero mean. Since T vanishes outside the fluid layer the problem for purely toroidal dynamo fields reads now: ⎧ ∂ T = ∂z (λ ∂z T ) + λ∇h · ∇h T − ∂z WT − v · ∇h T ⎪ ⎨t +∂z WT + v · ∇h T in L × (0, ∞), T = 0 on R2 × {z = 0, l} × (0, ∞), ⎪ ⎩ T (·, 0) = T0 , T0 = 0 on L × {t = 0}. (8) In [10] the decay of smooth solutions of problem (8) is based on the following reasoning: Consider the horizontal maximum of T , Tmax (z, t) := max T (x, y, z, t) x,y

and the associated “path” rhmax (z, t) = (x max (z, t), y max (z, t)), where the maximum is attained: Tmax (z, t) = T (rhmax (z, t), z, t). At maximum points we have clearly ∇h T (rhmax , z, t) = 0, h T (rhmax , z, t) ≤ 0. Moreover, we have ∂z Tmax (z, t) = ∂z T (rhmax , z, t) + ∇h T (rhmax , z, t) ∂z rhmax (z, t) = ∂z T (rhmax , z, t), and similarly, ∂t Tmax = ∂t T |rh =rhmax ,

∂z2 Tmax ≥ ∂z2 T |rh =rhmax .

Thus, evaluating (8)1 at rhmax yields the inequality for Tmax : ∂t Tmax ≤ ∂z (λ ∂z Tmax ) − ∂z WTmax + m

(9)

with m = m(z, t) denoting the mean value in (8)1 . Analogous arguments for Tmin := min x,y T yield the inequality ∂t Tmin ≥ ∂z (λ ∂z Tmin ) − ∂z WTmin + m.

(10)

Note that Tmax − Tmin is nonnegative due to the zero-mean condition. Thus, integrating (9) and (10) with respect to z over the interval [0, l] and using the boundary conditions for T one obtains the decay result: l ∂t (Tmax − Tmin ) dz ≤ 0. (11) 0 max/min

The derivation of (11) relies on the existence of smooth functions rh (z, t). These functions, however are not well-defined. In general, they are not unique and cannot be defined as continuous functions (cf. Fig. 1 in [10]). In [10] inequality (11) has been max/min proved on the assumption of piecewise smooth functions rh . But even this is not

Toroidal Magnetic Field Theorem

637

guaranteed for solutions of (8), not even for arbitrarily smooth solutions. It is the aim of the present paper to prove (in fact more than) (11) without falling back on functions max/min like rh . In the spherical case the poloidal/toroidal decomposition reads [12,16]: B = ∇ × (∇ × Sr) + ∇ × T r = −∇ × S − T

(12)

with := r × ∇. The mean · is now taken over spheres Sr of radius r , · := (4πr 2 )−1 Sr · ds, a mean field does not arise, and L := · denotes now the LaplaceBeltrami-operator on the unit sphere. On the assumption of a purely toroidal magnetic field the equations corresponding to (4), (5) read ∇T × ∇(v · r) · r = 0, λ

L ∂t + v · ∇ − λ − (1 + r ∂r ) T = · (∇T × ∇(v · r)) r

(13) (14)

with ∂r := (r/r ) · ∇ and λ := ∂r λ. Introducing the variable T := r T , (13) takes the form ∇T × ∇vr · r = 0, and is solved by a function w(τ, r, t) allowing the representation for vr : vr (r, t) = w(T (r, t), r, t) =: wT (r, t).

(15)

With the identity ∇T × ∇(r vr ) =

1 1 T × vr − ∂r (r vr ) T + ∂r T vr , r r

the right-hand side of Eq. (14) can be rewritten using (15) as2 1 1 · [∇T × ∇(r vr )] = · − ∂r (r vr ) T − (r vr ) ∂r T + (vr ∂r T ) r r 1 = − · ∂r [vr (r T )] + · (vr ∂r T ) = −L [∂r WT − vr ∂r T ] r T with W (T , r, t) := 0 w(τ, r, t) dτ and WT (r, t) := W (T (r, t), r, t). Thus, writing (14) in terms of T and discarding L, we arrive at a problem analogous to (8): ⎧ ⎪ ⎪ ∂t T = ∂r (λ ∂r T ) + λ∇nr · ∇nr T − ∂r WT ⎪ ⎨ −v · ∇nr T + ∂r WT + v · ∇nr T in B R × (0, ∞), (16) ⎪ T =0 on (S R ∪ {r = 0}) × (0, ∞), ⎪ ⎪ ⎩ on B R × {t = 0}. T (·, 0) = T0 , T0 = 0 ∇nr denotes here the non-radial gradient ∇ − (r/r )∂r . 2 Note that the operator interchanges with r as well as with ∂ . r

638

R. Kaiser

Questions of existence and uniqueness of solutions have been treated in [17] for the spherical case (16).3 According to this reference the following regularity assumptions:4 vnr := v − vr (r/r ) ∈ C α,α/2 (B R × [0, t0 ]), λ ∈ C 1+α,(1+α)/2 (B R × [0, t0 ]) , T0 ∈ C 2+α (B R ) for any t0 > 0, w(0, r, t) = 0,

∂ k+l+m w exists and is continuous on R × [0, R] × [0, ∞) ∂τ k ∂r l ∂t m

for any k + l + m ≤ 3, m ≤ 1, l ≤ 2 − m, together with the compatibility conditions T0 = ∂r (λ∂r T0 ) = 0

on S R

and the ellipticity condition (2) guarantee a unique local solution T ∈ C 2+α,(2+α)/2 (B R × [0, t ∗ ))

(17)

of the nonlinear problem (16). Moreover, if the maximal time of existence t ∗ < ∞ the solution blows up in the 1 + α - norm: lim T 1+α,(1+α)/2 < ∞.

t→t ∗

Analogous results hold also in the (simpler) plane case and can be proved along the lines of ref. [17]. On the other hand the toroidal magnetic field BT belonging to a reasonable solution T of (8) (T of (16), resp.) should also be a solution of the original linear system (1) with additional property BT ·∇vz = 0 (BT ·∇(v·r) = 0, resp.). Defining BT =: (Ty , −Tx , 0) this system takes in the plane case the form ∂t Tx = ∇ · (λ∇Tx ) − ∂x (vx Tx ) − ∂z (vz Tx ) − ∂x (vy Ty ), ∂t Ty = ∇ · (λ∇Ty ) − ∂y (vy Ty ) − ∂z (vz Ty ) − ∂y (vx Tx ), which is of type (1.1) in [18, Chap. VII]. In the spherical case an analogous system is obtained for the nonradial derivatives of T . According to Theorem 3.1 in [18, Chap. VII] condition (2), mild regularity assumptions on the coefficients λ and v (e.g. λ, v ∈ C α,α/2 (V × [0, t0 ]) is enough), and appropriate conditions on the initial value already imply BT ∈ C α,α/2 (V × [0, t0 ])

(18)

for any t0 > 0. Improved regularity of the coefficients and initial value leads to improved regularity of BT (cf. Theorem 4.1 in [18, Chap. VII]). Property (18) implies already boundedness of the horizontal derivatives of T (the nonradial derivatives of T , resp.). Boundedness of the vertical (radial, resp.) derivative is implied by BT ∈ C 1+α,(1+α)/2 (V × [0, t0 ]). In fact, observing that −1 h (ez × ∇) · is a bounded operator V 2 → V with V := { f : R2 → R | f P−periodic, f = 0, f C α < ∞}, 3 In fact, the problem considered in [17] differs slightly from (16), since the author of [17] uses T instead of T = r T (which is more appropriate in our case) as dynamic variable. 4 Concerning (uniform) Hölder continuity and Hölder norms in parabolic problems we refer to [17 or 18].

Toroidal Magnetic Field Theorem

639

one obtains α,α/2 ∂z T = −1 (L × [0, t0 ]) h (ez × ∇) · ∂z BT ∈ C

and an analogous result holds in the spherical case (cf. [19]). Thus, T resp. T remain bounded in C 1+α,(1+α)/2 (V × [0, t0 ]) for any t0 > 0. In conclusion, any regular solution of problem (8) or (16), whose associated magnetic field BT solves the original problem (1) with sufficiently regular data is global in time. Solutions of this type are considered henceforth. 3. A Reduction Method for Parabolic Equations Let us consider a parabolic equation of the form ∂t T =

m

∂xi (ai j ∂x j T ) +

i, j=1

n

bkl ∂yk ∂yl T − ∇x · WT − v · ∇y T + c,

(19)

k,l=1

which distinguishes two types of spatial variables: x ∈ G and y ∈ H with open bounded sets G ⊂ Rm and H ⊂ Rn . Gradients with respect to x and y are denoted by ∇x and ∇y , respectively; ∇ refers to both variables. T is assumed to be a smooth solution of (19), T ∈ C12 (G × H × [0, t0 ]),

t0 > 0,

(20)

i.e. T , ∂t T , ∇T , and ∇∇T are all continuous functions on (G × H × [0, t0 ]). The various coefficients in Eq. (19) depend on different variables: ai j = ai j (x, t), bkl = bkl (T, x, y, t), W = W(τ, x, t), v = v(T, x, y, t), and c = c(x, t). WT means again W(T (·), ·), thus ∇x · WT = ∇x · W|τ =T + ∂τ W|τ =T · ∇T being of divergence form. (ai j ) and (bkl ) are symmetric, positive definite matrices uniformly bounded away from zero. All coefficients are assumed to be sufficiently smooth to insure a solution of type (20); however, in this section we will make use only of boundedness of ai j , W, and ∂τ W =: w on their respective domains. In order to eliminate the advection term from Eq. (19) we define Tmax (x, t) := max T (x, y, t), y∈H

Tmin (x, t) := min T (x, y, t),

(21)

y∈H

and for fixed (x, t) ∈ G × [0, t0 ] the set of “extremal points” {ym (x, t)} := {y ∈ H : T (x, y, t) = Tm (x, t)} with m = max or min. Thus, for any ym (x, t) ∈ {ym (x, t)} we have Tm (x, t) = T (x, ym (x, t), t)

m ∈ {max, min}.

(22)

Despite its pointwise definition Tm has some smoothness: Lemma 1. Let T , ∇x T , and ∂t T be continuous on G × H × [0, t0 ], then Tm : G × [0, t0 ] → R with m ∈ {max, min} defined by (21) is Lipschitz continuous, a.e. differentiable, and there holds a.e. ∇x Tm (x, t) = ∇x T (x, y, t)|y=ym (x,t) , (23) ∂t Tm (x, t) = ∂t T (x, y, t)|y=ym (x,t) with arbitrary extremal points ym (x, t) ∈ {ym (x, t)}.

640

R. Kaiser

Proof. Let L > 0 such that |∇x T | < L , |∂t T | < L

on G × H × [0, t0 ]

and w.l.o.g. Tm (x, t) ≥ Tm (˜x, t˜). With ym (x, t) ∈ {ym (x, t)}, (22) and (21) follow: |Tmax (x, t) − Tmax (˜x, t˜)| = Tmax (x, t) − Tmax (˜x, t˜) = T (x, ymax (x, t), t) − T (˜x, ymax (˜x, t˜), t˜) ≤ T (x, ymax (x, t), t) − T (˜x, ymax (x, t), t˜) ≤ L |(x − x˜ , t − t˜)| and |Tmin (x, t) − Tmin (˜x, t˜)| = T (x, ymin (x, t), t) − T (˜x, ymin (˜x, t˜), t˜) ≤ T (x, ymin (˜x, t˜), t) − T (˜x, ymin (˜x, t˜), t˜) ≤ L |(x − x˜ , t − t˜)|. Thus, Tm is Lipschitz continuous and according to Rademacher’s theorem a.e. differentiable. Consider next forward and backward difference quotients with respect to t. We have with h > 0: 1 1 (Tmax (x, t + h) − Tmax (x, t)) = (T (x, ymax (x, t + h), t + h) h h 1 −T (x, ymax (x, t), t)) ≥ (T (x, ymax (x, t), t + h) − T (x, ymax (x, t), t)). h As the last expression has a well-defined limit for h → 0 we obtain lim inf h0

1 (Tmax (x, t + h) − Tmax (x, t)) ≥ ∂t T (x, y, t)|y=ymax (x,t) . h

(24)

On the other hand, 1 1 (Tmax (x, t)−Tmax (x, t −h)) ≤ (T (x, ymax (x, t), t)−T (x, ymax (x, t), t − h)) , h h which implies lim sup h0

1 (Tmax (x, t) − Tmax (x, t − h)) ≤ ∂t T (x, y, t)|y=ymax (x,t) . h

So, if Tmax is differentiable in (x, t), (24) and (25) imply ∂t Tmax (x, t) = ∂t T (x, y, t)|y=ymax (x,t) . For Tmin one obtains similarly with h > 0, lim sup h0

lim inf h0

1 (Tmin (x, t + h) − Tmin (x, t)) ≤ ∂t T (x, y, t)|y=ymin (x,t) , h 1 (Tmin (x, t) − Tmin (x, t − h)) ≥ ∂t T (x, y, t)|y=ymin (x,t) , h

which imply in the case of differentiability again ∂t Tmin (x, t) = ∂t T (x, y, t)|y=ymin (x,t) . Analogous results hold for the spatial derivatives ∂xi , i = 1, . . . , m.

(25)

Toroidal Magnetic Field Theorem

641

If T is of class (20) we have at extremal points ym clearly ∇y T |y=ym = 0 and n

(bkl ∂yk ∂yl T )|y=ymax ≤ 0,

k,l=1

n

(bkl ∂yk ∂yl T )|y=ymin ≥ 0.

k,l=1

Therefore, evaluating Eq. (19) at extremal points yields the inequalities ⎧ m

⎪ ⎪ ∂ ∂xi (ai j ∂x j T ) |y=ymax − ∇x · WTmax + c , T ≤ ⎪ t max ⎪ ⎨ i, j=1

m

⎪ ⎪ ⎪ ∂ ∂xi (ai j ∂x j T ) |y=ymin − ∇x · WTmin + c. T ≥ ⎪ t min ⎩

(26)

i, j=1

We used here Lemma 1 on the left-hand side and for rewriting the second term on the right-hand side: ∇x · WT |y=ym = ∇x · W|τ =Tm + ∂τ W · ∇x Tm = ∇x · WTm . The inequalities (26) cannot completely be expressed in terms of Tm , since second-order derivatives of Tm do not make sense. In fact, more than Lipschitz continuity and hence differentiability a.e. cannot be expected (cf. Fig. 1 in [10]). In a weak form, however, (26) can be reduced to inequalities for Tm . For this purpose we define the set of nonnegative test functions by ∞ ∞ C0,+ := C0,+ (G × (−∞, t0 )) := φ ∈ C0∞ (G × (−∞, t0 )) | φ ≥ 0 , and prove Lemma 2. Let T be of class (20), Tm and ym as in Lemma 1 with m ∈ {max, min}, and (ai j ) as explained after (20). Then m m

∂xi (ai j ∂x j T ) |y=ymax φ dx ≤ − ai j ∂x j Tmax ∂xi φ dx, i, j=1 G m

i, j=1 G

∂xi (ai j ∂x j T ) |y=ymin φ dx ≥ −

i, j=1 G m

i, j=1 G

ai j ∂x j Tmin ∂xi φ dx

∞ (G × (−∞, t )). for a.e. t ∈ (0, t0 ) and arbitrary φ ∈ C0,+ 0

Proof. Only the case m = max is proved, the other case follows with minor modifications. Denoting forward and backward difference quotients in direction n with step width h > 0 by 1 1 −h u(x) := (u(x) − u(x − hn)), (u(x + hn) − u(x)) and Dx·n h h respectively, one obtains with ym (x, t) ∈ {ym (x, t)}, (22) and (21): ⎧ −h h ⎪ Dx·n Dx·n T (x, y, t) |y=ymax (x,t) ⎪ ⎪ ⎪ ⎪ ⎪ 1 ⎪ ⎪ ⎨= 2 (T (x+hn, ymax (x, t), t)+T (x−hn, ymax (x, t), t) − 2 T (x, ymax (x, t), t)) h (27) 1 ⎪ ⎪ ⎪ ≤ h 2 (T (x + hn, ymax (x + hn, t), t) + T (x − hn, ymax (x − hn, t), t) ⎪ ⎪ ⎪ −2 T (x, ymax (x, t), t)) ⎪ ⎪ ⎩ −h h = Dx·n Dx·n Tmax (x, t). h Dx·n u(x) :=

642

R. Kaiser

Equation (27) implies the matrix Dx−h Dxh j Tmax − (Dx−h Dxh j T )|y=ymax i i

(28)

being positive semidefinite. Thus, contracting (28) with (ai j ) and multiplying with ∞ yields φ ∈ C0,+ m

m

(Dx−h Dxh j T )|y=ymax ai j φ ≤ i

i, j=1

Dx−h Dxh j Tmax ai j φ i

i, j=1 m

=− −

i, j=1 m

i, j=1

Dxh j Tmax Dxhi (ai j φ) 1 h (Dx j Tmax )|(x−hei ,t) ai j φ − Dxh j Tmax (ai j φ)|(x+hei ,t) , h

and integration over G with h < dist (supp(φ), ∂(G × (−∞, t0 ))): m

i, j=1 G

(Dx−h Dxh j T )|y=ymax ai j φ dx ≤ − i

m

i, j=1 G

Dxh j Tmax Dxhi (ai j φ) dx.

Observe now that (20) implies for fixed ymax (x, t) the pointwise limit on G × [0, t0 ]: (Dx−h Dxh j T )|y=ymax → (∂xi ∂x j T )|y=ymax i

for h → 0,

and Lemma 1 the limit a.e. on G × [0, t0 ]: Dxh j Tmax → ∂x j Tmax

for h → 0.

Moreover, there are constants L, L˜ > 0 such that ess sup |Dxh j Tmax | ≤ L , G×[0,t0 ]

max

G×H ×[0,t0 ]

˜ |Dx−h Dxh j T | ≤ L. i

Thus, Lebesgue’s theorem yields m

i, j=1 G

(∂xi ∂x j T )|y=ymax ai j φ dx ≤ −

m

i, j=1 G

∂x j Tmax ∂xi (ai j φ) dx.

Lemma 1 implies, moreover, the equation m

i, j=1 G

∂xi ai j (∂x j T )|y=ymax φ dx =

which concludes the proof.

m

i, j=1 G

∂x j Tmax ∂xi ai j φ dx,

Toroidal Magnetic Field Theorem

643

Lemma 2 enables us to write (26) in a weak form as inequalities for Tm : ⎧ t0 ⎪ ⎪ Tm ∂t φ dx dt + Tm (·, 0)φ(·, 0) dx ⎪ ⎪ ⎨ 0 G G t0 t0 m t0

⎪ ⎪ ⎪ ai j ∂x j Tm ∂xi φ dx dt − WTm · ∇φ dx dt − c φ dx dt, ⎪ ⎩ 0 G 0 G 0 G i, j=1

(29) with the upper inequality referring to m = max and the lower one to m = min. Introducing δT := Tmax − Tmin

(30)

the inequalities (29) can be combined to a single one. Writing W(Tmax , ·, ·) − W(Tmin , ·, ·) =

W(Tmax , ·, ·) − W(Tmin , ·, ·) δT Tmax − Tmin

and defining the bounded function ω(x, t) :=

W(τ, x, t) − W(σ, x, t) τ =Tmax (x,t), σ =Tmin (x,t) τ −σ

(31)

we obtain from (29), ⎧ t0 ⎪ ⎪ δT ∂ φ dx dt + δT (·, 0)φ(·, 0) dx ⎪ t ⎨ 0 G G t0 m

t0 ⎪ ⎪ ≥ ai j ∂x j δT ∂xi φ dx dt − δT ω · ∇φ dx dt ⎪ ⎩ 0 G 0 G

(32)

i, j=1

∞. with φ ∈ C0,+ Summarizing Lemmata 1 and 2 we have

Theorem 1. Let T be a solution of class (20) of the parabolic Eq. (19). Then, δT defined by (30) and (22) is of class C 0,1 (G × [0, t0 ]) and is, moreover, a weak subsolution (in the sense of (32)) of the “reduced equation” ∂t δT =

m

i, j=1

with ω given in (31).

∂xi (ai j ∂x j δT ) − ∇x · (ω δT )

(33)

644

R. Kaiser

4. Decay in Divergence-type Parabolic Equations We consider in this section (sub-)solutions of the following initial-boundary-value problem: ⎧ ⎨ ∂t h = ∇ · (λ ∇h − ω h) in G × (0, ∞), h=0 on ∂G × (0, ∞), (34) ⎩ h(·, 0) = h on G × {t = 0}. 0 Lortz et al. prove in [14] the following results: Let G ⊂ Rm be a bounded, open set with ∂G of class C 2+α . λ and ω are assumed to be measurable and bounded on G × (0, ∞); in particular, there are constants λ0 > 0, M > 0 such that λ ≥ λ0 , |ω| ≤ M

in G × (0, ∞).

(35)

In order to formulate an appropriate notion of weak solution consider for t0 > 0 the spaces V2 = { f ∈ L 2 (G × (0, t0 )) | ∇ f exists weakly and f V2 < ∞}, ◦

V 2 = { f ∈ V2 | f (·, t)|∂G = 0 in the trace sense for a.e. t ∈ (0, t0 )} with norm

f V2 := sup

(0,t0 ) G

t0

| f | dx + 2

0

|∇ f |2 dx dt. G

◦

A function h ∈V 2 is called a weak solution of (34) in G × (0, t0 ) with initial value h 0 ∈ L 2 (G) if and only if h satisfies t0 t0 h ∂t φ dx dt + h 0 φ(·, 0) dx = (λ ∇h − ω h) · ∇φ dx dt (36) 0

G

G

0

G

C0∞ (G

× (−∞, t0 )). h is called weak solution in G × (0, ∞) if and only if for any φ ∈ h|G×(0,t0 ) is a weak solution in G × (0, t0 ) for any t0 > 0. Solutions of this type decay (in time) in the following sense (cf. Theorem 1 in [14]): Theorem 2. (Lortz et al.) Let h 0 ∈ L ∞ (G) and ρ > 0. There exist positive constants c1 , c2 , c3 , and d, depending on λ0 , M, G, and m, c2 depending additionally on ρ, such that any weak solution h of (34) satisfies h(·, t) L 2 (G) ≤ c1 h 0 L 2 (G) e−dt

in (0, ∞),

(37)

−dt

in (ρ, ∞),

(38)

in (0, ∞).

(39)

h(·, t) L ∞ (G) ≤ c2 h 0 L 2 (G) e

h(·, t) L ∞ (G) ≤ c3 h 0 L ∞ (G) e

−dt

It is the aim of the present section to verify that Theorem 2 remains valid for nonnegative Lipschitz-continuous sub solutions of (34). The proof in [14] relies mainly on the construction of a positive solution of an auxiliary problem to which Harnack-type inequalities apply. The basic methods used are nontrivial adaptations of Moser’s iteration techniques to obtain, finally, pointwise estimates for solutions of (34) and the auxiliary problem. Large parts of the proof apply

Toroidal Magnetic Field Theorem

645

without changes to our situation and need not be repeated here. We discuss in the following only those points in the proof which require modifications and cite results about the (unchanged) auxiliary problem necessary for understanding the rest. (i) Without any modification we can take over the results about weak solutions of the following auxiliary problem: ⎧ ⎨ ∂t u = ∇ · (λ ∇u − ω u) in G × (0, ∞), n · ∇u = 0 on ∂G × (0, ∞), (40) ⎩ u(·, 0) = u on G × {t = 0}. 0 λ, ω, and G are assumed to be as in problem (34), n means the exterior normal at ∂G, and ω satisfies, additionally, n · ω|∂G×(0,∞) = 0. u ∈ V2 is called weak solution in G × (0, t0 ) with initial value u 0 ∈ L 2 (G) if and only if u satisfies (36) for any φ ∈ C0∞ (Rm × (−∞, t0 )). Weak solutions in G × (0, ∞) are defined as above. Lortz et al. show that weak solutions of problem (36) (which are in fact locally Hölder continuous) with positive bounded initial values remain positive and bounded for all times, more precisely (cf. Theorem 2 in [14]): There are positive constants c4 , c5 , depending only on λ0 , M, G, and m, such that the bound on u 0 , u0 ≤ u0 ≤ u0

on G

(41)

with positive constants u 0 , u 0 , implies the bound c4 u 0 ≤ u ≤ c5 u 0

in G × (0, ∞).

(42)

If, moreover, λ, ω ∈ C 1+α,(1+α)/2 (G × [0, t0 ]) and u 0 ∈ C 2+α (G) (and satisfying the compatibility conditions), then u is in fact a classical solution ∈ C 2+α,(2+α)/2 (G×[0, t0 ]). (ii) The proof in [14] proceeds in two steps: Theorem 2 is proved first for classical solutions, and then extended to weak solutions via approximation by convex combinations of classical solutions. The last point requires uniqueness of the weak solution, which is obviously not given for sub solutions. Fortunately, the subsolution possesses in our case enough regularity to avoid the approximation argument. Concerning the auxiliary problem, λ is by assumption sufficiently smooth to allow classical solutions, but ω is by definition (31) not better than (Lipschitz-)continuous; so we need here some approximation: Let t0 > 0 and (ωi ) ⊂ C ∞ (G × [0, t0 ]) be an approximating sequence with n · ωi |∂G×[0,t0 ] = 0, ωi C 0 ≤ M, i ∈ N, and ωi − ωC 0 → 0

for i → ∞.

(43)

Let (u i ) be the associated sequence of classical solutions of (40) with λ ∈ C 1+α,(1+α)/2 (B R × [0, t0 ]) and u 0 ∈ C 2+α (G) satisfying (42) uniformly with respect to i. According to the estimate (5.3) in [14] u i is, moreover, uniformly bounded in V2 : u i V2 < K = K (λ0 , M, G, m; u 0 , t0 ), i ∈ N.

(44)

Now we prove (37). Multiplying (40)1 with u i and ωi by (h/u i )2 and integrating over G one obtains after integrating by parts ∂t u i (h/u i )2 dx = −2 (h/u i )(λ∇u i − ωi u i ) · ∇(h/u i ) dx (45) G

G

646

R. Kaiser

for a.e. t ∈ (0, t0 ). Note that (45) makes sense for Lipschitz-continuous h when observing the correspondence C 0,1 (B R × [0, t0 ]) ∼ W 1,∞ (G × (0, t0 )). h itself satisfies (36) ∞ (G × (−∞, t )). Choosing for fixed i ∈ N a sequence of as inequality for any φ ∈ C0,+ 0 test functions (φi j ) j∈N of the form φi j (x, t) = ψi j (x, t) χ (t) with ψi j − h/u i H → 0 ∞ ((−∞, t )) one obtains after integration by parts (in t): for j → ∞ and χ ∈ C0,+ 0

t0

t0

∂t h ψi j χ dxdt ≤ −

0

G

0

(λ∇h − ωh) · ∇ψi j χ dxdt. G

Here · H denotes the Hilbert space norm t0 2 | f |2 + |∇ f |2 dxdt. f H := 0

G

Thus, in the limit j → ∞, one obtains t0 ∂t h h/u i dx χ dt ≤ − 0

G

t0 0

(λ∇h − ωh) · ∇(h/u i ) dx

∞ ((−∞, t )) is arbitrary, and, furthermore, since χ ∈ C0,+ 0 ∂t h h/u i dx ≤ − (λ∇h − ωh) · ∇(h/u i ) dx G

χ dt,

G

(46)

G

for a.e. t ∈ (0, t0 ). Combining (45) and (46) we arrive at d 2 ∂t h h/u i − ∂t u i (h/u i )2 dx h 2 /u i dx = dt G G 2 ≤ −2 λ u i |∇(h/u i )| dx + 2 h(ω − ωi ) · ∇(h/u i ) dx. G

(47)

G

Denoting ω − ωi C 0 by i , Poincaré’s constant in G with zero boundary conditions by C G , and observing (35), (42), and (44), the right-hand side in (47) can be estimated by d h 2 /u i dx ≤ −2 λ0 c4 u 0 |∇(h/u i )|2 dx + 2ω − ωi C 0 h|∇(h/u i )| dx dt G G G λ0 c4 u 0 h 2 /u i dx + 2 i h |(∇h/u i ) − (h/u i2 )∇u i | dx ≤ −2 2 C G c5 u 0 G G h 2 /u i dx + K 2 i =: −K 1 G

with positive constants K 1 = K 1 (λ0 , M, G, m; u 0 /u 0 ) and K 2 = K 2 (λ0 , M, G, m; u 0 , h, t0 ). Applying Gronwall’s inequality yields h 2 /u i dx ≤ h 20 /u 0 dx e−K 1 t + i K 2 /K 1 . G

G

Taking now the limit i → ∞ and applying once more (42) yields, finally, c5 u 0 2 h dx ≤ h 2 dx e−K 1 t , c4 u 0 G 0 G

Toroidal Magnetic Field Theorem

647

which is (37). Note that we are free to choose u 0 = const > 0 and hence u 0 = u 0 , and that K 1 does not depend on t0 . (iii) The proofs of the inequalities (38) and (39) are based on (37) and the following estimates (cf. Theorem 3 in [14]): Let ρ, t, and s be such that 0 < ρ < t, ρ ≤ 1/2, and 0 < ρ ≤ 1. There exist then positive constants c6 and c7 such that h L ∞ (G×(t,t+s)) ≤ c6 h L 2 (G×(t−ρ,t+s+ρ)) , h2L ∞ (G×(0,t)) ≤ c7 h2L 2 (G×(0,t)) + h 0 2L ∞ (G)

(48) (49)

for any classical solution h of problem (34). c6 , c7 depend on λ0 , M, G, and m; c6 depends, additionally, on ρ and c7 on t. The proof proceeds via a series of partly rather tricky estimates, which apply without changes to our situation as well. The only difference is in the “starting” Eq. (2.1a) leading to inequality (3.1) in [14]. Instead, our starting point is again (36) as inequal∞ (G × (−∞, t )) of the form ity. Choosing a sequence of test functions (φi ) ⊂ C0,+ 0 φi (x, t) = ψi (x, t) χi (t) with ψi − h γ −1 H → 0, χi − χ L 2 → 0 for i → ∞ one obtains after integration by parts (in t) and in the limit i → ∞: t2 t2 ∂t h h γ −1 dx χ dt ≤ − (λ∇h − ωh) · ∇h γ −1 dx χ dt. (50) t1

G

t1

G

χ ≥ 0 is here some differentiable function in (t1 , t2 ) ⊂ [0, t0 ] vanishing outside. Note that only the case γ ≥ 2 is relevant here. Rearranging (50) and using (35) yields t2 t2 1 γ γ −2 2 ∂t h dx χ dt + λh |∇h| dx χ dt γ (γ − 1) t1 G t1 G t2 h γ −1 |∇h| dx χ dt. ≤M t1

G

Integration by parts on the left-hand side and using the estimate (3.3) in [14] on the right-hand side yields then t2 t2 1 3 h γ dx χ + λ0 h γ −2 |∇h|2 dx χ dt γ (γ − 1) G 4 t1 G t1 t2

2 M |χ | + h γ dx χ dt, ≤ γ (γ − 1) λ0 t1 G which is (3.1) in [14]. Note, finally, that the technical restriction h > 0 with subsequent relaxation is not necessary here, since negative powers of h and log h-estimates do not appear. 5. Decay of Purely Toroidal Dynamo Fields In this section we apply the foregoing results to purely toroidal dynamo fields in a plane layer L = R2 × (0, l) and in a ball B R . In the former case we assume the toroidal scalar

648

R. Kaiser

T to be a smooth, i.e. of class C 2+α,(2+α)/2 (L × [0, ∞)), P-periodic solution of the following problem (cf. (8)): ⎧ ∂t T = ∂z (λ ∂z T ) + λ∇h · ∇h T − ∂z WT − v · ∇h T ⎪ ⎪ ⎪ ⎪ ⎨ in R2 × [0, l] × (0, ∞), +∂z WT + v · ∇h T T P−periodic in R2 × [0, l] × (0, ∞), (51) ⎪ ⎪ ⎪ on R2 × {z = 0, l} × (0, ∞), ⎪T = 0 ⎩ T (·, 0) = T0 , T0 = 0 on R2 × [0, l] × {t = 0}. Replacing R2 by the bounded region P, Eq. (51)1 fits into the framework of Sect. 3. According to Theorem 1 the quantity δT = maxP T − minP T is of class C 0,1 ([0, l] × [0, t0 ]) and is, furthermore, a weak subsolution of the problem ⎧ ⎨ ∂t δT = ∂z (λ ∂z δT ) − ∂z (ω δT ) in (0, l) × (0, t0 ), δT = 0 on {z = 0, l} × (0, t0 ), ⎩ δT (·, 0) = δT on (0, l) × {t = 0} 0 for any t0 > 0. ω is here determined according to (31) by W and T , and δT0 := maxP T0 − minP T0 . Note that λ ∈ C 1+α,(1+α)/2 (L × [0, t0 ]) and is horizontally symmetric by assumption, whereas ω is merely bounded and continuous by definition. δT is nonnegative due to the zero mean condition T = 0. Recall, furthermore, the correspondence C 0,1 ([0, l] × [0, t0 ]) ∼ W 1,∞ ((0, l) × (0, t0 )), which implies δT is also a weak (sub)solution in the sense of Theorem 2, and observe the equivalence of norms: max T ≤ max δT ≤ 2 max T. L

[0,l]

L

Therefore, Theorem 2 yields the result: Theorem 3. Any smooth, P-periodic solution of problem (51) decays in time according to max |T (·, t)| ≤ C max |T0 | e−dt , L

L

t ≥ 0.

C and d depend only on l, λ0 , and a pointwise bound on ∂τ W (or vz ). In the spherical case the modified toroidal scalar T = r T is governed by (cf. (16)): ⎧ ∂ T = ∂r (λ∂r T ) + λ∇nr · ∇nr T − ∂r WT ⎪ ⎨ t −v · ∇nr T + ∂r WT + v · ∇nr T in B R × (0, ∞), (52) T = 0 on (S R ∪ {r = 0}) × (0, ∞), ⎪ ⎩ T (·, 0) = T0 , T0 = 0 on B R × {t = 0}. Note that (52)1 does not fit precisely into the framework of Sec. 3, since B R is a “warped” product of its factors [0, R] and Sr . In particular, the nonradial gradient ∇nr depends also on r . However, checking the proofs in Sec. 3 one makes sure that Theorem 1 applies to this situation as well. Thus, starting with a smooth solution of (52) Theorem 1 yields δT (r, t) = max Sr T (·, t) − min Sr T (·, t) to be of class C 0,1 ([0, R] × [0, t0 ]) and to be a weak subsolution of the problem ⎧ ⎨ ∂t δT = ∂r (λ ∂r δT ) − ∂r (ω δT ) in (0, R) × (0, t0 ), δT = 0 on {r = 0, R} × (0, t0 ), ⎩ δT (·, 0) = δT on (0, R) × {t = 0} 0 for any t0 > 0. Proceeding as in the plane case, Theorem 2 yields then the result:

Toroidal Magnetic Field Theorem

649

Theorem 4. Any smooth solution of problem (52) decays in time according to max |T (·, t)| ≤ C max |T0 | e−d t , BR

BR

t ≥ 0.

(53)

C and d depend only on R, λ0 , and a pointwise bound on ∂τ W (or vr ). Note that (53), if expressed in terms of the original variable T , amounts to the nonuniform bound max |T (·, t)| ≤ C Sr

R max |T0 | e−dt , r BR

0 < r ≤ R, t ≥ 0.

Acknowledgements. The author would like to thank M. Seehafer for stimulating discussions and valuable comments.

References 1. Moffatt, H.K.: Magnetic Field Generation in Electrically Conducting Fluids. Cambridge: Cambridge University Press, 1978 2. Bullard, E.C., Gellman, H.: Homogeneous dynamos and terrestrial magnetism. Phil. Trans. R. Soc. Lond. A 247, 213–278 (1954) 3. Elsasser, W.M.: Hydromagnetic dynamo theory. Rev. Mod. Phys. 28, 135–163 (1956) 4. Childress, S.: Théorie Magnétohydrodynamique de l’Effet Dynamo. Report of Département Mécanique de la Faculté des Sciences, Paris, 1969 5. Busse, F.H.: Mathematical problems of dynamo theory. In: Applications of Bifurcation Theory, Ed. P.H. Rabinowitz. New York-San Francisco-London: Academic Press, 1977, pp. 175–202 6. Ivers, D.J., James, R.W.: An antidynamo theorem for partly symmetric flows. Geophys. Astrophys. Fluid Dynam. 44, 271–278 (1988) 7. Ivers, D.J.: Antidynamo Theorems. Ph.D. Thesis, University of Sydney, 1984 8. Cowling, T.O.: The magnetic field of sunspots. Mon. Not. R. Astr. Soc. 94, 39–48 (1934) 9. Elsasser, W.M.: Induction effects in terrestrial magnetism. I.Theory. Phys. Rev. 69, 106–116 (1946) 10. Kaiser, R., Schmitt, B.J., Busse, F.H.: On the invisible dynamo. Geophys. Astrophys. Fluid Dynam. 77, 93–109 (1994) 11. Ivers, D.J., James, R.W.: Axisymmetric antidynamo theorems in compressible non-uniform conducting fluids. Phil. Trans. R. Soc. Lond. A 312, 179–218 (1984) 12. Backus, G.E.: A class of self-sustaining dissipative spherical dynamos. Ann. Phys. 4, 372–447 (1958) 13. Kaiser, R.: The non-radial velocity theorem revisited. Geophys. Astrophys. Fluid Dynam. 101, 185–197 (2007) 14. Lortz, D., Meyer-Spasche, R., Stredulinsky, E.W.: Asymptotic behavior of the solutions of certain parabolic equations. Comm. Pure Appl. Math. 37, 677–703 (1984) 15. Schmitt, B.J., von Wahl, W.: Decomposition of solenoidal fields into poloidal fields, toroidal fields, and the mean flow. Applications to the Boussinesq-equations. In: The Navier Stokes Equations II – Theory and Numerical Methods, Lecture Notes in Mathematics 1530, Ed. I. G. Heywood, K. Masuda, R. Rautmann, S. A. Solonnikov. Berlin-Heidelberg-New York, Springer-Verlag, 1992, pp. 291–305 16. Schmitt, B.J.: The poloidal-toroidal representation of solenoidal fields in spherical domains. Analysis 15, 257–277 (1995) 17. Schmitt, B.J.: Purely toroidal solutions of the kinematic dynamo equation. Nonlinear Differ. Equ. Appl. (NoDEA) 4, 217–231 (1997) 18. Lady˘zenskaja, O.A., Solonnikov, V.A., Ural’ceva, N.N.: Linear and Quasilinear Equations of Parabolic Type. Translations of Mathematical Monographs, Vol. 23, Providence, RI: Amer. Math. Soc. 1968 19. Schmitt, B.J.: Abschätzungen für die Poloidal-Toroidal-Zerlegung von Solenoidalen Vektorfeldern in Dreidimensionalen Kugeln, Anwendungen auf Toroidale Lösungen der Kinematischen Dynamogleichung. Thesis, University of Bayreuth, 1994, p. 63 Communicated by P. Constantin

Commun. Math. Phys. 290, 651–677 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0720-1

Communications in

Mathematical Physics

On a Constrained 2-D Navier-Stokes Equation E. Caglioti1 , M. Pulvirenti1 , F. Rousset2, 1 Dipartimento di Matematica, Università di Roma ‘La Sapienza’, P.le Aldo Moro 2,

00185 Roma, Italy. E-mail: [email protected]; [email protected]

2 CNRS, Laboratoire Deudonné, Université de Nice, Parc Valrose,

06108 Nice cedex 2, France. E-mail: [email protected] Received: 24 July 2008 / Accepted: 11 September 2008 Published online: 8 January 2009 – © Springer-Verlag 2008

Abstract: The planar Navier-Stokes equation exhibits, in absence of external forces, a trivial asymptotics in time. Nevertheless the appearence of coherent structures suggests non-trivial intermediate asymptotics which should be explained in terms of the equation itself. Motivated by the separation of the different time scales observed in the dynamics of the Navier-Stokes equation, we study the well-posedness and asymptotic behaviour of a constrained equation which neglects the variation of the energy and moment of inertia. 1. Introduction Consider the two-dimensional Euler equation in vorticity form (∂t + u · ∇)ω(x, t) = 0, x ∈ R2 ,

(1.1)

where the divergence free velocity field u is given by u = ∇ ⊥ ψ, ψ = −−1 ω. Explicitly, we can write: u = K ∗ ω,

K (x) = ∇ ⊥ g = −

1 x⊥ 1 log |x|. , g(x) = − 2π |x|2 2π

(1.2)

The rigorous justification of the formation of coherent structures in two-dimensional fluid-dynamics, which is observed in real and numerical experiments (see e.g. [22]) remains a widely open problem. An attempt to justify the appearance of these coherent structures is due to Onsager [27], see also [17,25], and [11] for a recent review. The main idea is to replace the incompressible Euler equation by the system of N point vortices and to study the Statistical Mechanics of these point vortices. In the mean field limit Current address: IRMAR, Université de Rennes 1, Campus de Beaulieu, 35042 Rennes cedex, France. E-mail: [email protected]

652

E. Caglioti, M. Pulvirenti, F. Rousset

N → +∞, the Gibbs measure associated to the point vortices concentrates to some special stationary solutions of the Euler equation (called mean field solutions), we refer to [7,8,15,16] for the rigorous justification. These states are under the form: ebψ+a ω= Z

|x|2 2

,

Z=

R2

ebψ+a

|x|2 2

d x.

(1.3)

In this last expression, Z is a normalization factor to have ω = 1 and b real and a < 0 are parameters. From a mathematical point of view this equation enters in the framework of a general class of nonlinear elliptic equations given by |x|2

ebψ+a 2 − ψ = (1.4) Z which has been studied in [7,8]. Nevertheless, there is no justification of the fact that, among the infinite number of stationary stable solutions of the Euler equation, the mean field solutions play indeed a special role in the dynamics. Another justification of (1.3) could come from the intermediate asymptotic behaviour of the two-dimensional Navier-Stokes equation which, in vorticity form, reads (∂t + u · ∇)ω(x, t) = νω(x, t), x ∈ R2 ,

(1.5)

where ν > 0 is the viscosity coefficient. Indeed, due to the dissipation term in the righthand side of Eq. (1.5), the asymptotic behaviour of the solutions is trivial, namely, when t → +∞, ω(x, t) → 0 pointwise and in the L p sense, for p > 1. Consequently, the states (1.3) could play a part only in the intermediate behaviour of the equation before the dissipation scale. To give a more quantitative description of this idea, it is useful to recall that the solutions of (1.3) can be studied through a variational principle: the radial solutions are obtained as minimizers of the Boltzmann entropy (this is proven in the Appendix) S(ω) = ω log ω d x R2

under the constraints E(ω) = E,

M(ω) = 0,

I (ω) = I, ω ≥ 0,

R2

ω d x = 1,

for some fixed E and I , where the energy E(ω), the center of vorticity M(ω) and the moment of inertia I (ω) are respectively given by 1 1 ψ ω d x, M(ω) = xω d x, I (ω) = |x − M|2 ω. E(ω) = 2 R2 2 R2 R2 Note that it is always possible to choose the coordinates so that M(ω) = 0. It is thus interesting to study how these quantities, which are conserved by the Euler equation (1.1), evolve under the Navier-Stokes flow. At first, it is well known that the Navier-Stokes equation preserves the nonnegativity and that ω and M are conserved. Consequently, throughout this paper, we will focus on non-negative solutions which are normalized such that ω = 1 and M(ω) = 0. Next, we also observe that |∇ω|2 ˙ . I˙(ω) = 2ν, E˙ = −ν ω2 , S(ω) = −ν ω R2 R2

On a Constrained 2-D Navier-Stokes Equation

653

It is easy to see that I can be considered as constant for times t << ν −1 . Moreover, it is likely that in certain cases (see for instance [22]) the energy dissipation rate is much smaller than the entropy dissipation rate. Coming back to an attempt to justify Eq. (1.4) in terms of the Navier-Stokes evolution, the first naive remark is that, if the energy and the moment of inertia are assumed to vary on a long time scale, they can be considered as constant in a first approximation. On such a time scale, the motion should be governed by a master equation, which modifies the Navier-Stokes equation leaving constant both energy and moment of inertia, but retaining all the other features of the Navier-Stokes dynamics. Therefore such a master equation, dissipating the entropy at constant energy and moment of inertia, would lead to the solution to Eq. (1.4) as t → ∞. By using a recent geometric gradient flow characterization of the Navier-Stokes equation (see [35] for example) connected with the mass transport problem and the associated differential calculus introduced in [29] (see also [1]) we have derived such an equation in [10]. In this framework the Navier-Stokes equation can be written as a differential equation for a vector field which is the sum of a dissipative part, which is the gradient flow of the entropy, and a conservative part, corresponding to the Euler equation, which is the orthogonal gradient of the energy. The equation we were looking for was obtained by keeping only the orthogonal projection of the vector field in the tangent space of the manifold I =const and E =const. Here “orthogonal projection” means orthogonal for the Riemannian metric in the framework of Otto’s calculus. We just recall that in this framework the set of probability measure M is equipped with a structure of Riemannian manifold where the tangent space at ρ is parametrized as Tρ M = {ρ˙ = −∇ · (ρu)} and the Riemannian metric is given by gρ (ρ˙1 , ρ˙2 ) = ρ u 1 u 2 , ρ˙i = −∇ · (ρu i ). Due to the structure of the Navier-Stokes equation, this procedure modifies the dissipative part while leaving invariant the conservative part. Thus we have found the equation: ∂t ω + u · ∇ω = ν div(∇ω − b ω∇ψ − a ωx) |x|2 ) , = ν div ω∇(log ω − bψ − a 2

(1.6)

where the Lagrange multipliers b and a are given by 2I ω2 + 2V 2 ω|∇ψ|2 + V ω2 b = b(ω) = ; a = a(ω) = − , (1.7) 2I ω|∇ψ|2 − V 2 2I ω|∇ψ|2 − V 2 and

V =

ωx · ∇ψ =

R2 R2

ω(x, t)ω(y, t) x · ∇g(x − y) d xd y = −

1 . 4π

(1.8)

A way to validate this approach, is to test it in the following simpler and well understood case. A special self-similar solution to Eq. (1.5) is the so called Oseen vortex: ω(x, t) =

|x|2 1 e− 4ν(t+1) . 4π ν(t + 1)

654

E. Caglioti, M. Pulvirenti, F. Rousset

Note that this is also a solution to the heat equation. It was shown by Gallay and Wayne [12] that this solution describes the long time asymptotic of the Navier-Stokes equation in L 1 . Indeed, with the change of variables ξ=√

x 1+t

;

τ = log(1 + t),

ω(x, t) = (1 + t)−1 w(ξ, τ ),

the Navier-Stokes equation in the new variables reads: ∂τ w + v · ∇ξ w = νξ w + ν ∇ξ ·

1 ξw . 2

(1.9)

It is possible to show that w → W in L 1 as τ → ∞, where W (ξ ) is the rescaled Oseen vortex. As a consequence the Oseen vortex can be thought of as characterizing an intermediate asymptotics for times νt << 1. This analysis enters perfectly in the context of the projected gradient flows. Indeed imposing the constancy of I in the Navier-Stokes equation we find 1 ∂t ω + u · ∇ω = νω + ν ∇ · (ωx), I

(1.10)

which is Eq. (1.9) for I = 2. Moreover, the rescaled Oseen vortex W (ξ ) is a solution of the Mean Field equation (1.4) for b = 0, and a = −1/2ν. This suggests to impose also the constancy of E in the attempt of outlining what happens before the occurrence of the Oseen vortex. Indeed one could argue that I is more robust than E in many interesting physical situations. If so Eq. (1.6) should be more appropriate on the time scale when E is practically constant, while Eq. (1.9) should describe the fluid when E starts to be dissipated at constant I . The aim of this paper is the mathematical study of Eqs. (1.6), (1.7). For more details on the derivation of these equations and of the physical motivations, we refer to [10]. As explained in [10], the procedure of constraining a diffusion equation is highly non unique. Nevertheless, an interesting feature of Eq. (1.6) is that it can be obtained by many different methods. As already explained, it appears naturally by using the geometric structure of the Navier-Stokes equation. Moreover, it was noticed in [10] that Eq. (1.6) can also be obtained by constraining the stochastic vortex dynamics which is a finite dimensional approximation to the Navier-Stokes equation (see [18,19,28]...). Indeed (at a formal level), it is shown that the stochastic process for a system of N stochastic vortices, once constrained on the E =const manifold, produces, in the mean-field limit, exactly Eq. (1.6) which turns out to be compatible with this particle approximation. Equation (1.6) however is not new. It was previously derived by Chavanis in [5] and [6] following a completely different approach based on the kinetic theory of (deterministic) point vortices. We finally note that the projection on the manifold I = const according to the gradient notion used here, has been considered by Carlen and Gangbo in [4] for a different class of equations. A different approach to the one of Onsager [27] in order to understand the coherent structures arising in 2D flows was proposed by Robert and Sommeria [32] and Miller [24]. The equilibrium solutions are obtained by using the maximum entropy principle over a state space formed by a selected family of possible values of the vorticity. Note that this procedure preserves all the Euler invariants so that, as far as the equilibrium is concerned, the Robert-Sommeria-Miller theory is quite different from the approaches,

On a Constrained 2-D Navier-Stokes Equation

655

as the present one, based on the mean field equation. As regards the dynamics, a class of master equations leading to such equilibria has been introduced in [33]. One of them, which has some formal similarities with our model, has been systematically investigated from a mathematical point of view in [23]. We remark that such an equation exhibits a maximum principle leading to useful a priori estimates, for instance the L ∞ norm of the vorticity is uniformly bounded while, in our case, we do not have such a priori control. Our aim is to establish global existence results for (1.6) and to study the asymptotic behaviour of global solutions. There are two main difficulties. The first one is that Eq. (1.6) makes no sense whenever the denominator in the definition of a and b (see Eq. (1.7)) vanishes. Note that by the Cauchy-Schwarz inequality we have V2 =

2 ωx · ∇ψ

≤ 2I

ω|∇ψ|2 ,

and thus this denominator is always non-negative. Nevertheless, when ω∇ψ and ωx are collinear, it vanishes. This happens for the one-dimensional family of circular vortex patches: ω=

1 χ B(0,R) , π R2

where χ B(0,R) is the characteristic function of B(0, R), the disk of center 0 and radius R. Indeed, we have: ω∇ψ = −

ωx χ B(0,R) . 2π R 2

The other difficulty is that b is well-defined if ω ∈ L 2 but there are no a priori estimates available for the L 2 norm of the vorticity. The only a priori information we have at our disposal is that E and I are conserved (note that in this setting E is not very useful since the energy has no sign) and that the entropy S decays. Indeed, we formally have |x|2 ω∇ log ω · ∇ log ω − bψ − a 2 2 |x|2 = −ν ω ∇ log ω − a − bψ . 2

d S(ω) = −ν dt

This identity can be checked by direct computation and it is obvious by using the geometric interpretation of the equation (see [10]). Note that this identity is also useful to guess that asymptotic states should be given by the mean field solutions (1.3) since the entropy dissipation vanishes precisely on these states. Because of these difficulties, we will be able to get global existence results only for data sufficiently close to a mean field solution. Nevertheless, we point out that our smallness constraint is independent of the viscosity parameter ν. It remains an open problem to establish if Eq. (1.6) can produce a singularity in a finite time and in particular if the L 2 norm of ω can blow up. Note that if we consider (1.6) without (1.7), i.e we consider the equation with some given parameters a < 0 and b fixed, it is easy to establish the existence of solutions that blow-up. Indeed, since the inertial term does not play any part in the estimates, all the result established in [3] for the Keller-Segel equation remains

656

E. Caglioti, M. Pulvirenti, F. Rousset

true for this equation. In particular we have that the evolution of I which is nonnegative is given by b ) I˙ = a I + (2 − 4π and hence, if the fixed parameter b is larger than 8π , the solution must blow up in finite time. Of course, this argument is useless, if a, b are given by (1.7) and thus not fixed. Consequently, it would be very interesting to know if there is a nonlinear stabilization for (1.6), (1.7). The paper is organized as follows: the global existence proof is presented in Sect. 3 after some preliminary steps are discussed in Sect. 2. Section 4 is devoted to the proof of L p estimates necessary both for the existence part and the asymptotic behavior discussed in Sect. 5. Finally, the Appendix is devoted to the study of Eq. (1.3) and the connected variational principles. The main ideas are in [7,8], but the adaptation of these results to the R2 case requires some care. 2. Preliminaries Let us introduce the submanifold of probability densities M(E, I ) = ω, ω ≥ 0, ω = 1, E(ω) = E, I (ω) = I

(2.1)

for some fixed E and I > 0. Next, by using Theorems 12, 14 in the Appendix, we denote the unique minimizer of the entropy functional S(ω) on M(E, I ) by ω M F . Note that ω M F is a (radial) solution to Eq. (1.3) with parameters b and a that we denote by b M F and a M F respectively. 2.1. A stability property. We first establish a crucial result asserting the continuity of the L 1 norm with respect to variation of the entropy, in a neighbourhood of ω M F , in the manifold M(E, I ). Theorem 1. For any ε > 0 there exists δ > 0 such that, for all ω ∈ M(E, I ) for which S(ω) − S(ωMF ) ≤ δ,

(2.2)

ω − ωMF L 1 ≤ ε.

(2.3)

then

We remark that Thm. 1 provides a proof of the Lyapounov stability of ωMF with respect to the Euler flow (by virtue of the time invariance of S(ω) ). The stability of ωMF was also proved in [7] by using the arguments of [20] in which all the conserved quantitities of the Euler equation are used. Proof of Theorem 1. Assume, by contradiction, the existence of a sequence ωn ∈ M (E, I ) and ε > 0 such that lim S(ωn ) = S(ωMF ) and ωn − ωMF L 1 ≥ ε. n

(2.4)

On a Constrained 2-D Navier-Stokes Equation

657

Thanks to the entropy bound, we can find a probability distribution ω ∈ L 1 such that, up to the extraction of a subsequence, limn ωn = ω in the sense of the weak convergence of measures. Next, we also have that lim E(ωn ) = E(ω) = E n

(see the proof of (A.11) in the Appendix ) and that lim I (ωn ) = I ≥ I (ω), lim S(ωn ) ≥ S(ω) n

n

by convexity. Let now aMF , bMF be the multipliers for which ωMF solves Eq. (1.16) with those values of parameters, and F(aMF ,bMF ) the free energy functional (see (A.1) for the definition). We get, since aMF < 0, that F(aMF ,bMF ) (ω) ≤ lim (S(ωn ) − bMF E(ωn ) − aMF I (ωn )) n

= (S(ωMF ) − bMF E(ωMF ) − aMF I (ωMF )) = F(aMF ,bMF ) (ωMF ). Since ωMF is the unique minimizer of F(bMF ,aMF ) (see Theorem 12), it follows that ω = ωMF . As a consequence, we also get that lim I (ωn ) = I (ω) = I (ωMF ). n

(2.5)

Finally, we consider the relative entropy S(ωn |ωMF ) =

ωn log(

ωn ) ωMF

= S(ωn )−S(ωMF ) + bMF

(ωMF −ωn ) ψMF + aMF (I (ωMF )−I (ω)) .

Now, we observe that S(ωn ) − S(ω) goes to zero thanks to (2.4) and that bMF

(ωMF − ωn ) ψMF + aMF (I (ωMF ) − I (ω))

also goes to zero by weak convergence and (2.5). Consequently, the relative entropy S(ωn |ωMF ) goes to zero. Thus we conclude by using the Csiszar-Kullback inequality

ωn − ωMF 2L 1 ≤ 2S(ωn |ωMF ) → 0 which yields the desired contradiction.

658

E. Caglioti, M. Pulvirenti, F. Rousset

2.2. Properties of the coefficients a(ω), b(ω). Our next step is the study of the properties of a and b. We shall use the notation 2I ω2 + 2V 2 ω|∇ψ|2 + V ω2 b(ω) = , a(ω) = − , (2.6) D(ω) D(ω) where

D(ω) = 2I

ω|∇ψ|2 − V 2 , V = −

1 . 4π

(2.7)

As we have seen in the introduction, one of the main difficulties is that the denominator D(ω) may vanish for a vortex patch. Before stating the result, we shall recall a useful set of inequalities which will be used throughout the paper (see e.g. [34]...). Lemma 2 (Useful inequalities in R2 ). i) Sobolev-Gagliardo-Niremberg: The following inequalities hold for some C > 0: ||ω|| L 2 ≤ C||∇ω|| L 1 , ||ω||2L 2

(2.8)

≤ C ||ω|| L 1 ||∇ω|| L 2 .

(2.9)

ii) Biot and Savart law: Let u = K ∗ ω with K defined by (1.2), then we have 1 1 1 = − , ||u|| L q ≤ C||ω|| L p , (2.10) q p 2 α 1−α 1 = + , ||u|| L ∞ ≤ C||ω||αL p ||ω||1−α 1 ≤ p < 2, 2 < q ≤ ∞, L q , (2.11) 2 p q pt1 < p < +∞, ||∇u|| L p ≤ C||ω|| L p . (2.12)

1 < p < 2, 2 < q < +∞,

iii) Interpolation in L p spaces: 1 ≤ p < r < q ≤ +∞, ||ω|| L r ≤

2||ω||αL p

||ω||1−α Lq ,

p 1− α= r 1−

r q p q

. (2.13)

We shall prove the following result: Theorem 3. Suppose ω ∈ L p ∩ M(E, I ) for some p ≥ 2. 1. We have: b(ωMF ) = bMF , a(ωMF ) = aMF

(2.14)

and D(ωMF ) > 0. 2. D(ω), a(ω) and b(ω) are continuous on L 2 . 3. Assume that p > 2, then if S(ω) − S(ωMF ) is sufficiently small, we have |D(ω) − D(ωMF )| + |a(ω) − a(ωMF )| + |b(ω) − b(ωMF )| ≤ C ω − ωMF αL 1 , (2.15) for some α < 1. Here C depends on E, I and ω L p .

On a Constrained 2-D Navier-Stokes Equation

659

Proof of Theorem 3. We first prove 1. Since ω M F is a solution of the mean field equation (1.3), we have ∇ωMF = b M F ω M F ∇ψ M F + a M F ω M F x. Taking the scalar product of this equation by ∇ψ M F and x we find 2 b M F ω M F |∇ψ M F | + a M F V = ω2M F , b M F V + 2a M F I = −2. The resolution of this two by two linear system precisely gives that b(ω M F ) = b M F , a(ω M F ) = a M F . Consequently, since the numerator in the definition of b(ω M F ) and a(ω M F ) is finite, we find that D(ω M F ) > 0.

(2.16)

Next, we shall estimate the differences ω2 − ω2M F , ω|∇ψ|2 − ω M F |∇ψ M F |2 .

(2.17) (2.18)

By Cauchy-Schwarz, we obtain |(2.17)| ≤ ||ω − ω M F || L 2 (||ω|| L 2 + ||ω M F || L 2 ), and hence we find |(2.17)| ≤ C ω − ω M F αL 1

(2.19)

for some α > 0, by using (2.13) with r = 2 and q = 1. Next, we split (2.18) as |(2.18)| ≤ (ω − ω M F )|∇ψ|2 + ω M F ∇(ψ − ψ M F ) · ∇(ψ + ψ M F ) ≤ ||ω − ω M F || L 2 ||u||2L 4 +||ω M F || L 2 (||u|| L 4 + ||u M F || L 4 )||u − u M F || L 4 . We notice that thanks to (2.10), the L 4 norm of the velocity is bounded in terms of the 4 L 3 norm of the vorticity. Therefore, by a new use of (2.13), we find

|(2.18)| ≤ C ||ω − ω M F || L 2 + ||ω − ω M F || 4 ≤ C ω − ω M F αL 1 . (2.20) L3

Next, by using (2.19), (2.20), we get that |D(ω) − D(ω M F )| ≤ C||ω − ω M F ||αL 1 . Consequently, we can use Theorem 1 to get that 1 D(ω M F ), (2.21) 2 provided S(ω) − S(ω M F ) is sufficiently small. Finally, by using (2.21), (2.19), (2.20) and (2.16), we easily conclude the proof. D(ω) ≥ D(ω M F ) − |D(ω) − D(ω M F )| ≥

660

E. Caglioti, M. Pulvirenti, F. Rousset

3. Global Existence and Uniqueness We start with a brief explanation about the construction of a classical local solution. Let ω0 ∈ L p with p > 2 be the initial condition such that ω0 ∈ M(E, I ). Let ω M F be the unique Mean-Field solution associated to M(E, I ) as above. We assume that S(ω0 ) − S(ω M F ) is small enough so that, by virtue of Theorems 1, 3, we have |D(ω0 ) − D(ω M F )| ≤

1 D(ω M F ) 2

and also |a(ω0 ) − a(ω M F )| ≤

1 1 |a(ω M F )|, |b(ω0 ) − b(ω M F )| ≤ |b(ω M F )|. 2 2

This implies that we have the upper bound |a(ω0 )| + |b(ω0 )| ≤ 2(|a(ω M F )| + |b(ω M F )|) and also that D(ω0 ) ≥

1 D(ω M F ) > 0. 2

Note that the positivity of D is important in order to stay away from the singularity. For every p ≥ 2, by using a standard iterative scheme, we can easily establish, a local existence and uniqueness result for a classical solution ω ∈ C([0, T ], L p ) ∩ ∞ 1 2 L ([0, T ], L ((1 + |x| )d x , for which ||ω(t)|| L p ≤ 2(||ω0 || L p + ||ω M F || L p ),

(3.1)

and D(ω(t)) ≥

1 D(ω M F ), ∀t ∈ [0, T ]. 4

(3.2)

Moreover, we can continue the solution as long as the L p norm of ω remains finite and the denominator D(ω) remains positive. Note that L 2 seems the natural space for our equation in order to have a and b welldefined. Note also that the condition (3.2) allows to avoid the singularity of D(ω) by Theorem 3. Let T > 0 be the maximal time for which the estimates (3.1), (3.2) are verified. Our purpose is to prove, by a priori estimates, that T = +∞. To do this we shall use the L p estimates given by the following theorem which will be proven in the next section. Theorem 4. Consider a local solution as above of (1.6) such that |a(ω(t))| + |b(ω(t))| ≤ C0 , ∀t ∈ [0, T ].

(3.3)

Assume also that ω0 ∈ L p , p ∈ [2, +∞) is a probability density. Then there exists C p which depends only on ω0 , C0 and p (and hence does not depend on ν and T if C0 does not) such that ||ω(t)|| L p ≤ C p , ∀t ∈ [0, T ].

(3.4)

On a Constrained 2-D Navier-Stokes Equation

661

We are now in position to get a global existence result by showing that T = +∞. Indeed by the H-Theorem (see (1.22)), we have S(ω(t)) − S(ω M F ) ≤ S(ω0 ) − S(ω M F ), ∀t ∈ [0, T ],

(3.5)

and thus by Theorem 1, we get that ω(t)) − ω M F L 1 ≤ ε, ∀t ∈ [0, T ] ( with ε independent of T ), provided that S(ω0 ) − S(ω M F ) is sufficiently small. Next, we can use Theorem 4. Indeed, because of (3.1), (3.2) we have the bound |a| + |b| ≤ C0 on [0, T ] for some C0 > 0. This yields a control of the L p norm of ω with p > 2 , depending only on C0 . In particular, thanks to (3.2), we get that |D(ω(t)) − D(ω M F )| ≤ Cεα , ||ω(t) − ω M F || L p ≤ Cεα , ∀t ∈ [0, T ], where C depends on C0 only. Consequently, we can choose ε sufficiently small to have ||ω(t)|| L p < 2(||ω0 || L p + ||ω M F || L p ),

D(ω(t)) >

1 D(ω M F ), ∀t ∈ [0, T ], 4

and hence T = +∞. Therefore we have proven the following global existence result: Theorem 5. There exists δ0 > 0 (independent of ν > 0) such that, for any initial datum ω0 ∈ L p ∩ M(E, I ) with p > 2, close to ω M F in the sense that S(ω0 ) − S(ω M F ) ≤ δ0 ,

(3.6)

there exists a unique classical solution ω(t) ∈ C([0, +∞[, L p ) ∩ L ∞ ([0, T ], M(E, I )) to Eq. (1.6) with initial datum ω0 . Moreover, we have the Lyapounov stability of ω M F , namely, for any ε > 0, there exists δ, 0 < δ ≤ δ0 such that if S(ω0 ) − S(ω M F ) ≤ δ, then ||ω(t) − ω M F || L 1 ≤ ε, ∀t ≥ 0. As noticed after Theorem 1, ω M F is also Lyapounov stable as a stationary solution of the Euler equation (1.1). Consequently, we also have the following global stability result between the flows of (1.6) and (1.1) in the vicinity of ω M F : Corollary 6. For every ε > 0, there exists δ > 0 such that if ω0 ∈ L ∞ ∩ M(E, I ) verifies S(ω0 ) − S(ω M F ) ≤ δ, then the global solution ω E of Euler equation (1.1) and the global solution ων of (1.6) for ν > 0 with the same initial datum ω0 satisfy ||ων (t) − ω E (t)|| L 1 ≤ ε, ∀t ≥ 0. Note that this global approximation property of the Euler evolution (even for ν large) is of course false for the Navier-Stokes evolution.

662

E. Caglioti, M. Pulvirenti, F. Rousset

4. Propagation of L p Regularity In this section we prove Theorem 4. Our stategy in proving Theorem 4 will be based on weighted energy estimates because in this way we can use the fact that the inertial term u · ∇ω does not contribute. This is crucial in order to find estimates independent of ν. We first focus on the L 2 estimate. We shall use the notation B = supt∈[0,T ] (|(a(t)| + |b(t)|). The standard L 2 energy estimate for (1.6) gives: d 1 2 2 3 2 ||ω(t)|| L 2 + ν||∇ω(t)|| L 2 = ν b(t) ω(t) − a(t) ω(t) . (4.1) dt 2 Next, as in [3,14], we can use the Sobolev inequality (2.9) to get (we recall that ω = 1) 2 3 1 9 9 ω3 = (ω 2 )2 ≤ C ≤ C ||∇ω||2L 2 , |∇ω| ω 2 4 4 where C is the best constant in the Gagliardo-Nirenberg-Sobolev inequality. Consequently we get d 1 2 ||ω(t)|| L 2 + ν||∇ω(t)||2L 2 ≤ νC B ||∇ω(t)||2L 2 + ν B ||ω(t)||2L 2 , dt 2 where C is an explicit harmless number. Now, let us assume for the moment that C B is sufficiently small (less than 1/2 for example), then we can deduce that ν d 1 ||ω(t)||2L 2 + ||∇ω(t)||2L 2 ≤ ν B ||ω(t)||2L 2 . (4.2) dt 2 2 If we directly integrate this differential inequality, we still cannot conclude. Indeed we shall find that ||ω(t)|| L 2 grows exponentially in time and this does not allow us to get a uniform in time estimate. Note however that the bad term in the right-hand-side of (4.7) comes from the linear term ∇ · (x ω) in Eq. (1.6). The explanation for the bad behaviour we get is simple: the semigroup generated by the linear Fokker-Planck operator Lω = ω + B∇ · (xω), with B > 0, is not uniformly bounded in time as an operator in L(L 2 ). Nevertheless, it is bounded as an operator in L(L 1 ∩ L 2 , L 2 ). A very simple way to see this property is to use a weighted energy estimate. Indeed, multiplying (4.1) by eν B t , we find

d eν B t 3 ν eν Bt 2 ||ω(t)|| L 2 + ||∇ω(t)||2L 2 ≤ ν B eν Bt ||ω(t)||2L 2 , ∀t ∈ [0, T ]. dt 2 2 2 (4.3) Now, we can use (2.9) and the Young inequality to get, for some harmless explicit number C (independent of ν) which changes from line to line, 3 1 2 ν B eν Bt ||ω(t)||2L 2 ≤ C ν B eν Bt ||∇ω(t)|| L 2 ≤ ν eν Bt ||∇ω(t)||2L 2 + C ν B eν Bt . 2 4

On a Constrained 2-D Navier-Stokes Equation

663

Consequently, we can plug this last inequality in (4.3) to get

d eν B t 2 2 ||ω(t)|| L 2 ≤ C B ν eν B t . dt 2 The integration finally gives ||ω(t)||2L 2 ≤ ||ω0 ||2L 2 + C B,

∀t ∈ [0, T ].

We now remove the assumption on the smallness of B and prove also the propagation of the L p regularity in the general case p ≥ 2. The starting point is to use the idea of [3,14] in the study of the Keller-Segel equation. For K > 1, a parameter which will be fixed later, we define m K (t) = (ω(t) − K )+ d x . We note that m K (t) ≤

ω(t)≥K

ω(t) ≤

1 log K

ω≥K

ω(t) log(ω(t)) d x ≤

1 log K

ω(t)| log ω(t)|. (4.4)

Now we can use the following useful inequality: Lemma 7. There exists C > 0, such that, for all probability distribution ω, we have ω| log ω| ≤ S(ω) + C(1 + I (ω)) (4.5) This is a very classical estimate (see [30] for example). For the sake of completeness, we shall give a proof of this lemma in the end of the section. Thanks to (4.5), we find that ω| log ω| is bounded in terms of the initial datum because the entropy is decreasing and I (ω) is constant. Thus m K (t) ≤

1 C(ω0 ), ∀t ≥ 0. log K

(4.6)

Next, we can perform a modified L p energy estimate for the solution of (1.6). After a few integrations by parts, we find d 1 p p−2 (ω − K )+ + ν( p − 1) (ω − K )+ |∇(ω − K )+ |2 dt p 2 p+1 p−1 . (4.7) = νB (ω − K )+ + (2K + ) + (K 2 + 2K ) (ω − K )+ p To estimate the first term in the right-hand side of (4.7), we use the Sobolev-GagliardoNiremberg inequality (2.8) and Cauchy-Schwarz. We have 2 p+1 2 p−1 ( p + 1)2 p+1 (ω − K )+ 2 (ω − K )+ = ≤C (ω − K )+ 2 |∇(ω − K )+ | 4 ( p + 1)2 p−2 m K (t) (ω− K )+ |∇(ω− K )+ |2 . ≤C 4

664

E. Caglioti, M. Pulvirenti, F. Rousset

This yields, thanks to our assumption (3.3) and (4.6), C0 S0 ( p + 1)2 p+1 p−2 ν B (ω − K )+ ≤ ν (ω − K )+ |∇(ω − K )+ |2 . 4 log K Hence, by choosing K such that C0 S0 ( p + 1)2 1 = ( p − 1), 4 log K 2 we obtain

νB

p+1

(ω − K )+

≤

ν ( p − 1) 2

p−2

(ω − K )+

|∇(ω − K )+ |2 .

Note that K depends only on C0 , S0 , p and is diverging with p. To estimate the last term in (4.7), we write p−1 p−1 p−1 (ω − K )+ + (ω − K )+ (ω − K )+ ≤ K ≤ω≤K +1 ω≥K +1 p ≤ 1 + (ω − K )+ .

(4.8)

(4.9)

Indeed, to estimate the first integral, we have used that 1 |{ω ≥ K }| ≤ ω ≤ 1, K thanks to the Markov inequality since ω = 1 and K > 1. By plugging (4.9) and (4.8) in (4.7), we find ν( p − 1) d 1 p p−2 (ω − K )+ + (ω − K )+ |∇(ω − K )+ |2 dt p 2 2 p 2 2 ≤ νC0 K + 4K + (ω − K )+ + K + 2K p p ≤ C0 C ν (ω − K )+ + 1 , where, from now on, C is a harmless number which depends only on K and p. Again, we note that we cannot directly conclude by using the Gronwall Lemma in the last differential inequality because it gives an estimate which is not uniform in time. We now use the technique that we have explained in the beginning. We find d ν( p − 1) ν t 1 p p−2 eν t e (ω − K )+ |∇(ω − K )+ |2 (ω − K )+ + dt p 2 p ≤ C(1 + C0 ) ν eν t (4.10) (ω − K )+ + 1 . Next, we can use the inequality (2.9) to get 1 p 2 p 2 p p−2 (ω− K )+2 ≤ C (ω− K )+2 (ω − K )+ = (ω− K )+ |∇(ω − K )+ |2 . (4.11)

On a Constrained 2-D Navier-Stokes Equation

665

By using the interpolation inequality (2.13) of L p/2 between L 1 and L p , we have

p 2

(ω − K )+ ≤ C

(ω −

p K )+

p−2 2( p−1)

,

and hence, we deduce from (4.11) that 1

p

p−2

(ω − K )+ ≤ C

(ω − K )+

|∇(ω − K )+ |2

q

,

where q is such that p −1 + q −1 = 1. Thanks to the inequality ab ≤ p −1 a p + q −1 a q , a ≥ 0, b ≥ 0,

(4.12)

we finally obtain C(1 + C0 )

(ω −

p K )+

p−1 ≤ 4

p−2

(ω − K )+

|∇(ω − K )+ |2 + C p ,

where C p will now stand for a number which depends only on ω0 , C0 and p. By using this last inequality in (4.10), we finally arrive to d p νt 1 e (ω − K )+ ≤ C p νeνt . dt p The integration gives

p

(ω − K )+ ≤ C p .

(4.13)

Now we can conclude as in [3,14]. By using the inequality x ≤ p

λ λ−1

p−1 (x − 1) p

for every x ≥ λ > 1, we find p p ω ≤ ω + ωp ω≤K ω>K p−1 p ≤K + ω + K <ω≤λK

ω≥λK

ωp

p−1 p

ω λ −1 λ−1 ω≥λK k p−1 λ p ≤ K p−1 + (λK ) p−1 + (ω − K )+ . λ−1

≤ K p−1 + (λK ) p−1 + K p

This ends the proof of Theorem 4 . It remains to prove Lemma 7, which is a classical estimate we present for completeness.

666

E. Caglioti, M. Pulvirenti, F. Rousset

Define ω = ω1{|ω|≤1} . Since we have ω| log ω| = S(ω) − 2 ω log ω,

(4.14)

it suffices to find a bound from below of ω log ω. By using the fact that the relative entropy between two probability measures is non negative (this is an easy consequence of the Jenssen inequality) we get ⎛ ⎞ ω/m ⎠ ≥ 0, (ω/m) log ⎝ 2 1 − |x|2 R2 2π e where m = R2 ω ≤ 1. Then we get 1 1 1 − m log m ≥ −I (ω) + log − , ω log ω ≥ −m I (ω) + m log 2π 2π e and hence we get (4.5) by using this last estimate and (4.14)

5. Asymptotic Behaviour In this section we investigate the asymptotic behaviour of the global solutions given by Theorem 5. More generally, one can consider a global solution ω(t) of (1.6) such that ω ∈ C([0, +∞[, L 2 ∩ L 1 ((1 + |x|2 )d x) and such that ω(t) ∈ M(E, I ) and which satisfies for some C > 0, the uniform estimates ||ω(t)|| L 2 ≤ C, |b(ω(t))| + |a(ω(t))| ≤ C, ∀t ≥ 0.

(5.1)

The main result of this section is given by the following theorem. Theorem 8. Let ω(t) a global solution of (1.6) as above which satisfies (5.1). Then ω(t) converges in L 1 , as t → ∞, to the unique solution ω M F ∈ M(E, I ) of the associated microcanonical variational problem. Note that the solutions constructed in Theorem 5 satisfy the estimate (5.1) and hence their asymptotic behaviour is given by Theorem 8. Proof of Theorem 8. The first step consists in proving that the orbit {ω(t)}t≥0 is relatively compact in L 1 and uniformly bounded in L ∞ . Before, we need to study the evolution operator generated by the non-autonomous Fokker-Planck type operator n L γ ω = ν (ω + γ (t)∇ · (xω)), where γ (t) is a given continuous curve. Denoting by Sγ (t, τ )ω0 the solution of ∂t ω = L γ ω, t > τ, ω(τ ) = ω0 , we have the following estimates:

On a Constrained 2-D Navier-Stokes Equation

667

Lemma 9. Suppose that for all t ≥ 0, |γ (t)| ≤ K 0 , for some K 0 > 0. Then, there exists C > 0 independent of ν > 0 and such that for ( p, q, r ) ∈ [1, +∞]3 we have: ||Sγ (t, τ )ω||

Lp

≤

1− 1 C K0 r

e

2ν K 0 (1− q1 )(t−τ ) 1

(1 − e−2ν K 0 (t−τ ) )1− r e

3 1 2−r

||∇ Sγ (t, τ )ω|| L p ≤ C K 0

(1−e−2ν K 0 (t−τ ) ) e

3 1 2−r

||Sγ (t, τ )∇ω|| L p ≤ C K 0

2ν K 0 (1− q1 )(t−τ ) 3 1 2−r

ν K 0 (1− q2 )(t−τ )

(1 − e−2ν K 0 (t−τ ) )

3 1 2−r

||ω|| L q ,

1 1 1 + = 1 + , (5.2) r q p

||ω|| L q ,

1 1 1 + = 1+ , r q p

(5.3)

||ω|| L q ,

1 1 1 + = 1+ , r q p

(5.4)

for all p ∈ [1, +∞] Proof of Lemma 9. A simple computation in Fourier space allows to find the explicit representation

Sγ (t, τ )ω(x) = e2ν(B(t)−B(τ )) where B(t) = estimates.

t 0

e

−

4π ν

4ν

t

t τ

τ

|x−y|2 e2ν(B(s)−B(t)) ds

e2ν(B(s)−B(t)) ds

ω0 (eν(B(t)−B(τ )) y) dy, (5.5)

γ (s) ds. The result of Lemma 9 then follows by standard convolution

We come back to the proof of Theorem 8. We shall prove that ||∇ω(t)|| L 1 is uniformly bounded. We use the same idea as in [12]. Note that the solution of (1.6) can be written as t

ω(t) = Sa (t, 0)ω0 + Sa (t, τ )∇ · −u ω + νbu ⊥ ω (τ ) dτ. (5.6) 0

Moreover, thanks to (5.1), we have a uniform estimate on ||ω(t)|| L 2 and on |a(t)|+|b(t)| for all times: there exists C0 > 0 such that ||ω(t)|| L 2 ≤ C0 , |a(t)| + |b(t)| ≤ K 0 , ∀t ≥ 0.

(5.7)

Consequently, in (5.6), we can consider a and b as known and we can use the estimates of Lemma 9. Let us define F(ω) as the right-hand side of (5.6). We have

t ν K 0 (t−s) C 0 eν K 0 t e + C0 (1 + ν) ||u ω|| L 4 ds , ||F(ω(t))|| L ∞ ≤ C 1 3 0 aν (t − s) 4 aν (t) 2 where aν (t) = 1 − e−2ν K 0 t . Since by (2.10), (2.13) and the uniform L 2 bound, we have: ||u ω|| L 4 ≤ ||ω|| L ∞ ||u|| L 4 ≤ C||ω|| L ∞ ||ω||

4

L3

≤ CC0 ||ω|| L ∞ ,

we finally get

||F(ω(t))|| L ∞ ≤ C

C 0 eν K 0 t 1

aν (t) 2

t

+ C0 (1 + ν) 0

eν K 0 (t−s) 3

aν (t − s) 4

||ω(s)|| L ∞ ds .

668

E. Caglioti, M. Pulvirenti, F. Rousset

Consequently, we can set

1 z(T ) = sup e−ν K 0 t aν (t) 2 ||ω(t)|| L ∞ [0,T ]

to get

z(T ) ≤ CC0 (1 + ν) 1 + aν (T )

1 2

T

1 3

1

aν (T − s) 4 aν (s) 2

0

ds z(T ) .

Next, we notice that lim aν (T )

T →0

1 2

0

T

1 3

1

aν (T − s) 4 aν (s) 2

= 0,

therefore, there exists T (ν, C0 ) > 0 such that 1 z(T (ν, C0 )) ≤ CC0 (1 + ν) + z(T (ν, C0 )) 2 and hence, we get that ||ω(t)|| L ∞ ≤

C(ν, C0 ) 1

aν (t) 2

, ∀t ∈ [0, T (ν, C0 )] .

(5.8)

Next, since to establish (5.8) we have only used (5.7), we can consider for every n ∈ N, the solution ω˜ of (1.6) with initial value ω(nT (ν, C0 )/2). By the above argument, we get that ω˜ satisfies the estimate (5.9). By uniqueness, we have ω(t) ˜ = ω(t + nT (ν, C0 )/2), ∀t ∈ [0, T (ν, C0 )] and hence,

||ω(t)|| L ∞ ≤ C(ν, C0 ) 1 +

1

, ∀t ≥ 0

1

aν (t) 2

(5.9)

for some C(ν, C0 ). In a similar way we have by Duhamel’s formula t

∇ S(t −τ ) −u · ∇ω+νbu ⊥ · ∇ω+νbω2 (s) ds, ∇ω(t) = ∇ S(t, 0)+ω0 0

||∇ F(ω)|| L 1 ≤ C

C0 aν (t)

1 2

+ (1 + ν)C0 0

t

(5.10)

1 1

aν (t − s) 2

(||u ∇ω(s)|| L 1 + C0 ) ds ,

and since we have by (2.11) 1

||u ∇ω|| L 1 ≤ C||u|| L ∞ ||∇ω|| L 1 ≤ C||ω|| L2 ∞ ||∇ω|| L 1 ,

On a Constrained 2-D Navier-Stokes Equation

we get, thanks to (5.9),

||∇ F(ω)|| L 1 ≤ C

C0 aν (t)

669

1 2

+ C0 0

t

+ C(ν, C0 ) 0

t

1 1

aν (t − s) 2

ds

1 1

1

aν (t − s) 2 aν (τ ) 4

||∇ω(s)|| L 1 ds .

Consequently, by using the same method as before, we can easily obtain

1 ||∇ω(t)|| L 1 ≤ C(ν, C0 ) 1 + , ∀t ≥ 0 1 aν (t) 2

(5.11)

for some C(ν, C0 ). We now consider , the omega limit set of the trajectory (ω(t))t≥0 . We deduce from the previous estimates that the positive orbit {ω(t)}t≥0 is relatively compact in X = L 1 ((1 + |x|α ) d x) ∩ L 2 for α < 2. Indeed, since ω(t) ∈ C([0, +∞[, X ), it suffices to prove that {ω(t)}t≥1 is relatively compact. The compactness in L 1 (1 + |x|α ) follows immediately from the Riesz-Frechet-Kolmogorov criterion: ω(t) is uniformly bounded in L 1 , the uniform (for t ≥ 1) bound (5.11) gives the equi-integrability and we have a uniform bound on the moment of inertia to control the mass far away. Next, thanks to the uniform L ∞ estimate for t ≥ 1 given by (5.9) and the relative compactness in L 1 , we also get that {ω(t)}t≥0 is relatively compact in L p for every p < +∞. By the relative compactness properties that we have just proven, we get that is non empty and actually made by smooth L p functions thanks to the smoothing effect of the parabolicity. Moreover, we also have that the elements of are probability densities. Also, if ω ∈ , since there exists an increasing sequence tn such that ω(tn ) tends to ω in X , we also have E(ω) = lim E(ω(tn )) = E, n

I (ω) ≤ lim I (ω(tn )) = I.

(5.12)

n

The first equality is proven in the Appendix, see (A.11). Finally, we notice that the entropy S is constant on . Indeed, if ω1 , ω2 ∈ , we can construct an increasing sequence tn such that ω(t2n ) tends to ω1 and ω(t2n+1 ) tends to ω2 almost everywhere and such that there exists g1 , g2 ∈ L 1 ((1 + |x|2 )d x) ∩ L 2 with ω(t2n ) ≤ g1 , ω(t2n+1 ) ≤ g2 . By using that

3 ω| log ω| ≤ C ω2 + |ω| 4 ≤ C ω2 + (1 + |x|)ω +

1 (1 + |x|)3

,

we find by Lebesgue Theorem that S(ω1 ) = lim S(ω(t2n )), S(ω2 ) = lim S(ω(t2n+1 )). n

n

But, since the entropy is decreasing, we also have S(ω(t2n )) ≥ S(ω(t2n+1 )) ≥ S(ω(t2n+2 )) so that passing to the limit, we get S(ω1 ) ≥ S(ω2 ) ≥ S(ω1 ), and hence S(ω1 ) = S(ω2 ).

670

E. Caglioti, M. Pulvirenti, F. Rousset

Finally, we can prove that the elements of are solutions of the mean field equation. If ω ∈ , consider ω(t) the solution of (1.6) with initial value ω. By the strong parabolic principle, we have that ω(t) is smooth and strictly positive for t > 0. Since is invariant and S is constant on it, the entropy dissipation identity (1.22) gives that for t > 0, |x|2 ∇ log ω(t) − bψ(t) − a ) = 0. 2 By continuity in time we get that ω actually solves the mean field equation ω=

1 bψ+a |x|2 2 e Z

(5.13)

in R2 . Finally, by using the result of [26] and Lemma 4.3 of [3], we get that ω is radially symmetric. Note that we also necessarily have that a < 0 and b < 8π. To summarize, we have proven that the omega limit set of {ω(t)}t≥0 is made by probability densities which are radially symmetric solutions of the mean field equation (5.13) with finite energy equal to E and finite moment of inertia. Since the entropy separates the radial mean field solutions (see Remark 15 in the Appendix), we conclude that consists in a single point. A. The Mean-Field Equation and Related Variational Problems In this Appendix we collect some useful facts concerning the Mean-Field Equation (MFE) in R2 . The main ideas are in [7,8]; here, we adapt the results to the R2 case. For the microcanonical problem, the strategy of the proof is slightly different, we do not prove directly the existence of a solution. We focus on the negative temperature case (which corresponds to b > 0) which is the most interesting case. Definition 10 (Canonical Variational Principle). For a < 0, b > 0, consider the free-energy functional Fa,b (ω) = S(ω) − bE(ω) − a I (ω)

(A.1)

defined on the space of probability densities on R2 for which E(ω), I (ω) and S(ω) are finite. We set F(a, b) = inf Fa,b (ω). ω∈

(A.2)

Definition 11 (Microcanonical Variational Principle). For E ∈ R and I > 0 let us define M(E, I ) = {ω ∈ : E(ω) = E, I (ω) = I }. We set S(I, E) =

inf

ω∈M(E,I )

S(ω).

(A.3)

The main results of this Appendix are the two following theorems where we focus on the negative temperature case which is more interesting. Note that in the positive temperature case (i.e. b < 0) since Fb,a is a convex functional and since there exists a unique solution to the Mean Field Equation [13], all the following results are obvious.

On a Constrained 2-D Navier-Stokes Equation

671

Theorem 12 (Canonical Variational Principle). For a < 0, and 0 < b < 8π : i) There exists ω ∈ such that Fa,b (ω) = F(a, b). Moreover ω is radially symmetric and solves the mean field equation (1.4). ii) There is only one radially symmetric solution of the mean field equation (1.4). iii) As a consequence, there exists a unique minimizer ωa,b of F(a, b) over . Remark 13. It is easy to prove that when b → 8π , the solutions to the MFE concentrates at the origin. Indeed by multiplying the equation by x · ∇ψ and integrating by parts, we arrive to the identity (that is the same argument leading to the Pohozhaev inequality): 1−

8π 2πa I = . b b

(A.4)

Hence when b → 8π , I → 0 and the concentration takes place. For b > 8π we do not have solutions. As a consequence, we can solve the microcanonical variational principle. Theorem 14 (Microcanonical Variational Principle). For a < 0, and 0 0, E ∈ R let us define S ∗ (I, E) as S ∗ (I, E) = sup (F(a, b) + bE + a I ).

(A.6)

a,b

Denote by a(I, E), b(I, E) the unique maximizer of (A.6), then for any I > 0, E ∈ R, S(I, E) = S ∗ (I, E), and hence S is a smooth convex function. Moreover, the microcanonical variational principle admits a unique minimizer ω˜ I,E in I,E . Finally ω˜ I,E = ωa(I,E),b(I,E) (equivalence of the ensembles). Remark 15. We finally underline that the function I → S(E, I ) is strictly decreasing (∂ S/∂ I = a < 0) so that different radial solutions of the MFE with the same energy cannot have the same entropy. Proof of Theorem 12. By the logarithmic Hardy-Littlewood-Sobolev inequality (see [2, 9]), we have S(ω) − 8π E(ω) ≥ −(1 + log π ).

(A.7)

Note that we also have the inequality (see (A.9)) E(ω) ≥ −

1 log(4I (ω)). 8π

Indeed, we can write 1 1 E =− log |x − y|2 ω(x)ω(y), log |x − y|2 ω(x)ω(y) ≥ − 8π 8π

(A.8)

(A.9)

672

and hence

E. Caglioti, M. Pulvirenti, F. Rousset

2 1 1 E ≥− log 4I − 2 xω log(4I ). ≥− 8π 8π

Consequently, thanks to (A.7), (A.8), we get that 8π − b log(4I ) − a I − (1 + log π ) (A.10) 8π and hence, we find that F(b,a) is bounded from below. Let ωn be a minimizing sequence in . Up to the extraction of a subsequence, ωn converges in the sense of weak convergence of measures. Moreover, thanks to (A.10), we get that I (ωn ) is uniformly bounded and, by using again (A.7), we also have F(a,b) (ω) ≥ −

b 1 + log π )S(ωn ) ≤ F(ωn ) + , 8π 8π and hence S is bounded from above. Therefore the uniform integrability given by the bounds on S and I (which yields a bound on ω| log ω| thanks to Lemma 7) implies that ωn converges to a nonegative function ω. Moreover, the uniform estimate on the moment of inertia provides the tightness of the sequence ωn so that we obtain lim ωn = ω, (1 −

n

i.e. ω ∈ . Next, by lower semi-continuity, we have S(ω) ≤ lim S(ωn ),

I (ω) ≤ lim I (ωn )

n

n

and we claim that E(ω) = lim E(ωn ).

(A.11)

n

This proves that F(a,b) (ω) ≤ limn F(a,b) (ωn ) and hence that ω is a minimizer. It remains to prove (A.11). We write 1 E(ωn ) = − log |x − y|ωn (x)ωn (y) d xd y = I (ε) + J (ε), 4π where 1 log |x − y|ωn (x)ωn (y) d xd y, I (ε) = − 4π |x−y|≤ε 1 J (ε) = − log |x − y|ωn (x)ωn (y) d xd y 4π |x−y|≥ε for every ε ∈ (0, 1). By splitting the integration domain in {ωn (x)ωn (y) ≤ |x − y|−1 } and its complementary, we easily get that I (ε) ≤ −C |x − y|−1 log |x − y| d xd y |x−y|≤ε + 2C ωn (y) dy ωn | log ωn | sup x |x−y|≤ε −1 ≤ −C |x − y| log |x − y| d xd y + Csupx ωn (y) dy. |x−y|≤ε

|x−y|≤ε

On a Constrained 2-D Navier-Stokes Equation

673

For the last line, we have used that the entropy and the moment of inertia are uniformly bounded in n and thanks to Lemma 7, we also have that ωn | log ωn | is uniformly bounded. This yields the uniform integrability: lim supx ωn (y) dy = 0. ε→0

|x−y|≤ε

Consequently, we get that limε→0 I (ε) = 0 uniformly in n. Finally, by weak convergence, we have that 1 lim J (ε) = − log |x − y|ω(x)ω(y) d xd y n 4π |x−y|≥ε and since 1 lim − ε→0 4π

|x−y|≥ε

log |x − y|ω(x)ω(y) d xd y = E(ω),

the conclusion follows easily. By symmetrizing ω (around the origin) we find that F(a,b) is decreasing. Indeed S and I are unchanged and E is increasing. Thus ω must be radially symmetric. It is not difficult to show that ω > 0 (otherwise one could find a better distribution as regards the minimization problem). Hence ω satisfies the MFE. Next we show that such a solution is also unique among all the radial solutions to the MFE. Setting r = |x| and ψ(r ) = ψ(x) (by an obvious notational abuse), we have a 2 1 (r ψ ) = −ebψ+ 2 r . r

(A.12)

H¨ = −F(t)e H ,

(A.13)

∞ a 2 We are assuming that Z = 2π 0 dr r ebψ+ 2 r = 1, adding, if necessary, a constant to ψ. After the change of variable t = log r , setting H = bψ + 2t we readily arrive to the following equation:

where F(t) = bea

e2t 2

.

(A.14)

We are looking for smooth solutions to Eq. (18) and hence lim H˙ = 2

t→−∞

(A.15)

as a consequence of the fact that limr →0 r ψ (r ) → 0. H (t) behaves as 2t +χ as t → −∞ and χ must be chosen in such a way that Z = 1. In the new variables: 2π ˙ 2π ∞ ( H (−∞) − H˙ (∞)). (A.16) dt e H F(t) = Z= b −∞ b It is convenient to change the time variable by setting 2t → 2t − χ , so that the problem can be reformulated as χ H¨ = −F(t − )e H , 2 H˙ (−∞) = 2, H (t) ≈ 2t for t → −∞. (A.17)

674

E. Caglioti, M. Pulvirenti, F. Rousset

Note that Z (χ ) → 0 for χ → −∞

(A.18)

and Z (χ ) →

8π for χ → +∞. b

(A.19)

Equation (A.18) is obvious, while Eq. (A.19) comes out by integrating the Hamiltonian system H¨ = −e H

(A.20)

for which, the energy conservation yields H˙ (∞) = −2. Since 8π b > 1 the value Z = 1 is certainly taken, at least once. In order to get uniqueness it remains to show that χ → Z (χ ) is a monotone function, actually it is not decreasing. Defining G=H+

a 2(t− χ ) 2 , e 2

(A.21)

we find the following set of non-autonomous equations: H¨ = −be G , G¨ = −be G − 4(H − G).

(A.22)

Note that H, H˙ and G, G˙ satisfy the same condition at t = −∞. On the other hand the derivatives ∂χ H = h and ∂χ G = g satisfy h¨ = −be G g, g¨ = (4 − be G )g − 4h.

(A.23)

The conditions at t → −∞ are vanishing for both h, h˙ and g, g. ˙ Introducing the energy E = 21 H˙ 2 + be G we get: χ

E˙ = bae G e2(t− 2 ) ≤ 0 and hence be G(t) ≤ E(t) ≤ E(−∞) = 2.

(A.24)

Therefore g¨ ≥ 0 as far as h ≤ 0 and h¨ ≤ 0 as far as g ≥ 0. These conditions are indeed verified for t ≈ −∞ so that they are true for all the time. Then G is increasing as well as 2π ∞ Z= dte G . b −∞

On a Constrained 2-D Navier-Stokes Equation

675

Proof of Theorem 14. To prove the concavity of F we will prove that, for any a1 , b1 , a2 , b2 : F(

1 a1 + a2 b1 + b2 , ) > (F(a1 , b1 ) + F(a2 , b2 )). 2 2 2

2 Let a = a1 +a 2 , and b = ω, we get that

b1 +b2 2 .

By the linearity of Fa,b (ω) as a function of a, b at fixed

1 1 Fa ,b (ωa,b ) + Fa2 ,b2 (ωa,b ) 2 1 1 2 1 1 1 > Fa1 ,b1 (ωa1 ,b1 ) + Fa2 ,b2 (ωa2 ,b2 ) = (F(a1 , b1 ) + F(a2 , b2 )), 2 2 2

F(a, b) = Fa,b (ωa,b ) =

where we used the fact that ωa1 ,b1 and ωa2 ,b2 are the minimizer for Fa1 ,b1 and Fa2 ,b2 respectively. The smoothness of F comes from the fact that the solution of the canonical variational principle depends smoothly upon a, b. By taking the derivative of F with respect to a we get ∂F ∂ Fa,b (ωa,b ) = = −I (ωa,b ). ∂a ∂a Here we have used the fact that the derivative of F with respect to ω evaluated in ωa,b vanishes, and the fact that the derivative of Fa,b with respect to the parameter a is given by I. In the same way we get ∂ F(a, b)/∂b = −E(ωa,b ). Finally, the concavity of Fa,b implies that ∂ I /∂a > 0, and that ∂ E/∂b > 0 again with the notation I (a, b) = I (ωa,b ), E(a, b) = E(ωa,b ). Now it remains to prove iii). Again, the concavity of F(a, b) implies the existence of the convex function S ∗ (I, E) defined in (A.6). Now we want to prove that S(I, E) = S ∗ (I, E). First of all let us notice that ¯ + bE ¯ + a¯ I, S ∗ (I, E) = sup (F(a, b) + bE + a I ) = F(a, ¯ b) a,b

where a, ¯ b¯ is the unique maximum point for S ∗ (I, E). Therefore, for any a, b, S ∗ (I, E) ≥ F(a, b) + bE + a I = S(ωa,b ) − b(E(ωa,b ) − E) − a(I (ωa,b ) − I ). Now, since F is concave and smooth, we know that for any I, E, there exists unique a, b such that I (ωa,b ) = I, and E(ωa,b ) = E. By choosing a, b in this way in the previous equation we get S ∗ (I, E) ≥ S(ωa,b ) ≥ S(I, E).

(A.25)

On the other hand, let ωk : k = 1, 2, ... be a minimizing sequence for S(I, E), and ω a limit point for it. By lower semicontinuity of S we know that S(ω) ≤ S(I, E). Therefore, for any a, b, S(I, E) ≥ S(ω) = S(ω) − bE(ω) − a I (ω) + bE(ω) + a I (ω) ≥ F(a, b) + bE(ω) + a I (ω) ≥ F(a, b) + bE + a I,

(A.26)

where we have used the continuity of E from which E(ω) = E, and the lower semicontinuity of I, from which I (ω) ≤ I.

676

E. Caglioti, M. Pulvirenti, F. Rousset

Since a, b are arbitrary in (A.26), we get S(I, E) ≥ S ∗ (I, E),. Since we have already proven (A.25) this yields S(I, E) = S ∗ (I, E). Finally let us notice that S(I, E) = S ∗ (I, E) = S(ωa, ¯ b¯ ), where a, ¯ b¯ is the unique minimum point for S ∗ (I, E), where the relation between a, b and I, E is smooth and bijective (equivalence of the ensembles).

References 1. Ambrosio, L., Gigli, N., Savaré, G.: Gradient flows in metric spaces and in the space of probability measures. Lectures in Mathematics ETH Zürich, Basel: Birkhäuser Verlag, 2005 2. Beckner, W.: Sharp Sobolev inequalities on the sphere and the Moser-Trudinger inequality. Ann. of Math. (2) 138(1), 213–242 (1993) 3. Blanchet, A., Dolbeault, J., Perthame, B.: Two-dimensional Keller-Segel model: optimal critical mass and qualitative properties of the solutions. Electron. J. Diff. Eqs. No. 44, 32 pp. (2006), (electronic) 4. Carlen, E.A., Gangbo, W.: Constrained steepest descent in the 2-Wasserstein metric. Ann. of Math. (2) 157(3), 807–846 (2003) 5. Chavanis, P.H.: Systematic drift experiences by a point vortex in two-dimensional turbulence. Phys. Rev. E 58, R1199 (1998) 6. Chavanis, P.H.: Kinetic theory of point vortices: diffusion coefficient and systematic drift. Phys. Rev. E 64, 016309 (2001) 7. Caglioti, E., Lions, P.-L., Marchioro, C., Pulvirenti, M.: A special class of stationary flows for two-dimensional Euler equations: a statistical mechanics description. Commun. Math. Phys. 143(3), 501–525 (1992) 8. Caglioti, E., Lions, P.-L., Marchioro, C., Pulvirenti, M.: A special class of stationary flows for twodimensional Euler equations: a statistical mechanics description. II. Commun. Math. Phys. 174(2), 229–260 (1995) 9. Carlen, E., Loss, M.: Competing symmetries, the logarithmic HLS inequality and Onofri’s inequality on S n. Geom. Funct. Anal. 2(1), 90–104 (1992) 10. Caglioti, E., Pulvirenti, M., Rousset, F.: On the constrained Navier-Stokes equation and intermediate asymptotics. Physica A, to appear 11. Eyink, G.L., Sreenivasan, K.R.: Onsager and the theory of hydrodynamic turbulence. Rev. Mod. Phys. 78(1), 87–136 (2006) 12. Gallay, T., Wayne, C.E.: Global stability of vortex solutions of the two-dimensional Navier-Stokes equation. Commun. Math. Phys. 255(1), 97–129 (2005) 13. Gogny, D., Lions, P.-L.: Sur les états d’équilibre pour les densités électroniques dans les plasmas. RAIRO Modél. Math. Anal. Numér. 23(1), 137–153 (1989) 14. Jäger, W., Luckhaus, S.: On explosions of solutions to a system of partial differential equations modelling chemotaxis. Trans. Amer. Math. Soc. 329(2), 819–824 (1992) 15. Kiessling, M.K.-H.: Statistical mechanics of classical particles with logarithmic interactions. Comm. Pure Appl. Math. 46(1), 27–56 (1993) 16. Kiessling, M.K.-H., Lebowitz, J.L.: The micro-canonical point vortex ensemble: beyond equivalence. Lett. Math. Phys. 42(1), 43–58 (1997) 17. Lungren, T.S., Pointin, Y.B.: Statistical mechanics of two-dimensional vortices in a bounded container. Phys. Fluids 19, 1459–1470 (1976) 18. Marchioro, C., Pulvirenti, M.: Hydrodynamics in two dimensions and vortex theory. Commun. Math. Phys. 84(4), 483–503 (1982) 19. Marchioro, C., Pulvirenti, M.: Vortex methods in two-dimensional fluid dynamics, vol. 203 of Lecture Notes in Physics. Berlin: Springer-Verlag, 1984 20. Marchioro, C., Pulvirenti, M.: Some considerations on the nonlinear stability of stationary planar Euler flows. Commun. Math. Phys. 100(3), 343–354 (1985) 21. Marchioro, C., Pulvirenti, M.: Mathematical theory of incompressible nonviscous fluids, vol. 96 of Applied Mathematical Sciences, New York: Springer-Verlag, 1994 22. Matthaeus, W.H., Stribling, T., Martinez, D., Oughton, S., Montgomery, D.: Selective decay and coherent vortices in two-dimensional incompressible turbulence. Phys. Rev. Lett. 66, 2731–2734 (1991) 23. Mikeli´c, A., Robert, R.: On the equations describing a relaxation toward a statistical equilibrium state in the two-dimensional perfect fluid dynamics. SIAM J. Math. Anal. 29, 5, 1238–1255 (1998), (electronic)

On a Constrained 2-D Navier-Stokes Equation

677

24. Miller, J.: Statistical mechanics of Euler equations in two dimensions. Phys. Rev. Lett. 65(17), 2137–2140 (1990) 25. Montgomery, D., Joyce, G.: Statistical mechanics of “negative temperature” states. Phys. Fluids 17, 1139–1145 (1971) 26. Naito, Y.: Symmetry results for semilinear elliptic equations in R2 . In: Proceedings of the Third World Congress of Nonlinear Analysts, Part 6 (Catania, 2000), vol. 47, (2001), pp. 3661–3670 27. Onsager, L.: Statistical hydrodynamics. Nuovo Cimento (9) 6, Supplemento, 2(Convegno Internazionale di Meccanica Statistica), 279–287 (1949) 28. Osada, H.: Limit points of empirical distributions of vortices with small viscosity. In: Hydrodynamic behavior and interacting particle systems (Minneapolis, Minn., 1986), vol. 9 of IMA Vol. Math. Appl., New York: Springer, 1987, pp. 117–126 29. Otto, F.: The geometry of dissipative evolution equations: the porous medium equation. Comm. Part. Diff. Eqs. 26(1-2), 101–174 (2001) 30. Pinsker, M.S.: Information and information stability of random variables and processes. San Francisco, CA: Holden-Day Inc., 1964 31. Rothaus, O.: Diffusion on compact Riemannian manifolds and logarithmic Sobolev inequalities. J. Funct. Anal. 42, 102–109 (1981) 32. Robert, R., Sommeria, J.: Statistical equilibrium states for two-dimensional flows. J. Fluid Mech. 229, 291–310 (1991) 33. Robert, R., Sommeria, J.: Relaxation towards a statistical equilibrium state in two-dimensional perfect fluid dynamics. Phys. Rev. Lett. 69(19), 2776–2779 (1992) 34. Stein, E.M.: Singular integrals and differentiability properties of functions. Princeton Mathematical Series, No. 30. Princeton, NJ: Princeton University Press, 1970 35. Villani, C.: Topics in optimal transportation, vol. 58 of Graduate Studies in Mathematics. Providence, RI: Amer. Math. Soc., 2003 Communicated by P. Constantin

Commun. Math. Phys. 290, 679–717 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0827-z

Communications in

Mathematical Physics

On the Helicity in 3D-Periodic Navier–Stokes Equations II: The Statistical Case Ciprian Foias1,2 , Luan Hoang3 , Basil Nicolaenko4 1 Department of Mathematics, 3368 TAMU, Texas A&M University,

College Station, TX 77843-3368, U.S.A.

2 Department of Mathematics, Indiana University, Bloomington, IN 47405, U.S.A. 3 Department of Mathematics and Statistics, Texas Tech University,

Box 41042, Lubbock, TX 79409-1042, U.S.A. E-mail: [email protected]

4 Department of Mathematics and Statistics, Arizona State University,

Tempe, AZ 85287-1804, U.S.A. Received: 4 August 2008 / Accepted: 25 February 2009 Published online: 7 May 2009 – © Springer-Verlag 2009

Our collaborator and friend Basil Nicolaenko passed away in September of 2007, after this work was completed. Honoring his contribution and friendship, we dedicate this article to him. Abstract: We study the asymptotic behavior of the statistical solutions to the Navier– Stokes equations using the normalization map [9]. It is then applied to the study of mean energy, mean dissipation rate of energy, and mean helicity of the spatial periodic flows driven by potential body forces. The statistical distribution of the asymptotic Beltrami flows are also investigated. We connect our mathematical analysis with the empirical theory of decaying turbulence. With appropriate mathematically defined ensemble averages, the Kolmogorov universal features are shown to be transient in time. We provide an estimate for the time interval in which those features may still be present. Contents 1. 2.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Deterministic solutions of the Navier–Stokes equations 2.2 Statistical solutions of the Navier–Stokes equations . . 3. Supplementary Properties of the Normalization Map . . . . 4. Asymptotic Behavior of the Mean Flows . . . . . . . . . . 5. Statistical Solutions with Initial Gaussian Measures . . . . 6. Asymptotic Beltrami Flows . . . . . . . . . . . . . . . . . 7. Some Generic Properties of VF Measures . . . . . . . . . . 8. A Connection to the Empirical Theory of Turbulence . . . . Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

679 682 682 683 685 690 696 703 707 711 714 716

1. Introduction This paper is the continuation of our previous work [5]. In that paper, we study the asymptotic behavior of the helicity associated with the deterministic solution of the

680

C. Foias, L. Hoang, B. Nicolaenko

Navier–Stokes equations. The current paper is our study of the asymptotic properties of the statistical distributions of the solutions of the Navier–Stokes equations, including the asymptotic behavior of the statistical dynamics of the helicity. In this paper, as in [5], we study the incompressible viscous flows which are periodic in the space variables and are driven by potential body forces. Then the velocity field u(x, t), x = (x1 , x2 , x3 ) ∈ R3 , satisfies the periodicity condition u(x, t) = u(x + Le j , t), x ∈ R3 , j = 1, 2, 3,

(1.1)

(where L > 0 is the spatial period and {e1 , e2 , e3 } is the standard basis of R3 ) as well as the Navier–Stokes equations ∂u(x, t) + (u · ∇)u(x, t) − νu(x, t) = −∇ p(x, t) − ∇ϕ(x, t), ∂t ∇ · u(x, t) = 0,

(1.2) (1.3)

where ν is the viscosity of the fluid, p is the pressure and ϕ is the potential of the body force; here we assume that the mass density is equal to one. Using the well-known remarkable Galilean invariance of the Navier–Stokes equations we also can (by a change of the reference system) consider only the flows satisfying the following zero space average condition: u(x, t)dx = 0, = (−L/2, L/2)3 , (1.4)

where dx = d x1 d x2 d x3 is the usual volume element in R3 . Recall that the curl of the velocity, i.e. ∇ × u, is usually called the vorticity of the flow and is denoted by ω. The kinetic energy/mass, the dissipation rate of energy/mass, and the helicity/mass are defined by 1 2 E(t) = |u(x, t)| dx, F(t) = |ω(x, t)|2 dx (1.5) 2 and, respectively,

H(t) =

u(x, t) · ω(x, t)dx.

(1.6)

Above, u · ω is the helicity density of the flow. For the physical importance of the helicity in fluid dynamics, see the pioneering work by Moffatt [18] and also other surveys on this topic (e.g. [19]). In our previous paper [5], we studied mainly the asymptotic behavior of the helicity and its connections with that of the energy which had been previously determined in [8–10]. Unlike the latter behavior, the former is quite sensitive to the presence of the inertial nonlinear terms in the Navier–Stokes equations. In this paper we will present the asymptotic behavior of the statistical dynamics of all of the above three quantities E(t), F(t) and H(t) for our type of flows. Our main concern is to identify the asymptotic prop2 2 2 erties of e2ν(2π/L) t E(t), e2ν(2π/L) t F(t) and e2ν(2π/L) t H(t), where · denotes some appropriate ensemble averages, whose rigorous mathematical definitions will be given in Sect. 8. We prove that they all have limits when t → ∞ and that generically these limits are not zero. One interesting feature in the numerical simulation of the turbulent flows is their statistical tendency to approximate Beltrami flows (see, e.g. [1,21]), i.e., the flows whose

On the Helicity in 3D-Periodic Navier–Stokes Equations II

681

velocity and vorticity are parallel. We show that at least asymptotically this tendency is not generic. Our rigorous results confirm the well known empirical and computational evidence that the Kolmogorov type estimates for the decaying turbulence of the spatially periodic flows lose validity for large times (even after rescaling of the physical entities in order to obtain a simulacrum of stationarity). However, as presented in Sect. 8 our methods yield some estimates of the length of the time interval in which the universal feature may still present. In our analysis, we use the mathematically defined statistical solutions of the Navier– Stokes equations both on the phase space and the trajectory space. Since this is the first time we develop the asymptotic theory for those statistical solutions, we extend our studies to both the energy and the energy dissipation rate. Another ingredient of our method is the use of the normalization map constructed for the regular solutions to the Navier–Stokes equations in [9,10]. At this stage, we focus on the first rate of decay of the solutions, hence the first component of that normalization map is used throughout. Because the natural space to study the statistical solutions is the space of weak solutions, we extend the definition of that component of normalization map to those solutions. This newly defined map turns out to be an essential tool in describing the asymptotic behavior of the statistical solutions of the Navier–Stokes equations. Particularly, it determines the limits of the ensemble averages referred to above. We also use this map to study the flows which are asymptotically Beltrami. Moreover, the asymptotic behavior of the mean flows is connected with the nonlinear manifold M1 ([8–10]) of the initial data u 0 such that the corresponding solution u(t) is regular for all t ≥ 0 and decays exponentially faster 2 than e−ν(2π/L) t . This paper is organized as follows. In Sect. 2, we present the functional settings of the Navier-Stokes equations, the asymptotic behavior of the deterministic solutions. The definitions of the statistical solutions both in the phase space and the trajectory space are recalled as well as their fundamental existence theorem. In Sect. 3, we extend the definition of the first component of the normalization map to the set of Leray-Hopf weak solutions. We prove some basic properties of that map. In Sect. 4, we study the asymptotic behavior of the mean energy, mean energy dissipation rate and mean helicity using the Vishik-Fursikov statistical solutions. For the latter two mean quantities, the moving averages in time are used to overcome the lack of regularity of the weak solutions of the Navier–Stokes equations. In Sect. 5, we construct some initial Gaussian probability measures to show that the asymptotic behavior of the above three mean quantities are not trivial. In Sect. 6, we focus on the solutions which are asymptotically Beltrami (see Definition 6.3). Different equivalent conditions for those flows are given. We prove the existence of a Vishik-Fursikov measure with Gaussian initial data which is not asymptotically Beltrami (see Definition 6.7). In Sect. 7, we first show the connections between the mean flows which decay faster 2 than e−ν(2π/L) t with the above nonlinear manifold M1 . We then prove the genericity of the mean flows with non-trivial energy or dissipation rate of energy or helicity, and of the flows which are not asymptotically Beltrami. In Sect. 8, we apply our study in the previous sections to the conventional theory of decaying turbulence. We show that the Kolmogorov type estimates are transient in time and estimate the time interval for which those estimates may still be valid. We obtain in Proposition 8.3 certain lower and upper bounds for the Kolmogorov quotient (see (8.7)). The Appendix provides various basic facts on the Navier–Stokes equations which are used in this paper.

682

C. Foias, L. Hoang, B. Nicolaenko

2. Preliminaries In this paper, we use the same notation as in [5]. We will briefly recall some of them in Subsect. 2.1. Subsection 2.2 consists of the definitions of statistical solutions — both in the phase space and the trajectory space — and their basic existence theorem. 2.1. Deterministic solutions of the Navier–Stokes equations. The initial value problem for the Navier–Stokes equations in the three-dimensional space R3 with a potential body force consists of Eqs. (1.2), (1.3) and the initial condition u(x, 0) = u0 (x),

(2.1)

where u0 (x) is the known initial velocity field. We consider only solutions u(x, t) that satisfy the periodicity condition (1.1) and the zero average condition (1.4). Let V be the set of all L-periodic trigonometric polynomials with values in R3 which are divergence-free and have zero average on (see (1.1), (1.3) and (1.4)). We define H = closure of V in L 2 ()3 and V = closure of V in H 1 ()3 . Denote by ·, · and | · | the inner product and norm in L 2 ()3 . (Note that we also use | · | for the length of vectors in R3 , but the context will clarify its meaning.) On V we consider the inner product ·, · and the norm · defined by u, v =

3 ∂u j (x) ∂v j (x) dx and u = u, u1/2 , ∂ xk ∂ xk

j,k=1

for u = u(·) = (u 1 , u 2 , u 3 ) and v = v(·) = (v1 , v2 , v3 ) in V . Let A = − be the Stokes operator on the domain D A = V ∩ H 2 ()3 . Let PL denote the orthogonal projector in L 2 ()3 onto H . We define B(u, v) = PL (u · ∇v) for all u, v ∈ D A . We denote by R the set of all initial value u 0 ∈ V such that there is a (unique) solution u(t), t > 0, satisfying du(t) dt + Au(t) + B(u(t), u(t)) = 0, t > 0, (2.2) u(0) = u 0 ∈ V, where the equation holds in H , and u(t) is continuous from [0, ∞) into V . Such u(t) is called a regular solution of the Navier–Stokes equations. A classical result (see, e.g., [13–15]) is that for any initial data u0 (x) in H there exists a weak solution u(x, t) defined for all x ∈ R3 and t > 0 which eventually becomes analytic in space and time (see also [4,7,11]), hence regular on [t0 , ∞) for some t0 ≥ 0. We denote by S(t), t ≥ 0, the semigroup generated by the regular solutions of the Navier–Stokes equations, i.e., S(t)u 0 , u 0 ∈ R, denotes the regular solution of (2.2). Throughout this paper, except for Sect. 8, we take L = 2π and ν = 1. The general case is easily recovered by a change of scales. Let σ (A) be the spectrum of the Stokes operator A. For n ∈ σ (A) we denote by Rn the orthogonal projection of H onto the eigenspace of the Stokes operator A associated to n. Let Rn = 0 for n ∈ σ (A). Let C = ∇× be the curl operator mapping V into H . For each n ∈ σ (A) we have Rn = Rn+ + Rn−

and

Rn H = Rn+ H ⊕ Rn− H,

(2.3)

On the Helicity in 3D-Periodic Navier–Stokes Equations II

683

where Rn+ , resp. Rn− , is the √ orthogonal√projection of H onto the eigenspace of the curl operator C associated to n, resp. (− n), and √ (2.4) Rn± H = {u ∈ H : Cu = ± nu}. It is easy to see that B(u, u) = 0 if u ∈ Rn+ H ∪ Rn− H, n ∈ σ (A).

(2.5)

Let us recall some known results on the asymptotic expansion of the regular solutions to the Navier–Stokes equations and its associated normalization map (see [8–10,12] for more details). For any u 0 ∈ R the regular solution u(t) has the asymptotic expansion u(t) ∼ q1 (t)e−t + q2 (t)e−2t + q3 (t)e−3t + · · · ,

(2.6)

where q j (t) is a V-valued polynomial in t. For any N ∈ N and m ∈ N one has u(t) −

N

q j (t)e− jt H m () = O e−(N +ε)t as t → ∞ for some ε = ε N ,m > 0.

j=1

The normalization map W is defined by W (u 0 ) = W1 (u 0 ) ⊕ W2 (u 0 ) ⊕ · · · , where W j (u 0 ) = R j q j (0) for j ∈ N. Then W is an one-to-one analytic mapping from R to the Frechet space S A = R1 H ⊕ R2 H ⊕ · · · endowed with the component-wise topology. One has W (0) = I d, that is W (0)u 0 = R1 u 0 ⊕ R2 u 0 ⊕ R3 u 0 ⊕ · · · .

(2.7)

For u 0 ∈ R \ {0}, there is an eigenvalue n 0 of A such that u(t)2 = n0 t→∞ |u(t)|2 lim

and

lim u(t)en 0 t = wn 0 (u 0 ) ∈ Rn 0 H \ {0}.

t→∞

(2.8)

In this case W j (u 0 ) = 0, q j = 0 for j < n 0 , and qn 0 = wn 0 (u 0 ) = Wn 0 (u 0 ) = 0. In particular, for n 0 = 1, we have the the following limits in V : W1 (u 0 ) = lim eτ u(τ ) = lim eτ R1 u(τ ). τ →∞

τ →∞

(2.9)

2.2. Statistical solutions of the Navier–Stokes equations. Definition 2.1. We denote by T the class of test functionals

(u) = φ(u, g1 , u, g2 , . . . , u, gk ), u ∈ H, for some k > 0, where φ is a C 1 function on Rk with compact support and g1 , g2 , . . . , gk are in V . Definition 2.2. A family {µt }t≥0 of Borel probability measures on H is called a statistical solution of the Navier–Stokes equations with the initial data µ0 if (i) the initial kinetic energy H |u|2 dµ0 (u) is finite; (ii) the function t → H ϕ(u)dµt (u) is measurable for every bounded and continuous function ϕ on H ;

684

C. Foias, L. Hoang, B. Nicolaenko

∞ ([0, ∞)); (iii) the function t → H |u|2 dµt (u) belongs to L loc 1 ([0, ∞)); 2 (iv) the function t → H u dµt (u) belongs to L loc (v) µt satisfies the Liouville equation t

(u)dµt (u) =

(u)dµ0 (u) − Au + B(u, u), (u)dµs (u)ds, H

H

0

H

(2.10) for all t ≥ 0 and ∈ T ; (vi) the following energy inequality holds: t 2 2 |u| dµt (u) + 2 u dµs (u)ds ≤ |u|2 dµ0 (u). H

0

H

(2.11)

H

Recall that for each u 0 ∈ H , there exists a Leray-Hopf weak solution u(t) of the Navier–Stokes equations with u(0) = u 0 (cf. [4,17,22]). This weak solution satisfies u ∈ C([0, ∞), Hweak ) ∩ L ∞ ((0, ∞), H ) ∩ L 2 ((0, ∞), V ). Additionally, let G = G(u(·)) = {t0 ≥ 0 : lim |u(t0 + τ ) − u(t0 )| = 0}, τ 0

then 0 ∈ G, the Lebesgue measure of [0, ∞)\G is zero and for any t0 ∈ G, t 2 |u(t)| + 2 u(s)2 ds ≤ |u(t0 )|2 , t ≥ t0 .

(2.12)

(2.13)

t0

Denote by the set of the Leray-Hopf weak solutions of the Navier–Stokes equations on [0, ∞). Hence ⊂ C([0, ∞), Hweak ). Definition 2.3. A statistical solution {µt }t≥0 of the Navier–Stokes equations in the sense of Definition 2.2 is called a Vishik-Fursikov (VF) statistical solution if there is a Borel probability measure µ, ˆ called the Vishik-Fursikov (VF) measure, on the space C([0, ∞), Hweak ), such that (i) µ( ) ˆ = 1; (ii) for each t ≥ 0, µt is the projection measure Prt µˆ on H , i.e.

(u)dµt (u) =

(v(t))d µ(v(·)), ˆ for all ∈ C(Hweak ). H

(2.14)

ˆ For convenience, we also call Pr0 µˆ the initial data of µ. The existence theorems of the statistical solutions are summarized in the following ([2,3,16]). Theorem 2.4. Let m be a Borel probability measure on H such that H |u|2 dm(u) is finite. Then there exists a VF statistical solution {µt }t≥0 with µ0 = m. Note that such VF statistical solution and VF measure in Theorem 2.4 are not necessarily unique. Remark 2.5. If u(·) ∈ then the Dirac measure δu(·) is a VF measure. If µˆ and mˆ are two VF measures, so is their convex combination (1 − θ )µˆ + θ m, ˆ for any θ ∈ (0, 1).

On the Helicity in 3D-Periodic Navier–Stokes Equations II

685

3. Supplementary Properties of the Normalization Map In this section, we first extend the definition of W1 (u 0 ), for u 0 ∈ R, to W1 (u(·)), for u(·) ∈ . This definition is more suitable for our study of the asymptotic behavior of the statistical solutions to the Navier–Stokes equations. Basic properties of W1 (u(·)) are derived, in particular its relations with the initial value u(0), hence showing the connections between the asymptotic and the initial values of the Leray-Hopf weak solutions. First, we prove an invariant property of the first component of the normalization map which leads to the extension of that component map later. We recall from (2.9) that for u 0 ∈ R, W1 (u 0 ) = lim eτ u(τ ) = lim eτ R1 u(τ ), τ →∞

τ →∞

where the limits are in V . Lemma 3.1. Let u(·) ∈ and t0 ≥ 0 such that u(t0 ) ∈ R. Then et W1 (u(t)) = et0 W1 (u(t0 )), t ≥ t0 .

(3.1)

Proof. If u(0) = u 0 ∈ R then S(t)u 0 ∈ R for t ≥ 0, W1 (u 0 ) = lim et+τ u(t + τ ) = et lim eτ S(τ )S(t)u 0 = et W1 (S(t)u 0 ). τ →∞

τ →∞

(3.2)

In general, when u 0 ∈ H , let t0 ≥ 0 such that u(t0 ) is small in V and hence belongs to R. By (3.2), for τ ≥ 0 and t = τ + t0 ≥ t0 , W1 (u(t0 )) = eτ W1 (S(τ )u(t0 )) = e−t0 et W1 (u(t)), thus proving (3.1).

(3.3)

Remark 3.2. For the existence and estimate of the above t0 see, e.g., Lemmas A.1 and A.2 below. Definition 3.3. Let u(·) ∈ . By virtue of Lemma 3.1, we define W1 (u(·)) = et0 W1 (u(t0 )),

(3.4)

where t0 ≥ 0 such that u(t0 ) ∈ R. We then have the following equivalent definition of W1 (u(·)) which does not involve t0 explicitly: W1 (u(·)) = et0 W1 (u(t0 )) = et0 lim eτ S(τ )u(t0 ) = lim et u(t), τ →∞

t→∞

(3.5)

where the limit is taken in V . Similarly, using the second limit in (2.9) we have W1 (u(·)) = lim et R1 u(t). t→∞

(3.6)

Note that if u 0 = u(0) ∈ R, then t0 = 0 and W1 (u(·)) = W1 (u 0 ). Thus W1 (u(·)) is an extension of W1 (u 0 ), u 0 ∈ R. The following is a simple bound of W1 (u(·)) in terms of the values u(t), t ∈ G(u(·)), in particular, the initial value u(0).

686

C. Foias, L. Hoang, B. Nicolaenko

Lemma 3.4. Let u(·) ∈ . Then |W1 (u(·))| ≤ et |u(t)|, t ∈ G(u(·)).

(3.7)

|W1 (u(·))| ≤ |u(0)|.

(3.8)

In particular,

Proof. Let t ∈ G(u(·)) and τ ≥ t. It follows from (A.5) that eτ |u(τ )| ≤ eτ e−(τ −t) |u(t)| = et |u(t)|. Hence |W1 (u(·))| = limτ →∞ eτ |u(τ )| ≤ et |u(t)|.

Remark 3.5. The estimate (3.8) gives an upper bound for |W1 (u(·))|/|u(0)|. However, there is no positive lower bound for the quotient. Indeed, there is a sequence of solutions u n (·) with nonzero W1 (u n (·)), such that |W1 (u n (·))| = 0. n→∞ |u n (0)| lim

Proof. Let u n0 = ξ1 + nξ4 , where ξ j ∈ R j H, j = 1, 4, such that |ξ1 | = |ξ4 | = 1 and B(ξ1 , ξ1 ) = B(ξ4 , ξ4 ) = B(ξ1 , ξ4 ) = B(ξ4 , ξ1 ) = 0. For instance, we can take ξ1 =

e2 2(2π )3

(eie1 ·x + e−ie1 ·x ), ξ4 =

e2 2(2π )3

(e2ie1 ·x + e−2ie1 ·x ).

Then u n (t) = ξ1 e−t + ξ4 e−4t , n ∈ N, are the corresponding regular solutions with initial data u n0 . We thus have |W1 (u n (·))| |ξ1 | 1 = =√ → 0, n → ∞. n |u (0)| |ξ1 + nξ4 | 1 + n2 In the case |W1 (u(·)| attains its maximum value |u(0)|, we have the following maximum principle. Proposition 3.6. Let u(·) ∈ and u(0) = u 0 . If |W1 (u(·))| = |u 0 |, then u 0 ∈ R, u 0 = W1 (u 0 ) and u(t) = u 0 e−t for all t ≥ 0. Proof. By Lemma 3.4 and (A.6), we have |u 0 | = |W1 (u(·))| = et |u(t)|, for t ∈ G(u(·)). Let I = (t0 , t0 + s), s ∈ (0, ∞] be an interval of regularity of u(·). Then |u(t)|2 = e−2t |u 0 |2 for t ∈ I , hence d|u(t)|2 + 2|u(t)|2 = 0, t ∈ I. dt Comparing with the energy balance equation d|u(t)|2 + 2u(t)2 = 0, t ∈ I, dt

On the Helicity in 3D-Periodic Navier–Stokes Equations II

687

we infer that u(t) = |u(t)| for all t ∈ I . Hence u(t) ∈ R1 H in any interval of regularity. Due to its weak continuity, u(t) ∈ R1 H for all t ∈ [0, ∞). Consequently, one can check that B(u(t), u(t)) ∈ R2 H for all t ∈ [0, ∞). In any interval of regularity, du(t)/dt + Au(t) = −B(u(t), u(t)) which belongs to both R1 H and R2 H . This is possible only if both sides of the equation are zero. The weak continuity of u(·) now implies that u(t) = u 0 e−t for all t ≥ 0, hence W1 (u(·)) = u 0 . Another consequence of Lemma 3.4 is the following. Corollary 3.7. Let u(·) ∈ , then for t ≥ 0 and T > 0 we have 1 |W1 (u(·))| ≤ T

2

t+T

e2τ |u(τ )|2 dτ ≤ |u(0)|2 ,

(3.9)

t

and

t+T

u(τ )2 dτ ≤

t

e−2t e−2(t+T ) |u(0)|2 − |W1 (u(·))|2 . 2 2

(3.10)

Proof. The first inequality of (3.9) comes from (3.7) and the fact that G(u(·)) is dense in [0, ∞). The second inequality of (3.9) is from (A.6). For t0 , t0 ∈ [t, t + T ] ∩ G(u(·)) such that t0 < t0 , we have from the inequalities (2.13), (A.6) and (3.7) that

t0

t0

u(τ )2 dτ ≤

e−2t0 e−2t0 |u(0)|2 − |W1 (u(·))|2 . 2 2

Then (3.10) follows by taking t0 t and t0 t + T .

Note that (3.10) is a slightly better estimate than (A.7). It follows from (2.7) that the map W1 : R → S A is differentiable at 0 and W1 (0)u = R1 u for all u ∈ V . Noting that W1 (0) = 0, we then have |W1 (u) − R1 u| = o(u),

for u → 0.

The following lemma provides an explicit estimate for |W1 (u) − R1 u| in terms of |u|2 . This approximation of W1 (u) by R1 u using the quadratic term |u|2 when |u| → 0 will be exploited in Sects. 4 and 6. But first let us note from (A.3) that |R1 B(u, v), w| = |B(u, v), R1 w| = | − B(u, R1 w), v| ≤ c3 |u| |v| |A R1 w|1/2 |A3/2 R1 w|1/2 . Since |R1 w| = |A R1 w| = |A3/2 R1 w|, we obtain |R1 B(u, v)| ≤ c3 |u| |v|, u ∈ V, v ∈ D A .

(3.11)

688

C. Foias, L. Hoang, B. Nicolaenko

Lemma 3.8. Let u(·) ∈ , then |R1 u(0) − W1 (u(·))| ≤ c3 |u(0)|2 .

(3.12)

|et R1 (u(t)) − W1 (u(·))| ≤ c3 et |u(t)|2 , t ∈ G(u(·)).

(3.13)

More generally,

Consequently, when eventually u(t0 ) ∈ R then |R1 (u(t)) − W1 (u(t))| ≤ c3 |u(t)|2 , t ≥ t0 .

(3.14)

Proof. We have d R1 u + R1 u = −R1 B(u, u), dt whence t

t

e R1 u(t) = e R1 u(t ) + t

eτ R1 B(u(τ ), u(τ ))dτ, t > t ≥ 0.

t

Let ξ1 = W1 (u(·)) and t ∈ G(u(·)). Using (3.11) and (A.6), we derive

|et R1 u(t) − ξ1 | ≤ |et R1 u(t ) − ξ1 | +

t

eτ c3 e−2(τ −t) |u(t)|2 dτ

t

≤ |et R1 u(t ) − ξ1 | + c3 |u(t)|2 e2t (e−t − e−t )

≤ |et R1 u(t ) − ξ1 | + c3 et |u(t)|2 . Letting t → ∞ gives (3.13), by (3.6). Also, when u(t0 ) ∈ R and t ≥ t0 , ξ1 = et W1 (u(t)), hence (3.14) follows. By setting t = 0 in (3.13), we obtain (3.12). According to Remark 3.5, the quotient |W1 (u(·))|/|u(0)| is not bounded below by a positive constant in general. However, Lemma 3.8 immediately shows that this can be the case for u(0) belonging to some “cones” in H near the origin. Corollary 3.9. Given θ ∈ (0, 1), there are positive numbers α1 and α2 such that if u(·) ∈ satisfies |u(0)| ≤ α2 and |u(0) − R1 u(0)| ≤ α1 |u(0)| then |W1 (u(·))| ≥ θ |u(0)|.

(3.15)

Proof. By Lemma 3.8, |W1 (u(·))| ≥ |R1 u(0)| − |R1 u(0) − W1 (u(·))| ≥ |u(0)| − |u(0) − R1 u(0)| − c3 |u(0)|2 ≥ (1 − α1 − c3 |u(0)|)|u(0)|. Then (3.15) follows with α1 < 1 − θ and α2 = (1 − θ − α1 )/c3 .

On the Helicity in 3D-Periodic Navier–Stokes Equations II

689

The conditions in Corollary 3.9 require small |u(0)|. We show below that the conclusion in Corollary 3.9 still holds for u in a V -neighborhood of a special unbounded set B1 in H . Let B1 = {u ∈ R1 H : u = 0, B(u, u) = 0}.

(3.16)

By (2.5), the set B1 contains R1+ H ∪ R1− H , hence is not empty. Also, if u ∈ B1 , then e−t u is the regular solution with initial data u, hence W1 (u) = u. Proposition 3.10. Let u ∗ ∈ B1 and θ ∈ (0, 1). There exists ε2 = ε2 (|u ∗ |, θ ) > 0 such that if v0 ≤ ε2 then u 0 = u ∗ + v0 ∈ R and θ |u 0 | ≤ |W1 (u 0 )| ≤ |u 0 |.

(3.17)

Proof. Let ε(|u ∗ |) be defined as in Lemma A.3, ε1 = min

|u ∗ | , ε(|u ∗ |) 2

and

(1 − θ )(|u ∗ | − ε1 ) , (3.18) ε2 = min ε1 , ∗ 1 + ec3 |u |

where c3 > 0 is given in the Appendix. Since v0 ≤ ε1 , we have u 0 ∈ R according to Lemma A.3. Let u(t) = S(t)u 0 and v(t) = u(t) − e−t u ∗ . By (A.12), we obtain ∗

et |u(t)| ≥ |u ∗ | − et |v(t)| ≥ |u ∗ | − |v0 |ec3 |u | . It follows that ∗

|W1 (u 0 )| = lim et |u(t)| ≥ |u ∗ | − |v0 |ec3 |u | , t→∞

and we obtain ∗

∗

|W1 (u 0 )| ≥ |u 0 | − |v0 | − |v0 |ec3 |u | = |u 0 | − |v0 |(1 + ec3 |u | ).

(3.19)

Note that |u 0 | ≥ |u ∗ | − ε1 > 0, then v0 ≤ ε2 implies |v0 | ≤

(1 − θ )|u 0 | ∗ , 1 + ec3 |u |

hence (3.19) yields |W1 (u 0 )| ≥ |u 0 | − (1 − θ )|u 0 | = θ |u 0 |.

Lemma 3.11. The function F : u(·) ∈ → W1 (u(·)) is Borel measurable. Consequently, F is µ-measurable ˆ for any VF measure µ. ˆ Proof. We have for each t ≥ 0 that the function Ft : u(·) ∈ → et u(t) ∈ H is weakly continuous, hence Hweak -Borel measurable. Since the Borel sets of Hweak are the same as those of H , the function Ft is (strongly) Borel measurable. The fact that W1 (u(·)) = limt→∞ et u(t) implies that F is Borel measurable.

690

C. Foias, L. Hoang, B. Nicolaenko

4. Asymptotic Behavior of the Mean Flows In this section, let µˆ be a VF measure on the trajectory space (see Definition 2.3) and let {µt }t≥0 be the family of its projections which is a statistical solution on the phase space H (see Definition 2.2). Recall that µ0 satisfies the finite initial energy condition: 2 |u(0)| d µ(u(·)) ˆ = |u|2 dµ0 (u) < ∞. (4.1)

H

This condition, (3.8) and the fact that W1 (u(·)) ∈ R1 H imply 2 2 W1 (u(·)) d µ(u(·)) ˆ = |W1 (u(·))| d µ(u(·)) ˆ ≤ |u(0)|2 d µ(u(·)) ˆ < ∞.

We first describe the asymptotic behavior of the mean energy. Proposition 4.1. We have

|u|2 dµt (u) =

lim e2t

t→∞

H

|W1 (u(·))|2 d µ(u(·)). ˆ

(4.2)

ˆ Since e2t |u(t)|2 ≤ |u(0)|2 , Proof. First, e2t H |u|2 dµt (u) = e2t |u(t)|2 d µ(u(·)). 2 ˆ < ∞, applying Lebesgue’s dominated convergence by (A.5), and |u(0)| d µ(u(·)) theorem gives 2t 2 2t 2 |u| dµt (u) = lim e |u(t)| d µ(u(·)) ˆ = |W1 (u(·))|2 d µ(u(·)). ˆ lim e t→∞

t→∞

H

u(t)2 d µ(u(·)) ˆ

For the mean energy dissipation rate, is only defined almost everywhere on (0, ∞). However, by virtue of Lemma A.2, we can study the asymptotic behavior of the mean energy dissipation rate on the set of solutions with uniformly bounded initial values in H . More precisely, we obtain: Lemma 4.2. For any r > 0, we have lim e2t u(t)2 d µ(u(·)) ˆ = t→∞ {u(·)∈ :|u(0)|
{u(·)∈ :|u(0)|
W1 (u(·))2 d µ(u(·)) ˆ (4.3)

and lim

t→∞ {u(·)∈ :|u(0)|
e2t H(u(t))d µ(u(·)) ˆ =

{u(·)∈ :|u(0)|
H(W1 (u(·)))d µ(u(·)). ˆ (4.4)

Proof. By virtue of Lemma A.2, there is t (r ) > 0 such that for any u(·) ∈ with |u(0)| < r we have u(t) ∈ R and et u(t) ≤ 2e|u(0)|, t ≥ t (r ). Note that the integrals on the left-hand sides of (4.4) and (4.4) are well-defined for t ≥ t (r ). Then noting that |u(0)|2 d µ(u(·)) ˆ ≤ |u(0)|2 d µ(u(·)) ˆ < ∞, {u(·)∈ :|u(0)|

we apply Lebesgue’s dominated convergence theorem.

On the Helicity in 3D-Periodic Navier–Stokes Equations II

691

Since H u2 dµt (u) is not known to be defined for all t ∈ [t0 , ∞) for some t0 ≥ 0, we can not obtain the same result as Proposition 4.1 for the dissipation rate of energy. However, the energy inequality (2.11) suggests the consideration of the moving average t+T in time T1 t e2s H u2 dµs (u)ds and its limit as t → ∞. We also consider similar moving averages of the mean energy and helicity. Proposition 4.3. For any T > 0, we have 1 t+T 2s lim e |u|2 dµs (u)ds = |W1 (u(·))|2 d µ(u(·)), ˆ t→∞ T t H 1 t+T 2s e u2 dµs (u)ds = W1 (u(·))2 d µ(u(·)), ˆ lim t→∞ T t H and 1 lim t→∞ T

t+T

e

(4.6)

H(u)dµs (u)ds =

2s H

t

(4.5)

H(W1 (u(·)))d µ(u(·)), ˆ

(4.7)

where H(u) = Cu, u, for u ∈ V . Proof. Fix T > 0. For the mean energy, (4.5) is a consequence of Proposition 4.1. We prove (4.6) next. For t ≥ 0 and r > 0, let 1 t+T e2s u2 dµs (u) = I1 (t, r ) + I2 (t, r ), I (t) = T t H where 1 T

I1 (t, r ) =

1 I2 (t, r ) = T Also, let

t+T

t+T

{u∈H :|u|
t

t

{u∈H :|u|≥r }

e2s u2 dµs (u), e2s u2 dµs (u).

J=

W1 (u(·))2 d µ(u(·)) ˆ = J1 (r ) + J2 (r ),

where

J1 (r ) = J2 (r ) =

{u(·)∈ :|u(0)|
W1 (u(·))2 d µ(u(·)), ˆ W1 (u(·))2 d µ(u(·)). ˆ

Then |I (t) − J | ≤ |I1 (t, r ) − J1 (r )| + I2 (t, r ) + J2 (r ). First, by Lemma 3.4, we have J2 (r ) ≤

{u(·)∈ :|u(0)|≥r }

(4.8)

|u(0)| d µ(u(·)) ˆ = 2

{u∈H :|u|≥r }

|u|2 dµ0 (u).

(4.9)

692

C. Foias, L. Hoang, B. Nicolaenko

Second, by using Fubini’s theorem, I2 (t, r ) =

1 T

1 = T ≤

1 T

t+T

{u(·)∈ :|u(0)|≥r } t+T

t

{u(·)∈ :|u(0)|≥r } t

{u(·)∈ :|u(0)|≥r }

e2s u(s)2 d µ(u(·))ds ˆ e2s u(s)2 dsd µ(u(·)) ˆ

e2(t+T )

t+T

u(s)2 dsd µ(u(·)). ˆ

t

Using (A.7), we continue to estimate I2 (t, r ) ≤ =

e2T T e2T 2T

{u(·)∈ :|u(0)|≥r }

{u∈H :|u|≥r }

1 |u(0)|2 d µ(u(·)) ˆ 2

|u|2 dµ0 (u).

Given ε > 0. By (4.1), there is r = r (ε) > 0 such that e2T 2T

{u∈H :|u|≥r }

|u|2 dµ0 (u) < ε/3,

hence J2 (r ) < ε/3 and I2 (t, r ) < ε/3, t ≥ 0.

(4.10)

By Lemma 4.2, there is t0 = t0 (r ) ≥ 0 such that for all s ≥ t0 ,

e u(s) d µ(u(·)) ˆ − J1 (r )

< ε/3. 2s

{u(·)∈ :|u(0)|
2

Hence |I1 (t, r ) − J1 (r )| ≤

1 T

1 < T

t+T

{u(·)∈ :|u(0)|
t

t+T

e2s u(s)2 d µ(u(·)) ˆ − J1 (r )

ds

ε/3ds = ε/3.

(4.11)

t

Combining (4.8), (4.10) and (4.11), we have that for all t ≥ t0 , |I (t) − J | < ε/3 + ε/3 + ε/3 = ε, thus proving (4.6). For the mean helicity, the proof of (4.7) is similar.

On the Helicity in 3D-Periodic Navier–Stokes Equations II

693

Motivated by the existence of the limits in Proposition 4.3 we now study the following ensemble averages of the energy, energy dissipation rate and helicity: 1 t+T 1 t+T |u|2 dµs (u)ds, u2 dµs (u)ds and T t T t H H (4.12) 1 t+T H(u)dµs (u)ds. T t H The following is a direct consequence of Proposition 4.3 and the elementary fact that if f is a measurable function on some interval (c, ∞) such that lim e2t f (t) = a ∈ R,

t→∞

then e2t t→∞ T

lim

t+T

f (s)ds =

t

1 − e−2T a 2T

(4.13)

for any fixed T > 0. Corollary 4.4. We have for any T > 0 that e2t t+T 1 − e−2T |u|2 dµs (u)ds = |W1 (u(·))|2 d µ(u(·)), ˆ lim t→∞ T 2T H t e2t t+T 1 − e−2T lim u2 dµs (u)ds = |W1 (u(·))|2 d µ(u(·)), ˆ t→∞ T 2T H t

(4.14) (4.15)

and e2t t→∞ T

t+T

H(u)dµs (u)ds =

lim

H

t

1 − e−2T 2T

H(W1 (u(·)))d µ(u(·)). ˆ

(4.16)

Remark 4.5. The limits in Corollary 4.4 yield the lower and upper bounds for the ensemble averages in (4.12) when t is large. However, we need later bounds valid for all t ≥ 0, namely, 1 t+T −2(t+T ) 2 e |W1 u(·))| d µ(u(·)) ˆ ≤ |u|2 dµτ (u)dτ T t H |u|2 dµ0 (u), (4.17) ≤ e−2t e−2(t+T )

H t+T

1 u2 dµτ (u)dτ T t H e−2t ≤ |u|2 dµ0 (u) 2T H e−2(t+T ) − |W1 (u(·))|2 d µ(u(·)), ˆ (4.18) 2T

|W1 u(·))|2 d µ(u(·)) ˆ ≤

for T > 0 and t ≥ 0. They follow readily from (3.10) and (3.9).

694

C. Foias, L. Hoang, B. Nicolaenko

According to Proposition 4.1, one can understand the asymptotic behavior of the mean energy by studying |W1 (u(·))|2 d µ(u(·)). ˆ However, there is yet no explicit way to find W1 (u(·)) and µ. ˆ Fortunately, W1 (u(·)) is related to R1 u(0) by (3.12). Therefore, in some cases, we can reduce our study to H |R1 u|2 dµ0 (u) which only involves the initial measure µ0 and the finite rank projection R1. Similarly, the study of the asymptotic behavior of the mean helicity can be reduced to H H(R1 u)dµ0 (u). ˆ using µ0 . To start, we derive some bounds for |W1 (u(·))|2 d µ(u(·)) Proposition 4.6. We have 2 2 |u| dµ0 (u) ≤ |W1 (u(·))| d µ(u(·)) ˆ ≤ |u|2 dµ0 (u). R1+ H ∪R1− H

(4.19)

H

Proof. The second half of (4.19) follows from (3.8) and (2.14). If u 0 belongs to R1+ H or R1− H , then B(u 0 , u 0 ) = 0 and hence the corresponding solution is u(t) = u 0 e−t , which implies W (u(·)) = u 0 . Therefore, |W1 (u(·))|2 d µ(u(·)) ˆ ≥ |W1 (u(·))|2 d µ(u(·)) ˆ

= =

thus yields the first half of (4.19).

{u(·)∈ :u(0)∈R1+ H ∪R1− H } {u(·)∈ :u(0)∈R1+ H ∪R1− H } R1+ H ∪R1− H

|u(0)|2 d µ(u(·)) ˆ

|u|2 dµ0 (u),

Next, we want to find some sufficient conditions in order that |W1 (u(·))|2 d µ(u(·)) ˆ

= 0 or H(W1 (u(·)))d µ(u(·)) ˆ

= 0.

ˆ is positive whenNote from Proposition 4.6 that the integral |W1 (u(·))|2 d µ(u(·)) ever R + H ∪R − H |u|2 dµ0 (u) is positive. However, the latter condition does not hold even 1 1 when µ0 is a Gaussian measure on R1 H . Therefore we need to study other criteria which cover more classes of measures. We turn to a statistical version of (3.12) and its similar estimate for the helicity. Lemma 4.7. We have |R1 u(0) − W1 (u(·))|d µ(u(·)) ˆ ≤ c3 |u|2 dµ0 (u),

and for any r > 0,

(4.20)

H

where

|H(R1 u(0)) − H(W1 (u(·)))|d µ(u(·)) ˆ ≤ Ir ,

Ir = 2c3 r

(4.21)

|u| dµ0 (u) + 4 2

{u∈H :|u|
{u∈H :|u|≥r }

|u|2 dµ0 (u).

(4.22)

On the Helicity in 3D-Periodic Navier–Stokes Equations II

695

Proof. The inequality (4.20) follows directly from (3.12): 2 |R1 u(0) − W1 (u(·))|d µ(u(·)) ˆ ≤ c3 |u(0)| d µ(u(·)) ˆ = c3 |u|2 dµ0 (u).

H

Note that |CR1 u| = R1 u = |R1 u|. For the helicity, |H(R1 u(0)) − H(W1 (u(·)))|d µ(u(·)) ˆ ≤ |CR1 u(0) − CW1 (u(·))||R1 u(0)|d µ(u(·)) ˆ + |CW1 (u(·))||R1 u(0) − W1 (u(·))|d µ(u(·)) ˆ ≤2 |u(0)||R1 u(0) − W1 (u(·))|d µ(u(·)) ˆ + ˆ ≤2 |u(0)||R1 u(0) − W1 (u(·))|d µ(u(·)). {u(·)∈ :|u(0)|
{u(·)∈ :|u(0)|≥r }

Using (3.12) for the integral on {u(·) ∈ : |u(0)| < r }, and using (3.8) for the integral on {u(·) ∈ : |u(0)| ≥ r }, we obtain |H(R1 u(0)) − H(W1 (u(·)))|d µ(u(·)) ˆ 3 ≤ 2c3 |u(0)| d µ(u(·)) ˆ +4 |u(0)|2 d µ(u(·)) ˆ {u(·)∈ :|u(0)|
{u∈H :|u|≥r }

Using Lemma 4.7, we establish some sufficient conditions under which the integral ˆ or H(W1 (u(·)))d µ(u(·)) ˆ does not vanish. |W1 (u(·))|d µ(u(·)) Corollary 4.8. We have the following: ˆ > 0 and (i) If H |R1 u|dµ0 (u) > c3 H |u|2 dµ0 (u), then |W1 (u(·))|d µ(u(·)) 2 d µ(u(·)) |W (u(·))| ˆ > 0. subsequently, 1 (ii) If H H(R1 u)dµ0 (u) > Ir , resp. H H(R1 u)dµ0 (u) < −Ir , for some r > 0, where Ir is defined by (4.22), then H(W1 (u(·)))d µ(u(·)) ˆ > 0, resp. H(W1 (u(·)))d µ(u(·)) ˆ < 0.

Proof. By (4.20), we have |W1 (u(·))|d µ(u(·)) ˆ ≥ |R1 u(0)|d µ(u(·)) ˆ − |R1 u(0) − W1 (u(·))|d µ(u(·)) ˆ ≥ |R1 u|dµ0 (u) − c3 |u|2 dµ0 (u), H

H

696

C. Foias, L. Hoang, B. Nicolaenko

hence obtaining (i). The proof of (ii) follows from (4.21) and the following triangular inequalities: H(W1 (u(·)))d µ(u(·)) ˆ ≥ H(R1 u(0))d µ(u(·)) ˆ − |H(R1 u(0)) − H(W1 (u(·)))|d µ(u(·)), ˆ H(W1 (u(·)))d µ(u(·)) ˆ ≤ H(R1 u(0))d µ(u(·)) ˆ + |H(W1 (u(·))) − H(R1 u(0))|d µ(u(·)). ˆ

5. Statistical Solutions with Initial Gaussian Measures In this section, we focus on VF statistical solutions of the Navier–Stokes equations with initial Gaussian probability measures. In particular, we will construct some Gaussian measures on H to which we can apply Corollary 4.8 to obtain |W1 (u(·))|2 d µ(u(·)) ˆ > 0 or H(W1 (u(·)))2 d µ(u(·)) ˆ

= 0

for any VF measure µˆ having one of those Gaussian measures as the initial data. Example 5.1. Let N1 (= 12) be the dimension of R1 H and {w j , j = 1, . . . , N1 } be an orthonormal basis in R1 H and {wn , n > N1 } be an orthonormal basis in (I − R1 )H . For each u of H , we write u=

∞

xjwj.

(5.1)

j=1

Let µ be a Gaussian probability measure on H such that the density of the distribution of the random variable x j is given by

√ 1 2π σ j

x2

exp − 2σj2 , with σ j > 0 for all j ∈ N j

and the random variable x j , j ∈ N, are independent, (see e.g. [23]). Of course, the σ j must satisfy the condition ∞

σ j2 < ∞.

j=1

The variance of µ is σ2 =

|u|2 dµ(u) = H

This measure satisfies the following:

∞ j=1

σ j2 .

(5.2)

On the Helicity in 3D-Periodic Navier–Stokes Equations II

Lemma 5.2. For every r > 0, we have 2 |u| dµ(u) ≤ {u∈H :|u|≥r }

where σ 2 =

N1

{u∈H :|R1 u|≥r/2}

2 j=1 σ j

{u∈H :|u|≥r }

|u|2 dµ(u) ≤

4σ 2 σ 2 + σ 2, r2

(5.3)

{u∈H :|u|≥r }

|R1 u|2 dµ(u) +

{u∈H :|R1 u|≥r/2}

|R1 u|2 dµ(u) +

|(I − R1 )u|2 dµ(u) H

and

|R1 u|2 dµ(u) +

and σ 2 = σ 2 − σ 2 .

Proof. Note that

≤

697

{u∈H :|(I −R1 )u|≥r/2}

|R1 u|2 dµ(u) + σ 2

|R1 u|2 dµ(u) ≤ |R1 u|2 dµ(u) µ({u ∈ H : |(I − R1 )u| ≥ r/2}) H |(I − R1 )u|2 dµ(u) 4σ 2 σ 2 2 {u∈H :|(I −R1 )u|≥r/2} ≤ . ≤σ 2 (r/2) r2

{u∈H :|(I −R1 )u|≥r/2}

Thus (5.3) follows.

√ Proposition 5.3. Let 0 < < 1/(c3 2π N1 ) and µ be the Gaussian probability measure defined in Example 5.1 with σ j = , j = 1, . . . , N1 , and σ 2 = 2N1 2 . For any VF measure µˆ with initial data µ, we have |W1 (u(·))|2 d µ(u(·)) ˆ > 0. Proof. From (5.2), we have

|u|2 dµ(u) = 2N1 2 .

(5.4)

H

Also,

Hence

1 |R1 u|dµ(u) ≥ √ (|x1 | + |x2 | + · · · + |x N1 |) N 1 R N1 H N1 x 2j 1 × exp − 2 d x1 . . . d x N1 √ 2σ j 2π σ j j=1 √ N1 2 = √ σj. π N1 j=1

√ √ 2 2N1 |R1 u|dµ(u) ≥ √ N1 = √ . π π N H 1 Thus, H |R1 u|dµ(u) > c3 H |u|2 dµ(u). Then we apply Corollary 4.8.

(5.5)

(5.6)

698

C. Foias, L. Hoang, B. Nicolaenko

bound of is slightly smaller, we obtain a lower bound for the integral If the upper 2 d µ(u(·)) |W (u(·))| ˆ which is comparable with the initial kinetic energy. 1 √ Corollary 5.4. Let µ and µˆ be the measures in Proposition 5.3. If 0< < 1/(2c3 2π N1 ), then 1 2 2 |u| dµ(u) < |W1 (u(·))| d µ(u(·)) ˆ ≤ |u|2 dµ(u). (5.7) 4π H H Proof. The second inequality of (5.7) is from Proposition 4.6. For the first inequality, we use (4.20), (5.6) and (5.4) to have 2 |W1 (u(·))|2 d µ(u(·)) ˆ ≥ |W1 (u(·))|d µ(u(·)) ˆ

≥

2

|R1 u|dµ(u) − c3

H

|u| dµ(u) 2

H 2

√ 2 N1 ≥ √ − 2N1 c3 2 2π 1 2 N1 = > |u|2 dµ(u). 2π 4π H Remark 5.5. It is already known that |u|2 dµt (u) ≤ e−2t |u|2 dµ0 (u), t ≥ 0. H

(5.8)

H

If µ0 = µ is a measure satisfying the conditions in Corollary 5.4, then Proposition 4.1 and Corollary 5.4 now imply that 1 −2t e |u|2 dµt (u) ≥ |u|2 dµ0 (u), t ≥ t0 , (5.9) 4π H H for some t0 ≥ 0. Example 5.6. We consider the Gaussian measure µ defined in Example 5.1. To find µ0 = µ that satisfies the condition in Corollary 4.8, we will be more specific in choosing the orthonormal system w1 , w2 , . . . , w N1 in R1 H . First we recall that R1+ , resp. R1− , is the orthogonal projection of H onto the eigenspace of the curl operator C corresponding to the eigenvalue 1, resp. (−1). Since dim R1+ H = dim R1− H = N = N1 /2 = 6, we choose {w1 , . . . , w N } to be an orthonormal basis in R1+ H and {w N +1 , . . . , w2N } to be one in R1− H . Then H(R1 u)dµ(u) = |R1+ u|2 dµ(u) − |R1− u|2 dµ(u) = σ+2 − σ−2 , (5.10) H

where

σ+2

=

N

2 j=1 σ j

H

and

σ−2

=

2N

H

2 j=N +1 σ j .

We will find a Gaussian measure that satisfies part (ii) of Corollary 4.8. For that purpose, we need to estimate the quantity Ir defined by (4.22).

On the Helicity in 3D-Periodic Navier–Stokes Equations II

699

Lemma 5.7. Let r > 0, δ ∈ (0, 1) and µ be the measure constructed in Example 5.1 with 0 < σ j ≤ r/(2N1 Mδ ), j = 1, 2, . . . , N1 , where Mδ > 0 satisfies 1 √ 2π

{t∈R:|t|≥Mδ }

t 2 e−t

2 /2

dt = δ.

(5.11)

Let Ir be defined by (4.22) with µ0 = µ. Then Ir ≤ σ

2

16σ 2 2c3r + 4δ + 2 r

Proof. Recall that, for r > 0, Ir = 2c3 r

+ (2c3r + 4)σ 2 .

(5.12)

|u| dµ(u) + 4 2

{u∈H :|u|
{u∈H :|u|≥r }

|u|2 dµ(u).

By the change of variable t = x j /σ j , we have {x j ∈R:|x j |≥r/(2N1 )}

1 =√ 2π

1

x 2j

|x j | √ exp − 2 2σ j 2π σ j 2

{|t|≥r/(2N1 σ j )}

σ j2 |t|2 e−|t|

2 /2

dx j

1 dt ≤ √ 2π

{|t|≥Mδ }

σ j2 |t|2 e−|t|

2 /2

dt ≤ δσ j2 .

Hence

{u∈H :|R1 u|≥r/2}

≤

|R1 u|2 dµ(u) =

N1

|x j | √ 2

j=1 {x j ∈R:|x j |≥r/(2N1 )}

{x∈R N1 :|x|≥r/2}

1 2π σ j

|x|2

exp −

N1 j=1

x 2j 2σ j2

1

x 2j

exp − 2 √ 2σ j 2π σ j

dx

dx j

≤ δσ 2 . Applying Lemma 5.2, we obtain {u∈H :|u|≥r }

Since 2c3r

{u∈H :|u|
|u|2 dµ(u) ≤ δσ 2 +

2 dµ(u)

4σ 2 σ 2 + σ 2. r2

≤ 2c3r σ 2 , we have

2 2 4σ σ 2 2 , Ir ≤ 2c3 r σ 2 + σ 2 + 4 + σ + δσ r2 and (5.12) follows.

Now, the condition in the second statement of Corollary 4.8 can be fulfilled by some explicit Gaussian measures.

700

C. Foias, L. Hoang, B. Nicolaenko

Proposition 5.8. There exists a Gaussian probability measure µ0,+ , resp. µ0,− , as defined in Example 5.6 such that any VF measure µˆ + , resp. µˆ − , with initial data µ0,+ , resp. µ0,− , satisfies H(W1 (u(·)))d µˆ + (u(·)) > 0, (5.13)

resp.

H(W1 (u(·)))d µˆ − (u(·)) < 0.

(5.14)

Proof. Let σ j = + for j = 1, √ . . . , N , and σ j = − for j = N + 1, . . . , 2N . For (5.13), we take + = 2 and − = , for some > 0. By (5.10), we have 2 H(R1 u)dµ(u) = N (+2 − − ) = N 2. (5.15) H

Note that σ 2 = 3N 2 . In addition, if 2c3r <

1 r 1 1 16σ 2 1 2 N 2 , 0 < 4δ < , 0 < < √ and ( + 4)σ , , < < 18 18 r2 18 18 2 4 2N1 Mδ

where Mδ is defined in (5.11), then it follows from Lemma 5.7 that 16σ 2 ) + (2c3r + 4)σ 2 r2 1 1 2 N 1 + )+ < 3N 2 ( + 18 2 18 18 H(R1 u)dµ(u). = N 2 =

Ir ≤ 3N 2 (2c3r + 4δ +

H

Then applying Corollary 4.8 (ii),√ we obtain (5.13). 2 is still 3N 2 , For (5.14), we choose − = 2 and + = , then the sum +2 + − 2 2 hence Ir remains less than N . However, H H(R1 u)dµ(u) = −N and therefore H H(R1 u)dµ(u) < −Ir . Again, (5.14) follows from Corollary 4.8 (ii). Next, we construct Gaussian measures µ’s such that for the corresponding VF mea sures µ’s ˆ the integrals |W1 (u(·))|2 d µ(u(·)) ˆ are arbitrarily large. Proposition 5.9. For any M > 0, there exists a VF measure µˆ such that its initial data is a Gaussian probability measure and |W1 (u(·))|2 d µ(u(·)) ˆ ≥ M. (5.16)

Proof. Let µ be a Gaussian measure as in Example 5.6 and let µˆ be a VF measure with initial data µ. Fix M > 0 and θ ∈ (0, 1). Take σ+ > 0 such that θ 5 σ+2 ≥ M. Since lim K →∞ {u∈H :1/K ≤|R + u|≤K } |R1+ u|2 dµ(u) = σ+2 , we can choose K sufficiently large 1 so that |R1+ u|2 dµ(u) ≥ θ σ+2 . (5.17) {u∈H :1/K ≤|R1+ u|≤K }

On the Helicity in 3D-Periodic Navier–Stokes Equations II

701

Let B1 (θ ) = {u + v : u ∈ B1 , v < ε2 (|u|, θ )}, where B1 is defined by (3.16) and ε2 (|u|, θ ) is in (3.18). By virtue of Proposition 3.10, (3.18) and (A.14), there is ε > 0 depending on K and θ such that B1± (K , θ ) == {u + v : u ∈ R1± H, 1/K ≤ |u| ≤ K , v ≤ 2ε} ⊂ B1 (θ ). def

According to Proposition 3.10, |W1 (u)| ≥ θ |u| for u ∈ B1± (K , θ ). Thus we have 2 |W1 (u(·))| d µ(u(·)) ˆ ≥ θ 2 |u(0)|2 d µ(u(·)) ˆ {u(·)∈ :u(0)∈B1 (θ)} |u|2 dµ(u) = θ2 {u∈H :u∈B1 (θ)} 2 |u|2 dµ(u) ≥θ B1+ (K ,θ)

≥ θ2

+ 2 ⎧ ⎫ ⎨u∈H :1/K ≤|R1+ u|≤K ,|R1− u|≤ε,⎬ |R1 u| dµ(u) (I −R1 )u≤ε ⎩ ⎭

=θ

2 {u∈H :1/K ≤|R1+ u|≤K }

µ({u ∈ H : |R1− u| ≤ ε})

|R1+ u|2 dµ(u)

×µ({u ∈ H : (I − R1 )u ≤ ε}).

(5.18)

Assume for the moment that there are σ j , for j > N , such that µ({u ∈ H : |R1− u| ≤ ε}) ≥ θ, µ({u ∈ H : (I − R1 )u ≤ ε}) ≥ θ.

(5.19) (5.20)

Then combining (5.17), (5.18), (5.19) and (5.20) we obtain |W1 (u(·))|2 d µ(u(·)) ˆ ≥ θ 2 (θ σ+2 )θ 2 = θ 5 σ+2 ≥ M,

(5.21)

hence (5.16). It remains to verify (5.19) and (5.20). Verification of (5.19). We have µ({u ∈ H : |R1− u| ≤ ε}) =

≥ =

N

{x∈R N :|x|≤ε} j=1 N j=1 {|x j |≤ε/N } N

√

ε/(N σ N + j )

j=1 −ε/(N σ N + j )

√

−

1 2π σ N + j 1

2π σ N + j

e

−

e 2

x 2j 2 2σ N +j

x 2j 2 2σ N +j

yj 1 √ e− 2 dy j . 2π

For δ ∈ (0, 1), let m(δ) be the positive number such that m ( δ) t2 1 √ e− 2 dt = δ. 2π −m(δ)

dx

dx j

702

C. Foias, L. Hoang, B. Nicolaenko

Let σ N + j = − ≤ ε/(N m(θ 1/N )) for j = 1, . . . , N . We then obtain N m(θ 1/N ) 2 t 1 µ({u ∈ H : |R1− u| ≤ ε}) ≥ = θ. √ e− 2 dt 2π −m(θ 1/N ) Verification of (5.20). We will determine σ j for j > N1 such that (5.20) holds. Suppose Aw j = λ j w j , where (λ j )∞ is the increasing sequence of eigenvalues of the Stokes

∞ j=1

∞ 2 2 operator. For w = j=N1 +1 x j w j ∈ (I − R1 )H , we have w = j=N1 +1 λ j |x j | . Note that {

∞

2N ε

x j w j : |x j | ≤

1/2

2 j/2 λ j

j=N1 +1

} ⊂ {w ∈ (I − R1 )H : w ≤ ε}.

For j > N1 , let σ j > 0 be sufficiently small such that 2N ε 1/2 2 j/2 λ j σ j

≥ m(θ 2

N1 /2 j

).

We obtain µ({u ∈ H : (I − R1 )u ≤ ε}) ≥

∞ ε {|x j |≤ 1/2 } 2 j/2 λ j

j=N1 +1

≥ =

1

∞

m(θ 2

N1/2 j

)

N1/2 j

)

2 j=N1 +1 −m(θ ∞ 2 N1/2 j

θ

e √ 2π σ j

−

x 2j 2σ 2j

dx j

t2 1 √ e− 2 dt 2π

= θ,

j=N1 +1

hence (5.20) is satisfied. The proof is complete.

Remark 5.10. In the above proof, if σ−2 , σ 2 ≤ 1 and σ+2 ≥ 2α, where α = θ/(1 − θ ), then σ+2 ≥ α(σ−2 + σ 2 ) and σ+2 ≥ ασ 2 /(1 + α) = θ σ 2 .

(5.22)

From (5.22) and (5.21), we have |W1 (u(·))|2 d µ(u(·)) ˆ ≥ θ6 |u|2 dµ(u).

H

Hence we have proved that for any given M > 0 and θ ∈ (0, 1), there exists a VF measure µˆ with Gaussian initial data µ such that |u|2 dµ(u) ≥ M (5.23) H

and

θ

|u|2 dµ(u) ≤ H

|W1 (u(·))|2 d µ(u(·)) ˆ ≤

|u|2 dµ(u). H

(5.24)

On the Helicity in 3D-Periodic Navier–Stokes Equations II

703

6. Asymptotic Beltrami Flows A C 1 vector field u(x) in R3 is a said to be Beltrami if ∇ × u(x) = α(x)u(x), x ∈ R3 , some α(x) ∈ R.

(6.1)

Note√that if u = u(·) is an eigenfunction of the curl operator C, then (6.1) holds with α ≡ ± n, for some n ∈ σ (A). The converse is considered in the following: Lemma 6.1. ([1]) Let u = u(·) ∈ Rn H \{0}, where n ∈ σ (A). If ∇ × u(x) = α(x)u(x) √ a.e., for some √ α(·) ∈ R, then u is an eigenfunction of the curl operator, i.e., Cu = nu or Cu = − nu. Corollary 6.2. Let u ∈ Rn H \{0}, where n ∈ σ (A).√Then u is Beltrami √ if and only if u is an eigenfunction of the curl operator, i.e., Cu = nu or Cu = − nu. Let u(·) ∈ be such that there is t0 ≥ 0, u(t0 ) ∈ R \ {0}. Then u(t0 + τ )2 u(t)2 = lim 2 τ →∞ |u(t0 + τ )| t→∞ |u(t)|2

n = lim

is an eigenvalue of the Stokes operator A. Note that the eigenvalue n above depends on the asymptotic behavior of the solution but not on the value of t0 . Therefore we define u(t)2 . t→∞ |u(t)|2

(6.2)

W∗ (u(·)) = lim en ∗ t u(t),

(6.3)

n ∗ (u(·)) = lim Denote n ∗ = n ∗ (u(·)). Define

t→∞

and W ∗ (u(·)) =

u(t) W∗ (u(·)) = lim , |W∗ (u(·))| t→∞ |u(t)|

(6.4)

where the limits in both (6.3) and (6.4) are taken in either H or V . Recall that both W∗ (u(·)) and W ∗ (u(·)) belong to Rn ∗ H \ {0}. Since u(t0 ) ∈ R, we have lim en ∗ t u(t) = en ∗ t0 lim en ∗ τ u(t0 + τ ) = en ∗ t0 Wn ∗ (u(t0 )),

t→∞

τ →∞

hence W∗ (u(·)) = en ∗ t0 Wn ∗ (u(t0 )).

(6.5)

In particular, if t0 = 0, i.e., u 0 = u(0) ∈ R, then W∗ (u(·)) = Wn ∗ (u 0 ). In the case u(t0 ) = 0 for some t0 ≥ 0, we let n ∗ (u(·)) = ∞

and

W∗ (u(·)) = 0.

(6.6)

Denote n ∗ = n ∗ (u(·)) and ξn ∗ = W∗ (u(·)), ξ n ∗ = W ∗ (u(·)). Recall that u(t) → ξ n ∗ in H |u(t)|

and V , t → ∞,

(6.7)

704

C. Foias, L. Hoang, B. Nicolaenko

and hence Cu(t) → Cξ n ∗ in H , t → ∞. |u(t)|

(6.8)

It is known (e.g. [12]) that en ∗ t u(t) → ξn ∗ , for t → ∞, in any H m () norm, m ∈ N, consequently, in sup norm. Hence for x ∈ R3 , lim en ∗ t u(t, x) = ξn ∗ (x)

t→∞

and

lim en ∗ t ∇ × u(t, x) = ∇ × ξn ∗ (x).

t→∞

(6.9)

Since ξn ∗ (x) is analytic, we have lim en ∗ t |u(t, x)| = |ξn ∗ (x)| = 0, a.e.

t→∞

(6.10)

Definition 6.3. We say that a time dependent vector field u(x, t) is asymptotically Beltrami if there are α(x, t) ∈ R such that lim

t→∞

∇ × u(x, t) − α(x, t)u(x, t) = 0, a.e. on R3 . |u(x, t)|

(6.11)

Remark 6.4. The limit in (6.11) requires that a.e. on R3 , u(x, t) = 0, for all t ≥ t0 (x). We obtain the following equivalent conditions for a Leray-Hopf solution to be asymptotically Beltrami. Theorem 6.5. Let u(·) ∈ such that u(t0 ) ∈ R\{0}, for some t0 > 0. The following are equivalent: (i) u(·) is asymptotically Beltrami. (ii) There is a subsequence tk ∞ and α(x, tk ) ∈ R such that lim

k→∞

∇ × u(x, tk ) − α(x, tk )u(x, tk ) = 0, a.e. on R3 . |u(x, tk )|

(6.12)

(iii) W∗ (u(·)) is a Beltrami vector field. (iv) For n ∗ = n ∗ (u(·)),

√ |Cu(t) − ε n ∗ u(t)| = 0, t→∞ |u(t)| lim

(6.13)

where ε = 1 or −1. Proof. Assume (i). Of course, (ii) follows. Assume (ii). From (6.7) and (6.8) we can assume, without loss of generality, that lim u(x, tk )/|u(tk )| = ξ n ∗ (x) = 0, a.e.,

(6.14)

lim ∇ × u(x, tk )/|u(tk )| = Cξ n ∗ (x), a.e.

(6.15)

lim |u(x, tk )|/|u(tk )| = |ξ n ∗ (x)| = 0, a.e.,

(6.16)

k→∞ k→∞

Thus from (6.14), k→∞

On the Helicity in 3D-Periodic Navier–Stokes Equations II

705

and together with (6.12), lim

k→∞

∇ × u(x, tk ) − α(x, tk )u(x, tk ) = 0, a.e. |u(tk )|

(6.17)

From (6.15) and (6.17), it follows that lim α(x, tk )

k→∞

u(x, tk ) = Cξ n ∗ (x), a.e., |u(tk )|

(6.18)

and hence limk→∞ α(x, tk ) = α(x) exists a.e. on R3 and Cξ n ∗ (x) = α(x)ξ n ∗ (x) a.e. on R3 . By Lemma 6.2, ξ n ∗ is an eigenfunction of the curl operator C. Hence ξ n ∗ is Beltrami, so is W∗ (u(·)), and we have (iii).√ Assume (iii). Then Cξ n ∗ = ε n ∗ ξ n ∗ , where ε = 1 or −1. By (6.7) and (6.8), √ u(t) √ Cu(t) − ε n∗ = Cξ n ∗ − ε n ∗ ξ n ∗ = 0, lim t→∞ |u(t)| |u(t)| where the limit is taken in H , thus proving (iv). √ √ Assume (iv). The limit in (iv) is |Cξ n ∗ − ε n ∗ ξ n ∗ |, hence Cξ n ∗ = ε n ∗ ξ n ∗ and √ (6.19) ∇ × ξn ∗ (x) = ε n ∗ ξn ∗ (x), x ∈ R3 . By (6.9) and (6.10), √ √ ∇ × ξ(x) − ε n ∗ ξn ∗ (x) ∇ × u(x, t) − ε n ∗ u(x, t) = , a.e.. t→∞ |u(x, t)| |ξn ∗ (x)| √ This limit is zero by (6.19), hence (6.11) holds with α(x, t) ≡ ε n ∗ . lim

Corollary 6.6. Let u(·) ∈ be not identically √zero in (t0 , ∞), for some t0 > 0. If u(t) is asymptotically Beltrami then√ CW∗ (u(·)) = ε n ∗ (u(·))W∗ (u(·)), with ε = 1 or ε = −1, and (6.11) holds with α = ε n ∗ (u(·)). We now turn to the statistical study of the asymptotically Beltrami flows using the statistical solutions of the Navier–Stokes equations. Definition 6.7. Let µˆ be a VF measure on as in Definition 2.3. We say that the µˆ is asymptotically Beltrami if almost surely every solution u(·) in is asymptotic Beltrami; more precisely, µ({u(·) ˆ ∈ : u(·) is asymptotically Beltrami}) = 1.

(6.20)

We infer from Theorem 6.5 and Corollary 6.2 that if a Leray-Hopf solution u(·) is asymptotically Beltrami then CW1 (u(·)) = W1 (u(·)) or CW1 (u(·)) = −W1 (u(·)) (this trivially holds if W1 (u(·)) = 0), or equivalently, R1− W1 (u(·)) = 0 or R1+ W1 (u(·)) = 0. Therefore, the necessary condition for µˆ to be asymptotically Beltrami is that µ({u(·) ˆ ∈ : |R1+ W1 (u(·))| |R1− W1 (u(·))| = 0}) = 1,

(6.21)

706

C. Foias, L. Hoang, B. Nicolaenko

or equivalently,

|R1+ W1 (u(·))| |R1− W1 (u(·))|d µ(u(·)) ˆ = 0.

(6.22)

Another alternative interpretation of (6.22) is the following. Since R1+ u and R1− u are orthogonal, we have that |R1+ u||R1− u| = 0 if and only if |R1+ u| + |R1− u| − |R1 u| = 0. Therefore, the necessary condition (6.22) for µˆ to be asymptotically Beltrami is equivalent to + |R1 W1 (u(·))| + |R1− W1 (u(·))| − |W1 (u(·))| d µ(u(·)) ˆ = 0. (6.23)

Proposition 6.8. If µˆ is a VF measure with initial data µ satisfying H

+ |R1 u| + |R1− u| − |R1 u| dµ(u) > 3c3

|u|2 dµ(u),

(6.24)

H

then µˆ is not asymptotically Beltrami. Proof. Suppose (6.24) holds. Using (4.20) with µ0 = µ, one can show that (6.23) does not hold. Theorem 6.9. There exists a VF measure µˆ with initial Gaussian probability measure such that µˆ is not asymptotically Beltrami. Proof. Let µ be a Gaussian measure as in Example 5.6 and µˆ be a VF measure with initial data µ. Let σ j = > 0 for j = 1, . . . , 2N . Let ωn be the area of the (n −1)-dimensional unit sphere in Rn , n ≥ 2. We have Rn

|z|

∞ 2 2 e−|z| /(2 ) 1 2 2 dz = r e−r /(2 ) ωn r n−1 dr, n/2 n n/2 n (2π ) (2π ) 0 √ ωn 2 ∞ n −y 2 = y e dy π n/2 0 = αn .

r y=√ 2

Condition (6.24) is now equivalent to (2α N − α2N ) > 3c3 (N 2 + N 2 + σ 2 ).

(6.25)

Since 2α N − α2N > 0, condition (6.25) is satisfied with σ 2 = N 2 and < (2α N − α2N )/(9N c3 ).

On the Helicity in 3D-Periodic Navier–Stokes Equations II

707

Remark 6.10. In Proposition 6.8, we can use (6.22) instead of (6.23) to replace (6.24) with the following condition: |R1+ u| |R1− u|dµ(u) > Ir , (6.26) H

for some r > 0, where Ir is defined by (4.22) with µ0 = µ. Also one can adjust the construction of µ in the proof of Theorem 6.9 such that µ should satisfy (6.26), hence |R1+ W1 (u(·))| |R1− W1 (u(·))|d µ(u(·)) ˆ >0

and µˆ is not asymptotically Beltrami. 7. Some Generic Properties of VF Measures First we will show that

|W1 (u(·))|2 d µ(u(·)) ˆ >0

(7.1)

is a generic property for a VF measure µ. ˆ For this we will give a useful characterization of a VF measure with that property. Let 1 = {u(·) ∈ : W1 (u(·)) = 0}.

(7.2)

1 = {u(·) ∈ : u(t) ∈ M1 , for all t ≥ t0 = t0 (u(·))},

(7.3)

It is easy to see that where M1 = {u ∈ R : W1 (u) = 0} (see [8]). It is worth mentioning that M1 is a manifold in V . For our convenience, we will also define 1,t = {u(·) ∈ : u(t) ∈ M1 }. Then 1,t ⊂ 1,t for t ≤

t

(7.4)

and 1 = ∪t≥0 1,t .

Proposition 7.1. Relation (7.1) holds if and only if µ( ˆ 1 ) < 1. Proof. Suppose (7.1) does not hold. Let r > 0. According to Lemma A.2, there is t1 = t1 (r ) > 0 such that u(t1 ) ∈ R whenever |u(0)| < r . Then 0= |W1 (u(·))|2 d µ(u(·)) ˆ {u(·)∈ :|u(0)|
Hence W1 (u(t1 )) = 0 µ-a.e. ˆ on {u(·) ∈ : |u(0)| < r }. Thus ˆ ∈ : |u(0)| < r, u(t1 (r )) ∈ M1 }) 1 ≥ µ( ˆ 1 ) ≥ µ({u(·) = µ({u(·) ˆ ∈ : |u(0)| < r, W1 (u(t1 (r ))) = 0}) = µ({u(·) ˆ ∈ : |u(0)| < r }). Letting r → ∞, we obtain µ( ˆ 1 ) = 1. We now assume that µ( ˆ 1 ) = 1. Since W1 ( 1 ) = {0}, we have W1 (u(·)) = 0 µ-a.e. ˆ on , thus (7.1) fails. For the initial data µ0 of µˆ we obtain the following.

708

C. Foias, L. Hoang, B. Nicolaenko

Corollary 7.2. Let N1 = {u 0 ∈ H : ∃u(·) ∈ , u(0) = u 0 , and u(t) ∈ M1 , t ≥ t0 = t0 (u(·))}. If µ0 (N1 ) < 1 then (7.1) holds. Proof. Since N1 = Pr0 1 , we have −1 1 > µ0 (N1 ) = µ(Pr ˆ ˆ 1 ). 0 N1 ) ≥ µ(

Hence (7.1) holds, by virtue of Proposition 7.1.

Remark 7.3. We do not know if µ0 (N1 ) = 1 implies µ( ˆ 1 ) = 1. Definition 7.4. Let µˆ and µ˜ be two Borel measures on . We define d1 (µ, ˆ µ) ˜ by the total variation of the measure µˆ − µ, ˜ that is, ⎧ ⎫ N ⎨ ⎬ d1 (µ, ˆ µ) ˜ = sup |µ(E ˆ j ) − µ(E ˜ j )| , (7.5) ⎩ ⎭ j=1

where the supremum is taken over all Borel partitions {E 1 , E 2 , . . . , E N }, N ∈ N, of . It is known that the space of finite Borel measures on with metric d1 is complete. For our study, it is more suitable to let M be the set of all VF measures and define the following metric for µˆ and µ˜ in M: d(µ, ˆ µ) ˜ = d1 (µ, ˆ µ) ˜ + |u(0)|2 d|µˆ − µ|(u(·)), ˜ (7.6)

where |µˆ − µ| ˜ is the total variation measure of the signed measure (µˆ − µ). ˜ We have: Proposition 7.5. The metric space (M, d) is complete. Proof. Let (µˆ n )∞ ˆ n )∞ n=1 be a Cauchy sequence in (M, d). Then (µ n=1 is a Cauchy sequence with respect to d1 . Therefore there is a Borel measure µˆ on such that limn→∞ d1 (µˆ n , µ) ˆ = 0. Obviously, µˆ is a probability measure on . For r > 0, let B (r ; 0) = {u(·) ∈ : |u(0)| < r }. We have the function u(·) ∈ B (r ; 0) → Pk u(0) is continuous for r > 0, k ∈ N. Given ε > 0, there is N > 0 such that for n > n > N , we have |Pk u(0)|2 d|µˆ n − µˆ n |(u(·)) < ε, B (r ;0)

for any r > 0 and k ∈ N. Letting n → ∞ and then r → ∞, k → ∞, we obtain |u(0)|2 d|µˆ n − µ|(u(·)) ˆ ≤ ε.

Thus limn→∞ = 0. Since |u(0)|2 d µˆ n (u(·)) is finite for each n, it fol lows that |u(0)|2 d µ(u(·)) ˆ is finite. Hence µˆ is a VF measure. Therefore (M, d) is complete. d(µˆ n , µ) ˆ

On the Helicity in 3D-Periodic Navier–Stokes Equations II

709

In what follows, M is considered as a metric space with metric d. A property P(µ) ˆ of a VF measure µˆ is called generic if the set of all VF measures µˆ enjoying the property P(µ) ˆ contains an intersection of dense open sets in M. Lemma 7.6. Let µ, ˆ mˆ ∈ M, ε ∈ (0, 1) and µ˜ = (1 − ε)µˆ + εm. ˆ Then µ˜ ∈ M and d(µ, ˜ µ) ˆ ≤ 2ε + ε |u(0)|2 d µ(u(·)) ˆ + |u(0)|2 d m(u(·)) ˆ . (7.7)

Proof. The fact that µ˜ ∈ M follows from Remark 2.5. For any Borel partition {E j , j = 1, . . . , N }, some N ∈ N, of , we have N

|µ(E ˜ j ) − µ(E ˆ j )| = ε

j=1

N ˆ j ) ≤ 2ε, µ(E ˆ j ) + m(E j=1

thus yielding d1 (µ, ˆ µ) ˜ ≤ 2ε. Moreover, |u(0)|2 d|µ˜ − µ|(u(·)) ˆ = |u(0)|2 d|εmˆ − εµ|(u(·)) ˆ ≤ε |u(0)|2 d µ(u(·)) ˆ +ε |u(0)|2 d m(u(·)). ˆ

Hence (7.7) follows.

Theorem 7.7. The set M E of all µˆ ∈ M such that (7.1) holds is open and dense in M. Subsequently, (7.1) is generic. Proof. For the density, suppose µˆ ∈ M\M E and ε ∈ (0, 1). Denote M = |u(0)|2 d µ(u(·)). ˆ Let u 0 ∈ R1 H \{0} such that Cu 0 = u 0 and |u 0 | = 1. Then S(t)u 0 = u 0 (t) = e−t u 0 , for all t ≥ 0. Clearly, u 0 ∈ R, W1 (u 0 ) = u 0 and W1 (u 0 (·)) = u 0 , by Definition 3.3. Set µ˜ = (1 − ε)µˆ + εδu 0 (·) . Then µ˜ ∈ M and 2 |W1 (u(·))| d µ(u(·)) ˜ = (1 − ε) |W1 (u(·))|2 d µ(u(·)) ˆ + ε|W1 (u 0 (·))|2

= 0 + ε|u 0 |2 = 0, hence µ˜ ∈ M E . By Lemma 7.6, we have d(µ, ˆ µ) ˜ ≤ ε(M + 3). Therefore M E is dense in M. Now suppose µˆ ∈ M E . By Proposition 7.1, we have µ( ˆ 1 ) < 1, hence δ = µ( ˆ \ 1 ) > 0. Assume µ˜ ∈ M satisfies d(µ, ˜ µ) ˆ < δ. We have µ( ˜ 1 ) ≤ µ( ˆ 1 ) + d1 (µ, ˜ µ) ˆ < µ( ˆ 1 ) + δ = 1, thus µ˜ ∈ M E thanks to Proposition 7.1 again. Thus M E is open. We now study the genericity of the following property: H(W1 (u(·)))d µ(u(·)) ˆ

= 0.

(7.8)

710

C. Foias, L. Hoang, B. Nicolaenko

For that purpose, we denote by M H the set of all µˆ ∈ M such that (7.8), holds. Note that M H = M+H ∪ M− H , where M+H = µˆ ∈ M : H(W1 (u(·)))d µ(u(·)) ˆ >0 , (7.9) M− ˆ ∈M: H(W1 (u(·)))d µ(u(·)) ˆ <0 . (7.10) H = µ

Theorem 7.8. The set M H is open and dense in M. Subsequently, (7.8) is generic. Proof. First, let µˆ ∈ M\M H and ε ∈ (0, 1). Let u 0 (·), µ˜ and M be as in Theorem 7.7. Above we proved d(µ, ˜ µ) ≤ ε(3 + M). Also H(W1 (u(·)))d µ(u(·)) ˜ = (1 − ε) H(W1 (u(·)))d µ(u(·)) ˆ + εH(W1 (u 0 (·)))

= εCu 0 , u 0 = ε|u 0 |2 > 0, hence µ˜ ∈ M H . Thus M H is dense in M. Second, let µˆ ∈ M+H such that H(W1 (u(·)))d µ(u(·)) ˆ = δ > 0.

Suppose µ˜ ∈ M satisfies d(µ, ˜ µ) ˆ < δ. Then we have

H(W1 (u(·)))d µ(u(·))

˜ − H(W1 (u(·)))d µ(u(·)) ˆ

≤ |H(W1 (u(·)))|d|µ˜ − µ|(u(·)) ˆ ≤ |u(0)|2 d|µ˜ − µ|(u(·)) ˆ < δ.

M+H . Therefore M+H

Thus H(W1 (u(·)))d µ(u(·)) ˜ > 0 or µ˜ ∈ is open and hence so is M H . The proof is complete.

is open. Similarly, M− H

We now discuss the genericity of the VF measures which are asymptotically Beltrami (see Definition 6.7). We let M B = {µˆ ∈ M : µˆ is asymptotically Beltrami}.

(7.11)

Proposition 7.9. M\M B contains an open and dense subset of M. Consequently, the property “µˆ is not asymptotically Beltrami” for a VF measure µˆ is generic. Proof. Let

|R1+ W1 (u(·))| |R1− W1 (u(·))|d µ(u(·)) ˆ >0 . N B = µˆ ∈ M :

We know from the necessary condition (6.22) that N B is a subset of M\M B . Similar to Theorem 7.8, one can easily prove that N B is open. It suffices to show that N B is dense. µˆ ∈ M\N B . Let mˆ be in M having initial data µ0 as in Remark 6.10. We have Suppose + W (u(·))| |R − W (u(·))|d m(u(·)) |R ˆ > 0. Given ε ∈ (0, 1), let µ˜ = (1 − ε)µˆ + εm. ˆ 1 1 1 1 Then |R1+ W1 (u(·))| |R1− W1 (u(·))|d µ(u(·)) ˜ =ε |R1+ W1 (u(·))| |R1− W1 (u(·))|d m(u(·)) ˆ ,

On the Helicity in 3D-Periodic Navier–Stokes Equations II

711

which is positive, hence µ˜ ∈ N B . Also, it follows from Lemma 7.6 that 2 |u(0)| d µ(u(·)) ˆ + |u(0)|2 d m(u(·)). ˆ d(µ, ˜ µ) ˆ < ε(M + 2) where M =

Therefore N B is dense. The proof is complete.

8. A Connection to the Empirical Theory of Turbulence In this section, we connect our analytic study of the statistical solutions of the Navier– Stokes equations to the empirical theory of decaying turbulence. However, unlike the preceding sections which are based on rigorous mathematical arguments, our following discussion involves also heuristic inferences. Here x, L, t, ν are the dimensional spatial variable, period, time and viscosity. To apply the results established in the previous sections, we use the following change of scales: x 1 x=√ , t= t, λ1 ν λ1 where λ1 = (2π/L)2 denotes the first eigenvalue of the Stokes operator. Then x and t play the roles of corresponding adimensional variables of the previous sections. Let us recall the basic features of Kolmogorov’s empirical theory of turbulence. In that theory, the following quantities are essential: 1 ν 2 2 U = 3 |u(x, t)| d x and = 3 |∇ × u(x, t)|2 d x, L L [0,L]3 [0,L]3 where · denotes an “adequate” ensemble average. Note that U 2 is twice the mean energy/mass and is the mean energy dissipation rate/mass. These two quantities are connected by kd kd 2 U ∼ S(k)dk, ∼ ν k 2 S(k)dk, ki

ki

where S(k) is the energy spectrum √and [ki , kd ] is called the “inertial range” of the turbulent flows. Assume ki ∼ k0 = λ1 = 2π/L, kd ∼ (/ν 3 )1/4 and S(k) ∼ 2/3 k −5/3 (based on the dimensional analysis), we obtain kd −2/3 k −5/3 dk ∼ 2/3 ki ∼ (L)2/3 . (8.1) U 2 ∼ 2/3 ki

In the empirical theory of turbulence, both quantities U 2 and are often considered time-independent. However, in our study, the body force is potential hence they decay exponentially. We propose the following seemingly suitable candidates for these quantities based on our mathematical studies in the previous sections. Let (µt )t≥0 be a VF statistical solution to the Navier–Stokes equations with the VF measure µˆ and T > 0. We define for t ≥ 0, t+T 3/2 1 Ut2 = λ1 |u|2 dµτ (u)dτ, (8.2) T t H

712

C. Foias, L. Hoang, B. Nicolaenko

and t =

3/2 νλ1

1 T

t+T

H

t

u2 dµτ (u)dτ,

(8.3)

where we recall that |u| denotes the L 2 -norm on = (−L/2, L/2)3 and u = |∇u| = |∇ × u|. For our asymptotic study, the first component of the normalization map is defined now by W1 (u(·)) = lim eνλ1 t u(t), t→∞

(8.4)

where the limit is taken in any Sobolev norms. We also let 3/2 3/2 2 2 2 |u| dµ0 (u) and α1 = λ1 |W1 (u(·))|2 d µ(u(·)). ˆ α0 = λ1

H

and t , we have the following dimensional version of the related results in Corollary 4.4. For the long time dynamics of Ut2

Proposition 8.1. We have for each T > 0 that 1 − e−2T 2 α1 , t→∞ 2T 1 − e−2T 2 α1 . lim e2νλ1 t t = t→∞ 2T lim e2νλ1 t Ut2 =

(8.5) (8.6)

If (8.1) applies to Ut2 and t then there are absolute positive constants c K and C K such that cK ≤

1/3

Ut2 2/3

(L/2π )2/3 t

=

λ1 Ut2 2/3

t

≤ CK .

(8.7)

By virtue of Proposition 8.1, relation (8.7) will not hold when t is sufficiently large and α12 > 0. (The case α12 > 0 is, in fact, generic according to our study in Sect. 7.) We will estimate the time interval when (8.7) may still be valid, hence the universal features of the turbulent flows may only be observed on that interval of time. Furthermore, we 1/3 2/3 find rigorous lower and upper bounds for the quotient λ1 Ut2 /t . To start, we restate the inequalities in Remark 4.5 in their dimensional forms. Lemma 8.2. We have for T > 0 and t ≥ 0 that e−2νλ1 (t+T ) α12 ≤ Ut2 ≤ e−2νλ1 t α02 , νλ1 Ut2 ≤ t ≤

e−2νλ1 t 2T

(α02 − e−2νλ1 T α12 ).

Proposition 8.3. Let Q = α12 /α02 . We have for t ≥ 0 that 1/3 2/3 −2νλ1 t 2 1/3 1/3 e−2νλ1 t α02 e α0 λ1 Ut2 2νλ1 T e−2νλ1 T ≤ ≤ , Q 2/3 1 − e−2νλ1 T Q λ1 ν 2 λ1 ν 2 t or, equivalently, 1/3 2/3 −2νλ1 t 2 1/3 1/3 e−2νλ1 t α02 e α1 λ1 Ut2 2νλ1 T ≤ ≤ . 2/3 Q −1 e2νλ1 T − 1 λ1 ν 2 λ1 ν 2 t

(8.8) (8.9)

(8.10)

(8.11)

On the Helicity in 3D-Periodic Navier–Stokes Equations II 1/3

713

2/3

Proof. For the upper bound of λ1 Ut2 /t 1/3

λ1 Ut2 2/3

t

1/3

≤

λ1 Ut2 (λ1 νUt2 )2/3

, we have from Lemma 8.2, −2νλ t 2 1/3 2/3 1 α e Ut 0 = 1/3 ≤ . 1/3 λ1 ν 2/3 λ1 ν 2/3

For the lower bound: 1/3

λ1 Ut2 2/3

t

λ1 e−2νλ1 (t+T ) Qα02 1/3

≥

!2/3 − e−2νλ1 T α12 ) 1/3 2/3 e−2νλ1 t α02 2νλ1 T e−2νλ1 T =Q . λ1 ν 2 1 − e−2νλ1 T Q e−2νλ1 t 2 2T (α0

Hence we obtain (8.10). The estimates in (8.11) follow immediately.

Corollary 8.4. The relation (8.1) may only be valid on the time interval [t K , TK ] where α12 1 Q −1 e2νλ1 T − 1 tK = − 3 log C K − 2 log log , (8.12) 2νλ1 λ1 ν 2 2νλ1 T α02 1 − 3 log c K . log (8.13) TK = 2νλ1 λ1 ν 2 Proof. For t ≥ 0 such that (8.1) holds, it follows from (8.8) that −2νλ t 2 1/3 1/3 1 α e λ1 Ut2 0 cK ≤ ≤ , 2/3 1/3 2/3 t λ1 ν thus yielding t ≤ TK . Similarly, using (8.9), we have 2/3 −2νλ1 t 2 1/3 e α1 2νλ1 T ≤ CK , Q −1 e2νλ1 T − 1 λ1 ν 2 hence we obtain t ≥ t K .

Example 8.5. Let L = 2π (λ1 = 1), ν = 1 and µˆ be the VF measure in Corollary 5.4. We have (4π )−1 ≤ Q ≤ 1. It follows from Proposition 8.3 that 1/3 2/3 1/3 λ1 Ut2 1 2T −2t 2 1/3 (e α ) ≤ ≤ (e−2t α02 )1/3 , 0 2/3 4π 4π e2T − 1 t for all t ≥ 0. Also, by Corollary 8.4, we derive 1 TK = {log(α02 ) − 3 log c K }, 2 α02 1 4π e2T − 1 − 3 log C K − 2 log tK ≥ log . 2 4π 2T Now, if we let M > 0, θ ∈ (0, 1) and µˆ be a VF measure satisfying (5.23) and (5.24), then α02 ≥ M and θ ≤ Q ≤ 1 and t K in (8.12) can be bounded below by 1 θ −1 e2T − 1 log M − 3 log C K − 2 log . tK ≥ 2 2T

714

C. Foias, L. Hoang, B. Nicolaenko

Appendix A In this paper we need several well-known estimates for the non-linear term B(u, u) in the Navier–Stokes equations (2.2). For the convenience of the reader, we list them below. There are positive constants c j , j = 1, 2, 3, such that |B(u, v), w| ≤ c1 uv1/2 |Av|1/2 |w|, |B(u, v), w| ≤ c2 u1/2 |Au|1/2 v |w|, |B(u, v), w| ≤ c3 |u||Av|1/2 |A3/2 v|1/2 |w|.

(A.1) (A.2) (A.3)

The numbering of the constants is done in order to indicate the estimate in which the constant c j appears. Thus |B(u, v)| ≤ min{c1 uv1/2 |Av|1/2 , c2 u1/2 |Au|1/2 v, c3 |u||Av|1/2 |A3/2 v|1/2 }.

(A.4)

Let u(·) be a Leray-Hopf solution on [0, ∞) and G = G(u(·)) be defined by (2.12). It is known that for t0 ∈ G, we have |u(t)| ≤ e−(t−t0 ) |u(t0 )|, t ≥ t0 .

(A.5)

|u(t)| ≤ e−t |u(0)|, t ≥ 0.

(A.6)

In particular, 0 ∈ G and For t > t ≥ 0, let t0 ∈ [t, t ) ∩ G, then by (2.13) t 2 u(s)ds ≤ |u(t0 )|2 ≤ e−2t0 |u(0)|2 . t0

Letting t0 → t, we obtain t e−2t |u(0)|2 , t > t ≥ 0. u(s)2 ds ≤ 2 t

(A.7)

Lemma A.1. There is ε0 > 0 such that if u 0 ≤ ε0 then u 0 ∈ R and u(t) ≤ 2e−t u 0 , t > 0.

(A.8)

Proof. Though this is a consequence of the convergence of the asymptotic expansion of the regular solution when the initial data is small (cf. [6]), we present below an elementary proof to make our paper self-contained. The calculations are formal but can be made rigorous using the Galerkin approximations. Let C0 = min{c1 , c2 }. It follows from (2.2) and (A.4) that 1 d 1 u2 + |Au|2 ≤ |B(u, u), Au| ≤ C0 |Au|3/2 u3/2 ≤ |Au|2 + 2C04 u6 . 2 dt 2 √ Let C1 = 1/(2C0 4 2) and u 0 < C1 . By the standard small initial data argument, we have u 0 ∈ R and u(t) ≤ e−t/2 u 0 . Now, using interpolating inequality u2 ≤ |u||Au|, we obtain 1 d u2 + |Au|2 ≤ C0 |Au|3/2 u3/2 ≤ C0 |Au|2 |u|1/2 u1/2 , 2 dt

On the Helicity in 3D-Periodic Navier–Stokes Equations II

715

hence 1 d u2 + (1 − C0 |u|1/2 u1/2 )|Au|2 ≤ 0. 2 dt Using |u| ≤ u ≤ |Au| and u(t) ≤ u 0 ≤ C1 , we derive u(t)2 ≤ e−2

t

≤ e2C0

0 (1−C 0 |u(τ )|

∞ 0

e−τ/2 u

1/2 u(τ )1/2 )dτ

0 dτ

u 0 2 ≤ e2C0

∞ 0

u(τ )2 dτ −2t

e

u 0 2

e−2t u 0 2 ≤ e4C0 u 0 e−2t u 0 2 .

Thus (A.8) holds for log 2 > 0. ε0 = min C1 , 2C0

(A.9)

We give an estimate of t0 for which u(t0 ) ∈ R in terms of |u 0 | and ε0 defined in (A.9). Lemma A.2. Let u(·) ∈ , then there is t0 ∈ [0, log+ (|u(0)|/ε0 )+1) such that u(t0 ) ∈ R and u(t) ≤ 2e|u(0)|e−t , t ≥ t0 .

(A.10)

(Above log+ α = log(max{1, α}), for α ∈ R.) Proof. Let u 0 = u(0). Take t∗ = log+ (|u 0 |/ε0 ). By (A.7),

t∗ +1

2

u(s)2 ds ≤ e−2t∗ |u 0 |2 .

(A.11)

t∗

This implies that the Lebesgue measure of {s : u(s)2 ≤ e−2t∗ |u 0 |2 } is greater or equal to 1/2. Hence there is t0 ∈ (t∗ , t∗ + 1) such that u(t0 ) ≤ e−t∗ |u 0 | ≤ ε0 . Applying Lemma A.1 to u(t0 ) gives u(t) ≤ 2e−(t−t0 ) u(t0 ) ≤ 2e−t et∗ +1 e−t∗ |u 0 | = 2e−t+1 |u 0 |, t ≥ t0 , thus proving (A.10).

Concerning the perturbation problem for the Navier–Stokes equations when the initial data u 0 is in a neighborhood of a fixed u ∗0 ∈ R, we have the following result which is similar to but much simpler than that in [20]. For our purpose, we focus on the case u ∗0 belonging to the set B1 consisting of u ∈ R1 H \{0} such that B(u, u) = 0. Lemma A.3. Let u ∗0 ∈ B1 , there is ε = ε(|u ∗0 |) such that if v0 ≤ ε then u 0 = u ∗0 + v0 ∈ R and ∗

|S(t)u 0 − e−t u ∗0 | ≤ |v0 |ec3 |u 0 | e−t , t > 0.

(A.12)

716

C. Foias, L. Hoang, B. Nicolaenko

Proof. Let u ∗ (t) = S(t)u ∗0 = e−t u ∗0 and v(t) = S(t)u 0 − u ∗ (t). The equation for v(t) is dv + Av + B(v, v) + B(u ∗ , v) + B(v, u ∗ ) = 0. dt

(A.13)

Using (A.4) and the fact that u ∗ ∈ R1 H , we have 1 d|v|2 + v2 ≤ |B(v, u ∗ ), v| ≤ c3 |v|2 |u ∗ | ≤ c3 |v|2 |u ∗0 |e−t . 2 dt Hence ∗

|v(t)|2 ≤ |v0 |2 e−2t e2c3 |u 0 |

t 0

e−τ dτ

∗

≤ |v0 |2 e2c3 |u 0 | e−2t ,

thus yielding (A.12). We also have 1 dv2 + |Av|2 ≤ C0 |Av|3/2 v3/2 + c2 |Av|v|u ∗ | + c3 |Av||v||u ∗ | 2 dt ≤ C0 |Av|3/2 v3/2 + c2 |Av|3/2 |v|1/2 |u ∗ | + c3 |Av||v||u ∗ | 1 ≤ |Av|2 + C2 v6 + C3 |v|2 |u ∗ |2 (1 + |u ∗ |2 ) 2 1 ∗ ≤ |Av|2 + C2 v6 + C3 |v0 |2 e2c3 |u 0 | |u ∗0 |2 (1 + |u ∗0 |2 )e−2t , 2 where C2 , C3 > 0. Take ε > 0 satisfying ∗

C2 ε4 + C3 ε2 e2c3 |u 0 | |u ∗0 |2 (1 + |u ∗0 |2 ) <

1 . 4

The argument becomes standard now and we omit the details.

(A.14)

References 1. Constantin, P., Majda, A.: The Beltrami spectrum for incompressible fluid flows. Commun. Math. Phys. 115(3), 435–456 (1988) 2. Foias, C.: Statistical study of the Navier–Stokes equations I. Rend. Sem. Mat. Univ. Padova 48, 219–348 (1972) 3. Foias, C.: Statistical study of the Navier–Stokes equations II. Rend. Sem. Mat. Univ. Padova 48, 9–123 (1973) 4. Constantin, P., Foias, C.: Navier–Stokes equations. Chicago: University of Chicago Press, 1988 5. Foias, C., Hoang, L., Nicolaenko, B.: On the helicity in 3D–periodic Navier–Stokes equations I: The nonstatistical case. Proc. London Math. Soc., 94(1), 53–90 (2007) 6. Foias, C., Hoang, L., Olson, E., Ziane, M.: On the solutions to the normal form of the Navier–Stokes equations. Indiana Univ. Math J 55, 631–686 (2006) 7. Foias, C., Manley, O., Rosa, R., Temam, R.: Navier–Stokes equations and turbulence. In: Encyclopedia of Mathematics and its Applications, Cambridge: Cambridge University Press, 2001, p. 83 8. Foias, C., Saut, J.C.: Asymptotic behavior, as t → +∞ of solutions of Navier–Stokes equations and nonlinear spectral manifolds. Indiana Univ. Math. J. 33(3), 459–477 (1984) 9. Foias, C., Saut, J.C.: Linearization and normal form of the Navier–Stokes equations with potential forces. Ann. Inst H. Poincaré, Anal. Non Linéaire 4, 1–47 (1987) 10. Foias, C., Saut, J.C.: Asymptotic integration of Navier–Stokes equations with potential forces. I. Indiana Univ. Math. J. 40(1), 305–320 (1991) 11. Foias, C., Temam, R.: Gevrey class regularity for the solutions of the Navier–Stokes equations. J. Funct. Anal. 87(2), 359–369 (1989)

On the Helicity in 3D-Periodic Navier–Stokes Equations II

717

12. Guillope, C.: Remarques à propos du comportement lorsque t → ∞, des solutions des équations de Navier–Stokes associées à une force nulle. Bull. Soc. Math. France 111, 151–180 (1983) 13. Leray, J.: Étude de diverse équations intégrales non linéares et de quelques problèmes que pose l’hydrodynamique. J. Math. Pures Appl. 12, 1–82 (1933) 14. Leray, J.: Essai sur les mouvements plans d’un liquide visqueux que limitent des parois. J. Math. Pures Appl 13, 331–418 (1934) 15. Leray, J.: Essai sur le mouvement d’un liquide visqueux emplissant l’espace. Acta Math. 63, 193–248 (1934) 16. Vishik, M.I., Fursikov, A.V.: Mathematical Problems of Mathematical Statistical Hydrodynamics. Dordrecht: Kluwer Acad. Press, 1980 17. Ladyzhenskaya, O.A.: Mathematical Theory of Viscous Incompressible Flow. New York: Gordon and Breach, 1969 18. Moffatt, H.K.: The degree of knottedness of tangled vortex lines. J. Fluid Mech. 36, 117–129 (1969) 19. Moffatt, H.K., Tsinober, A.: Helicity in laminar and turbulent flow. Ann. Rev. Fluid Mech. 24, 281–312 (1992) 20. Ponce, G., Racke, R., Sideris, T.C., Titi, E.S.: Global stability of large solutions to the 3D Navier–Stokes Equations. Commun. Math. Phys. 159, 329–341 (1994) 21. Shtilman, L., Pelz, R.B., Tsinober, A.: Numerical investigation of helicity in turbulence flow. Computers and Fluids 16(3), 341–347 (1988) 22. Temam, R.: Navier–Stokes Equations: Theory and Numerical Analysis. AMS Chelsea Publishing, Providence, RI: Amer. Math. soc., 2001 23. Varadhan, S.R.S.: Stochastic Processes. New York: Courant Institute of Mathematical Sciences, 1968 Communicated by P. Constantin

Commun. Math. Phys. 290, 719–736 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0863-8

Communications in

Mathematical Physics

Moduli Spaces of Instantons on the Taub-NUT Space Sergey A. Cherkis School of Mathematics and Hamilton Mathematics Institute, Trinity College, Dublin, Ireland. E-mail: [email protected] Received: 5 August 2008 / Accepted: 25 March 2009 Published online: 26 June 2009 – © Springer-Verlag 2009

Abstract: We present ADHM-Nahm data for instantons on the Taub-NUT space and encode these data in terms of Bow Diagrams. We study the moduli spaces of the instantons and present these spaces as finite hyperkähler quotients. As an example, we find an explicit expression for the metric on the moduli space of one SU (2) instanton. We motivate our construction by identifying a corresponding string theory brane configuration. By following string theory dualities we are led to supersymmetric gauge theories with impurities.

Contents 1. 2. 3. 4.

5.

6.

7.

8. 9.

Introduction . . . . . . . . . . . . . . . . . . . . . . . Self-Dual Connections on Taub-NUT . . . . . . . . . . Instanton Data as a Bow Diagram . . . . . . . . . . . . Structure of Moduli Spaces . . . . . . . . . . . . . . . 4.1 Algebraic description . . . . . . . . . . . . . . . . 4.2 Comparison with ADHM construction . . . . . . . Some Generalizations . . . . . . . . . . . . . . . . . . 5.1 U(n) instantons . . . . . . . . . . . . . . . . . . . 5.2 Instantons on multi-Taub-NUT . . . . . . . . . . . Brane Configuration Analysis . . . . . . . . . . . . . . 6.1 D3-D5 intersection . . . . . . . . . . . . . . . . . 6.2 D3-NS5 intersection . . . . . . . . . . . . . . . . . Impurity Theory and Mirror Symmetry . . . . . . . . . 7.1 Theory with impurities and its D-flatness conditions 7.2 Mirror symmetry and bow reciprocity . . . . . . . Moduli Space of One Instanton . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

720 720 721 723 725 725 726 726 727 728 729 729 730 730 731 733 734

720

S. A. Cherkis

1. Introduction The celebrated construction of Atiyah, Drinfeld, Hitchin, and Manin [1] provided a description of all instantons of R4 in terms of algebraic data. It has been generalized in a number of ways. Werner Nahm in [2] discovered the construction of calorons, i.e. instantons on R3 × S1 , as well as that of magnetic monopoles, in terms of solutions of an integrable system of Ordinary Differential Equations. Kronheimer and Nakajima [3] constructed instantons on ALE spaces in terms of algebraic data organized into a quiver diagram. Nekrasov and Schwarz [4] modified the original ADHM construction to obtain instantons on noncommutative R4 . Here, we present data describing instantons on the Taub-NUT and multi-Taub-NUT spaces [5], which combines all elements of the above mentioned constructions. We motivate our construction by string theory analysis of a Chalmers-HananyWitten [6,7] brane configuration that is T-dual to the brane realization of instantons on a Taub-NUT space. This analysis is akin to the string theory derivation of the original ADHM and Kronheimer-Nakajima constructions for A-type ALE spaces by Douglas and Moore [8,9]. D-brane analysis leading to the Kronheimer-Nakajima construction on a general ALE space appeared in [10]. The Kronheimer-Nakajima construction was used to obtain some explicit instanton solutions on the Eguchi-Hanson space in [11 and 12]. For the case of calorons, the explicit solutions were found in [13 and 14]. A detailed transform from the data we present here to the gauge field on the TaubNUT, as well as an explicit solution of one instanton on the Taub-NUT, will appear in [15]. Let us emphasize that the ADHM and Nahm constructions produce instantons on flat backgrounds R4 and R3 × S1 . Kronheimer-Nakajima construction does lead to instantons in nontrivial geometric backgrounds of ALE spaces, however, it is based on the fact that any ALE space is a deformation of an orbifold R4 / of flat space. Here, we study Yang-Mills instantons on essentially curved ALF spaces which do not possess any useful flat limit. There was a number of attempts at construction of instantons on the Taub-NUT space. Some isolated solutions were found in [16,17] and particular families of solutions appear in [18–20]. We claim that the construction presented here produces all solutions for generic boundary conditions. We explore the geometry of the moduli space of solutions and present as an illustration the explicit metric on the moduli space of instanton configurations of instanton number one.

2. Self-Dual Connections on Taub-NUT A Taub-NUT space is described by the following metric: ⎤ ⎡ 2 1 1 + ω) (dτ ⎦, ds 2 = ⎣ l + d r 2 + 4 | r| l + |r1|

(1)

where τ ∼ τ +4π is the periodic coordinate on the Taub-NUT, ω is a one-form ω = ω·d r such that dω = ∗3 d |r1| (here ∗3 is the Hodge star operation for a flat three-dimensional space parameterized by r).

Moduli Spaces of Instantons on the Taub-NUT Space

721

An instanton on this space is a Hermitian connection A with the curvature F = d A − i A ∧ A such that it has finite action and the curvature form is self-dual: F = ∗F.

(2)

Here ∗ denotes the Hodge dual operation for the Taub-NUT space (1). There is a number of topological invariants associated to such a connection. The action is given by the integral of the Chern character: 1 S=− 2 tr F ∧ F, (3) 8π Taub−NUT and monopole charges are defined in the following way. Since the action is finite, the curvature tends to zero as we approach infinity, thus the connection tends to a flat one on the squashed three-sphere Sr3 : {| r | = r } as r → ∞. This three-sphere is Hopf-fibered over Sr2 , with the fiber S 1 parameterized by τ. The fiber has a finite size at infinity. Since the limiting connection is flat, the monodromy of the connection along the fiber S1 has eigenvalues independent of the point on the base Sr2 . Let the limiting values of the eigenvalues be exp(2πiλ/l) and exp(−2πiλ/l), with 0 ≤ λ ≤ l/2, as r → ∞. Generically λ = 0 and thus the bundle over Sr2 splits into eigen line bundles L + → Sr2 and L − → Sr2 , with degrees d+ and d− . We define the monopole charge by m = |d+ − d− |. In intuitive physics terms, at infinity the gauge field becomes independent of τ, and after the dimensional reduction, it looks like a monopole field. The charge m, defined above, is the charge of this monopole. To be more precise, let us denote the limiting τ -independent connection along the τ -circle by Aτ dτ and the other three horizontal components of the connection by A j , so that A = A j d x j + Aτ dτ. Dimensionally reducing along the finite τ circle, as in [21], we obtain the Higgs field = l + A j

R3 .

F

1 | r|

Aτ

and the gauge field = A j − ω j Aτ in If denotes the three-dimensional curvature of A , the pair (, A ) satisfies the Bogomolny equation F = ∗3 [D, ]. Thus at infinity the pair (, A ) behaves as a monopole and m is its charge. As demonstrated in [21] the case of nonzero m and vanishing instanton number reduces to the study of singular monopoles. Nahm data for singular monopoles was identified in [22] and further explored in [23,24]. Explicit singular monopole solutions were constructed in [25,26]. In this paper we focus on the pure instanton case, i.e we put m = 0. 3. Instanton Data as a Bow Diagram We consider an SU (2) instanton of zero monopole charge and instanton number N with maximal symmetry breaking, that is, with eigenvalues of the monodromy matrix at infinity exp(±2πiλ/l) with l > λ > 0. Each such instanton configuration is determined uniquely by the data we describe below. These data can be organized into the diagram in Fig. 1. In the limit l → 0 the Taub-NUT space degenerates to flat R4 and the above diagram becomes the ADHM quiver diagram for instantons on R4 . We shall refer to diagrams such as the one in Fig. 1 as Bow Diagrams. Each interval represented by a wavy line connecting two dots corresponds to U (N ) Nahm data. In the diagram above there are three such intervals [−l/2. − λ], [−λ, λ], and [λ, l/2]; we shall refer to these as ‘Left’, ‘Middle,’ and ‘Right’ intervals respectively

722

S. A. Cherkis

Fig. 1. Bow diagram corresponding to SU (2) Instantons on Taub-NUT

and parameterize each by a coordinate s taking value in these ranges. We denote the lengths of these intervals by d L = l/2 − λ, d M = 2λ, and d R = l/2 − λ. Nahm data consists of a quadruplet (T0 (s), T1 (s), T2 (s), T3 (s)) of s-dependent Hermitian N × N matrices continuous on each interval and satisfying the Nahm equations d T1 + [T0 , T1 ] − [T2 , T3 ] = 0, ds d i T2 + [T0 , T2 ] − [T3 , T1 ] = 0, (4) ds d i T3 + [T0 , T3 ] − [T1 , T2 ] = 0. ds Geometrically, one can think of a Hermitian vector bundle E of complex dimension N d over each interval with a connection ds − i T0 and Hermitian Higgs fields (i.e. endomorphisms of the bundle E) T1 , T2 , and T3 . We use subscripts and superscripts L , M, and R to specify the interval to which the Nahm data belong. The dots on the wavy line represent the fibers of this bundle E −l/2 , E −λ , E λ , and El/2 over the points s = −l, −λ, λ, and l. External dots represent one dimensional vector spaces W L and W R . Each arrow connects two dots and represents a map from the space at its tail to the space at its head. For example, in any given trivialization B10 is represented by an N × N matrix corresponding to a map from E −l/2 to El/2 . This data transforms under a unitary gauge transformation g(s) ∈ U (N ) as ⎛ −1 ⎞ d ⎛ ⎞ g (s)T0 g(s) − ig −1 (s) ds g(s) T0 (s) ⎜ ⎟ g −1 (s)T j g(s) ⎜ T j (s) ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ B ⎟ −1 (−l/2)B g(l/2) ⎜ ⎟ g 01 ⎜ 01 ⎟ ⎜ ⎟ ⎜ B10 ⎟ −1 (l/2)B g(−l/2) ⎜ ⎟ g 10 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ (5) → g(s) : ⎜ ⎜ ⎟ ⎟ ⎜ ⎟ −1 ⎜ IL ⎟ ⎜ ⎟ g (−λ)I L ⎜ ⎟ ⎜ ⎟ ⎜ JL ⎟ JL g(−λ) ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ I ⎠ ⎝ ⎠ −1 R g (λ)I R JR J R g(λ) i

Moduli Spaces of Instantons on the Taub-NUT Space

723

Besides the Nahm Eqs. (4), these data satisfy certain conditions at s = ±l and ±λ; to write these in a compact form we introduce an auxiliary twistorial variable ζ and a combination A = T1 + i T2 + 2ζ T3 − ζ 2 (T1 − i T2 ),

(6)

for each interval. Then the following conditions are satisfied for all values of ζ : † † )(B01 − ζ B10 ), A R (l/2) = (B10 + ζ B01

(7)

† † A L (−l/2) = (B01 − ζ B10 )(B10 + ζ B01 ),

(8)

A R (λ) − A M (λ) = −(I R + ζ J R† )(J R − ζ I R† ),

(9)

A M (−λ) − A L (−λ) = −(I L + ζ JL† )(JL − ζ I L† ).

(10)

Let us point out that the matching conditions (9) and (10) are identical to those of a monopole Nahm data [27]. We claim that the set of bow data (T0 , T1 , T2 , T3 , I L , JL , I R , J R , B01 , B10 ) satisfying Eqs. (4, 7–10) determines the instanton configuration on the Taub-NUT space up to a gauge transformation. The equivalence classes of the bow data up to the transformation (5) are in one-to-one correspondence with such instantons. Moreover, this correspondence is an isometry of the corresponding hyperkähler moduli spaces. For the case of the Nahm transform for monopoles Hitchin proved that it is one-to-one in [28], and the proof of the isometry statement was given by Nakajima in [29]. 4. Structure of Moduli Spaces The moduli space of instantons on the Taub-NUT is hyperkähler; moreover, the corresponding bow data presents it as an infinite hyperkähler quotient of linear spaces, since the Nahm equations (4) and the matching conditions of Eqs. (7–10) can be viewed as moment maps with respect to the gauge group action of Eq. (5). The moduli space can also be viewed as a finite hyperkähler quotient of a product of linear spaces and a number of T ∗ U C (N ) = T ∗ Gl(N ). The hyperkähler structure on the the cotangent bundle to a group was studied in [30], where this space emerges as a moduli space of Nahm data with regular boundary conditions on an interval. It carries a natural triholomorphic action of G × G with the first and second factors corresponding to the value of the gauge transformation on one and the other end of the interval. In terms of these building blocks the moduli space M N ,λ;l of charge N instantons on the Taub-NUT space (1) is given by the following hyperkähler quotient: N ∗ C N ∗ C N T ∗GC d L × H × T G d M × H × T G d R × H ///G −l/2 × G −λ × G λ × G l/2 , (11) 2

2

2

2

† , B10 ), each group G is where H N ≈ C N × C N (I † , J ), H N ≈ C N × C N (B01 a U (N ) and N on the right, • G −l/2 acts on T ∗ G C d L on the left and on H ∗ C • G −λ acts on the first H N factor, acting on T ∗ G C d L on the right and on T G d M on the left, 2

724

S. A. Cherkis

∗ C • G λ acts on the second H N factor, acting on T ∗ G C d M on the right and on T G d R on the left, N 2 on the left. • G l/2 acts on T ∗ G C d R on the right and on H

This allows one to construct more or less explicitly the twistor space of M N ,λ;l . Let us now explain how this finite quotient construction arises from our bow diagram description. Let us use the language of the hyperkähler reduction to give geometric meaning to the equations of Sect. 3. Given the bow data one views the space of all such unconstrained data B = {(T0 (s), T1 (s), T2 (s), T3 (s), I L , JL , I R , J R , B01 , B10 )} ,

(12)

as a direct product of infinite-dimensional vector spaces of unconstrained Nahm quadruplets (T0 (s), T (s)) for each interval and the linear spaces of (B01 , B10 ), (I L , JL ), and (I R , J R ). The metric on the space of Nahm quadruplets on each interval is given by (13) ds 2 = tr δT0 δT0† + δT1 δT1† + δT2 δT2† + δT3 δT3† ds, and the natural metrics on the rest of the data are † † , δ I R† δ I R + δ J R δ J R† . δ I L† δ I L + δ JL δ JL† , tr δ B01 δ B01 + δ B10 δ B10

(14)

There are three natural complex structures on these spaces coming from identifying T0 , T1 , T2 , T3 as components of a quaternion and from the identification of the spaces parameterized by B’s and the pairs (I, J ) with quaternions. These complex structures are spelled explicitly in Sect. 8. The transformations (5) leave the metric and the complex structures invariant. Let G denote the group of all U (N ) gauge transformations on [−l/2, l/2]. Our claim is that the moduli space of instantons M N ,λ;l is isometric to the hyperkähler quotient of B by G: M N ,λ;l = B///G.

(15)

B///G is exactly the space of equivalence classes under the gauge transformation (5) of bow data satisfying the moment map Eqs. (4, 7–10). This is the infinite hyperkähler quotient of linear spaces. Now we compare this formula to Eq.(11). Let G0 denote the subgroup of G consisting of all gauge transformations g(s) that equal to identity at the marked points (i.e with g(−l/2) = g(−λ) = g(λ) = g(l/2) = 1), then G/G0 = G −l/2 × G −λ × G λ × G l/2 .

(16)

Thus, we can perform the above reduction B///G in two steps, first performing the quotient with respect to G0 and then with respect to G/G0 . The moment maps of the G0 action are exactly the left-hand-sides of the Nahm Eqs.(4). Taking the zero level of the moment map and dividing by the group action we are left with T ∗ G C d on each interval. The second step of the hyperkähler reduction amounts to formula (11).

Moduli Spaces of Instantons on the Taub-NUT Space

725

4.1. Algebraic description. Let us give a more algebraic description of this space. Selecting a particular complex structure leads to a description of the Nahm data as a connection D on the vector bundle and its endomorphism T (none of these are restricted to be ∂ Hermitian). For example, for one of the complex structures, D = ∂s − i T0 − T3 and T = T1 + i T2 . With the complex structure selected, we can combine the three Nahm equations into one complex and one real. The complex Nahm equation within the interval simply reads [D, T ] = 0. Now, we introduce the parallel transport H from one end of the interval to another. Let H (s) be covariantly constant with respect to D (i.e. D H (s) = 0) such that the value of H (s) at the left end of the corresponding interval equals the identity. Then we denote the value of H (s) at the right end of the interval by H. This yields us natural complex coordinates (H, T ) on T ∗ G C , which is the moduli space of all the regular Nahm data satisfying the Nahm equations. As in [31], in a given complex structure the moduli space M N ,λ;l = {µ = 0}/G is C C equivalent as a complex variety to {µ = 0}/G . The latter is given by {(TL , HL , I L , JL , TM , HM , I R , J R , TR , H R , B01 , B10 )} ,

(17)

where TL ,M,R ∈ gl(N , C),

I L ,R ∈ Hom(C, C N ),

B01 ∈ Hom(C N , C N ),

(18)

HL ,M,R ∈ Gl(N , C),

JL ,R ∈ Hom(C N , C),

B10 ∈ Hom(C N , C N ),

(19)

satisfying the complex moment map conditions −1 TR − HM TM H M = I R J R ,

H R−1 TR H R = B10 B01 ,

TM − HL−1 TL HL = I L JL , TL = B01 B10 ,

(20) (21)

modulo the gauge equivalence ⎛

TL , H L

⎜ ⎜ IL , ⎜ ⎜ ⎜ TM , ⎜ ⎜ I , ⎜ R ⎜ ⎜T , ⎝ R B01 ,

⎞

⎛

−1 h −1 −l/2 TL h −l/2 , h −l/2 HL h −λ

⎞

⎟ ⎟ ⎜ h −1 JL h −λ ⎟ JL ⎟ ⎜ −λ I L , ⎜ ⎟ ⎟ ⎜ ⎟ ⎟ ⎜ −1 −1 HM ⎟ ⎜ h −λ TM h −λ , h −λ HM h λ ⎟ ⎟ ⎟∼⎜ ⎟, −1 ⎟ JR ⎟ h I , J h ⎟ ⎜ R R λ λ ⎜ ⎟ ⎟ ⎜ ⎟ ⎟ −1 ⎟ H R ⎠ ⎜ h −1 TR h λ , h H h R l/2 λ λ ⎝ ⎠ −1 B10 h l/2 B01 h −l/2 , h −1 B h −l/2 10 l/2

(22)

here h −l/2 , h −λ , h λ , and h l/2 ∈ Gl(N , C). 4.2. Comparison with ADHM construction. We can use the gauge transformations given by h −λ , h λ , and h l/2 to put HL = HM = H R = 1, then TL = B01 B10 , TM = B01 B10 − I R J R , TR = B10 B01 , and the only nontrivial remaining relation is B10 B01 − B01 B10 = I L JL + I R J R ,

(23)

726

S. A. Cherkis

with the gauge equivalence given by the remaining group with the action of an element h = h −l/2 , ⎛

B01 , B10

⎞

⎛

h −1 B01 h, h −1 B10 h

⎜ ⎜ ⎟ −1 h : ⎝ I L , JL ⎠ → ⎜ ⎝ h IL , I R , JR

h −1 I R ,

JL h

⎞ ⎟ ⎟. ⎠

(24)

JR h

Equation (23) with the equivalence (24) is exactly the ADHM condition for instantons on the flat space. This establishes the isomorphism of the moduli space of instantons on a Taub-NUT space and the moduli space of unframed instantons on R4 as complex varieties. We would like to emphasize, however, that even though in any given complex structure these two moduli spaces are isomorphic, their twistor spaces and metrics differ.

5. Some Generalizations 5.1. U(n) instantons. N instantons on the Taub-NUT space with the gauge group U (n) and with maximal symmetry breaking at infinity are given in terms of the bow diagram in Fig. 2. Here exp(2πiλ1 /l), exp(2πiλ2 /l), . . . , exp(2πiλn /l) are the eigenvalues of the monodromy at infinity and, if all of them are distinct, the auxiliary spaces W1 , W2 , . . . , Wn are one-dimensional. The moduli space is again given by a finite hyperkähler quotient of a product of linear spaces and n + 1 copies of T ∗ Gl(N ).

Fig. 2. A bow corresponding to SU (n) Instantons on Taub-NUT

Moduli Spaces of Instantons on the Taub-NUT Space

727

5.2. Instantons on multi-Taub-NUT. Given the bow data for SU (2) instantons on the Taub-NUT it is relatively easy to obtain the bow data on a general Ak ALF space, which is a (k + 1)-centered multi-Taub-NUT space. In order to achieve this, similarly to [3], one can consider the quotient of the Taub-NUT by the cyclic group Zk+1 rotating the 4π Taub-NUT circle τ → τ + k+1 . If we are to obtain N instantons on (k + 1)-centered degenerate Taub-NUT, we should consider rank N (k + 1) ADHM data for its covering Taub-NUT space. In other words, the bundle E of Sect. 3 is rank N (k + 1). The equivariance conditions one should impose are ⎛ ⎞ ⎛ ⎞ U −1 T0 U T0 (s) ⎜ ⎟ U −1 T j U ⎜ ⎜ T j (s) ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ 2πi ⎟ ⎜ U −1 B01 U e k+1 ⎟ ⎜ B ⎟ ⎜ 2πi ⎜ 01 ⎟ ⎟ ⎜ − k+1 −1 ⎜ B10 ⎟ ⎟ U B10 U ⎟ ⎜e ⎜ ⎟ ⎜ ⎜ ⎟. ⎟ (25) g(s) : ⎜ ⎟ ⎟ → ⎜ ⎜ U −1 I µ ⎜ IL ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ L L ⎜ ⎜ JL ⎟ ⎟ µ−1 JL U ⎜ ⎜ ⎟ ⎟ L ⎜ ⎜ ⎟ ⎟ ⎜ ⎟ ⎝ I ⎠ ⎝ U −1 I µ ⎠ R R R JR µ R JR U 2πi

Here the multiplication by e± k+1 rotates the circle of the Taub-NUT, and the factors µ L and µ R with |µ L | = |µ R | = 1 are the linear transformations of the auxiliary spaces W L and W R . The factors µ L and µ R correspond to the accompanying gauge transformation of the gauge fields on the Taub-NUT. U = I N ×N ⊗ Ck+1 , with Ck+1 being a permutation matrix permuting the (k + 1) blocks. We can choose it to be a clock matrix 2πi 2πi 2πi Ck+1 = diag(1, e k+1 , e2 k+1 , . . . , ek k+1 ). In order to satisfy the equivariance conditions 2πi µ L and µ R have to be some integer powers of the (k +1)st root of unity, say µ L = e k+1 p L 2πi and µ R = e k+1 p R . The equivariant bow data thus consists of the block diagonal T0 and T j , ⎞ ⎛ ⎛ (k) ⎞ (1) 0 0 0 . . . B01 0 B10 0 . . . 0 (2) ⎜ B (1) 0 0 . . . 0 ⎟ ⎜ 0 0 B10 . . . 0 ⎟ ⎟ ⎟ ⎜ 01 ⎜ ⎟ ⎜ ⎜ (2) 0 0 ... 0⎟ (26) B01 = ⎜ 0 B01 0 . . . 0 ⎟ , B10 = ⎜ 0 ⎟, ⎜ . ⎜ . .. .. . . .. ⎟ .. .. . . .. ⎟ .. ⎠ ⎠ ⎝ . ⎝ . . . . . . . . . (k) 0 0 ... 0 0 0 0 ... 0 B10 and I L and JL are such that they have only p th L block nonvanishing; similarly I R and J R have only the p th block nonvanishing. The remaining gauge transformations that respect R the equivariance conditions are block diagonal. These data are naturally arranged into a bow diagram with k + 1 wavy lines con( j) ( j) nected cyclically by the components B01 and B10 and two auxiliary spaces W L and th W R connected to some points on the p th L and p R wavy lines. If we are to start with U (n) instantons on the Taub-NUT (instead of the SU (2) instantons discussed so far in this section) the only difference would be to have n auxiliary spaces and n (instead of two) pairs of maps (Il , Jl ). The maps Il and Jl map between these auxiliary spaces and some fibers above n points on some of the wavy intervals.

728

S. A. Cherkis

Fig. 3. A bow corresponding to U (n) Instantons on multi-Taub-NUT

Thus, the bow data for the U (n) instantons in a nondegenerate multi-Taub-NUT background is given by Fig. 3. The positions of the dots on the wavy lines are given by the eigenvalues of the monodromy at infinity while the differences of the lengths of the wavy lines are determined by the fluxes of the B field on the multi-Taub-NUT. Just as in the ALE case [3], for a generic (nondegenerate) multi-Taub-NUT, the moment maps no longer vanish, but are given by the multi-Taub-NUT resolution parameters.

6. Brane Configuration Analysis We begin with a setup similar to that of Douglas and Moore [8] by realizing a U (n) charge N instanton configuration on a Taub-NUT space in Type IIA string theory. Namely, consider Type IIA string theory on a direct product of the Taub-NUT space and the six-dimensional Minkowski space-time. Let the coordinates 0, 1, 2, 7, 8, 9 be in the Minkowski space and the remaining coordinates 3, 4, 5 and 6 in the Taub-NUT space with the sixth coordinate being the periodic coordinate of the circle of the Taub-NUT. Now introduce n D6-branes wrapping the Taub-NUT space with the world-volumes in the 0, 1, 2, 3, 4, 5, 6 directions, and N D2-branes positioned at points on the Taub-NUT space with world-volumes in the 0, 1, 2 directions. In the effective world-volume theory on the D6-brane, this configuration is described by N instantons in the U (n) gauge group on the Taub-NUT. Performing the T-duality along the periodic Taub-NUT direction 6, we obtain a Chalmers-Hanany-Witten configuration [6,7] of n D5-branes, N D3-branes and one NS5-brane in a flat ten-dimensional Minkowski space with direction 6 periodic. The exact orientation of the branes is specified in Table 1.

Table 1. Brane configuration and bosonic bulk matter content of the impurity theory on the D3-branes D5 D3 NS5 Vector Adjoint Hyper

0 × × × A0

1 × × × A1

2 × × × A2

3 ×

Re H2

4 ×

Im H2

5 ×

Re H1

6 × Im H1

7

8

9

× Y1

× Y2

× Y3

Moduli Spaces of Instantons on the Taub-NUT Space

729

Fig. 4. Fundamental multiplet

Now that we have identified the relevant brane configuration, we would like to describe the theory on the world-volume of the D3-branes. In the absence of the fivebranes this would be a maximally supersymmetric Yang-Mills theory with the gauge group U (N ). As indicated in Table 1 the Vector multiplet consists of the three-dimensional gauge field with components A0 , A1 , A2 and the Higgs fields Y1 , Y2 , Y3 , corresponding to the transverse directions 7, 8, 9 to the D3-branes as well as two Majorana fermions. The Adjoint Hypermultiplet consists of one Dirac fermion, three Hermitian Higgs fields and one Hermitian connection. We combine these Higgs fields and this connection into two complex fields H1 and H2 , so that ReH2 , Im H2 , ReH1 , are the Higgs fields corresponding to the transverse directions 3, 4, 5, while ReH1 is the gauge field connection in direction 6. The presence of the five-branes breaks the maximal N = 4 supersymmetry to N = 2, d = 4 and introduces additional degrees of freedom in the effective world-volume theory on the D3-brane. Even though the sigma model analysis of this configuration is difficult due to the presence of both Ramond and Neveu-Schwarz charges, let us give some local arguments for the inhomogeneities we are about to introduce.

6.1. D3-D5 intersection. Let us focus first on one of the D5-branes with a number of D3-branes ending on it on the left and the same number ending on the right as in the leftmost diagram in Fig. 4. Assembling the D3-branes together, we can end up in the middle configuration of Fig. 4. Now, we can move the D3-branes in the direction transverse to the D5, thus ending with the configuration on the right diagram of Fig. 4. This brane configuration contains a massive string state corresponding to the lowest excitation of the string connecting the D3’s and the D5’s. Its mass is proportional to the distance between the D3-branes and the D5-brane and it is clearly in the fundamental representation of the gauge group on the D3-branes. Running this argument backwards, moving the D3’s so that they intersect the D5 renders this fundamental multiplet f massless, and separating the two parts of the D3-brane amounts to giving it a vacuum expectation value.

6.2. D3-NS5 intersection. Here we focus on the local configuration with the D3-branes in the vicinity of an NS5-brane as on the left diagram in Fig. 5. Moving the D3-branes towards the NS5-brane, so that they intersect as in the middle diagram, we can now separate the left and the right parts of the D3’s along the NS5-brane, as in the rightmost diagram of Fig. 5. This configuration has a string state corresponding to the lowest excitation of a string stretching between the left and right D3-brane ends. This state is in the

730

S. A. Cherkis

Fig. 5. Bifundamental multiplet

bifundamental of the gauge group on the D3-brane and its mass is proportional to the distance between the two D3 ends. Again, running the arguments in reverse, we move the left and right D3-branes’ ends together, so that the bifundamental multiplet B becomes massless. Now moving the D3-brane off the NS5-brane amounts to giving this multiplet some vacuum expectation value. It is clear from the leftmost diagram of Fig. 5 that there appears to be a single gauge group on the D3-brane, so one might ask how this fits with our description of two gauge groups and a bifundamental multiplet. As we are about to see in the next section, the D-flatness conditions state that the expectation value of the Higgs field corresponding to the D3-brane position transverse to the NS5-brane equals the bilinear combination of the bifundamental multiplet B. As a result the Higgs breaks the product of the left and right gauge groups to a subgroup of the skew-diagonal that leaves the bifundamental field invariant. 7. Impurity Theory and Mirror Symmetry 7.1. Theory with impurities and its D-flatness conditions. Supersymmetric gauge theories with impurities of codimension one and two were studied in [32] via the T-duality. We adapt these results to our case in this section. We should mention here the superfield formulations of impurity theories in N = 1, d = 4 [34] and N = 2, d = 3 [35] superspace that can be adapted in our context by introducing defects with bifundamental superfields. Following the approach of [32] here we work in components. The bosonic matter content in the bulk is • the vector multiplet containing the gauge fields A M with M = 0, 1, 2, and three Higgs fields Y i , • the adjoint hypermultiplet containing two complex Higgs fields H1 and H2 . The degrees of freedom localized at the inhomogeneities in the interior at x6 = λ p are two complex fundamental scalars f 1 p and f 2 p . The main difference from [32] is that we introduce bifundamental hypermultiplet B. This multiplet contains two scalar fields B1 and B2 transforming as bifundamental fields with respect to the gauge groups acting on the left (x6 = 0) and the right (x6 = l) ends of the interval. Augmenting the Lagrangian of [32] with these bifundamental fields we obtain the following component form of the bosonic field Lagrangian L = L 1 + L 2 of the effective theory on the D3-branes. The bulk Lagrangian L 1 (with the index ranges M = 0, 1, 2 and µ, ν = 0, 1, 2, 6) is given by 2 1 1 2 1 1 L1 = + Dµ Y i + |[Dµ H j ]|2 d 3 x M d x6 Fµν l 2 2 2 1 i j 2 i j 2 (27) − |[Y , Y ]| − |[Y , H ]| , 2 i< j

ij

Moduli Spaces of Instantons on the Taub-NUT Space

731

and the inhomogeneity Lagrangian 2 2 1 3 L2 = δ(x6 − λ p ) D M f p − Y i f p d x M d x6 l l p 2 2 j j + δ(x6 ) D M B − Y |x6 =0+ B − BY |x6 =l− 1 2 δ(x6 − λ p ) f αp ⊗ f † pβ |D| + TriDβα [Hα , H †β ] + l 2 p − δ(x6 )B †β ⊗ Bα + δ(x6 − l)Bα ⊗ B †β , +

(28)

where Im H1 is understood to be a covariant derivative D6 in the sixth direction (in other words H1 = D6 + T3 = ∂∂x6 − i T0 + T3 ), the auxiliary field Dβα = D j (σ j )αβ , and D M B is the covariant derivative naturally acting on the bifundamenal, e.g. D M B1 = ∂ M B1 − i(A M |x6 =0+ )B1 + i B1 (A M |x6 =l− ). † † √ J p In order to make contact with our notation we let l f p = , B = B01 , Ip B10 and H2 = T1 + i T2 , then the D-flatness conditions that are easily read off from the above Lagrangian become: k dT1 1 − i[T0 , T1 ] + i[T2 , T3 ] = − δ(s − λ p ) I J − J † I † − d x6 2 p=1 † † † † −δ(s)(B01 B10 − B10 B01 ) + δ(s − l)(B10 B01 − B01 B10 ) ,

(29)

k 1 dT2 − i[T0 , T2 ] + i[T3 , T1 ] = i δ(s − λ p ) I J + J † I † − d x6 2 p=1 † † † † −δ(s)(B01 B10 + B10 B01 ) + δ(s − l)(B10 B01 + B01 B10 ) ,

(30)

k 1 dT3 − i[T0 , T3 ] + i[T1 , T2 ] = − δ(s − λ p ) J † J − I I † − d x6 2 p=1 † † † † −δ(s)(B01 B01 − B10 B10 ) + δ(s − l)(B01 B01 − B10 B10 ) .

(31)

These reproduce exactly the Nahm equations (4), as well as the boundary conditions (7,8) and the matching conditions (9,10). 7.2. Mirror symmetry and bow reciprocity. The brane configuration identified in Sect. 6 allows for an S-dual description. S-duality leads to an analogous configuration with NS five-branes in place of D5-branes and vice versa. From the point of view of the gauge theory on the D3-brane identified in Sect. 7.1, this duality is the MontonenOlive electric-magnetic duality. The brane picture leads us to conclude that this duality effectively interchanges the two types of inhomogeneities in our theory, leading to a

732

S. A. Cherkis

Fig. 6. Reciprocity rule

Fig. 7. A bow pair

reciprocity among bows. The reciprocity rule is represented in Fig. 6. It amounts to interchanging the fundamental and bifundamental multiplets and splitting and rejoining the Nahm data intervals accordingly. If we are to consider a theory with impurities specified by the diagram on the right in Fig. 7, it is a theory with SU (N ) gauge group with three inhomogeneity hyperplanes: two with fundamental and one with bifundamental hypermultiplets. Two of the branches of vacua of this theory are described by the two bow diagrams of Fig. 7. The Coulomb branch of the theory is isometric to the moduli space of N U (1) instantons on a twocentered Taub-NUT (also called A1 ALF space)1 , while the Higgs branch is given by the moduli space of N U (2) instantons on a Taub-NUT space ( A0 ALF). As usual, S-duality maps the above impurity theory to another U (N ) impurity theory with two bifundamental and one fundamental matter multiplets. To give an example, we apply this rule to the bow diagram of SU (2) instantons on the Taub-NUT space. The reciprocal pair of diagrams is presented in Fig. 7. The resulting bow diagram describes Abelian self-dual connections on a two-centered Taub-NUT space. Let us emphasize that the moduli spaces of a bow and its reciprocal bow are generically different. What this reciprocity suggests, however, is that the two reciprocal bows should be considered together as a pair and that the moduli space of each bow in this pair should be viewed as one of the branches of a larger moduli space. These two branches can be interpreted as the Coulomb branch and the Higgs branch of the moduli space 1 At first sight this moduli space might appear to be empty, however, in the presence of noncommutativity, indicated here by the difference of the wavy interval lengths, this space is not trivial.

Moduli Spaces of Instantons on the Taub-NUT Space

733

of vacua of a corresponding impurity gauge theory of Sect. 7.1. These branches are interchanged by the action of the gauge theory mirror symmetry. This mirror symmetry action manifests itself as the bow reciprocity we formulated here.

8. Moduli Space of One Instanton For the case of a single instanton all the corresponding bow data are Abelian and the Nahm equations imply that all T j ’s for j = 1, 2, 3 are constant on each interval while T0 can be made constant by a gauge transformation that acts trivially at the boundary. If the 0 interval has length d, due to a gauge transformation exp 2πi s−s , the corresponding d T0 is periodically identified with the period 2π/d. Thus we have three copies of R3 × S1 parameterised by the Nahm data on the three intervals and three copies of R4 parameterized by (I L , JL ), (I R , J R ), and (B01 , B10 ). The remaining gauge group acting on these data is U (1)×4 with the following action: ⎛ ⎞ ⎞ e−iφ0 B1 eiφ3 B1 ⎜ ⎟ e−iφ3 B2 eiφ0 ⎜ B2 ⎟ ⎜ ⎟ ⎜ ⎟ −iφ1 I ⎜ ⎟ e ⎜ IL ⎟ L ⎜ ⎟ ⎜ ⎟ iφ1 ⎜ ⎟ J e ⎜ JL ⎟ L ⎜ ⎟ ⎜ I ⎟ −iφ ⎜ ⎟ e 2 IR ⎜ R ⎟ ⎜ ⎟. : ⎜ J ⎟ → ⎜ iφ ⎟ ⎜ R ⎟ JR e 2 ⎜ ⎟ ⎜ TL ⎟ L ⎜ ⎜ 0 ⎟ T0 + (φ1 − φ0 )/d L ⎟ ⎜ ⎟ ⎜TM ⎟ ⎜ T M + (φ2 − φ1 )/d M ⎟ ⎜ 0 ⎟ ⎜ ⎟ 0 ⎝TR ⎠ ⎝ T R + (φ3 − φ2 )/d R ⎠ 0 0 Tj Tj ⎛

eiφ0 × eiφ1 × eiφ2 × eiφ3

(32)

It is convenient to introduce quaternionic notation for this data: XB =

B2† B1† −B1 B2

,

XL =

I L† −JL† JL I L

,

XR =

I R† −J R† JR I R

.

(33)

Then the complex structures are given by I = −iσ3 , J = −iσ1 , and K = −iσ2 . The following computation is close to that of [33]. For each of the X L , X M , and X R we introduce coordinates L , M , and R via the following decomposition: X = Q exp(−iσ3 /2), with an anti-Hermitian Q and a periodic ∼ + 4π. For any such X, the combination X σ3 X † is traceless and Hermitian, thus it can be written in terms of a real three vector so that R · σ = X σ3 X † . The decomposition of X in terms of Q and was unique up R, to a change of the sign of Q assisted by a 2π shift in , thus X is determined uniquely in terms of R and . Moreover, the flat metric I2×2 · ds X2 = d X d X † = where d = ∗d

1 . | R|

1 4

1 2 d R + | R|(d + )2 I2×2 , | R|

(34)

If X transforms as X → X exp(−iσ3 φ) (i.e. → + 2φ), the On the other hand if the Nahm data transforms corresponding moment map is µ = 21 R. = T . With this in mind, we find the as (T0 , Tj ) → (T0 + φ, Tj ), the moment map is µ

734

S. A. Cherkis

following moment maps for the U (1)×4 action of Eq. (32): 1 1 µ 0 = − R B − TL , µ 1 = TL − TM − R L , 2 2 1 1 µ 3 = −TR + R B µ 2 = TM − TR + R R . 2 2 The following invariants of the gauge transformations (32) θ = B − 2d L T0L − L + α = L + R − 2d M T0M ,

dR ( R + 2d R T0R ), dL

(35) (36)

(37) (38)

provide two periodic coordinates of period 4π on the moduli space. In our case d L = d R = d/2, l = d L + d M + d R , d M = 2λ. Imposing the vanishing of the moment maps above we have 1 1 1 TL = TR = − R B = R1 , TM = R2 , r = R L = R R = R1 − R2 , (39) 2 2 2 1 1 2 2 2 d R1 − 4λd R1 d r + 2λ + d r ds = l + 2R1 r 1 2 1 1 (dα + ωr )2 2 dθ − 4 ω R1 + , (40) + l − 2λ + 1/r + 1/(2R1 ) 4 2λ + 1/r where θ ∼ θ + 4π, α ∼ α + 4π and dω R1 = ∗d(1/R1 ), dωr = ∗d(1/r ). This metric matches that of [36], where the moduli spaces of the corresponding three-dimentional gauge theories were studied. We interpret this moduli space metric in the following way. Just as in the case of a caloron, an instanton on the Taub-NUT space has two monopole-like constituents. These constituents are characterized by their positions in the three-space given by the vectors R1 and R2 and phases θ1 and θ2 respectively. The metric above is written in terms of the coordinates and the phase of the first constituent R1 and θ = θ1 and the relative position r = R1 − R2 and the relative phase α = θ2 − θ1 of the two constituents. 9. Conclusions We presented the Bow Diagram formalism which encodes the data determining instanton configurations on the Taub-NUT and multi-Taub-NUT spaces. Our discussion here was limited to zero monopole charge and a generic Wilson line at infinity. We motivated our construction by identifying a corresponding string theory brane configuration and analyzing the theory on D-branes’ world-volume. The resulting impurity theory on a four-dimensional space-time with one compact dimension contains two types of impurities localized on hyperplanes perpendicular to the periodic direction. We formulate a reciprocity rule that interchanges the two types of impurities. Applying this rule to all impurities in a bow leads to a dual bow. Supersymmetry conditions defining the moduli space of the quantum gauge theory are exactly the moment maps of the corresponding bow. The bow formulation allows us to find the moduli space of instantons on the TaubNUT. We identify it as a finite hyperkähler quotient and establish its holomorphic equivalence with the moduli space of instantons on R4 . As an example, we find the metric on the moduli space of a single instanton on the Taub-NUT space.

Moduli Spaces of Instantons on the Taub-NUT Space

735

Acknowledgements. It is our pleasure to thank Tamas Hausel and Juan Maldacena for illuminating discussions. We are grateful to the Institute for Advanced Study, Princeton and Princeton University Mathematics Department for hospitality. This work is supported by the Science Foundation Ireland Grant No. 06/RFP/MAT050 and by the European Commision FP6 program MRTN-CT-2004-005104.

References 1. Atiyah, M.F., Hitchin, N.J., Drinfeld, V.G., Manin Yu., I.: Construction of instantons. Phys. Lett. A 65, 185 (1978) 2. Nahm, W.: Selfdual monopoles and calorons. BONN-HE-83-16 Presented at 12th Colloq. on Group Theoretical Methods in Physics, Trieste, Italy, Sep 5–10, 1983; published in lecture Notes in Physics 201, New York: Springer, 1984, pp. 189–200 3. Kronheimer, P.B., Nakajima, H.: Yang-Mills instantons on ALE gravitational instantons. Math. Ann. 288(2), 263–307 (1990) 4. Nekrasov, N., Schwarz, A.S.: Instantons on noncommutative R4 and (2,0) superconformal six dimensional theory. Commun. Math. Phys. 198, 689 (1998) 5. Taub, A.H.: Empty space-times admitting a three parameter group of motions. Ann. Math. 53(3), 472–490 (1951); Newman, E., Tamburino, L., Unti, T.: Empty-space generalization of the Schwarzschild metric. J. Math. Phys. 4, 915 (1963); Hawking, S.W.: Gravitational instantons. Phys. Lett. A 60, 81 (1977) 6. Chalmers, G., Hanany, A.: Three dimensional gauge theories and monopoles. Nucl. Phys. B 489, 223 (1997) 7. Hanany, A., Witten, E.: Type IIB superstrings, BPS monopoles, and three-dimensional gauge dynamics. Nucl. Phys. B 492, 152 (1997) 8. Douglas, M.R., Moore, G.W.: D-branes, quivers, and ALE instantons. http://arxiv.org/abs/hep-th/ 9603167v1, 1996 9. Douglas, M.R.: Gauge fields and D-branes. J. Geom. Phys. 28, 255 (1998) 10. Johnson, C.V., Myers, R.C.: Aspects of type IIB theory on ALE spaces. Phys. Rev. D 55, 6382 (1997) 11. Bianchi, M., Fucito, F., Rossi, G., Martellini, M.: Explicit construction of Yang-Mills instantons on ALE spaces. Nucl. Phys. B 473, 367 (1996) 12. Bianchi, M., Fucito, F., Rossi, G., Martellini, M.: On the ADHM construction on ALE gravitational backgrounds. Phys. Lett. B 359, 49 (1995) 13. Kraan, T.C., van Baal, P.: Exact T-duality between calorons and Taub - NUT spaces. Phys. Lett. B 428, 268 (1998); Kraan, T.C., van Baal, P.: Periodic instantons with non-trivial holonomy. Nucl. Phys. B 533, 627 (1998); Kraan, T.C., van Baal, P.: New instanton solutions at finite temperature. Nucl. Phys. A 642, 299 (1998) 14. Lee, K.M., Lu, C.h.: SU(2) calorons and magnetic monopoles. Phys. Rev. D 58, 025011 (1998) 15. Cherkis, S.A.: Instantons on the Taub-NUT space. http://arxiv.org/abs/0902.4724 16. Pope, C.N., Yuille, A.L.: A Yang-Mills instanton in Taub-NUT space. Phys. Lett. B 78, 424 (1978); Yuille, A.L.: Phys. Lett. B 81, 321 (1979) 17. Kim, H., Yoon, Y.: Instanton-meron hybrid in the background of gravitational instantons. Phys. Rev. D 63, 125002 (2001); Kim, H., Yoon, Y.: Effects of gravitational instantons on Yang-Mills instanton. Phys. Lett. B 495, 169 (2000) 18. Etesi, G.: Classification of ’t Hooft instantons over multi-centered gravitational instantons. Nucl. Phys. B 662, 511 (2003) 19. Etesi, G., Hausel, T.: New Yang-Mills instantons over multi-centered gravitational instantons. Commun. Math. Phys. 235, 275 (2003) 20. Etesi, G., Hausel, T.: Geometric construction of new Taub-NUT instantons. Phys. Lett. B 514, 189 (2001) 21. Kronheimer, P.B.: Monopoles and Taub-NUT Metrics. M. Sc. Thesis, Oxford, 1985 22. Cherkis, S.A., Kapustin, A.: Singular monopoles and supersymmetric gauge theories in three dimensions. Nucl. Phys. B 525, 215 (1998) 23. Cherkis, S.A., Kapustin, A.: Singular monopoles and gravitational instantons. Commun. Math. Phys. 203, 713 (1999) 24. Cherkis, S.A., Kapustin, A.: Dk gravitational instantons and Nahm equations. Adv. Theor. Math. Phys. 2, 1287 (1999) 25. Cherkis, S.A., Durcan, B.: Singular Monopoles via the Nahm Transform. JHEP 0804, 070 (2008) 26. Cherkis, S.A., Durcan, B.: The ’t Hooft-Polyakov monopole in the presence of an ’t Hooft operator. Phys. Lett. B 671, 123–127 (2009) 27. Hurtubise, J., Murray, M.K.: On the construction of monopoles for the classical groups. Commun. Math. Phys. 122, 35 (1989)

736

S. A. Cherkis

28. Hitchin, N.J.: On the construction of monopoles. Commun. Math. Phys. 89, 145 (1983) 29. Nakajima, H.: Monopoles and Nahm’s equations. In: Sanda 1990, Proceedings, Einstein metrics and Yang-Mills connections. Mabuchi, T., Mukai, S., eds., New York: Marcel Dekker, 1993, pp. 193–211 30. Dancer, A., Swann, A.: Hyperkähler metrics associated to compact Lie groups. Math. Proc. Cambridge Philos. Soc. 120(1), 61–69 (1996) 31. Donaldson, S.K.: Nahm’s equations and the classification of monopoles. Commun. Math. Phys. 96, 387 (1984) 32. Kapustin, A., Sethi, S.: The Higgs branch of impurity theories. Adv. Theor. Math. Phys. 2, 571 (1998) 33. Gibbons, G.W., Rychenkova, P.: HyperKaehler quotient construction of BPS monopole moduli spaces. Commun. Math. Phys. 186, 585 (1997) 34. DeWolfe, O., Freedman, D.Z., Ooguri, H.: Holography and defect conformal field theories. Phys. Rev. D 66, 025009 (2002) 35. Erdmenger, J., Guralnik, Z., Kirsch, I.: Four-dimensional superconformal theories with interacting boundaries or defects. Phys. Rev. D 66, 025020 (2002) 36. de Boer, J., Hori, K., Ooguri, H., Oz, Y.: Mirror symmetry in three-dimensional gauge theories, quivers and D-branes. Nucl. Phys. B 493:101 (1997) Communicated by N. A. Nekrasov

Commun. Math. Phys. 290, 737–777 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0860-y

Communications in

Mathematical Physics

Index and Stability of Symmetric Periodic Orbits in Hamiltonian Systems with Application to Figure-Eight Orbit Xijun Hu1, , Shanzhong Sun2, 1 Department of Mathematics, Shandong University, Jinan,

Shandong, 250100, The People’s Republic of China. E-mail: [email protected] 2 Department of Mathematics, Capital Normal University, Beijing, 100048, The People’s Republic of China. E-mail: [email protected] Received: 8 August 2008 / Accepted: 10 April 2009 Published online: 7 July 2009 – © Springer-Verlag 2009

Abstract: In this paper, using the Maslov index theory in symplectic geometry, we build up some stability criteria for symmetric periodic orbits in a Hamiltonian system, which is motivated by the recent discoveries in the n-body problem. The key ingredient is a generalized Bott-type iteration formula for periodic solution in the presence of finite group action on the orbit. For second order system, we prove, under general boundary conditions, the close formula for the relationship between the Morse index of an orbit in a Lagrangian system and the Maslov index of the fundamental solution for the corresponding orbit in its Hamiltonian system counterpart, and the boundary conditions cover the cases which appeared in the n-body problem. As an application we consider the stability problem of the celebrated figure-eight orbit due to Chenciner and Montgomery in the planar three-body problem with equal masses, and we clarify the relationship between linear stability and its variational nature on various loop spaces. The basic idea is as follows: the variational characterization of the figure-eight orbit provides information about its Morse index; based on its relation to the Maslov index, our stability criteria come into play.

1. Introduction and Main Results The purpose of this paper is to study the stability of the symmetric periodic solutions in Hamiltonian systems. Our main idea is to make full use of the variational nature of periodic solutions in the presence of symmetry. The main motivation for us is periodic orbits with symmetry in celestial mechanics, specially in n-body problem, via direct variational method. The idea of minimization in the three-body problem goes back to Partially supported by NSFC (No.10801127) and the knowledge innovation program of the Chinese Academy of Science. Partially supported by NSFC (No.s 10401025, 10571123 and 10731080) and NSFB-FBEC (No. KZ20 0610028015).

738

X. Hu, S. Sun

Poincaré. Although his attack to this problem is not so successful, it reveals the essential difficulty for the Newtonian potential other than the strong force case. Except for the now classical work of Gordon [G] on the minimizing property of elliptic Keplerian orbits, there has been very little progress in this direction for more than one century. Until recently in 2000, in their celebrated paper [CM], Chenciner and Montgomery made a major breakthrough by considering minimization in some symmetric loop space in the equal mass case to find the now-famous figure-eight orbit in the planar three-body problem with equal masses. Here the symmetry of the loop plays an important role to avoid collisions. In fact they get their solution by considering the minimizing problem in some path space and then use the symmetry to glue the pieces together to get the whole orbit. This is the starting point of later exciting developments in this field. Now a lot of solutions, rigorous or numerical, have been derived by these ideas. They depend heavily on the equal mass assumption [FT]. Note that only quite recently, Chen [CH] relaxes this condition. The symmetric nature of these solutions are our main concern in this paper. We should note that the general theory developed in the paper works also for periodic orbits with variational characterizations other than minimizers. Let z ≡ z(t) be a periodic solution of the Hamiltonian system with Hamiltonian H , namely

z˙ (t) = J H (t, z(t)),

(1.1)

z(0) = z(T )

(1.2)

0 −In and In the identity matrix on Rn . Its associated fundamental soluIn 0 tion γ ≡ γz (t) is such that

with J =

γ˙ (t) = J H (t, z(t))γ (t),

(1.3)

γ (0) = I2n .

(1.4)

It is well-known that γ is a path in the symplectic group Sp(2n) = {M ∈ G L(2n) | M T J M = J }, and γ (T ) is called the linear Poincaré map. Solution z is called linear stable if ||γ (T )k || is bounded for all k ∈ N and spectral stable (or elliptic) if all the eigenvalues of γ (T ) are on U, the unit circle in the complex plane C. For M ∈ Sp(2n), let the elliptic height e(M) be the total algebraic multiplicity of all eigenvalues of M on U. In studying the existence, multiplicity and stability of periodic solutions of a Hamiltonian system, a very successful index theory for such symplectic paths is introduced by Conley and Zehnder [CZ] and developed by Long and others (see [L3] for details). We call it Maslov-type index theory (or Morse index for the Lagrangian systems). For ω ∈ U, let i ω (γ ) be the Maslov-type index of γ . For the solution z satisfing (1.1–1.2) with γ its fundamental solution, its Maslov-type index is defined to be i ω (z) := i ω (γ ). We review the Maslov-type index in Sect. 4.1 for the completeness. The key idea for the stability problem is to use the Bott-type iteration formula for the Maslov-type index [B,BTZ,E,L1] to get an estimate of the elliptic height. A very interesting application is the ellipticity of closed characteristics on the compact convex

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

739

hypersurface in R2n . Dell’Antonio, D’Onofrio and Ekeland [DDE] prove that there is at least one elliptic closed characteristic on any symmetric compact convex hypersurface in R2n . Ekeland and Hofer ([EH], Prop. 2, [E], Th. V.3.4) provide a close relationship between the set of Maslov-type indices of closed characteristics and the set of even positive integers, which is the core in studying the multiplicity and ellipticity of the closed characteristics. For the compact convex hypersurface in R4 , a spectacular theorem of Hofer, Wysocki and Zehnder [HWZ] says that the number of geometrically different closed characteristics is either 2 or infinity, and Long [L2] further proves that all the closed characteristics are elliptic if the number is finite. Recently, Long and Zhu [LZ1] prove that there is at least one elliptic closed characteristic if the number is finite on any compact convex energy hypersurface in R2n . Bearing these in mind, in this paper we use Maslov index theory [CLM] (including its complex version [LZ2,Z]) to study the stability of the symmetric periodic solutions in Hamiltonian systems, especially in the n-body problem. To this purpose, at first we need to generalize the Bott-type iteration formula to the case with group action on the orbit. Let E be a Hilbert space, and G a finite subgroup of the orthogonal group acting on E. If f is a G-invariant functional on E, and x is a critical point of f restricted to the G-invariant subspace, then it is also a critical point on the total space following Palais’ principle. The symmetry of solution used recently in celestial mechanics is essentially generated by the next two kinds of group actions. Type I H : Cyclic Symmetry. Let Sp(2n) and O(2n) be the symplectic and orthogonal group respectively, and Q ∈ Sp(2n) ∩ O(2n) is fixed. Take the function space to be E = {z ∈ W 1,2 (R/T Z, R2n ) | z(t) = Qz(t + T )}. The Zm -group action for the generator g ∈ Zm is as follows: g : E → E, z(t) → Sz(t +

T ), m

where S is an orthogonal symplectic matrix such that S J = J S and S m = Q, hence g m = id. Suppose the Hamiltonian function H (t, z) ∈ C2 (R × R2n , R) satisfies H (t − T /m, Sz) = H (t, z) (H (Sz) = H (z) in autonomous case), then the functional T f (z) = 0 [(−J dz(t) dt , z(t)) − H (t, z(t))]dt is Zm -invariant. Type I I H : Generalized Brake Symmetry. This is some generalized time-reversal Z2 -group action. Let S, N ∈ O(2n) and they satisfy S J = J S, N 2 = I2n , N J = −J N , N = N T . Moreover, suppose N ST = S N ,

(1.5)

and the Hamiltonian function H (t, z) satisfies H (T −t, N z) = H (t, z) (H (N z) = H (z) in the autonomous case). Let E = {z ∈ W 1,2 ([0, T ], R2n ) | z(0) = Sz(T )}, then g : E → E, z(t) → N z(T − t), T generates a Z2 group action on E. Obviously, the functional f (z) = 0 [(−J dz(t) dt , z(t))− H (t, z(t))]dt is Z2 -invariant on E. Let V + (S N ) (V + (N )) and V − (S N ) (V − (N )) be

740

X. Hu, S. Sun

the positive and negative definite subspaces of S N (or N respectively) in R2n , then both V ± (S N ) and V ± (N ) are Lagrangian subspaces of (R2n , ω) with ω the standard symplectic form on R2n . These two group actions are motivated by the periodic solutions of the n-body problems appearing in recent literature [C1,CM,CH,FT]. For the notations of cyclic type and brake type, we follow [FT]. However in our setting the function space needs not to be a loop space, and it can be a path space, which is different from [FT]. Note that for M ∈ Sp(2n), Gr (M) := {(x, M x) | x ∈ R2n } is a Lagrangian subspace of the symplectic vector space (R2n ⊕ R2n , −ω ⊕ ω). For some solution z of Hamiltonian system with fundamental solution γ , Gr (γ ) is then a path of Lagrangian subspaces. We denote by µ(Gr (Q T ), Gr (γ (t)), t ∈ [0, T ]) the Maslov index of a symplectic path γ (t)(t ∈ [0, T ]) with respect to Q ∈ Sp(2n), and its definition will be given in details in (2.8) following [CLM]. Theorem 1.1. Let z be a solution to the Hamiltonian system (1.1) with γ its fundamental solution. Under the above notations, for Type I H symmetry, µ(Gr (Q T ), Gr (γ (t)), t ∈ [0, T ]) m √ i µ(Gr (exp( 2π −1)S T ), Gr (γ (t)), t ∈ [0, T /m]), = m

(1.6)

i=1

and for Type I I H symmetry, T ]) 2 T +µ(V − (N ), γ (t)V − (S N ), t ∈ [0, ]). 2

µ(Gr (S T ), Gr (γ (t)), t ∈ [0, T ]) = µ(V + (N ), γ (t)V + (S N ), t ∈ [0,

(1.7)

This is a generalization of the Bott-type iteration formula for Maslov index of periodic solutions as remarked above. If Q = I2n , S = I2n , then (1.6) is the usual iteration formula [L3]. For Type I I H symmetry, note that when S = I2n , z(t) is the usual brake orbit which has time-reversal symmetry. In this case, the formula has been derived by [LZZ]. For the even general group action situation, see Theorem 2.7. It is well-known that the n-body problem is also a second order system. Its solution, as the critical point of the action functional, also has Morse index. The relation between Morse index and Maslov index is an intriguing problem, and the investigation has been initiated by Duistermaat [D]. However, he only defines the Maslov index without boundary condition, and his formula is not suitable for our purpose. In our setting we can get similar relations between these indices. For x ∈ W 1,2 ([0, T ], Rn ), set T F(x) = L(t, x, x)dt, ˙ (1.8) 0

C 2 ([0, T ]

R2n , R)

× and it satisfies the Legendrian convexity condition. where L ∈ Corresponding to the above two kinds of group actions, we are interested in the following boundary conditions: 1 = Gr ( S¯ T ) = {(x(0), x(T )) ∈ Gr ( S¯ T )}, 2 = V1 × V2 = {(x(0), x(T )) | x(0) ∈ V1 , x(T ) ∈ V2 },

(1.9) (1.10)

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

741

S¯ 0 , 0 S¯ ¯ i = J V ⊥ ⊕ Vi for i = 1, 2. Then the boundary conditions 1 , 2 correspond to and i ¯1⊕ ¯ 2 when the system is switched to first order Hamiltonian system. Gr (S T ) and For a self-adjoint operator A, we denote its Morse index by m − (A), the dimension of maximal negative definite subspaces of A. For a critical point x of F, we denote its Morse index by where S¯ is an orthogonal matrix on Rn , V1 and V2 are subspaces of Rn . Let S =

m − (x) = m − (F ), where F is the Hessian of F at x. Theorem 1.2. For a critical point x of (1.8), under boundary condition 1 , ¯ = µ(Gr (S T ), Gr (γ (t))), m − (x) + ν1 ( S)

(1.11)

¯ = dim ker( S¯ − In ). Under boundary condition 2 , where ν1 ( S) ¯ 2 , γ (t) ¯ 1 ), m − (x) + dim V1⊥ ∩ V2⊥ = µ(

(1.12)

where γ (t) is the fundamental solution of the corresponding solution in Hamiltonian system. We will use the theory developed by Long [L3] and Theorem 1.1 to give stability criteria. Given M ∈ Sp(2n), for any symplectic path η from I2n to M, we define Dω (M) for ω ∈ U by Dω (M) = i ω (η) − i 1 (η).

(1.13)

Following [L3], this definition is independent of the choice of η, i.e. it only depends on the end points, and we will give an explanation of this fact in Sect. 4.2. For a function g(w) defined on some closed interval [a, b], we define its variation by var (g(w), [a, b]) = max{

k−1

|g(w j+1 ) − g(w j )|, a = w0 < · · · < wk = b is any partition}.

(1.14)

j=0

Let

√ f (θ ) ≡ f γz ,S (θ ) = µ(Gr (exp( −1θ )S T ), Gr (γz (t)), t ∈ [0, T /m]) + Dexp(√−1θ) (S),

then we have Theorem 1.3. Suppose the periodic solution z to (1.1–1.2) has Type I H symmetry with fundamental solution γz , then e(γz (T ))/2 ≥ var ( f (θ ), θ ∈ [0, π ]).

(1.15)

We also give a criterium to instability Theorem 1.4. Suppose the periodic solution z to (1.1–1.2) has Type I H symmetry, then the solution is linearly unstable if µ(Gr (S T ), Gr (γz (t)), t ∈ [0, T /m]) is odd.

742

X. Hu, S. Sun

A famous result of Poincaré states that a closed minimizing geodesic on any Riemann surface is unstable. Our theorem can be seen as a generalization. The linear stability of the figure-eight orbit is numerically observed by Simó [S] by verifying that all eigenvalues of the monodromy matrix are on the unit circle. Note also that it is numerically confirmed by Simó that figure-eight is KAM stable on the manifold of zero angular momentum by numerically computing the Poincaré map to higher order and its normal form around the fixed point corresponding to the eight. Then Kapela and Simó [KS], also by Roberts [R], rigorously establish the linear stability. However their proof is computer assisted. In this paper, we try to understand why this linear stability is possible from a variational viewpoint by index theory of the Hamiltonian system developed above. The key property of the figure-eight orbit is that it has D6 full symmetry [C1], which mixes Type I H and Type I I H boundary conditions [R]. Let Xˆ be the configuration space of the planar three-body problem with center-of-masses fixed. We denote by m 1 , m 2 , m 3 the Morse indices of the eight as the critical point of the action functional in the total loop space W 1,2 (R/Z, Xˆ ), with some Z/2Z(cyclic-type)- and Z/3Z-invariant loop subspaces which will be defined in Sect. 5. Theorem 1.5. For the figure-eight orbit,if a= 1 and m 2 = m 3 = 0, it is linear 1a , which is the symplectic Jordan form stable. Here a appears in N1 (1, a) = 01 corresponding to the angular momentum of the monodromy matrix. The idea for the proof of the linear stability via Maslov index is as follows: from the variational characterization of periodic solution of a Hamiltonian system, we get its Morse index; for a periodic solution with symmetry, when we focus on some part of the orbit on account of symmetry, we are mainly interested in the solution with general boundary condition and we can draw conclusions about its Maslov index from its Morse index by Theorem 1.2; for Theorem 1.5, this is Lemma 5.7 which links together the Morse indices on various invariant loop spaces and the Maslov indices in different period segments; in turn, these Maslov indices are related through Maslov indices with respect to ω ∈ U on the same basic period segment (T /6 in our figure-eight case at hand) by Theorem 1.1, the generalized Bott-type iteration formula; now a key formula for the stability is (5.51) which implies that for different ω’s on the unit circle, their Maslov index differences are essentially due to the splitting number’s jumps at the eigenvalues of the monodromy matrix; we can use this way to detect the distributions of the eigenvalues of the monodromy matrix on the unit circle, whence the stability; the properties of splitting numbers are given in Lemmata 4.4 and 4.5. Please see Sect. 5 for the details and its variants. Using Matlab, we have strong numerical evidences for our assumptions, and all theories fit well with numerical results. We would like to point out that the stability of the eight is quite special among the solutions found recently by minimization methods. In fact numerical simulation shows that it is one of several examples of stable solutions, and the others are all unstable. Our theorem on the instability will be useful for this purpose. In a recent preprint [HS] we have used the Maslov-type index theory to study the stability problem of elliptic relative equilibrium, and get the relation between stability and Morse index. In that paper we do not use the symmetry property of the solution which is essential to this paper. The paper is organized as follows. In Sect. 2, we firstly review the definitions of relative Morse index by spectral flow and Maslov index, then we prove that they are equal for

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

743

each orbit of a Hamiltonian system via crossing forms; for the orbit with symmetry, we prove the decomposition theorem on Morse index and relative Morse index with respect to the group action and we call this generalized Bott-type iteration formula; as an application, we prove Theorem 1.1 which will be used for the stability of the figure-eight. In Sect. 3, for an orbit under general boundary conditions in a Lagrangian system, we study the relationship between its Morse index as a critical point of the action functional and its Maslov index in the corresponding first order Hamiltonian system, and prove Theorem 1.2; this is also inspired by studies in celestial mechanics. In Sect. 4, after reviewing the Maslov-type index for the symplectic paths, we give the stability/instability criteria by Maslov index; this is the core of the paper and can be used in the situations other than the figure-eight orbit. In Sect. 5, we analyze in details the stability of the figure-eight orbit of the planar three-body problem with equal masses. 2. Maslov Index with Symmetry 2.1. Relative Morse index and Maslov index via spectral flow. As is well known, spectral flow was introduced by Atiyah, Patodi and Singer [APS] in their study of index theory on manifolds with boundary. Since then, this concept has been studied extensively. Specially, many interesting properties and applications to the Maslov index [CLM,CHH, RS2,ZL] have been established. For completeness, we recall here the basic properties which will be used later. Let {A(θ ), θ ∈ [0, 1]} be a continuous path of self-adjoint Fredholm operators on a Hilbert space E. Roughly speaking, the spectral flow of path {A(θ ), θ ∈ [0, 1]} counts the net change in the number of negative eigenvalues of A(θ ) as θ goes from 0 to 1, where the enumeration follows from the rule that each negative eigenvalue crossing to the positive axis contributes +1 and each positive eigenvalue crossing to the negative axis contributes −1, and for each crossing the multiplicity of eigenvalue is counted. More precisely, as shown in [APS], let ℘= Spec(A(θ )) θ∈[0,1]

which is a closed subset of the (θ, λ)-plan, the spectral flow S f ({A(θ )}) is defined to be the intersection number of ℘ with the line λ = − with respect to the usual orientation for some small positive . In order to calculate the spectral flow, we will use the crossing operator introduced in [RS2]. Take a C 1 -path {A(θ ), θ ∈ [0, 1]}, which always exists among the homotopy class of the original path, and let Pθ be the orthogonal projection from E to E 0 (A(θ )), the kernel of A(θ ). When eigenvalue crossing occurs at A(θ ), the operator Pθ

∂ A(θ )Pθ : E 0 (A(θ )) → E 0 (A(θ )) ∂θ

(2.1)

is called a crossing operator, denoted by Cr [A(θ )]. An eigenvalue crossing at A(θ ) is said to be regular if the null space of Cr [A(θ )] is trivial. In this case, we define for each such θ , signCr [A(θ )] = dim E + (Cr [A(θ )]) − dim E − (Cr [A(θ )]),

(2.2)

where E + (Cr [A(θ )]) (E − (Cr [A(θ )])) is the positive (negative, respectively) definite subspace of the crossing operator Cr [A(θ )].

744

X. Hu, S. Sun

Suppose that all the crossings are regular. Let S be the set of θ ∈ [0, 1] at which the crossing occurs. Then S contains only finitely many points. The spectral flow of {A(θ ), θ ∈ [0, 1]} is [CHH] S f ({A(θ ), θ ∈ [0, 1]}) = signCr [A(θ )] − dim E − (Cr [A(0)]) θ∈S∗

+ dim E + (Cr [A(1)]),

(2.3)

where S∗ = S ∩ (0, 1). In what follows, the spectral flow of {A(θ ), θ ∈ [0, 1]} will be simply denoted by S f ({A(θ )}) when the starting and end points of the flow are clear from the context. One can prove that [APS,ZL], S f ({A(θ )}) = S f ({A(θ ) + id}) if id is the identity operator on E and 0 ≤ ≤ 0 for some 0 sufficiently small positive number. Furthermore, there always exists some ∈ (0, 0 ) such that all the eigenvalue crossings occurred in {A(θ ) + id | θ ∈ [0, 1]} are regular [RS2] (Theorem 4.22). In fact, it holds for almost all ∈ (0, 0 ). Using homotopy invariance of spectral flow, we may assume, without loss of generality, that {A(θ ), θ ∈ [0, 1]} is continuously differentiable in θ and all the eigenvalue crossings are regular. Details can be found in [RS2,ZL]. ¯ if the difference A¯ − A is relative For two self-adjoint Fredholm operators A and A, compact [Ka] (p.194) with respect to A, then there is a path of self-adjoint Fredholm operators A(s) with end points A¯ and A such that A(s) − A is relative compact with respect to A for any s ∈ [0, 1]. In this case the spectral flow S f ({A(s), s ∈ [0, 1]}) will be independent of the path A(s) [ZL]. For such two operators we can define relative Morse index via spectral flow. Definition 2.1. The relative Morse index of A¯ with respect to A is defined to be the spectral flow ¯ = −S f ({A(s), s ∈ [0, 1]}). I (A, A) For another beautiful treatment of the relative Morse index, we refer the readers to [Ab]. Suppose A˜ is another self-adjoint Fredhlom operator such that A˜ − A is relative compact with respect to A, then ¯ + I ( A, ¯ A) ˜ = I (A, A). ˜ I (A, A)

(2.4)

Our main interests in this paper are Hamiltonian dynamics. Given a Hamiltonian system, z˙ = J H (t, z), z ∈ R2n ,

(2.5)

its periodic solutions have been extensively studied, in this paper we will consider its solutions under more general boundary conditions. Note that if (R2n , ω) is the standard symplectic vector space, then R2n ⊕ R2n with 2-form (−ω) ⊕ ω is also symplectic. Let be a Lagrangian subspace of (R2n ⊕ R2n , (−ω) ⊕ ω). The general boundary condition of (2.5) could be given by (z(0), z(T )) ∈ ,

(2.6)

and we call this the Lagrangian boundary condition. It is easy to see that the periodic solution is a special case with = := {(z, z) | z ∈ R2n }.

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

745

If z is a solution to (2.5), the fundamental solution of the linearized Hamiltonian system is γ : [0, T ] → Sp(2n) with γ (0) = I2n . The corresponding linear operator is d − B(t) with B(t) = H (t, z(t)). Let D(T, ) = {z ∈ W 1,2 ([0, T ], R2n ) | (z(0), −J dt d d and −J dt − B(t) are self-adjoint Fredholm operators with z(T )) ∈ }, then both −J dt domain D(T, ). d d Now the relative Morse index m − (z) of −J dt − B(t) with respect to −J dt can be defined as above by spectral flow: d d d − B(t) = −S f ({−J − Bs (t), s ∈ [0, 1]}), (2.7) m − (z) := I −J , −J dt dt dt where Bs (t) is a family of symmetry matrices with s ∈ [0, 1], B0 (t) = 02n , d B1 (t) = B(t), and each operator −J dt − Bs (t) has domain D(T, ). In Hamiltonian system theory, there is another natural topological characterization for each orbit of the system, namely its Maslov index. Firstly, we review the definition of Maslov index [Ar,CLM]. For simplicity, we only consider the case of the Maslov index of a path of Lagrangian subspaces with respect to a fixed Lagrangian subspace which is enough for our purpose. The general case of two paths of Lagrangian subspaces is not mentioned here [CLM]. Let Lag(2n), the Lagrangian Grassmannian, be the set of Lagrangian subspaces of (R2n , ω). It is known that Lag(2n) is a manifold. For each fixed Lagrangian subspace W ∈ Lag(2n), set O j (W ) = {V |V ∈ Lag(2n) and dim(V ∩ W ) = j}. It is a connected submanifold of codimension j ( j + 1)/2. The union of all strata n j=1 O j (W ) is the closure of O1 (W ). This closure O1 (W ) of O1 (W ) is a singular cycle of codimension 1, and it is called Maslov cycle [Ar]. The top stratum O1 (W ) has a canonical transverse orientation. To be more precise, for each η ∈ O1 (W ), the path {et J η, t ∈ (−δ, δ)} of Lagrangian subspaces crosses O1 (W ) transversally, and as t increases the path points to the desired transverse direction. Thus this singular cycle is two-sidedly embedded in Lag(2n) [Ar]. Based on these properties, Arnold defined the Maslov index for a closed loop of Lagrangian subspaces via transversality argument. This general position argument also works for a path of Lagrangian subspaces with end points not in the Maslov cycle. Motivated by concrete applications, the Maslov index for such kind of paths regardless of where its endpoints lie are considered. Let W ∈ Lag(2n), and (t) be a continuous path in Lag(2n) with t ∈ [a, b]. The definition of Maslov index is as follows: Definition 2.2 [CLM]. µ(W, (t)) := [e−ε J (t) : O1 (W )]

(2.8)

where the right-hand side is the intersection number and 0 < ε << 1. For the readers’ convenience, we give more explanations of the definition. For ε small enough, e−ε J (a) and e−ε J (b) is not in the singular circle [CLM]. For fixed end points ˜ ˜ of e−ε J (t), there exists a perturbed path (t) such that: if (t) ∩ O1 (W ) = ∅, then ˜ ˜ (t) ∩ O1 (W ) ⊂ O1 (W ). In fact, we can require that (t) intersect O1 (W ) transversally in finite times. Suppose ti , i = 1, . . . , m is the crossing time, and define s(ti ) = 1 ˜ near ti have the same (opposite respectively) direction with et J (t ˜ i ). (s(ti ) = −1) if (t) Then the intersection number is the summation of s(ti ) taken over all such crossings.

746

X. Hu, S. Sun

All the intersection numbers used below are understood in this manner. We list several properties of the Maslov index, and the details can be found in [CLM]. Property I (Reparametrization invariance). Let ψ : [a, b] → [c, d] be a continuous and piecewise smooth function with ψ(a) = c, ψ(b) = d, then µ(W, (t)) = µ(W, (ψ(t))).

(2.9)

Property II (Homotopy invariance with respect to end points). For a continuous family of Lagriangian paths (s, t), 0 ≤ s ≤ 1, a ≤ t ≤ b such that dim (s, a) ∩ W and dim (s, b) ∩ W are constants, then µ(W, (0, t)) = µ(W, (1, t)).

(2.10)

Property III (Path additivity). If a < c < b, then. µ(W, (t)) = µ(W, (t)|[a,c] ) + µ(W, (t)|[c,b] ).

(2.11)

Property IV (Symplectic invariance). Let φ(t), t ∈ [a, b] be a continuous path in Sp(2n), then µ(W, (t)) = µ(φ(t)W, φ(t)(t)).

(2.12)

One efficient way to study the Maslov index is via crossing form introduced by [RS1] as follows. Let (t) be a C 1 -curve of Lagrangian subspaces with (0) = , and let V be a fixed Lagrangian subspace which is transversal to . For v ∈ and small t, define w(t) ∈ V by v + w(t) ∈ (t). Then the form Q(v) =

d |t=0 ω(v, w(t)) dt

(2.13)

is independent of the choice of V [RS1]. A crossing for (t) is some t for which (t) intersects W nontrivially, i.e. for which (t) ∈ O1 (W ). The set of crossings is compact. At each crossing, the crossing form is defined to be ((t), W, t) = Q|(t)∩W .

(2.14)

A crossing is called regular if the crossing form is non-degenerate. If the path is given by (t) = γ (t) with γ (t) ∈ Sp(2n) and some fixed Lagrangian subspace, then the crossing form is equal to (−γ (t)T J γ˙ (t)v, v), for v ∈ γ (t)−1 ((t) ∩ W ), where ( , ) is the standard inner product on R2n . For (t) and W as before, if the path only has regular crossing, following [LZ], the Maslov index is equal to µ(W, (t)) = m + (((a), W, a)) + Sign(((t), W, t)) a
−m (((b), W, b)),

(2.15)

where the summation runs over all crossings t ∈ (a, b) and m + , m − are the dimensions of positive and negative definite subspaces, Sign = m + − m − is the signature. Note that for a C 1 -path (t) with fixed end points, we can make it only have regular crossings by a small perturbation.

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

747

To prove the generalized Bott iteration formula, we review the Maslov index theory of the complex Lagrangian subspaces, the details of which can be found in [Z]. Let (C2n , ω) be the complex symplectic vector space with symplectic form ω(x, y) = J x, y, ∀x, y ∈ C2n , with ·, · the standard Hermitian inner product and J defined as before. A complex subspace V is Lagrangian if ω|V = 0, and dimC V = n. Let V ± = ker(i J ∓ I2n ), then any Lagrangian subspace can be expressed as Gr (U ) = {(x, U x) | x ∈ V + }, where U : V + → V − is some unitary matrix, and the converse is also true. This shows that the (complex) Lagrangian Grassmannian Lag(C2n ) is homeomorphic (isomorphic) to the unitary group U(n), which we denote by F : Lag(C2n ) → U(n). For V1 , V2 ∈ Lag(C2n ), it is obvious that dim V1 ∩ V2 = dim ker(F(V2 )−1 F(V1 ) − In ). For any fixed U ∈ U(n), let U = {U0 ∈ U(n) | det(U −1 U0 − In ) = 0)} be the singular cycle of U . For any U0 ∈ U , the path eit U0 (|t| < ε) is transversal to U . Let U (t)(t ∈ [a, b]) be any path in U(n). For ε > 0 small enough, e−εi U (a) and e−εi U (b) are not in the singular cycle of U , and the intersection number [e−εi U (t) : U ] is well defined. For a path of complex Lagrangian subspaces V (t) and ∈ Lag(C2n ), we define the Maslov index to be µ(, V (t)) := [e−εi F(V (t)) : F () ].

(2.16)

For a real Lagrangian space V , V ⊗ C ∈ Lag(C2n ). Let LagR (C2n ) := {V = V¯ |V ∈ Lag(C2n )}, then LagR (C2n ) is isomorphic to Lag(2n) ⊗ C. Since for V ∈ Lag(C2n ), F(e−ε J V ) = e−2εi F(V ), the definition of (2.16) is the same as (2.8) when the path consists of real Lagrangian subspaces. So the Maslov index of the real Lagrangian subspace path is the same as it is considered to be a complex one. As for the crossing form in the complex case, it is also exactly (2.13–2.14), and the Maslov index in the complex case can also be expressed by (2.15). The complex Maslov index has also Property I–IV above. d d On the other hand, the operators −J dt and −J dt − Bs (t) are also self-adjoint oper1,2 2n ators on W ([0, T ], C ) with boundary condition ∈ Lag(C2n ⊕ C2n , −ω ⊕ ω). d The spectral flow of −J dt − Bs (t) in complex Hilbert space is the same as in the real one. We will use this complex Maslov index and the operator in complex Hilbert space when we need it. For the fundamental solution γ (t), t ∈ [0, T ] of the linearized Hamiltonian system along the solution z(t), d γ (t) = J B(t)γ (t), dt

(2.17)

γ (0) = I2n ,

(2.18)

with B(t) = H (t, z(t)), the Maslov index of z is defined to be µ(z) = µ(, Gr (γ (t))),

(2.19)

748

X. Hu, S. Sun

where ∈ Lag(4n) is the boundary condition. We will show that, for each orbit of the Hamiltonian system, its relative Morse index is equal to its Maslov index under the Lagrangian boundary conditions. d As mentioned before, we can choose a path −J dt − Bs (t), s ∈ [0, 1], such that it only d − Bs (t))| D(T,) ) has regular crossings. Direct computation shows that z s ∈ ker((−J dt if and only if z s is a solution to the linear equation dz s (t) = J Bs (t)z s (t) dt such that (z s (0), z s (T )) ∈ 1 (s) ∩ . Here

(2.20)

1 (s) = Gr (γs (T )) ∈ Lag(4n), with γs (t) a fundamental solution of (2.20). For a Fredholm operator A, we denote by m 0 (A) the dimension of its kernel space. The next lemma is obvious. d Lemma 2.3. s is a crossing of −J dt − Bs (t) if and only if it is a crossing of 1 (s) with respect to . Moreover,

m 0 (−J

d − Bs (t)) = dim(1 (s) ∩ ). dt

(2.21)

Specially for s = 1, we have d − B(t)) = dim Gr (γ (T )) ∩ ). (2.22) dt The following lemma is essentially from [RS2], and we give a proof here for the readers’ convenience. m 0 (−J

Lemma 2.4. The signature of the crossing operator of the spectral flow is equal to the negative of the signature of the crossing form of the path of Lagrangian subspaces. Proof. Let Bs (t) be a symmetric matrix, where t ∈ [0, T ] and s ∈ [0, 1] such that B0 (t) = 0, B1 (t) = B(t). Consider the matrix equation d γs (t) = J Bs (t)γs (t), dt

(2.23)

γs (0) = id2n .

(2.24)

Differentiate (2.23) respect to s and multiplying by

−γsT (t)J ,

we obtain

∂ 2 γs (t) ∂ Bs (t) ∂γs (t) = γsT (t) γs (t) + γsT (t)Bs (t) . ∂s∂t ∂s ∂s Hence by direct calculations via (2.23), T T ∂ Bs (t) ∂γs (t) T ∂γs (t) T T γs (t)dt = γs (t)J |0 − dt − γs (t) γsT (t)Bs (t) ∂s ∂s ∂s 0 0 T ∂γs (t) + dt γsT (t)Bs (t) ∂s 0 −γsT (t)J

= γsT (T )J

∂γs (T ) . ∂s

(2.25)

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

749

d Note that the left-hand side of (2.25) corresponds to the crossing operator of −J dt −Bs (t) and the right-hand side is the negative crossing form defining the Maslov index for path 1 (s) ∈ Lag(4n), so the proof is complete.

Theorem 2.5. Under the situation above, µ(z) = m − (z).

(2.26)

Proof. From Lemmata 2.3 and 2.4, (2.3) and (2.15), we have m − (z) = µ(, 1 (s)), s ∈ [0, 1].

(2.27)

Let γs (t) be the fundamental solution of (2.23), set (s, t) = Gr (γs (t)), that is a rectangle on the space of Lagrangian subspace. Since 1 (s) and Gr (γ1 (t)) = Gr (γ (t)) are two boundaries and the other two are , these show that 1 (s) and Gr (γ (t)) are paths homotopic to each other with the same end points. The result then follows from the homotopy invariance of Maslov index. Remark 2.6. 1. For the periodic solutions of Hamiltonian system, this theorem has been established by many authors with different methods (see [Ab,L3] and references therein for the details). 2. Let γ (t) be a path of symplectic matrices, = 1 ⊕ 2 ∈ Lag(4n), where i ∈ Lag(2n), for i = 1, 2, then following [RS1], by computing the crossing form, we have µ(1 ⊕ 2 , Gr (γ (t))) = µ(2 , γ (t)1 ).

(2.28)

2.2. Index of orbits with symmetry. Let E be a Hilbert space and g be an unitary operator on E, which satisfies g m = id E . Then G = {g k , k = 1, . . . , m} is√isomorphic to the group Zm . Let E i = ker(g −ωi ), i = 1, . . . , m, where ωi = exp(2π −1 mi ). Obviously E m is G-invariant subspace. From the spectral theory of normal operator, we know that E i ’s are mutually orthogonal and E = E 1 ⊕ . . . . ⊕ E m . Suppose A is a closed linear operator on E which commutes with g, i.e. g A = Ag, then E i is invariant subspace of A for all i. Set Ai = A| Ei , then we have decomposition A = A1 + . . . + Am . If A is a self-adjoint Fredholm operator commuting with g, then m 0 (A) = m 0 (A1 ) + · · · + m 0 (Am );

(2.29)

if its Morse index is finite, then m − (A) = m − (A1 ) + · · · + m − (Am ).

(2.30)

Moreover if A(t), t ∈ [0, 1], is a path of self-adjoint Fredholm operators such that A(t)g = g A(t) for any t, and set Ai (t) = A(t)| Ei , then S f (A(t)) = S f (A1 (t)) + · · · + S f (Am (t)).

(2.31)

Suppose f is a G-invariant functional on E, that is f (gx) = f (x) for any x ∈ E, g ∈ G, then f m = f | E m is the function on the G-invariant space. The well-known Palais principle tells us that if x is a critical point of f m on E m , then it is also the critical point of f on E. This can be seen as follows: Since f (x) = f (Gx), f (x) = g ∗ f (gx) and g f (x) = f (gx). This shows that if x ∈ E m , then f (x) ∈ E m , which implies the result.

750

X. Hu, S. Sun

Moreover, for x ∈ E m , one can also show that f (x) = g ∗ f (x)g, which means = f (x)g, ∀g ∈ G. Since f (x) is a self-adjoint operator, we can decompose its Morse index or relative Morse index by (2.30) or (2.31). For our Hamiltonian system (2.5-2.6), the orbits are the critical points of the functional T dz(t) f (z) = [(−J , z(t)) − H (t, z(t))]dt dt 0 g f (x)

on the space D(T, ) (where the closure is under the H 1/2 -norm). Suppose f (gz) = d d ) = (−J dt )g for any g ∈ G, then for any critical point on G-invariant f (z) and g(−J dt d − H (t, z(t)) and H (t, z(t)) commutes with g. subspace of D(T, ), f (z) = −J dt d Let A(s) = −J dt − s H (t, z(t)), then g A(s) = A(s)g. From (2.31), we have d d Theorem 2.7. Suppose g(−J dt ) = (−J dt )g for any g ∈ G, and z ∈ D(T, )m is a solution of the Hamiltonian system (2.5-2.6). If the functional f (z) is G-invariant, then d d the relative Morse index of −J dt − H (t, z(t)) with respect to −J dt is

I (−J

m d d d d , −J − H (t, z(t))) = | D(T,)i , −J I (−J dt dt dt dt i=1

−H (t, z(t)) | D(T,)i ),

(2.32)

and d d − H (t, z(t))) = − H (t, z(t)) | D(T,)i ). m 0 (−J dt dt m

m 0 (−J

(2.33)

i=1

We call this theorem the generalized Bott-type iteration formula, and the reason follows from the Type I H group action. We know that if there is a group action on a functional space, then there is a decomposition of the space such that the relative Morse index is equal to the sum of the relative indices on each subspace. If the relative Morse index on each subspace is equal to the Maslov index with the corresponding boundary condition, then we get a decomposition of Maslov index. Proof of Theorem 1.1. For Type I H group action, if z is a critical point of the action functional f on the G-invariant subspace E m , then we can decompose the relative Morse index and Maslov index by (2.31). √ T Note that E i = {z ∈ E | Sz(t + m ) = exp( mi 2π −1)z(t)} for i = 1, . . . , m, from the above Theorem 2.5, d d | E , −J − H (t, z(t)) | Ei ) dt i dt √ i = µ(Gr (exp( 2π −1)S T ), Gr (γ (t)); t ∈ [0, T /m]). (2.34) m So (1.6) is from Theorem 2.7. For the Type I I H group action, let z be the symmetric solution of the Hamiltonian system with the boundary condition z(0) = Sz(T ), that is, it satisfies I (−J

N z(T − t) = z(t). Then, z is a critical point of the functional f on the Z2 invariant subspace of E.

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

751

According to the group action, the whole space E decomposes into the orthogonal direct sum of E 1 = {z ∈ E | gz(t) = z(t)} T T = {z ∈ W 1,2 ([0, ], R2n ) | z(0) ∈ V + (S N ), z( ) ∈ V + (N )}, 2 2 and E 2 = {z ∈ E | gz(t) = −z(t)} T T = {z ∈ W 1,2 ([0, ], R2n ) | z(0) ∈ V − (S N ), z( ) ∈ V − (N )}. 2 2 Now, (1.7) is from (2.28), Theorem 2.5 and Theorem 2.7. Remark 2.8. 1. In 1956, Bott got his celebrated iteration formula for the Morse index of closed geodesics [B], and it was generalized by [BTZ,CD,E]. The precise iteration formula of the general Hamiltonian system was established by Long [L2]. In fact, the iteration could be regarded as a special group action. In our theorem, we generalize this to the general group action which also includes the brake symmetry studied in [LZZ]. We have noticed that the iteration formula for brake symmetry is studied in [Liu] by a different way. 2. Our result is motivated by the recent progress in n-body problems trying to find symmetric solutions by minimizing the Lagrangian action functional on some symmetric loop space. It is interesting to note that the famous figure-eight solution has mixed symmetry of Type I H and Type I I H ([R] and Sect. 5). 3. Morse Index of Lagrangian System Hamiltonian systems encountered in applications come often from Lagrangian systems. In this section we switch to this kind of systems, and get relations between the Morse index and relative Morse index of orbits of the Lagrangian system. Duistermaat [D] has shown that the intersection theory of curves of Lagrangian subspaces is a very flexible tool in studying the Morse index of Lagrangian systems. Based on this, An and Long (see [L3] for further details) and Viterbo [V] give the clear relation between the Morse index and the Maslov-type index under the periodic boundary condition. Zhu [Z] studies the higher derivative case. Our purpose here is to give the relation for the Morse index and the Maslov index which covers the applications to the n-body problem. For x ∈ W 1,2 ([0, T ], Rn ), set T F(x) = L(t, x, x)dt, ˙ (3.1) 0

where L ∈

C 2 ([0, T ] × R2n , R)

and it satisfies the Legendrian convexity condition:

∂2 L (t, u, v)w, w) > 0 for t ∈ [0, T ], w ∈ Rn \{0}, (u, v) ∈ Rn × Rn . (3.2) ∂v 2 Let V be a subspace of Rn ⊕ Rn . The solution to the Euler-Lagrange equation with the boundary condition (x(0), x(T )) ∈ V corresponding to the Lagrangian L is the critical point of F in the space (

E¯ V = {x ∈ W 1,2 ([0, T ], Rn ) | (x(0), x(T )) ∈ V }.

752

X. Hu, S. Sun

And x is a stationary curve for the boundary condition V if and only if d ∂L ∂L (t, x, x) ˙ − (t, x, x) ˙ = 0, (3.3) dt ∂v ∂u ∂L ∂L ˙ − (T, x(T ), x(T ˙ ))) ∈ V ⊥ , (3.4) (x(0), x(T )) ∈ V, ( (0, x(0), x(0)), ∂v ∂v where V ⊥ is the orthogonal complement of V in Rn ⊕ Rn . ∂L (t, x, x) ˙ and H (t, y, x) = y· x˙ −L(t, x, x), ˙ Using Legendrian transformation y = ∂v (3.3) can be converted into z˙ = J H (t, z)

(3.5)

∂L (t, x(t), x(t)), ˙ x(t)). ∂v 2n For the phase space (R , ω), the boundary conditions of the Hamiltonian systems can be expressed by Lagrangian subspace of (R2n ⊕R2n , −ω⊕ω). J := (−J )⊕ J is the standard symplectic matrix. Rn ⊕ Rn is a Lagrangian subspace of (R2n ⊕ R2n , −ω ⊕ ω). Note that V, V ⊥ ⊂ Rn ⊕ Rn , and let

with z(t) = (y(t), x(t)) = (

(V ) = J V ⊥ ⊕ V.

(3.6)

It is obvious that dim((V )) = 2n and (−ω ⊕ ω)|(V ) = 0, so (V ) is a Lagrangian subspace of (R2n ⊕R2n , −ω⊕ω). Now (3.3–3.4) is equivalent to (3.5) with the boundary condition (z(0), z(T )) ∈ (V ).

(3.7)

Suppose x is an extreme of F in E¯ V . The index form of x is given by T I(ξ, η) = {(P ξ˙ + Qξ ) · η˙ + Q T ξ˙ · η + Rξ · η}dt, ξ, η ∈ E¯ V ,

(3.8)

0

∂2 L ∂2 L ∂2 L (t, x(t), x(t)) ˙ and R(t) = (t, x(t), x(t)), ˙ Q(t) = (t, x(t), ∂v 2 ∂u∂v ∂u 2 x(t)). ˙ The Hessian of F at x is given by

where P(t) =

I(ξ, η) = F (x)ξ, η. Linearization of (3.3) at x is given by d (P(t) y˙ + Q(t)y) + Q T (t) y˙ + R(t)y = 0, (3.9) dt and y is solution of (3.9) if and only if y ∈ ker(I). This linear Sturm system (3.9) corresponds to the linear Hamiltonian system −

z˙ = J B(t)z, z ∈ R2n , where

B(t) =

(3.10)

−P −1 (t)Q(t)

−Q T P −1 (t)

Q T (t)P −1 (t)Q(t) − R(t)

P −1 (t)

.

(3.11)

Note that y is a solution to (3.9) is equivalent to that z ≡ z(t) = (P(t) y˙ (t)+Q(t)y(t), y(t)) is a solution of (3.11). We have

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

753

Lemma 3.1. Under above settings, dim ker(I) = dim(Gr (γ (T )) ∩ (V )),

(3.12)

where γ is the fundamental solution of (3.10). P(t) Q(t) Lemma 3.2. If > 0 for any t, then the linear Hamiltonian system (3.10) Q T (t) R(t) has no nontrivial solution under boundary condition (V ). Proof. Define the Lagrangian density by L(t, x(t), x(t)) ˙ = 1/2((P x˙ + Qx) · x˙ + Q T x˙ · x + Rx · x),

(3.13)

then L(t, x(t), x(t)) ˙ ≥ 0 and it is equal to 0 only at the point (t, 0, 0). The EulerT Lagrange equation of F(x) = 0 {L(t, x(t), x(t))}dt ˙ is (3.9). Suppose x is a solution of (3.9), then 2F(x) = (−

d (P(t)x˙ + Q(t)x) + Q T (t)x˙ + R(t)x) · x = 0. dt

(3.14)

This shows that x(t) ≡ 0. Change the Lagrangian system into a Hamiltonian system, it is exactly (3.10) and (3.11), so there is no nontrivial solution to it. Definition 3.3. We define C() as C() := I (−J

d d , −J − C), dt dt

(3.15)

wherethe operators on the right-hand side are with respect to domain D(T, ), and In 0 . C= 0 −In Theorem 3.4. For the Lagrangian system which satisfies the Legendrian condition, under boundary condition (V ), the Morse index m − (F (x)) of x and relative Morse index m − (z) of z satisfy m − (F (x)) + C() = m − (z).

(3.16)

Proof. Firstly note that m − (F (x)) = S f ({F (x) + s In , s ∈ [0, s0 ]}), (3.17) P(t) Q(t) > 0 under condition where s0 is chosen large enough such that Q T (t) R(t) + s0 In (3.2). On the other hand, passing from the linear Sturm system (3.9) to linear Hamiltonian d system (3.10), we get the operator −J dt − Bs (t), where Bs (t) =

P −1 (t)

−P −1 (t)Q(t)

−Q T P −1 (t)

Q T (t)P −1 (t)Q(t) − R(t) − s In

, s ∈ [0, s0 ].

754

X. Hu, S. Sun

By computing the crossing operator, it is easy to see S f ({−J

d − Bs (t), s ∈ [0, s0 ]}) = S f ({F (x) + s In , s ∈ [0, s0 ]}). dt

(3.18)

So we have I (−J

d d d d d d , −J − B(t)) = I (−J , −J − C) + I (−J − C, −J − Bs0 (t)) dt dt dt dt dt dt d d +I (−J − Bs0 (t), −J − B(t)), (3.19) dt dt

where the left-hand side is the relative Morse index m − (z) and I (−J

d d d − Bs0 (t), −J − B(t)) = S f ({−J − Bs (t), s ∈ [0, s0 ]}), (3.20) dt dt dt

hence it is the Morse index m − (F (x)). Let P ν (t) ν K = Q T,ν (t)

Q ν (t) R ν (t)

>0

for ν ∈ [0, 1], t ∈ [0, T ] and such that K 0 = I2n , P(t) Q(t) 1 . K = Q T (t) R(t) + s0 In Let γ ν be the fundamental solution of the corresponding Hamiltonian system, that is γ˙ ν (t) = J B ν (t)γ ν (t), and B 0 (t) ≡ C, B 1 (t) = Bs0 (t). Then from Lemma 3.2, Gr (γ ν (T )) ∩ (V ) = 0, or equivalently ker(−J

d − B ν (t)) = 0, ν ∈ [0, 1]. dt

This shows that I (−J

d d d − C, −J − Bs0 (t)) = −S f ({−J − B ν (t)), ν ∈ [0, 1]}) = 0, dt dt dt (3.21)

which completes the proof. Remark 3.5. In fact, in the proof of Theorem 3.4, we have got m − (I) = I (−J

d d d d , −J − B(t)) − I (−J , −J − C), dt dt dt dt

(3.22)

where I is the index form on E¯ V = {x ∈ W 1,2 ([0, T ], Rn ) | (x(0), x(T )) ∈ V } with V a subspace of Rn ⊕ Rn , and the operators appearing on the right-hand side of (3.22) are self-adjoint on L 2 ([0, T ], R2n ) with domain D(T, (V )) = {z ∈ W 1,2 ([0, T ], R2n ) |

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

755

(z(0), z(T )) ∈ (V )}. Note that (3.22) is valid for any subspace V ∈ Rn ⊕ Rn ; it is not necessary to be the boundary condition that is used to find the critical point. Moreover, both I and the operators can be extended to the complex case. I can be considered as a symmetric form on E˜ V = {x ∈ W 1,2 ([0, T ], Cn ) | (x(0), x(T )) ∈ V }, where V is a subspace in Cn ⊕ Cn . On the other hand, the operators appearing on the ˜ (V )) = right-hand side of (3.22) are self-adjoint on L 2 ([0, T ], C2n ) with domain D(T, 1,2 2n ⊥ {z ∈ W ([0, T ], C ) | (z(0), z(T )) ∈ (V )}, where (V ) = J V ⊕ V is a complex Lagrangian subspace of (C2n ⊕ C2n , −ω ⊕ ω). Equation (3.22) is correct in this complex case. Fix an m ∈ N, let G¯ be the orthogonal group acting on W 1,2 ([0, T ], Rn ), which ¯ Suppose F is G-invariant; ¯ satisfies g¯ m = id, for any g¯ ∈ G. it is often useful to find ¯ critical points of F on the G-invariant subspaces. Corresponding to Type I H and Type ¯ I I H group action, we have two kinds of G-actions as follows: ¯ Type I L . For the generator g¯ of G, set ¯ g¯ : E¯ → E, T ¯ x → g¯ ◦ x ≡ Sx(t + ), m where S¯ is an orthogonal matrix such that S¯ m = id, hence g¯ m = id. Suppose the Lagrangian satisfies ¯ Sv). ¯ L(t, u, v) = L(t − T /m, Su, Then the action functional F = space is isomorphic to

T 0

(3.23)

¯ ¯ L(t, x, x)dt ˙ is G-invariant. The G-invariant sub-

¯ E¯ m = {x ∈ W 1,2 (R/T Z, Rn ) | x(t) = Sx(t + T /m)}. Type I I L . Let N¯ be an involution orthogonal matrix on Rn with N¯ = N¯ T , N¯ 2 = In and S¯ an orthogonal matrix on Rn , which satisfy N¯ S¯ T = S¯ N¯ .

(3.24)

Let (g¯ ◦ x)(t) = N¯ x(T − t) generate the group action, and suppose the Lagrangian is such that L(t, u, v) = L(T − t, N¯ u, N¯ v),

(3.25)

¯ ¯ then the action functional is G-invariant on E¯ with boundary condition x(0) = Sx(T ). ¯ Let V1 = V + ( S¯ N¯ ) and V2 = V + ( N¯ ), then the G-invariant subspace is isomorphic to {x ∈ W 1,2 ([0, T /2], Rn ) | x(0) ∈ V1 , x(T /2) ∈ V2 }. Motivated by the last two examples, we compute C(1 ) and C(2 ) as defined in Definition 3.3 for Type I L and I I L boundary conditions correspondingly with 1 and 2 given by (1.9) and (1.10). Forgetting the group action for a moment, let S¯ be any orthogonal matrix on Rn , and V1 , V2 be arbitrary subspaces of Rn , we also call it boundary Type I L (1 ) and Type I I L (2 ).

756

X. Hu, S. Sun

Proof of Theorem 1.2. From Theorem 2.5 and formula (2.28), we only need to compute C(1 ) and C(2 ). Boundary Type I L . Recall that in this case in coordinates z(t) = (y(t), x(t)), S¯ 0 z(0) = Sz(T ) with S = and S¯ being an orthogonal matrix in Rn . 0 S¯ In 0 . Direct Note that in this case, we can let γ (t) = exp(J Ct), where C = 0 −In computation shows that cosh(t)In sinh(t)In , t ∈ [0, T ]. γ (t) = sinh(t)In cosh(t)In The eigenvalues of γ (t) are et and e−t . Note that Sγ (t) = γ (t)S. Since S¯ is orthogonal, ¯ The crossing the crossing occurs at the eigenvector of the possible eigenvalue 1 of S. form is positive definite at y-space, and negative definite at x-space. We let ν1 (S) be the dimension of the eigenvectors of 1 of S. So we have ¯ C(1 ) = ν1 ( S).

(3.26)

¯i ∩ ¯ f with ¯ i = J V ⊥ ⊕ V1 Boundary Type I I L . For fixed t, suppose (y, x) ∈ γ (t) 1 ⊥ ¯ ¯ and f = J V2 ⊕ V2 , then there exists (y1 , x1 ) ∈ i , such that γ (t)(y1 , x1 ) = (y, x). Note that y1 , J x1 = 0 and y, J x = 0. The crossing occurs only when t = 0, and the intersection space is ¯ f = (J V1⊥ ∩ J V1⊥ ) ⊕ (V1 ∩ V2 ). ¯i ∩

(3.27)

For the same reason as above, the crossing form is positive definite at y-space, and negative definite at x-space. We have C(2 ) = dim(V1⊥ ∩ V2⊥ ).

(3.28)

Remark 3.6. 1. Equations (3.26) and (3.28) are also useful in the complex case where S¯ is unitary on Cn . Specially, let S¯ = ωIn , ω ∈ U. ω = 1 is the periodic boundary condition and S¯ = In , so C() = n, ω = 1,

(3.29)

C() = 0, ω = 1.

(3.30)

2. Consider the second order system L(t, x, x) ˙ =

1 2 x(t) ˙ − V (t, x), 2

(3.31)

where V ∈ C2 (R × Rn , R). For the Dirichlet boundary condition D , the solution satisfies x(0) = x(T ) = 0, that is V1 = V2 = {0}, so we get C( D ) = n.

(3.32)

For the Neumann boundary condition N , x(0) ˙ = x(T ˙ ) = 0, so x(0), x(T ) ∈ Rn by (3.4), that is V1 = V2 = Rn , then we have C( N ) = 0.

(3.33)

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

757

4. Criteria to Stability 4.1. A brief review on an index theory for symplectic paths. In this subsection, we briefly recall the index theory for symplectic paths in the form which will be used later. All the details can be found in [L3]. At the end we give its relation to the Maslov index defined in [CLM]. 2 As usual, the topology of the symplectic group Sp(2n) is induced from that of R4n . For T > 0, we are interested in paths in Sp(2n): PT (2n) = {γ ∈ C([0, T ], Sp(2n)) | γ (0) = I2n }, which is equipped with the topology induced from that of Sp(2n). The following real function was introduced in [L1]: Dω (M) = (−1)n−1 ωn det(M − ωI2n ),

∀ω ∈ U, M ∈ Sp(2n).

Then for any ω ∈ U, the following codimension one hypersurface in Sp(2n) is defined [L1]: Sp(2n)0ω = {M ∈ Sp(2n) | Dω (M) = 0}. For any M ∈ Sp(2n)0ω , we define a co-orientation of Sp(2n)0ω at M by the positive d direction dt Met J |t=0 of the path Met J with t ≥ 0 sufficiently small. Let Sp(2n)∗ω = Sp(2n) \ Sp(2n)0ω . For any two continuous arcs ξ and η : [0, T ] → Sp(2n) with ξ(T ) = η(0), the concatenation of them is defined as usual, namely:

ξ(2t), if 0 ≤ t ≤ T /2, η ∗ ξ(t) = η(2t − T ), if T /2 ≤ t ≤ T. Ak Bk Given any two 2m k ×2m k matrices of square block form Mk = with k = 1, 2, C k Dk the -product of M1 and M2 is defined by the following 2(m 1 + m 2 ) × 2(m 1 + m 2 ) matrix M1 M2 : ⎛ ⎞ A1 0 B1 0 ⎜0 B2 ⎟ A2 0 ⎜ ⎟ M1 M2 = ⎜ ⎟. ⎝ C 1 0 D1 0 ⎠ 0

C2

0

D2

Denote by M k the k-fold -product M · · · M. Note that the -product of any two symplectic matrices is symplectic. For any two paths γ j ∈ PT (2n j ) with j = 0 and 1, let γ0 γ1 (t) = γ0 (t)γ1 (t) for all t ∈ [0, T ]. A special path ξn ∈ PT (2n) is defined to be n 0 2 − Tt for 0 ≤ t ≤ T. (4.1) ξn (t) = 0 (2 − Tt )−1 Let D(a) = diag[a, a −1 ], Mn+ = D(2)n , Mn− = D(−2) D(2)(n−1) .

758

X. Hu, S. Sun

Lemma 4.1 [L3] (pp. 58–59). For ω ∈ U, the set Sp(2n)∗ω possesses precisely two path connected components Sp(2n)+ω and Sp(2n)− ω which are simply connected in Sp(2n). Mn+ ∈ Sp(2n)+ω , and Mn− ∈ Sp(2n)− ω. Definition 4.2. For any ω ∈ U and γ ∈ PT (2n), we define νω (γ ) = dimC ker C (γ (T ) − ωI2n ),

(4.2)

i ω (γ ) = [e−ε J γ ∗ ξn : Sp(2n)0ω ],

(4.3)

and

the intersection number with ε small enough positive number. ± (ω) of M at Definition 4.3. For any M ∈ Sp(2n) and ω ∈ U, the splitting numbers S M ω are defined by ± (ω) = lim+ i ω exp(±√−1) (γ ) − i ω (γ ), SM →0

(4.4)

for any path γ ∈ PT (2n) satisfying γ (T ) = M. The normal form of a symplectic matrix is its Jordan block under the symplectic transformation. The following special symplectic matrices are very important as 2 × 2 normal forms in this index theory: λ 0 D(λ) = , λ = ±2, (4.5) 0 λ−1 N1 (λ, b) = R(θ ) =

λ

b

0

λ

cos θ sin θ

,

λ = ±1, b = ±1, 0,

− sin θ cos θ

(4.6)

, θ ∈ (0, π ) ∪ (π, 2π ).

(4.7)

Splitting numbers have the following properties: ± Lemma 4.4 [L3] (pp. 191). Splitting numbers S M (ω) are well defined, i.e., they are independent of the choice of the path γ ∈ PT (2n) satisfying γ (T ) = M. For ω ∈ U and M ∈ Sp(2n), splitting numbers S N± (ω) are constant for all N = P −1 M P, with P ∈ Sp(2n).

Lemma 4.5 [L3] (pp. 198–199). For M ∈ Sp(2n) and ω ∈ U, θ ∈ (0, π ), there hold ± (ω) = 0, SM

if ω ∈ σ (M),

± ∓ (ω) = S M (ω), ¯ SM ± (ω) ≤ dim ker(M − ωI ), 0 ≤ SM

(4.8) (4.9) (4.10)

− + SM (ω) + S M (ω) ≤ dim ker(M − ωI )2n ω ∈ σ (M), (4.11)

(1, 1), if b = 0, 1, − + (4.12) (S N1 (1,b) (1), S N1 (1,b) (1)) = (0, 0), if b = −1,

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

(S N+ 1 (−1,a) (−1), S N−1 (−1,a) (−1)) (S +R(θ) (e (S +R(2π −θ) (e

√

√

−1θ

−1θ

), S − R(θ) (e

), S − R(2π −θ) (e

√ √

=

759

(1, 1),

if a = 0, −1,

(0, 0),

if a = 1,

(4.13)

−1θ

)) = (0, 1),

(4.14)

−1θ

)) = (1, 0).

(4.15)

For any Mi ∈ Sp(2n i ) with i = 0 and 1, there holds ± ± ± SM (ω) = S M (ω) + S M (ω), 0 M1 0 1

∀ ω ∈ U.

(4.16)

From the definition and property of splitting numbers, for any γ ∈ PT (2n) with γ (T ) = M, we have − − + + i ω (γ ) = i 1 (γ ) + S M (1) + (S M (ω0 ) − S M (ω0 )) − S M (ω), (4.17) ω0

where the sum runs over all the eigenvalues ω0 of M belonging to the part of U+ or U− strictly located between 1 and ω. The next lemma is from [LZ2] (Cor. 2.1). Lemma 4.6. For any γ ∈ PT (2n), we have i 1 (γ ) + n = µ( , Gr (γ (t))),

(4.18)

i ω (γ ) = µ(Gr (ω), Gr (γ (t))), ω ∈ U\{1},

(4.19)

and

where is the diagonal Gr (I2n ) and Gr (ω) = Gr (ωI2n ). Note that µ(Gr (ωS T ), Gr (γ (t))) = µ(Gr (ω), Gr (Sγ (t)))

(4.20)

for any symplectic orthogonal matrix S. We choose any path ξ in Sp(2n) connecting I2n to S. Let γ˜ be the path Sγ (t) with t ∈ [0, T ], and γ˜ ∗ ξ be the concatenation of ξ and γ˜ . We have, for any ω ∈ U, µ(Gr (ω), Gr (γ˜ )) = i ω (γ˜ ∗ ξ ) − i ω (ξ ),

(4.21)

which plays an important role for the stability analysis later on. This can be deduced from the additivity of Maslov index with respect to concatenation of symplectic paths and (4.18) as well as (4.19). From [L3](p. 120) and the definition of Maslov-type index, for any ω ∈ U and γ ∈ PT (2n) we have i ω (γ ) = i ω¯ (γ ).

(4.22)

µ(Gr (ωS T ), Gr (γ (t)) = µ(Gr (ωS ¯ T ), Gr (γ (t)),

(4.23)

From (4.20–4.22), we have

for any ω ∈ U and γ ∈ PT (2n).

760

X. Hu, S. Sun

For the case Q = S m = I , we can express the general Bott-type formula (1.6) by the Maslov-type index of symplctic paths as i 1 (x) + n =

m (i exp( 2πi √−1) (γ˜ ∗ ξ ) − i exp( 2πi √−1) (ξ )), m

i=1

m

(4.24)

with ξ chosen as above. 4.2. Stability and index. In this section and the following, we give some statements on the stability and instability of the periodic solutions of the Hamiltonian systems via indices of the orbits. Recall that, for M ∈ Sp(2n), it is linearly stable if M j is bounded for all j ∈ N. Note that this implies M is diagonalizable and the eigenvalues of M are all on the unit circle U of the complex plane. We call M to be spectrally stable if all its eigenvalues are on the unit circle. We denote by e(M) the total algebraic multiplicities of all eigenvalues of M on U. Definition 4.7. Given a T -periodic solution z(t) to a first order Hamiltonian system with fundamental solution γ (t), we say z is spectrally stable (linearly stable) if γ (T ) is spectrally stable (linearly stable, respectively). In the literature there are many papers concerning the stability of the periodic solutions of the Hamiltonian system using the Maslov-type index [E,L2,L3,LZ1]. The complete iteration formula developed by Long and his collaborators is a very effective tool for this purpose. On the other hand there are papers [DDE] on the stability of Z 2 -symmetric orbits. Motivated by these results and the problem in hand, i.e., the stability of figure-eight solution, we give some criteria to the stability or instability of the symmetric periodic orbits via Maslov-type index. Let γz be the fundamental solution of solution z which has Type I H symmetry. It is easy to see that (compare [R]) γz (t +

T T ) = S T γz (t)Sγz ( ), m m

(4.25)

so we have γz (

kT T ) = (S k )T (Sγz ( ))k . m m

(4.26)

Since S m = I2n , we get γz (T ) = (Sγz (

T m )) . m

(4.27)

T So the linear or spectral stability of the γz (T ) is the same as that of Sγz ( m ). For M ∈ Sp(2n), choose any path γ of symplectic matrices connecting I2n to M, then the difference of ω indices for ω in U can give a lower bound for e(M). Recall var is the total variation of a function defined in (1.14). Note that i ω (γ ) is a function of ω ∈ U. We have the following lemma which is a consequence of Lemma 4.4 and Lemma 4.5:

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

761

Lemma 4.8. With the notations above, e(M)/2 ≥ var (i eθ √−1 (γ ), θ ∈ [0, π ]),

(4.28)

where γ is an arbitrary path in Sp(2n) connecting I2n to M. Proof. From (4.17), + var (i eθ √−1 (γ ), θ ∈ [0, π ]) = S M (1) +

− − + (S M (ω0 ) + S M (ω0 )) + S M (−1), (4.29) ω0

where the sum runs over all the eigenvalues ω0 of M belonging to the part of U+ . From + (1) = S − (1) and S + (−1) = S − (−1), then (4.11) implies that (4.9) S M M M M − − + + (1) + (S M (ω0 ) + S M (ω0 )) + S M (−1) ≤ 1/2(dim ker(M − I2n )2n SM ω0

+ dim ker(M + I2n )2n ) dim ker(M − ω0 I2n ) + ω0

≤ e(M)/2.

(4.30)

Definition 4.9. Given M ∈ Sp(2n), for any symplectic path η from I2n to M, we define Dω (M) for ω ∈ U by Dω (M) = i ω (η) − i 1 (η).

(4.31)

By (4.17), + (1) + Dω (M) = i ω (η) − i 1 (η) = S M

k

− − + (S M (ωi ) − S M (ωi )) − S M (ω), (4.32)

i=1

where the sum runs over all the eigenvalues ω0 of M belonging to the part of U+ or U− strictly located between 1 and ω. So the definition is independent of the choice of η, i.e., it only depends on its end point M and ω. Proof of Theorem 1.3. Suppose γz (t), t ∈ [0, T /m] to be the fundamental solution of the Hamiltonian system with Type I H symmetry, γ˜ (t) = Sγz (t), t ∈ [0, T /m]. For symplectic paths ξ in Sp(2n) connecting I2n to S as considered at the end of last subsection, we have by (4.20), (4.21) and (4.31), i ω (γ˜ ∗ ξ ) − i 1 (ξ ) = i ω (γ˜ ∗ ξ ) − i ω (ξ ) + i ω (ξ ) − i 1 (ξ ) = µ(Gr (ω), Gr (γ˜ )) + Dω (S) = µ(Gr (ωS T ), Gr (γz )) + Dω (S),

(4.33)

and by Lemma 4.8 and (4.19), e(Sγz (

T ))/2 ≥ var (i ω (γ˜ ∗ ξ )) m = var (i ω (γ˜ ∗ ξ ) − i 1 (ξ )) = var (µ(Gr ((eθ

√

−1

Now, the result follows from (4.27).

)S T ), Gr (γz (t)))+Deθ √−1 (S), θ ∈ [0, π ]). (4.34)

762

X. Hu, S. Sun

T Remark 4.10. 1. Set M = Sγz ( m ), by (4.32) and (4.33),

Dω (M) = i ω (γ˜ ∗ ξ ) − i 1 (ξ ) = µ(Gr (ωS T ), Gr (γz )) + Dω (S).

(4.35)

For 0 ≤ θ1 ≤ θ2 ≤ π , by (4.33) again √

var (µ(Gr ((eθ −1 )S T ), Gr (γz )) + Deθ √−1 (S), θ ∈ [θ1 , θ2 ]) = var (Deθ √−1 (M), θ ∈ [θ1 , θ2 ]) + = SM (eθ1

√

−1

)+

k

− − θ2 + (S M (ωi ) + S M (ωi )) + S M (e

√

−1

),

(4.36)

i=1

where the sum is taken over all the eigenvalues of M belonging to U+ strictly located √ √ θ −1 θ −1 and e 2 . between e 1 2. In certain cases, we can simply replace var (γ ) by |i 1 (γ ) − i −1 (γ )|. This is used to study orbits in symmetric compact convex hypersurface [DDE]. 4.3. Instability and index. If M ∈ Sp(2n) is linearly stable as defined in Sect. 4.2, then there exists P ∈ Sp(2n), such that [BTZ] (p. 223, Remark(c)) j

P −1 M P = I2 M j+1 · · · Mn , (4.37) cos θi − sin θi , θi ∈ (0, 2π ). Moreover, det(Mi − I2 ) > 0, which where Mi = sin θi cos θi means that det(e−ε J P −1 M P − I2n ) > 0 with real ε > 0 small enough. In fact, we can prove the following stronger property. Lemma 4.11. If M ∈ Sp(2n) is linearly stable, then det(e−ε J M − I2n ) > 0.

(4.38)

Proof. Let γ (ε, s) = e−ε J Ps−1 M Ps , Ps ∈ Sp(2n), (ε, s) ∈ [0, ε0 ] × [0, 1], where ε0 > 0 is small enough, P0 = I2n and P1 = P which is the symplectic matrix in (4.37). We will compute the crossing form at ε = 0, − J(

d d γ (ε, s))γ −1 (ε, s)|ε=0 = −J [ (e−ε J Ps−1 M Ps )]Ps−1 M −1 Ps eε J |ε=0 dε dε = −I2n . (4.39)

This shows that ε = 0 is a regular crossing. So γ (ε, s) ∈ Sp(2n)∗ for ε small enough, s ∈ [0, 1], this implies that γ (ε, 0) and γ (ε, 1) are in the same connected component of Sp(2n)∗ . The result follows from det(e−ε J P −1 M P − I2n ) > 0. Let γ˜ be a symplectic path connecting S to M as before, where S is a symplectic orthogonal matrix. Then, by the definition (2.16) of Maslov index, we have µ(, Gr (γ˜ )) = µ(, Gr (e−ε J γ˜ )).

(4.40)

Recall that the Maslov index is defined by the intersection number of the path with the singular cycle. We have the following observation.

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

763

Proposition 4.12. Let γ˜ be a symplectic path from S to M as above. If µ(, Gr (γ˜ )) is odd, then M is linearly unstable. Proof. Note that for ε > 0 small enough, by the definition (2.16), µ(, Gr (γ˜ )) = [e−ε J γ˜ : Sp(2n)01 ],

(4.41)

where the right-hand side is the intersection number as explained before. Since S is a symplectic orthogonal matrix, e−ε J S is a symplectic orthogonal matrix also, so it is linear stable and must have det(e−ε J S − I2n ) > 0. If M is linearly stable, then det(e−ε J M − I2n ) > 0, so the intersection number [Sp(2n)01 : e−ε J γ˜ ] is even, which completes the proof. Proof of Theorem 1.4. In fact, if µ(Gr (S T ), Gr (γz (t)), t ∈ [0, T /m]) = µ(, Gr (Sγz (t)), t ∈ [0, T /m]) is odd, then Sγz (T /m) is linear unstable by Proposition 4.12. The result follows from γz (T ) = (Sγz (T /m))m . In Hamiltonian systems, it is well known that the first integral yields multipliers 1 of the linearized Poincaré map. Each first integral gives a 2 × 2 matrix N1 (1, b) with b = −1, 0, 1. In this case, it is natural to define the stability of orbits in the reduced phase space. More precisely, if P −1 M P = M1 M2 , where M1 is the part from the symmetry, we call M linearly stable (unstable) if M2 is linearly stable (unstable, respectively). Now we can state our main theorem on linear instability. Theorem 4.13. Let γ˜ be a symplectic path from S to M as before, if det(e−ε J M1 − I2k ) > 0 (or det(e−ε J M1 − I2k ) < 0), and if µ(, Gr (γ˜ )) is odd (even, respectively), then M is linearly unstable. Proof. Suppose det(e−ε J M1 − I2k ) > 0. If M2 is linearly stable, then det(e−ε J P −1 M P − I2n ) = det(e−ε J M1 − I2k ) · det(e−ε J M2 − I2(n−k) ) > 0. So det(e−ε J M − I2n ) > 0 by the proof of Lemma 4.11. This implies that µ(, Gr (γ˜ )) must be even by Proposition 4.12, a contradiction. The proof for det(e−ε J M1 − I2k ) < 0 is similar, and we omit it. Suppose M1 = N1 (1, 1)i N1 (1, −1) j I2lk ,

(4.42)

then det(e−ε J N1 (1, 1) − I2 ) > 0, det(e

−ε J

N1 (1, −1) − I2 ) < 0.

(4.43) (4.44)

So, we have Corollary 4.14. M is linearly unstable if µ(, Gr (γ˜ )) − j is odd. Remark 4.15. As mentioned in the Introduction, a famous result of Poincaré states that a closed minimizing geodesic on a Riemann surface is unstable. It is easy to see this from our theory as follows: Since the geodesic is a minimizer, so its Morse index is 0. On the other hand for a closed orientable geodesic as periodic orbit on surface, C() = 2. By Theorem 3.4, we know the Maslov index is 2. Moreover in this case, j in (4.42) is 1, and the instability immediately follows from the last corollary. The details will be given elsewhere.

764

X. Hu, S. Sun

5. Stability Analysis of the Figure-Eight Orbit 5.1. Introduction to the figure-eight orbit. In this section, we will study the relation between index theory developed in the previous sections and the stability of the celebrated figure-eight orbit discovered by Chenciner and Montgomery [CM] in the planar three-body problem with equal masses. Without loss of generality, we set all the masses to be 1. In this paper we only consider the center of mass fixed problem, i.e., modulo Galilean symmetry. One of the main goals of the planar three-body problem is to solve the Newtonian system of equations d 2 xi ∂U = (x1 , x2 , x3 ), 2 dt ∂ xi

xi ∈ R2 ,

i = 1, 2, 3,

(5.1)

where U (x1 , x2 , x3 ) =

i< j

1 xi − x j

is the potential or force function with · the standard norm in R2 . So the configuration space is Xˆ = X \ , where X = {x = (x1 , x2 , x3 ) ∈ (R2 )3 | x1 + x2 + x3 = 0}, and is the collision set = {x ∈ X | ∃i = j, xi = x j }. The phase space is the cotangent bundle of Xˆ . We denote the loop space by W 1,2 (R/T Z, Xˆ ) for some fixed period T . The system (5.1) is the Euler-Lagrange equation of the action functional A : W 1,2 (R/T Z, Xˆ ) → R

(5.2)

T

x → A(x) =

L(x(t), x(t))dt, ˙

(5.3)

0 2 + U (x(t)) is the Lagrangian. where L(x(t), x(t)) ˙ = 21 x(t) ˙ The Hamiltonian system corresponding to the Newtonian system (5.1) is

z˙ (t) = J H (z(t))

(5.4)

with Hamiltonian function H (z) = H (y, x) = 21 y2 − U (x) defined on the phase space T ∗ (Xˆ ). Let us first recall precisely the figure-eight orbit of the planar three-body problem with three equal masses [CM]. For the fixed period T , the Klein group Z/2Z × Z/2Z with generators σ and τ acts on R/T Z and on R2 as follows: σ ·t =t +

T T , τ · t = −t + , σ · (x, y) = (−x, y), τ · (x, y) = (x, −y). 2 2

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

765

Theorem 5.1 (Chenciner and Montgomery [CM]). There exists an “eight”-shaped planar loop q : (R/T Z, 0) → (R2 , 0) with the following properties: (i) for each t, q(t) + q(t + T /3) + q(t + 2T /3) = 0; (ii) q(t) is equivariant with respect to the actions of Z/2Z × Z/2Z on R/T Z and R2 above: q(σ · t) = σ · q(t) and q(τ · t) = τ · q(t); (iii) the loop x : R/T Z → Xˆ defined by x(t) = (q(t + 2T /3), q(t + T /3), q(t)) is a zero angular momentum T -periodic solution of the planar three-body problem with equal masses. The figure-eight orbit can also be characterized as the minimizer of the action functional on the dihedral group D6 -invariant loop space [C1]. In terms of generators and relations, D6 is D6 = g1 , g2 | g16 = I6 , g22 = I6 , g1 g2 = g2 g1−1 . The generators g1 and g2 act on W 1,2 (R/T Z, (R2 )3 ) as follows: let x(t) = (x1 (t), x2 (t), x3 (t))T ∈ W 1,2 (R/T Z, (R2 )3 ),

⎛

(g1 ◦ x)(t) = −RSx(t + T /6),

(5.5)

(g2 ◦ x)(t) = R N x(T /6 − t).

(5.6)

⎞ ⎛ ⎞ 0 0 I2 I2 0 0 Here S = ⎝ I2 0 0 ⎠, N = ⎝ 0 0 I2 ⎠, and R is the diagonal action of the reflection 0 I2 0 0 I2 0 ⎛ ⎞ R2 0 0 1 0 with respect to the x-axis in R2 , that is R = ⎝ 0 R2 0 ⎠ with R2 = . Here 0 −1 0 0 R2 our notation g2 is different from that in [C1]. Let D3 denote the subgroup generated by g12 and g2 g1 , and Z/6Z denote the subgroup generated by g1 . It is shown in [C2] that a minimizer of the action in D3 - or Z/6Z-invariant loop space exists and is collision-free. In [C3], Chenciner posed: Are all these three minimizers the same? To our knowledge, this problem is still open now. For the closely related uniqueness problem, it is known that the eight is locally unique [KS] in the D6 -invariant subspace. However, the global uniqueness is still an intriguing open problem. For the readers’ convenience, we first review the beautiful shape sphere coordinate for the planar three-body problem (see for example [CM]). Here we follow Roberts [R]. In the following, all vectors are column vectors. We introduce the Jacobian coordinates

766

X. Hu, S. Sun

−T y v K 0 in T ∗ (R2 )3 , where K −T is the = x u 0 K transpose of K −1 , u = (u 1 , u 2 , u 3 )T , v = (v1 , v2 , v3 )T ∈ (R2 )3 , and

by canonical transformation

⎛

0

⎜ ⎜ 0 ⎜√ √ ⎜ ⎜ 2/ 3 K =⎜ ⎜ 0 ⎜ ⎜ ⎝ 1/3

√ −1/ 2

0 0 0 √ √ 2/ 3

0

0 √ −1/ 6

√ 1/ 2

0 √ −1/ 2

0 √ −1/ 6

0

0 √ −1/ 6

0

1/3

0

1/3

1/3

0

1/3

0

0

⎞ 0 √ ⎟ 1/ 2 ⎟ ⎟ ⎟ 0 ⎟ √ ⎟. −1/ 6⎟ ⎟ ⎟ 0 ⎠ 1/3

In these new coordinates, the Lagrangian is 1 ˜ 1 , u 2 , u˙ 1 , u˙ 2 ) = L(K −1 u, K T u) L(u ˙ = (u˙ 1 2 + u˙ 2 2 ) + U˜ (u 1 , u 2 ). 2

(5.7)

So the Lagrangian does not depend on u 3 and v3 , which correspond to the conservation of the center of masses and linear momentum respectively. Since for the eight orbit, we fix the center of masses at the origin, we set u 3 = v3 = 0. This is essentially the symplectic reduction with respect to the translation symmetry of the Newtonian system. The reduced phase space is isomorphic to T ∗ Xˆ . Now the moment of inertia is I = x1 2 + x2 2 + x3 2 = u 1 2 + u 2 2 . We denote by g˜ 1 , g˜ 2 the corresponding generators of the group defined above acting on the loop space in the new coordinates u = (u 1 , u 2 ). Then ˜ (g˜ 1 ◦ u)(t) = Su(t + T /6),

(5.8)

(g˜ 2 ◦ u)(t) = N˜ u(T /6 − t),

(5.9)

where ⎛

0

√ 3/2

−1/2

0

0

1/2

1/2

⎜ ⎜ 0 S˜ = ⎜ ⎜ √ ⎝− 3/2 0

√ 3/2

⎞ 0 √ ⎟ − 3/2⎟ ⎟, ⎟ 0 ⎠

(5.10)

−1/2

0

and ⎛ −1 ⎜0 ⎜ N˜ = ⎜ ⎝0 0

0

0

1

0

0

1

0

0

0

⎞

0⎟ ⎟ ⎟. 0⎠

−1

(5.11)

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

767

5.2. The symplectic Jordan block decomposition of the Poincaré map. As is well known, for the planar three-body problem, there are two first integrals, namely the energy and the angular momentum, when the Newtonian system is restricted to the space Xˆ . Fix the center of masses to the origin, the dimension of the eigenspace associated to the eigenvalue 1 is at least 4. We call a solution of the Newtonian system nondegenerate if the dimension of the eigenspace associated with the eigenvalue 1 of the Poincaré map is exactly 4. This is the case for the eight which is established in [KS] by computer-assisted methods: Fact. The Figure-eight orbit is nondegenerate.

D 0 For an orthogonal matrix D on we define Dd = , it is easy to show that 0 D Dd J = J Dd and Dd ∈ Sp(2n). Let γ (t) be the fundamental solution of the eight in Jacobian coordinates, and set M = S˜d γ (T /6); from the Z/6Z-invariance property, we have Rn ,

γ (T ) = M 6 . So the stabilities of γ (T ) and M are the same. Now we analyze the eigenvalues of M. Note that if x(t) is a period T solution of the Newton system, then h −2/3 x(ht) is ˙ h −2/3 x(ht))T , then z h (t) is a also a solution with period T / h. Let z h (t) = (h 1/3 x(ht), solution of the corresponding Hamiltonian system. In Jacobian coordinates we denote ˙ h −2/3 u(ht))T , it satisfies the figure-eight by z h (t) = (h 1/3 u(ht), S˜d z h (Th /6) = z h (0),

(5.12)

where Th = T / h is the period of the orbit z h (t) and H (z h ) = h 2/3 H (z 1 ).

(5.13)

Note that for the eight as a bounded orbit in the planar three body problem, its energy is negative, which is an important fact for calculating the normal form of M. The next lemma is motivated by [E](pp. 70-71), [L3](p. 327) in their studies of the closed characteristics in convex hypersurface for Hamiltonian system in R2n . Lemma 5.2. With the notations above and γ (t)(t ∈ [0, T /6]) the fundamental solution path of the system (5.4) in Jacobian coordinates, the matrix M = S˜d γ (T /6) satisfies M z˙ (0) = z˙ (0),

(5.14)

d T d z h (0)|h=1 . − z˙ (0) + M z h (0)|h=1 = 6 dh dh

(5.15)

Proof. Since S˜d z(t + T /6) = z(t),

(5.16)

differentiating this formula with respect to t, we get (5.14) by setting t = 0. To prove (5.15), from (5.12), we get γ (Th /6)

d d z h (0) = z h (Th /6). dh dh

(5.17)

768

X. Hu, S. Sun

Differentiating (5.12) with respect to h yields d(Th /6) ˜ d d S˜d z˙ (Th /6) + Sd z h (Th /6) = z h (0). dh dh dh Plug (5.17) into (5.18), and let h = 1; we get (5.15).

(5.18)

Lemma 5.3. Under the same setting as the last lemma, there exist P ∈ Sp(8) and M1 ∈ Sp(6), such that M = P −1 (N1 (1, 1) M1 )P. Proof. Let ξ1 = T /6˙z (0) and ξ2 =

d dh z h (0)|h=1 .

ω(ξ1 , ξ2 ) = J · T J H (z(0)),

(5.19)

Direct computations show that

d d z h (0)|h=1 = −T H (z h ) > 0. dh dh

So the space spanned by ξ1 , ξ2 is an invariant symplectic subspace of M, and furthermore they form the symplectic basis of this subspace. When M restricts to this subspace, it is N1 (1, 1). Since M is a symplectic matrix, we have the result. Remark 5.4. Our above analysis for the eight works for any orbit with cyclic symmetry. In fact, for the periodic solution of n-body problem with Type I H symmetry, Q = S m = I , the Jordan block of Sd γ (T /m) corresponding to the translation invariance is N1 (1, 1). In the following, we will compute the normal form corresponding to the angular A˜ 0 0 −1 momentum. Let A = with A˜ = . Then s = exp(s Ad ) is the rotation 1 0 0 A˜ flow, it commutes with the Hamiltonian flow φt of the figure-eight, that is φt s (z 0 ) = s φt (z 0 ).

(5.20)

Choose z 0 to be a point of the eight, set t = T /6, and differentiate (5.20) with respect to s at s = 0; we get γ (T /6)Ad z 0 = Ad S˜dT z 0 .

(5.21)

Since Ad S˜dT = − S˜dT Ad , we have S˜d γ (T /6)Ad z 0 = −Ad z 0 .

(5.22)

This implies that −1 is an eigenvalue of M. From the nondegeneracy fact, we have proved Lemma 5.5. There exist P ∈ Sp(8), M2 ∈ Sp(4) such that M = P −1 (N1 (1, 1) N1 (−1, b) M2 )P,

(5.23)

where b = 1, 0 or −1. Remark 5.6. 1. From this lemma, M2 is a 4 × 4 symplectic matrix which is the essential part for the stability problem.

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

769

2. N1 (−1, b) is the Jordan block corresponding to the angular momentum, where b = 1, 0 or −1. Since γ (T ) = M 6 , it is obvious that the Jordan block of M corresponding to the angular momentum is of the form N1 (−1, b) if and only if the Jordan block of γ (T ) corresponding to the angular momentum is N1 (1, −b). We have numerically checked that b = −1, so M = P −1 (N1 (1, 1) N1 (−1, −1) M2 )P.

(5.24)

It was also pointed out earlier in [CFM] that the Jordan block corresponding to the angular momentum can be computed by the bifurcation family of the eight. Unfortunately, they also depend on the numerical results. A mathematical proof of this statement seems to be still missing.

5.3. Stability of the figure-eight orbit. Let Z/2Z, Z/3Z denote the group generated by g13 and g12 respectively, and set m 1 , m 2 , m 3 to be the Morse indices of the eight as the critical point of the action functional corresponding to L˜ in the total loop space W 1,2 (R/Z, X ), its Z/2Z- and Z/3Z-invariant loop subspaces. In the following subsections, we will show the relation between the stability of the eight and m 1 , m 2 , m 3 under some assumptions. In this subsection, we will give some criteria to the stability of the eight by Morse index. This is based on Theorem 1.1 proved in Sect. 2.2. Since dimker( S˜ 2 − I4 ) = 0 and dimker( S˜ 3 − I4 ) = 2, from Theorem 1.2, we have the next lemma. Lemma 5.7. With notations as above, the fundamental solution γ of the eight satisfies µ(Gr (( S˜ 2 )dT ), Gr (γ (t)), t ∈ [0, T /3]) = m 3 ,

(5.25)

µ(Gr (( S˜ 3 )dT ), Gr (γ (t)), t ∈ [0, T /2]) = m 2 + 2,

(5.26)

µ( , Gr (γ (t)), t ∈ [0, T ]) = m 1 + 4.

(5.27)

For simplifying notations, we define µ(ω) := µ(Gr (ω S˜dT ), Gr (γ (t)), t ∈ [0, T /6]), ∀ω ∈ U.

(5.28)

From Theorem 1.1, the generalized Bott-type iteration formula, we have µ(Gr (( S˜ 2 )dT ), Gr (γ (t)), t ∈ [0, T /3]) = µ(1) + µ(−1); µ(Gr (( S˜ 3 )dT ), Gr (γ (t)), t ∈ [0, T /2]) =

3

(5.29)

√ µ(exp(2π j −1/3));

(5.30)

√ µ(exp(2π j −1/6)).

(5.31)

j=1

µ( , Gr (γ (t)), t ∈ [0, T ]) =

6 j=1

Let I be the index form of the Figure-Eight orbit in the Z/6Z loop space. By Theorem 1.2 and Remark 3.5, µ(ω) = C() + m − (I),

(5.32)

770

X. Hu, S. Sun

˜ + T /6). From (3.26) and Remark 3.16, where is the boundary condition x(t) = ω¯ Sx(t C() = dim ker(ω¯ S˜ − In ) ≥ 0.

(5.33)

µ(ω) ≥ 0.

(5.34)

µ(ω) = µ(ω), ¯ ω ∈ U.

(5.35)

√ √ 1 µ(exp(π −1/3)) + µ(exp(2π −1/3)) = 2 + (m 1 − m 3 ), 2

(5.36)

√ 1 µ(exp(2π −1/3)) = 1 + (m 2 − µ(1)). 2

(5.37)

This shows that

By (4.23)

Now (5.26-5.31) imply that

and

In the √ following, we will compute Dω ( S˜d ) of Definition 4.9. We set U+ = {ω = exp( −1θ ) ∈ U | θ ∈ [0, π ]}. Lemma 5.8.

√ √ Dω ( S˜d ) = 0, ω ∈ U+ \{exp(π −1/3), exp(2π −1/3)},

and Dexp(π √−1/3) ( S˜d ) = Dexp(2π √−1/3) ( S˜d ) = −1.

(5.38)

2π Proof. Note that S˜d = A1 A2 with A1 = R( 5π 3 )d and A2 = R( 3 )d . For the notation R(θ ), see (4.7). Let √ √ ⎞ ⎛ 0 0 − 2/2 − 2/2 √ ⎜√ ⎟ ⎜ 2/2 − 2/2 0 0 ⎟ ⎟, P=⎜ √ √ ⎜ ⎟ 2/2 0 0 ⎠ ⎝ 2/2 √ √ 0 0 2/2 − 2/2

then P −1 A1 P = R(

π 5π ) R( ). 3 3

(5.39)

From (4.14–4.16), we have π

(S +A1 (e 3

√ −1

π

3 ), S − A1 (e

√ −1

π

)) = (S +R( 5π ) (e 3

√

−1

3

π

+(S +R( π ) (e 3

√ −1

3

π

), S −

R( 5π 3

)

(e 3 π

), S − (e 3 R( π ) 3

√ −1 √

−1

)) ))

= (1, 0) + (0, 1) = (1, 1).

(5.40)

Stability of Symmetric Periodic Solutions in Hamiltonian Systems π

Since the eigenvalues of A1 are e± 3

√ −1

771

,

(S +A1 , S − A1 )(ω) = (0, 0), ω =

π√ −1, ω ∈ U. 3

(5.41)

Similarly, for A2 , P −1 A2 P = R(

4π 2π ) R( ). 3 3

(5.42)

So (S +A2 (e

2π 3

√ −1

), S − A2 (e

2π 3

√

−1

)) = (1, 1),

(5.43)

2π √ (S +A2 , S − −1, ω ∈ U. A2 )(ω) = (0, 0), ω = 3

The lemma follows from (4.16) and (4.32).

(5.44)

From (5.23), we know M = P −1 (N1 (1, 1) N1 (−1, b) M2 )P, for some P ∈ Sp(8). So 1 e(M) e(M2 ) = (e(N1 (1, 1)) + e(N1 (−1, b)) + e(M2 )) = 2 + . 2 2 2

(5.45)

From the fact that the Figure-Eight is nondegenerate, we know ker(M26 − I4 ) = 0, so ±1 are not the eigenvalues of M2 . Suppose ωi , i = 1, . . . , k, are all the eigenvalues of M2 (same as M) belonging to U+ strictly located between 1 and −1. By (4.30), e(M2 ) − − + + ≥ SM (1) + (S M (ωi ) + S M (ωi )) + S M (−1) 2 2 2 2 2 k

i=1

=

k

− + (S M (ωi ) + S M (ωi )) 2 2

i=1

=

k

− + (S M (ωi ) + S M (ωi )).

(5.46)

i=1

For ε small enough, from (4.36), var (µ(ω)+Dω ( S˜d ) , ω = e

√

−1θ

, θ ∈ [ε, π − ε]) =

k

− + (S M (ωi )+ S M (ωi )).

(5.47)

i=1

So we have √ e(M) − 2 ≥ var (µ(ω) + Dω ( S˜d ) , ω = exp( −1θ ) , θ ∈ [ε, π − ε]). 2

(5.48)

We choose any path ξ in Sp(2n) connecting I2n to S˜d . Let γ˜ be the path S˜d γ (t) with t ∈ [0, T /6], where γ (t) is the fundamental solution of the Figure-Eight orbit. Let γ˜ ∗ ξ be the concatenation of ξ and γ˜ . From (4.21), for any ω ∈ U, µ(ω) = µ(Gr (ω), Gr (γ˜ )) = i ω (γ˜ ∗ ξ ) − i ω (ξ ).

(5.49)

772

X. Hu, S. Sun

So µ(ω) − µ(1) = i ω (γ˜ ∗ ξ ) − i 1 (γ˜ ∗ ξ ) − (i ω (ξ ) − i 1 (ξ ) = i ω (γ˜ ∗ ξ ) − i 1 (γ˜ ∗ ξ ) − Dω ( S˜d ).

(5.50)

From (4.17), for ω ∈ U+ , + µ(ω) = µ(1) + S M (1) +

− − + (S M (ω0 ) − S M (ω0 )) − S M (ω) − Dω ( S˜d ),

(5.51)

ω0

where the sum runs over all the eigenvalues ω0 of M belonging to the part of U+ strictly located between 1 and ω. Proposition 5.9. If b = −1 in (5.23) and m 3 = 0, then the eight is either spectral stable or hyperbolic. Proof. By (5.25), µ(Gr (( S˜ 2 )dT ), Gr (γ (t)), t ∈ [0, T /3]) = m 3 = 0,

(5.52)

µ(1) + µ(−1) = 0,

(5.53)

which implies

by (5.29). Since µ(ω) ≥ 0 by (5.34), m 3 = 0 implies µ(1) = µ(−1) = 0.

(5.54)

If b = −1, M = P −1 (N1 (1, 1) N1 (−1, −1) M2 )P by (5.23). By Lemma 4.5, (4.12) and (4.13), + (1) = S N+ 1 (1,1) (1) = 1, SM − SM (−1) = S N−1 (−1,−1) (−1) = 1.

(5.55) (5.56)

By Lemma 5.8, D−1 ( S˜d ) = 0. Applying (5.51) to ω = −1, we have + (1) + 0 = µ(−1) = µ(1) + S M

− − + (S M (ω0 ) − S M (ω0 )) − S M (−1) − D−1 ( S˜d ) ω0

− + (S M (ω0 ) − S M (ω0 )). =

(5.57)

ω0

+ (ω ) − S − (ω )| = 1 by (4.14-4.15) If M has only one eigenvalue ω0 ∈ U+ , then |S M 0 0 M of Lemma 4.5. This is a contradiction to (5.57). So either M has no eigenvalues in U+ , which is the hyperbolic case, or M has exactly two eigenvalues in U+ , i.e. it is spectrally stable.

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

773

Proof of Theorem 1.5. By Remark 5.6, the condition a = 1 in Theorem √ 1.5 is equivalent to b = −1. If m 2 = m 3 = 0, then by (5.37) and (5.54), µ(exp(2π −1/3)) = 1. Since Dexp(2π √−1/3) ( S˜d ) = −1 by (5.38). Now (5.51) reads √ − + + 1 = µ(exp(2π −1/3)) = µ(1) + S M (1) + (S M (ω0 ) − S M (ω0 )) ω0

√ − (exp(2π −1/3)) − Dexp(2π √−1/3) ( S˜d ) −S M − + (S M (ω0 ) − S M (ω0 )) + 2, (5.58) = ω0

where the sum runs over all the√eigenvalues ω0 of M belonging to the part of U+ strictly located between 1 and exp(2π −1/3). The second equality follows from the facts that + (1) = 1 and D √ ˜ µ(1) = 0, S M exp(2π −1/3) ( Sd ) = −1. We have − + (S M (ω0 ) − S M (ω0 )) = −1. ω0

Since

− + (S M (ω0 ) + S M (ω0 )) ≤ 2, ω0

√ the only possibility is that there exists only one ω0 = exp( −1θ0 ) ∈ σ (M) with θ0 ∈ (0, 2π/3) such that − + ((S M (ω0 ), S M (ω0 )) = (0, 1).

(5.59)

On the other hand, similar to (5.51), we have √ √ − + + (exp(2π −1/3)) + (S M (ω0 ) − S M (ω0 )) 0 = µ(−1) = µ(exp(2π −1/3)) + S M ω0

− −S M (−1) − D−1 ( S˜d ) + Dexp(2π √−1/3) ( S˜d ) − + (S M (ω0 ) − S M (ω0 )) − 1, =

(5.60)

ω0

+ where the sum runs over all the√eigenvalues ω0 of M belonging to the part √ of U strictly located between exp(2π −1/3) and −1. So there exists ω1 = exp( −1θ1 ) with θ1 ∈ (2π/3, π ) such that − + (ω1 ), S M (ω1 )) = (1, 0). ((S M

So M must be linear stable.

(5.61)

Based on (5.60-5.61) in the proof of this theorem, we can get the normal form for M following from (4.14-4.15) of Lemma 4.5 about the properties of the splitting numbers. Corollary 5.10. If b = −1 and m 2 = m 3 = 0, then ∃P ∈ Sp(8), θ1 ∈ (0, 2π/3), θ2 ∈ (2π/3, π ), such that M = P −1 (N1 (1, 1) N1 (−1, −1) R(θ1 ) R(2π − θ2 ))P.

(5.62)

774

X. Hu, S. Sun

Theorem 5.11. The eight is linear stable if b = −1, m 3 = 0 and m 1 < 4. √ Proof. √ If m 1 < 4 and m 3 = 0, by (5.36) either µ(exp(2π −1/3)) = 1 or µ(exp (π −1/3) = 1. The stability follows from the same idea as Theorem 1.5. Remark 5.12. 1. Using Matlab, we get the Morse indices in the various loop spaces defined above, and they are m 1 = 2, m 2 = 0 and m 3 = 0. Numerically, according to Simó, the eight is the minimizer of the action functional on the Z/3Z-invariant subspace of loops in the class (0, 0, 0), which we know from Chenciner. This shows that the eight is a local minimizer in Z/3Z-invariant subspace of loops, i.e., m 3 = 0. It would be very interesting to establish these statements rigorously. 2. We also check that the 8 eigenvalues of M are 1, 1, −1, −1, ω1 , ω¯ 1 , ω2 , ω¯ 2 with ω1 = 0.2099 + 0.9777i and ω2 = −0.5076 + 0.8616i, where ω1 and ω2 are well in accord with the numerical results of [R]. + (ω ), S − (ω )) = (0, 1) and (S + (ω ), S − (ω )) = (1, 0), hence 3. We further get (S M 1 1 2 2 M M M their symplectic normal forms are as in Corollary 5.10. Similarly, the symplectic normal forms about the eigenvalues 1 and −1 are checked to be N1 (1, 1) and N1 (−1, −1), which match quite well with our theory. 5.4. Further analysis on the index. In this subsection, we will further analyze the Maslov index via the brake symmetry. Set E i, j = {x ∈ W 1,2 ([0, T /12], R4 )|x(0) ∈ V i ( S˜ N˜ ), x(T /12) ∈ V j ( N˜ )}, i, j = ±1; (5.63) here W 1,2 ([0, T /12], R4 ) is considered to be the tangent space of W 1,2 ([0, T /12], Xˆ ) at the eight. Note that the D6 -invariant subspace is E 1,1 . The Z/6Z-invariant loop space is ˜ /6)}. E 1 = {x ∈ W 1,2 ([0, T /6], R4 ) | x(0) = Sx(T Since S˜ 2 N˜ S˜ = S˜ N˜ , the D3 -invariant subspace is ˜ E 2 = {x ∈ W 1,2 ([0, T /6], R4 )|x(0) ∈ V + ( S˜ N˜ )), x(T /6) ∈ V + ( N˜ S)}. Moreover, it is easy to see that V + ( S˜ N˜ ) = {q3 = 0, q1 = −q2 }, V + ( N˜ ) = {Rq1 = q1 , Rq2 = −q3 , q1 + q2 + q3 = 0}, and ˜ = {q2 = 0, q1 = −q3 }. V + ( N˜ S) It is clear that V + ( S˜ N˜ ) is the Euler configuration space with x3 in the middle and V + ( N˜ ) is the isosceles configuration space (i.e. r12 = r13 ) which are used in [CM] to ˜ is the Euler configuration space with x2 the middle point. construct the eight. V + ( N˜ S) We denote by m i. j the Morse index of the action functional at the eight restricted to the space E i, j , for i, j = ±1, m 6 is the Morse index on E 1 .

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

775

From the decomposition of Morse index, we have m 6 = m 1,1 + m −1,−1 ,

(5.64)

m 3 = m 1,1 + m −1,−1 + m 1,−1 + m −1,1 .

(5.65)

and

Moreover, we have Lemma 5.13. If the figure eight is local minimizer on E 1 and E 2 , then m 1,1 = m −1,−1 = m 1,−1 = 0.

(5.66)

Proof. m 1,1 = m −1,−1 = 0 is from (5.64). To prove m 1,−1 = 0, note that E 2 has Z2 group action generated by g2 g1 , so E 2 = E 1,1 ⊕ E 1,−1 . By assumption the eight is D3 -invariant minimizer, that is the minimizer on E 2 . The result follows from the decomposition of the Morse index m 1,1 + m 1,−1 = 0. For V ⊂ Rn , (V ) = J V ⊥ ⊕ V are Lagrange subspaces in (R2n , ω). Set µi, j = µ((V i ( N˜ )), γ (t)(V j ( S˜ N˜ )), t ∈ [0, T /12]), i, j = ±1.

(5.67)

From the generalized Bott-type formula, we have µ(1) = µ1,1 + µ−1,−1 ,

(5.68)

µ(−1) = µ1,−1 + µ−1,1 ,

(5.69)

˜ γ (t)(V + ( S˜ N˜ )), t ∈ [0, T /6]) = µ1,1 + µ1,−1 , µ((V + ( N˜ S)), ˜ γ (t)(V − ( S˜ N˜ )), t ∈ [0, T /6]) = µ−1,−1 + µ−1,1 , µ((V − ( N˜ S)),

(5.70) (5.71)

µ(Gr (( S˜ 2 )dT ), Gr (γ (t)), t ∈ [0, T /3]) = µ1,1 + µ1,−1 + µ1,−1 +µ−1,−1 .

(5.72)

From Lemma 5.13 and Theorem 1.2, if the eight is local minimizer on E 1 and E 2 , we get µ1,1 = µ−1,−1 = µ1,−1 = 0;

(5.73)

m 3 = m −1,1 = µ−1,1 .

(5.74)

in this case

If m 3 = 0 can be proved, then µi, j = 0 for i, j = ±1. Acknowledgements. The authors sincerely thank Professors Alain Albouy, Kuo-Chang Chen, Alain Chenciner, Jacques Féjoz, Yiming Long and Carles Simó for their precious help and useful discussions. The referees’ suggestions on the improvement of the presentations were quite valuable for polishing the paper. Part of this work was done while X. Hu was visiting the University of Michigan and Georgia Institute of Technology; he sincerely thanks Professors Yongbin Ruan and Chongchun Zeng for their invitations and discussions. S. Sun thanks the IMCCE of L’Observatoire de Paris for its hospitality.

776

X. Hu, S. Sun

References [Ab] [Ar] [APS] [B] [BTZ] [CLM] [CH] [CHH] [C1] [C2] [C3] [CFM] [CM] [CZ] [CD] [D] [DDE] [E] [EH] [FT] [G] [HS] [HWZ] [KS] [Ka] [Liu] [L1] [L2] [L3] [LZZ] [LZ1]

Abbondandolo, A.: Morse Theory for Hamiltonian Systems. Chapman Hall/CRC Research Notes in Mathematics 425, 2001 Arnold, V.I.: On a characteristic class entering into conditions of quantization. Funct. Anal. Appl. 1, 1–8 (1967) Atiyah, M.F., Patodi, V.K., Singer, I.M.: Spectral asymmetry and riemannian geometry. Math. Proc. Camb. Philos. Soc. 79, 79–99 (1976) Bott, R.: On the iteration of closed geodesics and the sturm intersection theory. Comm. Pure Appl. Math. 9, 171–206 (1956) Ballmann, W., Thorbergsson, G., Ziller, W.: Closed geodesics on positively curved manifolds. Ann. of Math. (2) 116(2), 213–247 (1982) Cappell, S.E., Lee, R., Miller, E.Y.: On the maslov index. Comm. Pure Appl. Math. 47(2), 121–186 (1994) Chen, K.: Existence and minimizing properties of retrograd orbits in three-body problem with various choice of mass. Ann. of Math. (2) 167(2), 325–348 (2008) Chen, C., Hu, X.: Maslov index for homoclinic orbits of hamiltonian syatems. Ann. IHP. Anal. Non Linéaire 24, 589–603 (2007) Chenciner, A.: Action minimizing periodic solutions of the n-body problem. In: Celestial Mechanics, dedicated to Donald Saari for his 60th Birthday, Chenciner, A., Cushman, R., Robinson, C., Xia, Z.J. eds, Contemporary Mathematics 292, Providence, RI: Amer. Math. Soc., 2002, pp. 71–90 Chenciner, A.: Action minimizing solutions of the n-body problem: from homology to symmetry. Proceedings ICM Beijing 2002, Vol. III, Beijing: Higher Ed. Press, 2002, pp. 279–294 Chenciner, A.: Some facts and more questions about the Eight, Topological Methods. In: Variational Methods, ed. Brezis, H. et al., Singapore: World Scientific, 2003, pp. 77–88 Chenciner, A., Féjoz, J., Montgomery, R.: Rotating eights. I. the three i families. Nonlinearity 18(3), 1407–1424 (2005) Chenciner, A., Montgomery, R.: A remarkable periodic solution of the three body problem in the case of equal masses. Ann. Math. 152, 881–901 (2000) Conley, C., Zehnder, E.: Morse-type index theory for flows and periodic solutions for hamiltonian equations. Comm. Pure Appl. Math. 37, 207–253 (1984) Cushman, R., Duistermaat, J.J.: The behavior of the index of a periodic linear hamiltonian system under iteration. Adv. Math. 23(1), 1–21 (1977) Duistermaat, J.J.: On the morse index in variational calculus. Adv. Math. 21(2), 173–195 (1976) Dell’Antonio, G., D’Onofrio, B., Ekeland, I.: Les systéms hamiloniens convexes et pairs ne sont pas ergodiques en general. C. R. Acad. Sci. Paris. Série I 315, 1413–1415 (1992) Ekeland, I.: Convexity Methods in Hamiltonian Mechanics. Berlin: Springer-Verlag, 1990 Ekeland, I., Hofer, H.: Convex hamiltonian energy surfaces and their periodic trajectories. Commun. Math. Phys. 113(3), 419–469 (1987) Ferrario, D.L., Terracini, S.: On the existence of collisionless equivariant minimizers for the classical n-body problem. Invent. Math. 155(2), 305–362 (2004) Gordon, W.B.: A minimizing property of kepler orbits. Amer. J. Math. 99, 961–971 (1977) Hu, X., Sun, S.: Morse Index and Stability of Elliptic Lagrangian Solutions in the Planar 3-Body Problem. Submitted, 2008 Hofer, H., Wysocki, K., Zehnder, E.: The dynamics on three-dimensional strictly convex energy surfaces. Ann. Math. (2) 148(1), 197–289 (1998) Kapela, T., Simó, C.: Computer assisted proofs for nonsymmetric planar choreographies and for stability of the eight. Nonlinearity 20, 1241–1255 (2007) Kato, T.: Perturbation Theory for Linear Operators. Classics in Mathematics, Berlin-HeidelbergNew York: Springer-Verlag, 1995 Liu, C.: Iteration theory for Maslov-type index theory with Lagrangian boundary conditions, minimal period brake problems for nonlinear Hamiltonian systems. Preprint, 2007 Long, Y.: Bott formula of the maslov-type index theory. Pac. J. Math. 187, 113–149 (1999) Long, Y.: Precise iteration formulae of the maslov-type index theory and ellipticity of closed characteristics. Adv. Math. 154, 76–131 (2000) Long, Y.: Index Theory for Symplectic Paths with Applications. Progress in Math. 207, Basel: Birkhäuser, 2002 Long, Y., Zhang, D., Zhu, C.: Multiple brake orbits in bounded convex symmetric domains. Adv. Math. 203(2), 568–635 (2006) Long, Y., Zhu, C.: Closed characteristics on compact convex hypersurfaces in R2n . Ann. Math. 155, 317–368 (2002)

Stability of Symmetric Periodic Solutions in Hamiltonian Systems

[LZ2] [R] [RS1] [RS2] [S] [V] [ZL] [Z]

777

Long, Y., Zhu, C.: Maslov-type index theory for symplectic paths and spectral flow. II. Chinese Ann. Math., Ser. B 21(1), 89–108 (2000) Roberts, G.E.: Linear stability analysis of the figure-eight orbit in the three body problem. Erg. Th. Dyn. Sys. 27(6), 1947–1963 (2007) Robbin, J., Salamon, D.: The maslov index for paths. Topology 32, 827–844 (1993) Robbin, J., Salamon, D.: The spectral flow and maslov index. Bull. London Math. Soc. 27, 1–33 (1995) Simó, C.: Dynamical properties of the figure eight solution of the three-body problem. In: Celestial Mechanics (Evanston, IL, 1999), Contemp. Math. 209 Providence, RI: Amer. Math. Soc., 2002, pp. 209–228 Viterbo, C.: A new obstruction to embedding lagrangian tori. Invent. Math. 100(2), 301–320 (1990) Zhu, C., Long, Y.: Maslov-type index theory for symplectic paths and spectral flow I. Chin. Ann. of Math. 20B(4), 413–424 (1999) Zhu, C.: A generalized Morse index theorem. In: Analysis, Geometry and Topology of Elliptic Operators, Hackensack, NJ: World Sci. Publ., 2006, pp. 493–540

Communicated by G. Gallavotti

Commun. Math. Phys. 290, 779–788 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0830-4

Communications in

Mathematical Physics

A Penrose-Like Inequality for General Initial Data Sets Marcus A. Khuri Department of Mathematics, Stony Brook University, Stony Brook, NY 11794, USA. E-mail: [email protected] Received: 8 September 2008 / Accepted: 6 March 2009 Published online: 12 May 2009 – © Springer-Verlag 2009

Abstract: We establish a Penrose-Like Inequality for general (not necessarily time symmetric) initial data sets of the Einstein equations which satisfy the dominant energy condition. More precisely, it is shown that the ADM energy is bounded below by an expression which is proportional to the square root of the area of the outermost future (or past) apparent horizon. 1. Introduction In an attempt to find a counterexample for his Cosmic Censorship Conjecture, R. Penrose [12] proposed a necessary condition for its validity, in the form of an inequality relating the ADM mass and area of any event horizon in an asymptotically flat spacetime: Area Mass ≥ . (1.1) 16π Unfortunately this Penrose Inequality can only be proven with knowledge of the full spacetime development, as otherwise it would not be possible to locate the event horizon in a given spacelike slice. Thus it is customary to reformulate (1.1) so that the quantities involved may be calculated solely from local information, namely initial data sets for the Einstein equations. By an initial data set we are referring to a triple (M, g, k), consisting of a Riemannian 3-manifold M with metric g and a symmetric 2-tensor k representing the extrinsic curvature of a spacelike slice. These data are required to satisfy the constraint equations 16π µ = R + (Tr g k)2 − |k|2 , 8π Ji = ∇ j (ki j − (Tr g k)gi j ),

The author is partially supported by NSF Grant DMS-0707086 and a Sloan Research Fellowship.

780

M. A. Khuri

where R is scalar curvature and µ, J are respectively the energy and momentum densities for the matter fields. If all measured energy densities are nonnegative then µ ≥ |J |, which will be referred to as the dominant energy condition. Moreover the initial data set will be taken to be asymptotically flat (with one end), so that at spatial infinity the metric and extrinsic curvature satisfy the following fall-off conditions: |∂ l (gi j − δi j )| = O(r −l−1 ),

|∂ l ki j | = O(r −l−2 ), l = 0, 1, 2, as r → ∞.

The ADM energy and momentum are then well defined by 1 1 (∂i gi j − ∂ j gii )ν j , Pi = lim (ki j − (Tr g k)gi j )ν j , E = lim r →∞ 16π S r →∞ 8π S r r where Sr are coordinate spheres in the asymptotic end with unit outward normal ν. The strength of the gravitational field in the vicinity of a 2-surface ⊂ M may be measured by the null expansions θ± := H ± Tr k, where H is the mean curvature with respect to the unit outward normal (pointing towards spatial infinity). The null expansions measure the rate of change of area for a shell of light emitted by the surface in the outward future direction (θ+ ), and outward past direction (θ− ). Thus the gravitational field is interpreted as being strong near if θ+ < 0 or θ− < 0, in which case is referred to as a future (past) trapped surface. Future (past) apparent horizons arise as boundaries of future (past) trapped regions and satisfy the equation θ+ = 0 (θ− = 0). In the setting of the initial data set formulation of the Penrose Inequality, apparent horizons take the place of event horizons, in that the area of the event horizon is replaced by the area of the outermost apparent horizon (or in some formulations by the least area required to enclose an apparent horizon). The Penrose Inequality has been established by Huisken & Ilmanen [8] and by Bray [2] in the time symmetric case, that is when k = 0. At the present time the conjecture for arbitrary initial data sets remains open, however recently Bray and the author [3] have succeeded in reducing this problem to the question of existence for a canonical system of partial differential equations. The purpose here is to establish a Penrose-Like Inequality for arbitrary initial data satisfying the dominant energy condition. This new inequality will generalize the following one obtained by Herzlich [7] in the time symmetric case. Theorem 1.1. Let (M, g) be a 3-dimensional asymptotically flat Riemannian manifold with nonnegative scalar curvature, and boundary consisting of a minimal 2-sphere with area |∂ M|. Then the ADM energy satisfies σ |∂ M| E(g) ≥ , 2(1 + σ ) π where σ =

∇v 2L 2 (M) |∂ M| inf∞ . π v∈Cc v 2 2 L (∂ M) v =0

Furthermore equality holds if and only √ if (M, g) is a portion of the t = 0 slice of the Schwarzchild spacetime with mass |∂ M|/16π.

A Penrose-Like Inequality for General Initial Data Sets

781

A useful device for extending results in the time symmetric case to the general case is Jang’s deformation [9] of the initial data, which was successfully employed by Schoen and Yau [13] in their proof of the Positive Energy Theorem. In their application, special solutions of Jang’s equation which exhibit blow-up behavior at apparent horizons played an integral role, and for some time it has been suggested that these solutions may be helpful in studying the Penrose Inequality (see [10] for some problems that can occur with this approach). For this to be the case, solutions which blow-up at a given apparent horizon must always be shown to exist. In fact such a theorem has recently been established by Metzger in [11]. More precisely Metzger has shown that given an initial data set containing an outermost future (or past) apparent horizon, there exists a smooth solution of Jang’s equation outside of the outermost apparent horizon which blows-up to +∞ (−∞) in the form of a cylinder over the horizon, and vanishes at spatial infinity. Here an outermost future (past) apparent horizon refers to a future (past) apparent horizon outside of which there is no other apparent horizon; such a horizon may have several components. We will denote the Jang surface associated with the given blow-up solution of Jang’s equation by (M, g), and its connection by ∇. We will show Theorem 1.2. Let (M, g, k) be an asymptotically flat initial data set for the Einstein equations satisfying the dominant energy condition µ ≥ |J |. If the boundary consists of an outermost future (past) apparent horizon with components of area |∂i M|, i = 1, . . . , n, then the ADM energy satisfies n |∂i M| σ , E(g) ≥ 2(1 + σ ) π i=1

where σ =

n i=1

−1 4π |∂i M|

inf ∇v 2L 2 (M) ,

with the infimum taken over all v ∈ C ∞ (M) such that v(x) → 0 as x → ∂ M and v(x) → 1 as |x| → ∞. Remark 1.3. Although the hypotheses require a boundary consisting entirely of future or entirely of past apparent horizons, our proof gives a bit more. Namely when both types n are present the same result holds, where {∂i M}i=1 consists entirely of future or entirely of past apparent horizons. An important point to note concerning Theorem 1.2 is that the case of equality is not considered. The reasons for this are the following. First the Jang equation is designed to embed the initial data into Minkowski space if equality were to occur (as is done in the Positive Energy Theorem), and so there is no chance of obtaining and embedding into the Schwarzchild spacetime in this situation, as the Penrose Inequality demands. Moreover, it will in fact be shown that the case of equality can never be achieved. This implies that the current result is not optimal (unlike Theorem 1.1), and suggests that there may be a better choice of boundary conditions for the Jang equation which does yield an optimal result. Another point to note is that the constant σ is dimensionless, and so is actually independent of the area of the boundary ∂ M. Furthermore σ never vanishes, and therefore Theorem 1.2 does give a positive lower bound for the ADM mass in terms of the area

782

M. A. Khuri

of the apparent horizon, which is consistent with the spirit of the Penrose Inequality. Moreover the theorem may be generalized to the setting of initial data containing a trapped surface, to give a positive lower bound for the ADM mass in terms of the least area required to enclose the trapped surface. To see this recall that Andersson and Metzger [1], and Eichmair [4], have shown that the existence of a future (past) trapped surface in an asymptotically flat initial data set implies the existence of an outermost future (past) apparent horizon. One may then apply Theorem 1.2 to obtain the desired result. The proof of Theorem 1.2 closely follows that of Theorem 1.1. The main difference, or new idea, is to employ blow-up solutions for Jang’s equation in an appropriate way. However the argument still relies on the following version of the Positive Energy Theorem due to Herzlich. Theorem 1.4 [7]. Let (M, g) be a 3-dimensional asymptotically flat Riemannian manifold with nonnegative scalar curvature. If the boundary ∂ M consists of n components having spherical topology and mean curvature (calculated with respect to the normal √ pointing inside M) satisfying H∂i M ≤ 16π/|∂i M|, 1 ≤ i ≤ n, then E(g) ≥ 0 and when equality occurs g is flat. Remark 1.5. The statement of Herzlich’s original theorem only allowed the boundary ∂ M to have one component. However the same spinor proof may easily be extended to allow for finitely many components as in Theorem 1.4. In continuing with the outline of proof for Theorem 1.2, there are three primary steps. The first is to deform the initial data by constructing a blow-up solution of the Jang equation, which as mentioned above has already been established. This deformation yields a positivity property for the scalar curvature of the Jang metric g. The next step entails cutting off the cylindrical ends of the blown-up Jang surface at a height T to obtain a manifold with boundary M T , and then making a conformal deformation (M T , gT := u 4T g) to obtain a manifoldwith zero scalar curvature and with each boundary component = 16π/|∂i M T | satisfying H gT , 1 ≤ i ≤ n. Existence of a conformal factor u T ∂i M T

satisfying these properties will be established by a variational argument, which heavily depends on the positivity property for the scalar curvature of the Jang metric as well as the blow-up behavior of the Jang surface at the horizon. One may then undertake the last step, which consists of applying Theorem 1.4 to obtain E( gT ) ≥ 0. The desired lower bound for E(g) = E(g) is then produced by estimating the difference E(g) − E( gT ) and letting T → ∞. 2. The Jang Surface The goal of this section is to give a precise description of the blow-up solution to the Jang equation, as well as to record certain qualitative properties of the resulting Jang surface. Let us first recall some basic facts. The Jang surface M is given by a graph t = f (x) in the product manifold (M × R, g + dt 2 ), and so has induced metric g = g + d f 2 . The function f is required to satisfy the Jang equation: ⎛ gi j ⎝

∇i j f 1 + |∇g f |2

⎞ − ki j ⎠ = 0.

(2.1)

A Penrose-Like Inequality for General Initial Data Sets

783

Here ∇i j denote second covariant derivatives with respect to g and gi j = gi j −

f if j 1 + |∇g f |2

is the inverse matrix for g i j with f i = g i j ∇i f , and therefore Jang’s equation simply asserts that the mean curvature of the graph is equal to the trace of k over the graph (assuming that the tensor k has been extended trivially to all of M × R). The motivation for solving Jang’s equation is to obtain a positivity property for the scalar curvature of the Jang surface. In particular, if f satisfies Eq. (2.1) then the scalar curvature of g has the following expression (see [13]): R = 16π(µ − J (w)) + |h − k|2g + 2|q|2g − 2divg (q),

(2.2)

where wi =

∇i f 1 + |∇g f |2

,

fj qi = (h i j − ki j ), 1 + |∇g f |2

and h is the second fundamental form of M. In addition to the positivity property for the scalar curvature, we will require the Jang surface to exhibit blow-up behavior at ∂ M in order to construct the conformal factor described in the introduction. It turns out that such a solution always exists as long as ∂ M is an outermost horizon. Theorem 2.1 [11]. Suppose that ∂ M is an outermost future (past) apparent horizon. Then there exists an open set ⊂ M (with (M − ) ∩ ∂ M = ∅) and a smooth function f : → R satisfying (2.1), such that ∂ − ∂ M consists of past (future) apparent horizons, M = graph( f ) is asymptotic to the cylinders ∂ M × R+ (∂ M × R− ) and ∂ × R− (∂ × R+ ), and f (x) → 0 as |x| → ∞. This theorem yields the desired blow-up behavior at ∂ M with the added feature that blow-up may occur elsewhere at ∂ − ∂ M as well, if M contains apparent horizons of the other type (with respect to ∂ M). However the hypotheses of Theorem 1.2 do not allow for such extra horizons, so that in fact = M. We remark that the sole reason for prohibiting extra horizons in the initial data is to ensure that each component of ∂ has spherical topology, which is needed when applying the Positive Energy Theorem, Theorem 1.4. Thus one could allow apparent horizons of the other type, if they have spherical topology. The other goal of this section is to record the decay rate for certain geometric quantities associated with the Jang surface. Since the solution of Jang’s equation blows-up at ∂ M in the form of a cylinder, in a neighborhood of each boundary component the Jang surface may be foliated by the level sets t = f (x), which we denote by t . Similarly in a neighborhood of each boundary component, M may be foliated by the projection of the level sets t onto M, which we denote by t . We can then introduce coordinates (r, ξ 2 , ξ 3 ) in such a neighborhood of each component, where r = |t|−1 and ξ 2 , ξ 3 are coordinates on a 2-sphere. Note that as r → 0 the projections r converge to their associated component of ∂ M. Furthermore the r -coordinate may be chosen orthogonal to its level sets, so that the initial data metric takes the form g = g11 dr 2 +

3 i, j=2

gi j dξ i dξ j .

784

M. A. Khuri

Lemma 2.2. Consider the level sets r of the blown-up Jang surface near a component of ∂M. If H r denotes the mean curvature of r with respect to the inward pointing (towards spatial infinity) normal N , then H r − q(N ) → 0 as r → 0. Proof. A calculation in [14] (p. 10) shows that H r − q(N ) =

1 + |∇g f |2 (Hr ± Tr r k) ∓

Tr r k , |∇g f | + 1 + |∇g f |2

where Hr is the mean curvature of r , Tr r k is the trace over r , and + (−) is chosen depending on whether the particular component of ∂ M in question is a future (past) horizon respectively. From this expression we see that it is enough to show that the first term on the right-hand side approaches zero as r → 0. Fortunately this same expression appears in the Jang equation, and yields the desired result. To see this, write the Jang equation in the coordinates (r, ξ 2 , ξ 3 ) to obtain 3 g 11 1 ( f − f ) − g i j i1j f ,r ,rr 11 ,r 1 + g 11 f ,r2 i, j=2 ⎛ ⎞ 3 11 g = 1 + g 11 f ,r2 ⎝ k11 + g i j ki j ⎠ , 1 + g 11 f ,r2 i, j=2

where f ,r , f ,rr are partial derivatives and i1j are Christoffel symbols for g given by 1 11

1 = g 11 ∂r g11 , 2

i1j

= − g 11 h i j ,

2 ≤ i, j ≤ 3,

with h i j denoting the second fundamental form of r . It follows that ⎛ ⎞ 11 f 1 g ,rr ⎠. 1 + |∇g f |2 (Hr ± Tr r k) = ± + O ⎝ 1 + g 11 f ,r2 1 + |∇g f |2 Lastly we observe that by definition of the coordinate r , f (r ) = ±r −1 , and therefore 1 + |∇g f |2 (Hr ± Tr r k) = O(r ) as r → 0. Lemma 2.3. The solution of Jang’s equation satisfies the following fall-off condition at spatial infinity: 1

|∇ l f |(x) = O(|x|− 2 −l ) as |x| → ∞,

l = 0, 1, 2.

In particular, the energy of the Jang metric g equals the energy of g. Proof. See Schoen and Yau [13].

A Penrose-Like Inequality for General Initial Data Sets

785

3. The Conformal Factor In this section we will complete the last preliminary step before application of the Positive Energy Theorem. Namely we will conformally deform the Jang metric to zero scalar curvature on a portion of the Jang surface, while at the same time prescribing the mean curvature of its boundary. The region to be considered consists of the portion of the Jang surface lying between the horizontal planes t = ±T , and will be denoted by M T . We then search for a conformal factor u T satisfying the following boundary value problem: u T − 18 Ru T = 0 on M T , (3.1) 16π u 3T on each ith component of ∂ M T , ∂ν u T + 41 H ∂i M T u T = 14 |∂i M T | gT

uT = 1 +

AT |x|

+ O(|x|−2 ) as |x| → ∞,

where A T is a constant and H ∂i M T denotes mean curvature with respect to the unit inward normal ν (pointing inside M T ). This ensures that (M T , gT := u 4T g) has zero th ≡ 0 and mean curvature on each i component of ∂ M T given by scalar curvature R = 16π/|∂i M T | H gT . These two properties, combined with the fact that each ∂i M T

component of ∂ M must have spherical topology ([5,6]), then guarantee that Theorem 1.4 is applicable. We shall use a variational argument, just as in [7], to construct u T := 1 + vT . In this regard observe that boundary value problem (3.1) arises as the Euler-Lagrange equation for the functional √ 1/2 1 π 1 2 2 4 |∇v| + R(1 + v) + Q(v) = (1 + v) 2 MT 8 2 ∂ MT 1 − H (1 + v)2 . 8 ∂ MT ∂ MT More precisely, we will search for a global minimum over the weighted Sobolev space l

1,2 1,2 (M T ) = {v ∈ Wloc (M T ) | |x|l−1 ∇ v ∈ L 2 (M T ), l = 0, 1}. W−1 1,2 Theorem 3.1. Given T > 0 sufficiently large, there exists a function vT ∈ W−1 (M T ) ∩ ∞ C (M T ) at which Q attains a global minimum. Moreover u T = 1 + vT never vanishes and satisfies the asymptotic behavior in (3.1).

Proof. In order to establish the existence (as well as the regularity and asymptotic behavior) portion of this theorem it is enough, by the arguments of [7], to show that for T sufficiently large the functional Q is nonnegative. To see this use formula (2.2) 1,2 and integrate the divergence term by parts to find that for any v ∈ W−1 (M T ),

Q(v) ≥ MT

−

1 8

√ 1/2 3 π 2 2 4 (1 + v) |∇v| + π(µ − |J |)(1 + v) + 8 2 ∂ MT

∂ MT

(H ∂ M T − q(N ))(1 + v)2 .

(3.2)

786

M. A. Khuri

By Lemma 2.2 H ∂ M T − q(N ) = O(T −1 ), and a calculation shows that the area of ∂ M T agrees with the area of T ⊂ M which remains bounded as T → ∞. It then follows from Jensen’s Inequality,

2 ∂ MT

(1 + v)2

≤ |∂ M T |

∂ MT

(1 + v)4 ,

that for T sufficiently large Q is nonnegative. It remains to show that u T = 1 + vT is strictly positive. So suppose that u T is not positive and let D− be the domain on which u T < 0. Since u T → 1 as |x| → ∞, the closure of D− must be compact. Now multiply Eq. (3.1) through by u T and integrate by parts to obtain |∇u T |2 ≤ 0. D−

Note that if D− ∩ ∂ M T = ∅, then the same arguments used above to show that Q is nonnegative, must be employed. It follows that u T ≥ 0. To show that u T > 0, one need only apply Hopf’s Maximum Principle (the boundary condition of (3.1) must be used to obtain this conclusion at ∂ M T ). 4. Proof of Theorem 1.2 Here we shall carry out the last step in the proof of Theorem 1.2, namely to apply the Positive Energy Theorem and to compare the two energies E(g) and E( gT ). Observe that all the hypotheses of Theorem 1.4 are satisfied by (M T , gT ) so that E( gT ) ≥ 0. Therefore a straightforward calculation yields 1 lim ∂ν u T . (4.1) E(g) ≥ E(g) − E( gT ) = 2π r →∞ |x|=r Furthermore upon integrating by parts and using boundary value problem (3.1) we obtain lim ∂ν u T = lim u T ∂ν u T = 2Q(vT ). (4.2) r →∞ |x|=r

r →∞ |x|=r

n π |∂i M T | for some positive constant η, where Now suppose that Q(vT ) ≤ η i=1 n denotes the number of components comprising ∂ M. Then integrating by parts, and using arguments such as those found in the proof of Theorem 3.1, shows that there exists a constant C > 0 independent of T such that 3 8

|∇vT | + 2

MT

1 − C T −1 2

n

π

i=1

|∂i M T |

∂i M T

(1 + vT )2 ≤ η

However by Young’s Inequality, (1 + vT )2 ≥ 1 −

1 + (1 − δ)vT2 δ

n π |∂i M T |. i=1

A Penrose-Like Inequality for General Initial Data Sets

787

for any δ > 0, and therefore

n 1 − C T −1 3 π |∇vT |2 + (1 − δ) vT2 8 MT 2 |∂ M | ∂ M i T i T i=1 n 1 ≤ (η − (1 − δ −1 )(1 − C T −1 )) π |∂i M T |. 2 i=1

It follows that the left-hand side is nonnegative if δ − 1 ≤ σT , where 2 M T |∇vT | σT = , n π 2 2(1 − C T −1 ) i=1 ∂i M T vT |∂i M T |

so that η ≥ δ −1 (δ − 1)(1 − C T −1 )/2 for all such δ. In particular by choosing δ = 1 + σT we conclude that n σT (1 − C T −1 ) Q(vT ) ≥ π |∂i M T |. (4.3) 2(1 + σT ) i=1

Furthermore combining (4.1), (4.2), and (4.3) produces n σT (1 − C T −1 ) |∂i M T | . E(g) ≥ 2(1 + σT ) π

(4.4)

i=1

The desired inequality of Theorem 1.2 may be obtained from (4.4) by letting T → ∞. To see this we observe that (4.1), (4.2), and (3.2) together show that the sequence of 1,2 functions {u T } is uniformly bounded in Wloc (M). Thus with the help of elliptic estimates and Sobolev embeddings, this sequence converges on compact subsets to a smooth uniformly bounded solution u ∞ of u ∞ −

1 Ru ∞ = 0 on M, 8

u∞ = 1 +

A∞ + O(|x|−2 ) as |x| → ∞. |x|

However since M approximates a cylinder on regions where it blows-up, comparison with a bounded solution of the same equation on the cylinder (as is done in [13]) shows that u ∞ (x) → 0 as x → ∂ M; in fact the decay rate is of exponential strength. Therefore (with a bit more effort) σT → σ∞ ≥ σ and |∂i M T | → |∂i M|, 1 ≤ i ≤ n, as T → ∞. This completes the proof of Theorem 1.2. Lastly we analyze what happens when equality occurs in Theorem 1.2. By slightly modifying the arguments of this section in this special case, we find that |∇u ∞ |2 = 0, M

and therefore u ∞ must be constant. However this is impossible since 1 as |x| → ∞, u ∞ (x) → 0 as x → ∂ M. We conclude that the case of equality cannot occur.

788

M. A. Khuri

References 1. Andersson, L., Metzger, J.: The area of horizons and the trapped region. http://arxiv.org/abs/0708. 4252v3[gr-qc], 2007 2. Bray, H.: Proof of the Riemannian Penrose conjecture using the positive mass theorem. J. Diff. Geom. 59, 177–267 (2001) 3. Bray, H., Khuri, M.: PDE’s which imply the Penrose conjecture. In preparation, 2008 4. Eichmair, M.: The plateau problem for apparent horizons. http://arxiv.org/abs/0711.4139v2[math.DG], 2007 5. Galloway, G.: Rigidity of marginally trapped surfaces and the topology of black holes. Commun. Anal. Geom. 16, 217–229 (2008) 6. Galloway, G., Schoen, R.: A generalization of Hawking’s black hole topology theorem to higher dimensions. Commun. Math. Phys. 266(2), 571–576 (2006) 7. Herzlich, M.: A Penrose-like inequality for the mass of Riemannian asymptotically flat manifolds. Commun. Math. Phys. 188, 121–133 (1997) 8. Huisken, G., Ilmanen, T.: The inverse mean curvature flow and the Rieman-nian Penrose inequality. J. Diff. Geom. 59, 353–437 (2001) 9. Jang, P.-S.: On the positivity of energy in general relativity. J. Math. Phys. 19, 1152–1155 (1978) 10. Malec, E., Ó Murchadha, N.: The Jang equation, apparent horizons, and the Penrose inequality. Class. Q. Grav. 21, 5777–5787 (2004) 11. Metzger, J.: Blowup of Jang’s equation at outermost marginally trapped surfaces. http://arxiv.org/abs/ 0711.4753v1[gr-qc], 2007 12. Penrose, R.: Naked singularities. Ann. N. Y. Acad. Sci. 224, 125–134 (1973) 13. Schoen, R., Yau, S.-T.: Proof of the positive mass theorem II. Commun. Math. Phys. 79(2), 231–260 (1981) 14. Yau, S.-T.: Geometry of three manifolds and existence of black hole due to boundary effect. Adv. Theor. Math. Phys. 5(4), 755–767 (2001) Communicated by P. T. Chru´sciel

Commun. Math. Phys. 290, 789–800 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0759-7

Communications in

Mathematical Physics

A Simple Proof of Hardy-Lieb-Thirring Inequalities Rupert L. Frank Department of Mathematics, Princeton University, Princeton, NJ 08544, USA. E-mail: [email protected] Received: 19 September 2008 / Accepted: 18 November 2008 Published online: 13 March 2009 – © The Author 2009

Abstract: We give a short and unified proof of Hardy-Lieb-Thirring inequalities for moments of eigenvalues of fractional Schrödinger operators. The proof covers the optimal parameter range. It is based on a recent inequality by Solovej, Sørensen, and Spitzer. Moreover, we prove that any non-magnetic Lieb-Thirring inequality implies a magnetic Lieb-Thirring inequality (with possibly a larger constant).

1. Introduction and main result This paper is concerned with estimates on moments of negative eigenvalues of Schrödinger operators (−)s − Cs,d |x|−2s − V in L 2 (Rd ) in terms of integrals of the potential V . Here Cs,d := 22s

((d + 2s)/4)2 ((d − 2s)/4)2

is the sharp constant in the Hardy inequality | p|2s |u( ˆ p)|2 dp ≥ Cs,d |x|−2s |u(x)|2 d x , Rd

Rd

(1.1)

u ∈ C0∞ (Rd ) ,

(1.2)

which is valid for 0 < s < d/2 [He] and we write u( ˆ p) := (2π )−d/2 Rd u( p)e−i p·x d x for the Fourier transform of u. In [FrLiSe1] we have shown that for any γ > 0, 0 < s ≤ 1 and 0 < s < d/2 one has γ γ +d/2s s −2s HLT −V ≤ L γ ,d,s V (x)+ dx (1.3) tr (−) − Cs,d |x| −

Rd

© 2009 by the author. This paper may be reproduced, in its entirety, for non-commercial purposes.

790

R. L. Frank

with a constant L HLT γ ,d,s independent of V . Here and in the following, t± := max{±t, 0} denote the positve and negative parts of a real number or a self-adjoint operator t. The case s = 1 in (1.3) has been shown earlier in [EkFr]. We refer to (1.3) as HardyLieb-Thirring inequality since it is (up to the value of the constant) an improvement of the Lieb-Thirring inequality [LiTh] γ γ +d/2s tr (−)s − V − ≤ L γ ,d,s V (x)+ dx . (1.4) Rd

It should be pointed out that if 0 < s < d/2, then (1.4) is valid even for γ = 0 (as first shown by Cwikel, Lieb, and Rozenblum) while (1.3) is not. We refer to the surveys [LaWe,Hu] for background and references concerning (1.4). The original motivation for (1.4) came from the problem of stability of nonrelativistic matter (see [LiSe] for a textbook presentation). Likewise, our motivation for (1.3) was stability of relativistic matter in magnetic fields. For this problem it is crucial that (1.3) continues to hold if (−)s is replaced by |D − A|2s with a magnetic vector potential A ∈ L 2,loc (Rd , Rd ), and that the constant can be chosen independently of A. Here, as usual, D = −i∇ and the operator |D − A|2s := ((D − A)2 )s is defined by means of the spectral theorem. Using the magnetic version of (1.3) we could prove stability of relativistic matter in magnetic fields up to and including the critical value of the nuclear charge α Z = 2/π = C1/2,3 ; see [FrLiSe1] and also [FrLiSe2]. The purpose of this paper is fourfold. (1) We will give a new, much simpler proof of (1.3). While the method in [FrLiSe1] relied on rather involved relations between Sobolev inequalities and decay estimates on heat kernels, the present proof uses nothing more than (1.4) (with γ = 0 and with s replaced by some t < s) and the generalization of a powerful (though elementary to prove) new inequality by Solovej, Sørensen and Spitzer [SoSøSp]. (2) We will extend (1.3) to its optimal parameter range 0 < s < d/2. For d ≥ 3 and 1 < s < d/2 this is a new result, even for integer values of s when the operator is local. This result can not be attained with the method of [FrLiSe1], since positivity properties of the heat kernel break down for s > 1. (3) Though our new proof of (1.3) does not work in the presence of a magnetic field, we shall prove a new operator-theoretic result, which says that any non-magnetic Lieb-Thirring inequality implies a magnetic Lieb-Thirring inequality (with possibly a different constant). This recovers, in particular, that (1.3) holds if (−)s is replaced by |D − A|2s and 0 < s ≤ 1. (The reason for the restriction s ≤ 1 at this point is that we need a diamagnetic inequality.) Another application of this result concerns the recent inequality in [KoVuWe] corresponding to the endpoint γ = 0 of (1.4) with s = 1, d = 2 . (4) We show that an analog of inequality (1.3) for s = 1/2, d = 3 holds in a model for pseudo-relativistic electrons that includes spin. The difficulty here is that the potential energy is non-local. This new estimate simplifies some of the arguments in [FrSiWa] and will be, we believe, a crucial ingredient in the proof of stability of matter in this model. Here is the precise statement of our result. Theorem 1.1. Let d ≥ 1, 0 < s < d/2 and γ > 0. Then there is a constant L HLT γ ,d,s such that γ γ +d/2s tr (−)s − Cs,d |x|−2s − V ≤ L HLT V (x)+ dx . (1.5) γ ,d,s −

Rd

Hardy-Lieb-Thirring Inequalities

791

If d ≥ 2, 0 < s ≤ 1 and (−)s is replaced by |D − A|2s for some A ∈ L 2,loc (Rd , Rd ), HLT p then (1.5) remains valid if L HLT γ ,d,s is replaced by L γ ,d,s (e/ p) ( p+1) with p = γ +d/2s. The crucial ingredient in our proof of (1.5) is the following lower bound for the quadratic form | p|2s |u( ˆ p)|2 dp − Cs,d |x|−2s |u(x)|2 d x h s [u] := Rd

Rd

of the operator (−)s − Cs,d |x|−2s . Theorem 1.2. Let 0 < t < s < d/2. Then there exists a constant κd,s,t > 0 such that for all u ∈ C0∞ (Rd ) one has h s [u]θ u2(1−θ) ≥ κd,s,t (−)t/2 u2 ,

θ := t/s .

(1.6)

In the special case d = 3 and s = 1/2 this is a recent result by Solovej, Sørensen and Spitzer [SoSøSp, Thm. 11]. The results reported here are motivated by their work. Below we shall show that their proof extends to arbitrary 0 < s < d/2. Our original proof of (1.5) in [FrLiSe1] for 0 < s ≤ 1 relied on the GagliardoNirenberg-type inequality d 1 1 θ 2(1−θ) 2 , (1.7) h s [u] u ≥ σd,s,q uq , θ := − s 2 q for 2 < q < 2s/(d − 2s). This is weaker than (1.6) in view of the Sobolev inequality [LiLo, Thms. 4.3 and 8.3] (−)t/2 u2 ≥ Sd,t uq2 ,

q=

2d . d − 2t

What makes (1.6) much easier to prove than (1.7) is that it is a linear inequality, that is, all norms are taken in L 2 (Rd ). Indeed, (1.6) is easily seen to be equivalent to the operator inequality (−)s − Cs,d |x|−2s ≥ K d,s,t l −2(s−t) (−)t − l −2s , l > 0, (1.8) −s t 1/s where K d,s,t = s t (s − t)s−t κd,s,t , and this is the way we shall prove it in the next section. 2. Proof of Theorem 1.2 Throughout this section we assume that 0 < s < d/2. Recall that for 0 < α < d the Fourier transform of |x|−d+α is given by ∧ bd−α | · |−d+α ( p) = bα | p|−α , bα := 2α/2 (α/2) ; (2.1) see, e.g., [LiLo, Thm. 5.9], where another convention for the Fourier transform is used, however. This implies that for 2s < α < d one has 1 1 dq = s,d (α) α−2s , (2.2) d−2s α |q| | p| Rd | p − q|

792

R. L. Frank

where s,d (α) := (2π )d/2

b2s bα−2s bd−α π d/2 (s) ((α − 2s)/2) ((d − α)/2) . = bd−2s bd−α+2s bα ((d − 2s)/2) ((d − α + 2s)/2) (α/2)

We shall need the following facts about s,d (α) as a function of α ∈ (2s, d). Lemma 2.1. s,d is an even function with respect to α = (d + 2s)/2 and one has s,d ((d + 2s)/2) = (2π )d/2

b2s bd−2s

−1 Cs,d

(2.3)

with Cs,d from (1.1). Moreover, s,d is strictly decreasing on (2s, (d +2s)/2) and strictly increasing on ((d + 2s)/2, d). This is Lemma 3.2 from [FrLiSe1] in disguise. Proof of Lemma 2.1. s,d (α) is obviously invariant under replacing α by d + 2s − α, and its value at α = (d + 2s)/2 follows immediately from definition (1.1). To prove the monotonicity we write s,d (α) =

f (t) π d/2 (s) , ((d − 2s)/2) f (s + t)

t = (α − 2s)/2 ,

where T := (d − 2s)/2 and f (t) := (t)/ (T + s − t). We need to show that log( f (t)/ f (s + t)) is strictly decreasing in t ∈ (0, T /2). Noting that f (t) = ψ(t) + ψ(T + s − t) f (t) with ψ := / the Digamma function, we have f (t) d log = ψ(t) + ψ(T + s − t) − ψ(t + s) − ψ(T − t) = − dt f (t + s)

t+s

h(τ ) dτ

t

with h(τ ) := ψ (τ ) − ψ (T + s − τ ) for 0 < τ < T + s. Since ψ is strictly decreasing [AbSt, (6.4.1)], h is an odd function with respect to τ = (T + s)/2 which is strictly positive for τ < (T + s)/2. Since the midpoint of the interval (t, t + s) lies to the left of (T + s)/2, the integral of h over this interval is strictly positive, which proves the claim.

Now we prove (1.8), following the strategy of Solovej, Sørensen and Spitzer [SoSøSp] in the special case d = 3, s = 1/2; see also [LiYa, Thm. 11] for a related argument. Proof of Theorem 1.2. For technical reasons we prove the theorem only for 2s/3 ≤ t < s. In view of the inequality (−)t ≤ (3t/2s)l 2(s−3t/2) (−)2s/3t + (1 − 3t/2s)l −2t for t < 2s/3 and l > 0 this implies the result for all 0 < t < s. By a well-known argument (going back at least to Abel and, in the present context, to [KoPeSe]) based on the Cauchy-Schwarz inequality, one has for any positive measurable function h on Rd b2s |u|2 u( ˆ p)u(q) ˆ (2π )d/2 d x = dp dq ≤ th ( p)|u( ˆ p)|2 dp , d−2s bd−2s Rd |x|2s Rd ×Rd | p − q| Rd

Hardy-Lieb-Thirring Inequalities

793

where th ( p) := h( p)−1

Rd

h(q) dq . | p − q|d−2s

Below we shall choose h (depending on l > 0) in such a way that for some positive constants A and B (depending on d, s and t, but not on l) one has th ( p) ≤ s,d ((d + 2s)/2)| p|2s − Al −2(s−t) | p|2t + Bl −2s .

(2.4)

(By scaling it would be enough to prove this for l = 1, but we prefer to keep l free.) Because of (2.3) this estimate proves (1.8). We show that (2.4) holds with h( p) = (| p|(d+2s)/2 + l β−(d+2s)/2 | p|β )−1 , where β is a parameter depending on t that will be fixed later. (Indeed, we shall choose β = 2t + (d − 2s)/2.) Since the derivatives of the function r → r −1 have alternating signs one has (a + b)−1 ≤ a −1 − a −2 b + a −3 b2 and therefore h(q) dq d−2s Rd | p − q|

1 l β−(d+2s)/2 l 2β−d−2s 1 ≤ − + 3(d+2s)/2−2β dq . d−2s |q|d+2s−β |q|(d+2s)/2 |q| Rd | p − q| If we assume that (d + 6s)/4 < β < (3d + 2s)/4 then the right side is finite and, using notation (2.2) with instead of s,d , equal to 1 l β−(d+2s)/2 d + 2s − (d + 2s − β) 2 | p|d−β | p|(d−2s)/2 2β−d−2s l 3(d + 2s) − 2β + . 2 | p|3d/2−2β+s Thus th ( p) ≤

d + 2s d + 2s | p|2s − (d + 2s − β) − l β−(d+2s)/2 | p|β−(d−2s)/2 2 2 3(d + 2s) − 2β − (d + 2s − β) l 2β−d−2s | p|2β−d + 2 3(d + 2s) − 2β l 3β−3d/2−3s | p|3β−3d/2−s . + 2

If we assume that β ≤ (d + 2s)/2, then the exponents of | p| on the right side satisfy 2s ≥ β − (d − 2s)/2 ≥ 2β − d ≥ 3β − 3d/2 − s, and if β ≥ (3d + 2s)/6 then the last exponent is non-negative. Now we choose β = 2t + (d − 2s)/2, so that the exponent of the second term is 2t and the condition β ≥ (3d + 2s)/6 is satisfied, since we are assuming that t ≥ 2s/3. Moreover, according to Lemma 2.1, the coefficient of the second term is negative. Finally, we use that there are constants C1 and C2 such that for any ε > 0 one has | p|2β−d ≤ ε| p|β−(d−2s)/2 + C1 ε | p|

3β−3d/2−s

≤ ε| p|

This concludes the proof of (2.4).

β−(d−2s)/2

+ C2 ε

2(2β+d) − d+2s−2β

,

6β−3d−2s − 2(d+2s−2β)

.

794

R. L. Frank

3. Proof of Theorem 1.1 We fix 0 < s < d/2 and γ > 0 and write ∞ γ tr (−)s − Cs,d |x|−2s − V =γ N (−τ, (−)s − Cs,d |x|−2s − V ) τ γ −1 dτ , −

0

where N (−τ, H ) denotes the number of eigenvalues less than −τ , counting multiplicities, of a self-adjoint operator H . We shall use (1.8) with l −2s = σ τ and some 0 < t < s and 0 < σ < 1 to be specified below. Abbreviating K t = K d,s,t we find that N (−τ, (−)s − Cs,d |x|−2s − V ) ≤ N (0, K t (σ τ )(s−t)/s (−)t − V + (1 − σ )τ ) = N 0, (−)t − K t−1 (σ τ )−(s−t)/s (V −(1−σ ) τ ) . Now we use (1.4) with γ = 0 and s replaced by t (see [Da] for t ≤ 1 and [Cw] for t < d/2). Abbreviating L t = L 0,d,t we have N (−τ, (−)s − Cs,d |x|−2s − V ) −d/2t d/2t −d(s−t)/2st ≤ L t Kt (σ τ ) (V − (1 − σ ) τ )+ d x Rd

and

γ tr (−)s − Cs,d |x|−2s − V − −d/2t −d(s−t)/2st ≤ γ L t Kt σ dx Rd

=

∞

0

d(s−t) −d/2t − d(s−t) γ L t Kt σ 2st (1 − σ )−γ + 2st

dτ τ γ −1−d(s−t)/2st (V − (1 − σ ) τ )+ d(s−t) d 2st )( 2t d (γ + 2s + 1)

(γ −

+ 1)

Rd

d/2t

dx

γ +d/2s

dx .

V+

Here we assumed that t > ds/(2γ s + d) so that the τ integral is finite. Finally, we optimize over 0 < σ < 1 by choosing σ = d(s − t)/2γ st and over ds/(2γ s + d) < t < s to complete the proof of (1.5). The statement about the inclusion of A follows from Example 4.1 and Theorem 4.2 in the following section. 4. Magnetic Lieb-Thirring Inequalities In this section we discuss Lieb-Thirring inequalities for magnetic Schrödinger operators, that is, (1.4) (and its generalizations) with (−)s replaced by |D − A|2s for some vector field A ∈ L 2,loc (Rd , Rd ). It is a remarkable fact that all presently known proofs of Lieb-Thirring inequalities, which allow for the inclusion of a magnetic field, yield the same constants in the magnetic case as in the non-magnetic case. It is unknown whether this is also true for the unknown sharp constants. Note that the diamagnetic inequality implies that the lowest eigenvalue does not decrease when a magnetic field is added, but there is no such result for, e.g., the number or the sum of eigenvalues; see [AvHeSi,Li]. Rozenblum [Ro] discovered, however, that any power-like bound on the number of eigenvalues in the non-magnetic case implies a similar bound in the magnetic case, with possibly a worse constant. Here we show the same phenomenon for moments of eigenvalues.

Hardy-Lieb-Thirring Inequalities

795

We work in the following abstract setting. Let (X, µ) be a sigma-finite measure space and let H and M be non-negative operators in L 2 (X, µ) such that for any u ∈ L 2 (X, µ) and any t > 0, | exp(−t M)u(x)| ≤ (exp(−t H )|u|)(x)

µ − a.e. x ∈ X .

(4.1)

Note that this implies that exp(−t H ) is positivity preserving. We think of H as a non-magnetic operator, M a magnetic operator and (4.1) as a diamagnetic inequality. It might be useful to keep the following example in mind. Example 4.1. Let X = Rd with Lebesgue measure, H = (−)s , and M = |D − A|2s for some 0 < s ≤ 1 and A ∈ L 2,loc (Rd ). The diamagnetic inequality (4.1) in the case s = 1 was shown in [Si1], and in the case 0 < s < 1 it follows from the s = 1 result since the function λ → exp(−λs ) is completely monotone and hence by Bernstein’s theorem [Do] the Laplace transform of a positive measure. More generally, (4.1) holds for H = (−)s + W and M = |D − A|2s + W with s and A as before and a, say, bounded function W . This can be seen using Trotter’s product formula. By an approximation argument the inequality holds also for W (x) = −Cs,d |x|−2s ; see [FrLiSe1]. The main result in this section is Theorem 4.2. Let H and M be as above and assume that there exist some constants L > 0, γ ≥ 0, p > 0 and a non-negative function w on X such that for all V ∈ L p (V, w dµ) one has γ p V+ w dµ . (4.2) tr(H − V )− ≤ L X

Then one also has γ

tr(M − V )− ≤ L

p e p ( p + 1) V+ w dµ . p X

(4.3)

We do not know whether the factor (e/ p) p ( p + 1) in (4.3) can be omitted. Results from [FrLoWe] about the eigenvalues of the Landau Hamiltonian in a domain (but without potential) seem to indicate that a factor > 1 is necessary. Our proof of Theorem 4.2 uses some ideas from [Ro] where the case γ = 0 was treated; see also [FrLiSe2] for a result about operators with discrete spectrum. Remark 4.3. With the same proof one can deduce estimates on tr f (M) from estimates on tr f (H ) for more general functions f . For example, let d = 2 and f (t) := | ln |t||−1 if −e−1 < t < 0, f (t) := 1 if t ≤ −e−1 , and f (t) := 0 if t ≥ 0. Then there exists a constant L and for any q > 1 a constant L q such that for all l > 0 and A ∈ L 2,loc (R2 , R2 ), |x| 2 2 dx tr f l ((D − A) − V ) ≤ L V (x)+ log l |x|
S

Indeed, this follows by Lemma 4.4 via integration from the A ≡ 0 result of [KoVuWe].

796

R. L. Frank

The key ingredient in the proof of Theorem 4.2 is a bound on the negative eigenvalues of M − V by those of H − αV , averaged over all coupling constants α. As before, we denote by N (−τ, A) the number of eigenvalues less than −τ , counting multiplicities, of a self-adjoint operator A. Lemma 4.4. Let H and M be non-negative self-adjoint operators satisfying (4.1) and let V ≥ 0. Then for any τ ≥ 0 and t > 0 one has ∞ t N (−τ, M − V ) ≤ te N (−τ, H − αV )e−αt dα . (4.4) 0

Proof. Since (4.1) remains valid with H +τ and M +τ in place of H and M we need only consider τ = 0. Moreover, by an approximation argument we may assume that V > 0 a.e. We define h := V −1/2 H V −1/2 and m := V −1/2 M V −1/2 via quadratic forms and claim that (4.1) holds with h and m in place of H and M. Since this fact is proved in [Ro, Thm. 3] we only sketch the main idea. Indeed, for any σ > 0, ∞ −1 1/2 −1 1/2 (m + σ ) = V (M + σ V ) V = V 1/2 exp(−s(M + σ V ))V 1/2 ds , 0

and by (4.1) and Trotter’s product formula | exp(−s(M + σ V ))V 1/2 u| ≤ exp(−s(H + σ V ))V 1/2 |u| a.e. Hence |(m + σ )−1 u| ≤ (h + σ )−1 |u| a.e. Iterating this inequality and recalling that (1 + tm/n)−n → exp(−tm) strongly as n → ∞, we obtain (4.1) for h and m. By [Si2, Thm. 2.13] this analog of (4.1) implies that tr exp(−tm) = exp(−tm/2)22 ≤ exp(−th/2)22 = tr exp(−th) with · 2 the Hilbert-Schmidt norm, and hence by the Birman-Schwinger principle, N (M − V ) = N (1, m) ≤ et tr exp(−tm) ≤ et tr exp(−th) . Using the Birman-Schwinger principle once more, we find ∞ ∞ N (α, h)e−tα dα = t N (H − αV )e−tα dα , tr exp(−th) = t 0

proving (4.4).

0

Proof of Theorem 4.2. By the variational principle we may assume that V ≥ 0. By Lemma 4.4 one has for any t > 0, ∞ γ N (−τ, M − V )τ γ −1 dτ tr(M − V )− = γ 0 ∞ ∞ N (−τ, H − αV )τ γ −1 dτ e−αt dα ≤ γ tet 0 0 ∞ γ tr(H − αV )− e−αt dα , = tet 0

and by assumption (4.2) the right-hand side can be bounded from above by ∞ t p −αt p −p t α e dα V w dµ = Lt e ( p + 1) V p w dµ . Lte 0

X

Now the assertion follows by choosing t = p.

X

Hardy-Lieb-Thirring Inequalities

797

5. A Pseudo-Relativistic Model Including Spin Throughout this section we assume that d = 3. The helicity operator h on L 2 (R3 , C2 ) is defined as the Fourier multiplier corresponding to the matrix-valued function p → σ · p/| p|, where σ = (σ1 , σ2 , σ3 ) denotes the triple of Pauli matrices. The properties of these matrices imply that h is a unitary and self-adjoint involution. The analog of the Hardy (or Kato) inequality (1.2) is |u(x)|2 + |(hu)(x)|2 dx , u ∈ C0∞ (R3 , C2 ) , |ξ ||u(ξ ˆ )|2 dξ ≥ C˜ (5.1) 2 |x| R3 R3 with the sharp constant C˜ =

2 ; 2/π + π/2

see [EvPeSi]. Note that this constant is strictly larger than C := C1/2,3 = 2/π , which is the constant one would get if hu were replaced by u on the right side of (5.1). For a function V on R3 taking values in the Hermitian 4 × 4 matrices we introduce the non-local potential 1 1 L 2 (R3 ,C2 ) ∗ 1 L 2 (R3 ,C2 ) V , (V ) := h h 2 1 3 2 where L 2 (R ,C ) is considered as an operator from L 2 (R3 , C2 ) to L 2 (R3 , C4 ). The h √ operator − − (V ) in L 2 (R3 , C2 ) has been suggested by Brown and Ravenhall as the Hamiltonian of a massless, relativistic spin-1/2 particle in a potential −V . It results from projecting onto the positive spectral√ subspace of the Dirac operator. One of the advantages of this operator over the simpler − − V is that it is well-defined for ˜ which includes all known elements. We refer to [LiSe] for more nuclear charges α Z ≤ C, background about this model. Despite the efforts in [LiSiSo,BaEv,HoSi] the problem of stability of matter for the corresponding many-particle system is not yet completely understood and the following result, we believe, might be useful in this respect. Theorem 5.1. Let d = 3 and γ > 0. Then there is a constant L˜ HLT such that γ γ √ γ +3 −1 ˜ − − C(|x| ) − (V ) ≤ L˜ HLT tr C4 V (x)+ d x . tr γ −

R3

(5.2)

For the proof of this some facts about the partial wave decompo√ theorem we need −1 ) from [EvPeSi]. This operator commutes with ˜ sition of the operator − − C(|x| the total angular momentum operator J = L + 21 σ , where L = −i∇ × x, as well as with the operator L2 . The subspace corresponding to total angular momentum j = 1/2 is of the form H1/2,0 ⊕ H1/2,1 , where the subspaces H1/2,l correspond to the eigenvalues l(l + 1) of L2 . The next result, essentially contained in [FrSiWa], says that on the space H1/2,0 ⊕ √ √ −1 ) is controlled by the operator ˜ H1/2,1 the operator − − C(|x| − − C|x|−1 with the smaller coupling constant C. (Strictly speaking, the latter operator should be tensored with 1C2 , but we suppress this if there is no danger of confusion.)

798

R. L. Frank

Lemma 5.2. If 0 ≡ ψ ∈ H1/2,0 ∩ C0∞ (R3 , C2 ), then √ −1 ) ψ ˜ ψ, − − C(|x| 2 1 ≥ ≥ √ . 2 −1 1 + (2/π ) 1 + (2/π )2 ψ, − − C|x| ψ √ If 0 ≡ ψ ∈ H1/2,1 ∩ C0∞ (R3 , C2 ), this bound is true provided ψ, − − C|x|−1 ψ √ is replaced by hψ, − − C|x|−1 hψ . Proof of Lemma 5.2. We prove the assertion only for l = 1 since the lower bound for l = 0 is contained in [FrSiWa, Lemma 2.7] and the upper bound is proved as below. By orthogonality we may assume that the Fourier transform of ψ is of the ˆ ) = |ξ |−2 g(|ξ |)1/2,1,m ( ξ ), where m ∈ {−1/2, 1/2} and 1/2,1,m are form ψ(ξ |ξ | )= explicit functions in L 2 (S2 , C2 ). By the properties of these functions one has hψ(ξ ξ −2 −|ξ | g(|ξ |)1/2,0,m ( |ξ | ). The ground state representation from [FrSiWa, Lemma 2.6] reads √ dp dq C˜ ∞ ∞ −1 ˜ ψ, , − − C(|x| ) ψ = |g( p)−g(q)|2 k˜ 21 qp + qp 2π 0 p q 0 √ C ∞ ∞ dp dq hψ, , − − C|x|−1 hψ = |g( p) − g(q)|2 k 21 qp + qp 2π 0 p q 0 ˜ where k(t) = 21 (Q 0 (t) + Q 1 (t)), k(t) = Q 0 (t), and Q l are the Legendre functions of the second kind [AbSt, 8.4]. The assertion now follows from the fact that Q 0 ≥ Q 1 ≥ 0.

Proof of Theorem 5.1. We first claim that for any 0 < t < 1/2 there is a K˜ t > 0 such that √ −1 ˜ − − C(|x| ) ≥ K˜ t l −1+2t (−)t − l −1 , l > 0 . (5.3) Indeed, it follows from Lemma 5.2 and (1.8) that on H1/2,0 ⊕ H1/2,1 one has for any 0 < t < 1/2, −1 √ −1 ˜ K t l −1+2t (−)t − l −1 , l > 0 . − − C(|x| ) ≥ 1 + (2/π )2 On the other hand, the arguments of [EvPeSi] show that there exists a constant C˜ > C˜ ⊥ √ such that − ≥ C˜ (|x|−1 ) on H1/2,0 ⊕ H1/2,1 . Hence on that space √

C˜ − C˜ √ −1 ˜ − − C(|x| )≥ − C˜ C˜ − C˜ 1 −1+2t 1 − 2t −1 , l > 0. l l (−)t − ≥ 2t 2t C˜

This proves (5.3). Given (5.3), the proof of (5.2) is similar to that of (1.5). We may assume that V (x) = v(x)IC4 for a non-negative, scalar function v (otherwise, replace V (x) by v(x)IC4 , where v(x) is the operator norm of the 4 × 4 matrix V (x)+ ). For a given l > 0 and

Hardy-Lieb-Thirring Inequalities

799

0 < t < 1/2 we introduce the operator H := K˜ t l −1+2t (−)t − v − l −1 in L 2 (R3 , C). Then according to (5.3) one has √ −1 ˜ N (−τ, − − C(|x| ) − (V )) ≤ N (−τ, 21 (H ⊗ 1C2 + h(H ⊗ 1C2 )h)) ≤ 4N (−τ, H ) . In the last inequality we used that N (−τ, 21 (A + B)) ≤ N (−τ, A) + N (−τ, B) for any self-adjoint, lower semi-bounded operators A and B, which follows from the variational principle. Now one can proceed in the same way as in the proof of (1.5).

Acknowledgement. The author would like to thank E. Lieb and R. Seiringer for very fruitful discussions, as well as J. P. Solovej, T. Østergaard Sørensen and W. Spitzer for useful correspondence. Support through DAAD grant D/06/49117 and U.S. National Science Foundation grant PHY 06 52854 is gratefully acknowledged.

References [AbSt] [AvHeSi] [BaEv] [Cw] [Da] [Do] [EkFr] [EvPeSi] [FrLiSe1] [FrLiSe2] [FrLoWe] [FrSiWa] [He] [HoSi] [Hu]

[KoPeSe] [KoVuWe] [LaWe] [Li] [LiLo] [LiSe]

Abramowitz, M., Stegun, I.A.: Handbook of mathematical functions with formulas, graphs, and mathematical tables. Reprint of the 1972 edition. New York: Dover Publications, 1992 Avron, J., Herbst, I., Simon, B.: Schrödinger operators with magnetic fields. I. general interactions. Duke Math. J. 45(4), 847–883 (1978) Balinsky, A.A., Evans, W.D.: Stability of one-electron molecules in the Brown-Ravenhall model. Commun. Math. Phys. 202(2), 481–500 (1999) Cwikel, M.: Weak type estimates for singular values and the number of bound states of schrödinger operators. Ann. Math. 106, 93–102 (1977) Daubechies, I.: An uncertainty principle fermions with a generalized kinetic energy. Commun. Math. Phys. 90, 511–520 (1983) Donoghue, W.F.: Monotone matrix functions and analytic continuation. New York-Heidelberg, Springer, 1974 Ekholm, T., Frank, R.L.: On Lieb-Thirring inequalities for Schrödinger operators with virtual level. Commun. Math. Phys. 264(3), 725–740 (2006) Evans, W.D., Perry, P., Siedentop, H.: The spectrum of relativistic one-electron atoms according to Bethe and Salpeter. Commun. Math. Phys. 178(3), 733–746 (1996) Frank, R.L., Lieb, E.H., Seiringer, R.: Hardy-Lieb-Thirring inequalities for fractional Schrödinger operators. J. Amer. Math. Soc. 21(4), 925–950 (2008) Frank, R.L., Lieb, E.H., Seiringer, R.: Stability of relativistic matter with magnetic fields for nuclear charges up to the critical value. Commun. Math. Phys. 275(2), 479–489 (2007) Frank, R.L., Loss, M., Weidl, T.: Pólya’s conjecture in the presence of a constant magnetic field. J. Eur. Math. Soc., to appear Frank, R.L., Siedentop, H., Warzel, S.: The energy of heavy atoms according to Brown and Ravenhall: The Scott correction. http://arxiv.org/abs/0805.4441v2[math-ph], 2008 Herbst, I.W.: Spectral theory of the operator ( p 2 + m 2 )1/2 − ze2 /r . Commun. Math. Phys. 53(3), 285–294 (1977) Hoever, G., Siedentop, H.: Stability of the Brown-Ravenhall operator. Math. Phys. Electron. J. 5 ,11 pp (1999) Hundertmark, D.: Some bound state problems in quantum mechanics. In: Spectral theory and mathematical physics: a Festschrift in honor of Barry Simon’s 60th birthday, Proc. Sympos. Pure Math. 76, Part 1, Providence, RI: Amer. Math. Soc., 2007, pp. 463–496 l/2 Kovalenko, V.F., Perel’muter, M.A., Semenov, Ya.A.: Schrödinger operators with lw (Rl )potentials. J. Math. Phys. 22(5), 1033–1044 (1981) Kovaˇrík, H., Vugalter, S., Weidl, T.: Spectral estimates for two-dimensional Schrödinger operators with application to quantum layers. Commun. Math. Phys. 275(3), 827–838 (2007) Laptev, A., Weidl, T.: Recent results on Lieb-Thirring inequalities. Journées ‘Équations aux Dérivées Partielles’ (La Chapelle sur Erdre, 2000), Exp. No. XX, Nantes: Univ. Nantes, 2000 Lieb, E.H.: Flux phase of the half-filled band. Phys. Rev. Lett. 73, 2158–2161 (1994) Lieb, E.H., Loss, M.: Analysis. Second edition. Graduate Studies in Mathematics 14, Providence, RI: Amer. Math. Soc., 2001 Lieb, E.H., Seiringer, R.: The stability of matter. In preparation

800

[LiSiSo] [LiTh] [LiYa] [Ro] [Si1] [Si2] [SoSøSp]

R. L. Frank

Lieb, E.H., Siedentop, H., Solovej, J.P.: Stability and instability of relativistic electrons in classical electromagnetic fields. J. Stat. Phys. 89(1-2), 37–59 (1997) Lieb, E.H., Thirring, W.: Inequalities for the moments of the eigenvalues of the Schrödinger Hamiltonian and their relation to Sobolev inequalities. Studies in Mathematical Physics, Princeton, NJ: Princeton University Press, 1976, pp. 269–303 Lieb, E.H., Yau, H.-T.: The stability and instability of relativistic matter. Commun. Math. Phys. 118(2), 177–213 (1988) Rozenblum, G.: Domination of semigroups and estimates for eigenvalues. St. Petersburg Math. J. 12(5), 831–845 (2001) Simon, B.: Maximal and minimal Schrödinger forms. J. Oper. Th. 1(1), 37–47 (1979) Simon, B.: Trace ideals and their applications. Second edition, Mathematical Surveys and Monographs 120, Providence, RI: Amer. Math. Soc., 2005 Solovej, J.P., Sørensen, Ø.T., Spitzer, W.: The relativistic Scott correction for atoms and molecules. http://arxiv.org/abs/0808.2163v1[math-ph], 2008

Communicated by B. Simon

Commun. Math. Phys. 290, 801–812 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0756-x

Communications in

Mathematical Physics

Regularity Criteria for the Dissipative Quasi-Geostrophic Equations in Hölder Spaces Hongjie Dong1, , Nataša Pavlovi´c2, 1 The Division of Applied Mathematics, Brown University, 182 George Street,

Box F, Providence, RI 02912, USA. E-mail: [email protected]

2 Department of Mathematics, University of Texas at Austin,

1 University Station, C1200, Austin, TX 78712, USA. E-mail: [email protected] Received: 11 August 2008 / Accepted: 5 November 2008 Published online: 26 February 2009 – © Springer-Verlag 2009

Abstract: We study regularity criteria for weak solutions of the dissipative quasigeostrophic equation (with dissipation (−)γ /2 , 0 < γ ≤ 1). We show in this paper that if θ ∈ C((0, T ); C 1−γ ), or θ ∈ L r ((0, T ); C α ) with α = 1 − γ + γr is a weak solution of the 2D quasi-geostrophic equation, then θ is a classical solution in (0, T ] × R2 . This result improves our previous result in [18]. 1. Introduction In this paper we present two regularity results for weak solutions of the 2D dissipative quasi-geostrophic equation, that extend our previous work [18]. We consider the following initial value problem: θt + u · ∇θ + (−)γ /2 θ = 0, x ∈ R2 , t ∈ (0, ∞), (1.1) θ (0, x) = θ0 (x), where γ ∈ (0, 2] is a fixed parameter and the velocity u = (u 1 , u 2 ) is divergence free and determined by the Riesz transforms of the potential temperature θ : u = (−R2 θ, R1 θ ) = (−∂x2 (−)−1/2 θ, ∂x1 (−)−1/2 θ ). The 2D dissipative quasi-geostrophic equation appears in geophysical studies of strongly rotating fluids (see, for example, Pedlosky [24]). The central mathematical question related to the initial value problem (1.1) is whether there exists a global in time smooth solution to (1.1) evolving from any given smooth initial data. In order to recall known results to this question, we note that cases γ > 1, Partially supported by a start-up funding from the Division of Applied Mathematics of Brown University and NSF grant number DMS 0800129. Partially supported by a start-up funding from the College of Natural Sciences of the University of Texas at Austin, NSF grant number DMS 0758247 and an Alfred P. Sloan Research Fellowship.

802

H. Dong, N. Pavlovi´c

γ = 1 and γ < 1 are called subcritical, critical and supercritical, respectively. Resnick [25] established existence of a global weak solution in both dissipative and non-dissipative cases. The existence of solutions is fully understood in the subcritical case: Constantin and Wu [10] proved that every sufficiently smooth initial data give a rise to a unique global smooth solution. In the critical case, γ = 1, Constantin, Cordoba and Wu [8] established existence of a unique global classical solution corresponding to any initial data that are small in L ∞ . The hypothesis requiring smallness in L ∞ has been removed recently in two elegant papers [1 and 20]. More precisely, Kiselev, Nazarov and Volberg [20] proved persistence of a global solution in C ∞ corresponding to any C ∞ periodic initial data. Their proof is based on a maximum modulus of the continuity principle. In [17] Dong and Du adapted the method of [20] to obtain global wellposedness for the critical 2D dissipative quasi-geostrophic equations with H 1 initial data in the whole space. On the other hand, Caffarelli and Vasseur [1] used harmonic extension to establish regularity of the Leray-Hopf weak solution. More precisely, their approach consists of establishing the following three claims: (1) Every Leray-Hopf weak solution corresponding to initial data θ0 ∈ L 2 is in 2 L∞ loc (R × (0, ∞)). ∞ (2) The L solutions are Hölder regular, i.e. they are in C γ for some γ > 0. (3) Every Hölder regular solution is a classical solution in C 1,β . However the question addressing global in time existence of a solution still remains open in the supercritical case, γ < 1. We note that, in this case Chae and Lee [4], Wu [26,29], Chen, Miao and Zhang [5] and Hmidi and Keraani [19] established existence of a global solution in Besov spaces evolving from small initial data (see also [21,23]). Also recently, Constantin and Wu in [11] implemented the approach of [1] in the supercritical case. They proved that the claim (1) is valid in the super-critical case. Towards addressing the claim (2), Constantin and Wu in [11] proved that L ∞ solutions are Hölder continuous under the additional assumption that the velocity u ∈ C 1−γ . The claim (3) is considered by Constantin and Wu in a separate paper [12] where they obtained a conditional regularity result of the type: if a Leray-Hopf solution is in the sub-critical space L ∞ ((t0 , t1 ); C δ (R2 )) for some δ > 1 − γ on the time interval [t0 , t1 ], then such a solution is a classical solution on (t0 , t1 ]. In [18] we extended the result of Constantin and Wu [12] to scaling invariant mixed time-space Besov spaces. More precisely, in [18] we proved that if θ ∈ L r ((0, T ); B αp,∞ (R2 )),

(1.2)

for any γ ∈ (0, 1], p ∈ [2, ∞), T ∈ (0, ∞), r ∈ [2, ∞) with α = 2p + 1 − γ + γr , is a weak solution of the 2D quasi-geostrophic equation (1.1), then θ is a classical solution of (1.1) in (0, T ] × R2 . The significance of this space is that it is scaling invariant under the scaling transformation θλ = λγ −1 θ (λx, λγ t). It is natural to ask whether the result of [18] can be extended to include the case r = ∞, p = ∞ in (1.2). In this paper we explore that question and prove that if θ ∈ C((0, T ); C 1−γ (R2 ))

Quasi-Geostrophic Equations

803

with γ ∈ (0, 1) is a weak solution of (1.1), then θ is a classical solution of (1.1) in δ the region (0, T ] × R2 . Since B˙ ∞,∞ ∩ L ∞ = C δ , this regularity result extends our previous result [18] to include the case p = ∞ and not quite r = ∞ (since we require continuity in time). The importance of the space C 1−γ (R2 ) is in the fact that it is the largest scaling invariant space for the 2D quasi-geostrophic equations (1.1). We note that this new regularity result is inspired by the analogous conditional regularity result for the Navier-Stokes equations that was recently obtained by Cheskidov and Shvydkoy [6]. For the precise statement of our result, see Theorem 3.4. We remark that, as in [6], from the proof it is clear that we allow small jump discontinuities of θ (t, ·) in the C 1−γ norm. The proof of Theorem 3.4 relies on a regularity criterion, stated in Lemma 4.1, which exploits a certain cancellation property of the bilinear term. We identify such a cancellation property by means of Bony’s paraproduct formula for Littlewood-Paley operators and use of a certain commutator estimate involving Littlewood-Paley operators. The approach that we use to identify the cancellation property differs on a technical level from the approach employed in [6], where the authors followed [7] in order to identify the cancellation property. Thanks to the above mentioned cancellation property of the nonlinear term, we present another conditional regularity result too (see Theorem 3.5), which extends our previous result [18] to include the case p = ∞, r ∈ [1, ∞) in (1.2). Organization of the paper. The paper is organized as follows. In Sect. 2 we introduce the notation and we review known estimates that shall be used throughout the paper. In Sect. 3 we state the main results of the paper. In Sect. 4 we present a proof of the crucial regularity criterion (Lemma 4.1) which is based on the cancellation property. Also in Sect. 4 we give a proof of Theorem 3.4. Then in Sect. 5 we give a proof of Theorem 3.5. 2. Notation and Preliminaries 2.1. Notation and spaces. We recall that for any β ∈ R the fractional Laplacian (−)β is defined via its Fourier transform: β f (ξ ) = |ξ |2β fˆ(ξ ). (−) We note that by a weak solution to (1.1) we mean θ (t, x) in (0, ∞) × R2 such that for any smooth function φ(t, x) satisfying φ(t, ·) ∈ S for each t, the identity T θ (T, ·)φ(T, ·) d x − θ (0, ·)φ(0, ·) d x − θ φt d x dt R2

T

− 0

R2

R2

T

uθ ∇φ d x dt + 0

R2

0

R2

θ γ φ d x dt = 0

holds for any T > 0. Before we give the definition of the spaces that will be used throughout the paper, we shall review the Littlewood-Paley decomposition. For any integer j, define j to be the Littlewood-Paley projection operator with j v = φ j ∗ v, where ˆ − j ξ ), φˆ ∈ C0∞ (R2 \{0}), φˆ ≥ 0, φˆ j (ξ ) = φ(2 supp φˆ ⊂ {ξ ∈ R2 | 1/2 ≤ |ξ | ≤ 2}, φˆ j (ξ ) = 1 for ξ = 0. j∈Z

804

H. Dong, N. Pavlovi´c

Formally, we have the Littlewood-Paley decomposition v(·, t) =

j v(·, t).

j∈Z

Also denote ¯ −1 = = (−)1/2 ,

k , ≤ j =

k<0

v j = j v, v≤ j =

j

k ,

k≤ j

vk , v≥ j =

k=−∞

∞

vk , vi≤·≤ j =

k= j

j

vk .

k=i

For any p, q ∈ [1, ∞] and s ∈ R, we denote by B˙ sp,q and B sp,q , respectively the homogeneous and inhomogeneous Besov spaces equipped with norms

v B˙ s

p,q

:=

v B sp,q :=

1/q jsq v q , for q < ∞, j j∈Z 2 Lp js for q = ∞, sup j∈Z 2 j v L p , 1/q jsq v q ¯ −1 v L p , + j j≥0 2 Lp js ¯ −1 v L p , sup j≥0 2 j v L p +

for q < ∞, for q = ∞.

If s > 0, we have s B sp,q = B˙ sp,q ∩ L p , v B sp,q ∼ v B˙ s + v L p , C s = B∞,∞ . p,q

2.2. Preliminaries. The following Bernstein’s inequality is well-known. Lemma 2.1.

i) Let p ∈ [1, ∞] and s ∈ R. Then for any j ∈ Z, we have λ2 js j v L p ≤ s j v L p ≤ λ 2 js j v L p

(2.1)

with some constants λ and λ depending only on p and s. ii) Moreover, for 1 ≤ p ≤ q ≤ ∞, there exists a positive constant C depending only on p and q such that

j v L q ≤ C2(1/ p−1/q)d j j v L p .

(2.2)

Now we recall the generalized Bernstein’s inequality and a lower bound for an integral involving a fractional Laplacian which will be used in the paper. They can be found in [21,28] and [5]. Lemma 2.2.

i) Let p ∈ [2, ∞) and γ ∈ [0, 2]. Then for any j ∈ Z, we have λ2γ j/ p j v L p ≤ γ /2 (| j v| p/2 ) L 2 ≤ λ 2γ j/ p j v L p , 2/ p

with some positive constants λ and λ depending only on p and γ .

(2.3)

Quasi-Geostrophic Equations

805

ii) Moreover, we have

R2

and

( γ v)|v| p−2 v ≥ c γ /2 |v| p/2 2L 2 ,

(2.4)

( γ j v)| j v| p−2 j v ≥ c2γ j j v L p , p

R2

(2.5)

with some positive constant c depending only on p and γ . Also we will use the following commutator estimate on the Littlewood-Paley projection operator. Lemma 2.3. Let d ≥ 1 be an integer, r, r1 , r2 ∈ [1, ∞], j ∈ Z we have

1 r2

≤ 1. Then for any

[u, j ]v L r (Rd ) ≤ C2− j ∇u L r1 (Rd ) v L r2 (Rd )

(2.6)

1 r

=

1 r1

+

as long as the right-hand side is finite. Here C is a positive constant independent of j, and [u, j ]v = u j (v) − j (uv). Proof. This follows easily from the integral representation of the Littlewood-Paley projection, Minkowski inequality and Hölder’s inequality. Finally we recall the following regularity criterion for (1.1), which is the main result of [18]. Theorem 2.4. Let γ ∈ (0, 1], p ∈ [2, ∞), T ∈ (0, ∞) and r ∈ [2, ∞). Denote by α = 2p + 1 − γ + γr . If θ ∈ L r ((0, T ); B αp,∞ (R2 )) is a weak solution of (1.1), then θ is in C ∞ ((0, T ] × R2 ), and thus it is a classical solution of (1.1) in the region (0, T ] × R2 . 3. Formulation of Results Assumption 3.1. In the sequel, we always assume that θ is regular at time t = 0. More precisely, we assume θ (0, ·) is in B δp00 ,q0 for some p0 ∈ [1, ∞), q0 ∈ [1, ∞] and δ0 > 1 − γ + 2p . 0

Remark 3.2. Assumption 3.1 seems quite natural due to the local smoothing effect of (1.1) (see, for instance, [15 and 16]). Remark 3.3. Because of the well-known embedding relations of Besov spaces, we have θ (0, ·) ∈ B δp,q for any p ∈ [ p0 , ∞], q ∈ [1, ∞] and some δ > 1 − γ + 2p . Moreover, by the L p maximum principle for (1.1), it holds that θ ∈ L ∞ ([0, ∞); L p ) for any p ∈ [ p0 , ∞].

806

H. Dong, N. Pavlovi´c

Now we state the main results of the paper. The first theorem states that weak solutions in certain critical Hölder spaces are regular. Theorem 3.4. Let γ ∈ (0, 1) and T ∈ (0, ∞). If θ ∈ C((0, T ); C 1−γ (R2 )) is a weak solution of (1.1), then θ is in C ∞ ((0, T ] × R2 ), and thus it is a classical solution of (1.1) in the region (0, T ] × R2 . The second theorem extends Theorem 2.4 to the limiting case p = ∞. Theorem 3.5. Let γ ∈ (0, 1), T ∈ (0, ∞), r ∈ [1, ∞) and α = 1 − γ + γr . If θ ∈ L r ((0, T ); C α (R2 )) is a weak solution of (1.1), then θ is in C ∞ ((0, T ] × R2 ), and thus it is a classical solution of (1.1) in the region (0, T ] × R2 . Remark 3.6. It is not clear to us if the result of Theorem 3.5 still holds true when r = ∞. In some sense, Theorem 3.4 gives a partial answer to this open problem (see also Lemma 4.2). On the other hand, for the critical dissipative quasi-geostrophic equation, i.e. γ = 1, 2 Caffarelli and Vasseur [1] established that any weak solution in L ∞ loc ((0, ∞) × R ) is regular. 4. Proof of Theorem 3.4 As we mentioned in the introduction, Theorem 3.4 is inspired by the analogous theorem for the Navier-Stokes equations presented in [6]. The proof of Theorem 3.4 relies on a regularity criterion stated in Lemma 4.2. On the other hand the proof of Lemma 4.2 is based on the regularity criterion formulated in Lemma 4.1 which exploits a certain cancellation property of the nonlinear term. We identify such a cancellation property by using Bony’s paraproduct formula for Littlewood-Paley operators and a commutator estimate involving Littlewood-Paley operators. Hence on a technical level our approach differs from the approach employed in [6]. Lemma 4.1. Let θ be a weak solution of (1.1) in [0, T ]×R2 . Then there exists a positive constant ε0 such that if θ satisfies lim sup sup 2 j (1−γ ) θ j (t, ·) L ∞ < ε0 , x j→∞ t∈(0,T )

(4.1)

then θ (t, x) is regular in [0, T ] × R2 . Proof. We prove the lemma by contradiction. Suppose θ blows up in (0, T ]. Without loss of generality, one may assume T is the first blow-up time. Let us start by applying the operator j , j > 0 to the first equation in (1.1) and use the divergence-free property of u to obtain ∂t θ j + ∇ · j (uθ ) + γ θ j = 0.

(4.2)

Quasi-Geostrophic Equations

807

We multiply (4.2) by |θ j | p−2 θ j , where p is an even number to be specified later, and integrate in x to obtain 1 d p

θ j L p + ( γ θ j ) |θ j | p−2 θ j d x = ∇ · j (uθ ) |θ j | p−2 θ j d x. (4.3) p dt R2 R2 Fix an integer N ≥ 10 and fix an ε ∈ (0, 1). In order to simplify the notation we will denote by β, β = 2 + p(1 − γ ).

(4.4)

Now we use Lemma 2.2 to obtain a lower bound on the second term on the left-hand side of (4.3) to derive 1 d p p ∇ · j (uθ ) |θ j | p−2 θ j d x,

θ j L p + λ2γ j θ j L p ≤ 2 p dt R which after being multiplied by 2 j (β+ε) and summed over j ≥ N gives: ∞ ∞ 1 j (β+ε) d p p

θ j L p + λ 2 2 j (β+ε+γ ) θ j L p p dt j=N ∞

2 j (β+ε)

≤

j=N

j=N

R2

∇ · j (uθ ) |θ j | p−2 θ j d x.

(4.5)

In order to bound the term on the right-hand side of (4.5) we split the nonlinear term j (uθ ) by applying Bony’s decomposition and the localization properties of the Littlewood-Paley operators as follows: j (uθ ) = N j,lh + N j,hl + N j,hh , where N j,lh = j (u ≤ j+4 θ j−2≤·≤ j+2 ), N j,hl = j (u j−2≤·≤ j+2 θ≤ j−3 ), ∞ j (u k−2≤·≤k+2 θk ). N j,hh = k= j+3

Hence we can write ∞ j (β+ε) 2 j=N

R2

∇ · j (uθ ) |θ j | p−2 θ j d x = I1 + I2 + I3 ,

where I1 = I2 =

∞ j=N ∞

2 j (β+ε) 2 j (β+ε)

j=N

I3 =

∞ j=N

2

j (β+ε)

R2

∇ · N j,lh |θ j | p−2 θ j d x ,

R2

∇ · N j,hl |θ j | p−2 θ j d x ,

R2

∇ · N j,hh |θ j | p−2 θ j d x.

In what follows we denote f j−2≤·≤ j+2 by f˜j .

808

H. Dong, N. Pavlovi´c

We start by estimating I1 . We use localization properties of Littlewood-Paley operators and the divergence-free property of u to notice that I1 = = =

∞ j=N ∞ j=N ∞

2 j (β+ε) 2 j (β+ε) 2 j (β+ε)

R2

R2

R2

j=N

∇ · j (u ≤ j+4 θ˜ j ) |θ j | p−2 θ j d x ∇ · ≤ j+6 j (u ≤ j+4 θ˜ j ) − u ≤ j+4 θ j |θ j | p−2 θ j d x ∇ · ≤ j+6 [ j , u ≤ j+4 ]θ˜ j |θ j | p−2 θ j d x,

thanks to which we can use Hölder’s inequality to get I1

∞

2 j (β+ε) ∇ · ≤ j+6 [ j , u ≤ j+4 ]θ˜ j

j=N

L

p+1 2

p−1

θ j L p+1 .

We then apply the commutator estimate stated in Lemma 2.3 to obtain ∇ · ≤ j+6 [ j , u ≤ j+4 ]θ˜ j p+1 ∇u ≤ j+4 L p+1 θ˜ j L p+1 L 2 2k u k L p+1 θ˜ j L p+1 .

(4.6)

(4.7)

k≤ j+4

Now we combine (4.6) and (4.7) and use the properties of Riesz transforms as follows: I1

∞

2 j (β+ε) θ j L p+1 p−1

j=N

k≤ j+4

∞

j=N −2

∞

2

p j (β+ε+1) p+1

j=N −2

∞ j=N −2

2

2k u k L p+1 θ˜ j L p+1

2 j (β+ε) θ j L p+1 p

2k θk L p+1

k≤ j+4 β+ε+1 2k 1− p+1 k(β+ε+1) 1 p+1 θ p+1 2 k L 2j

p

θ j L p+1

k≤ j+4 j (β+ε+1)

p+1

θ j L p+1

∞

+

p−β−ε 2k 2 p+1 2k(β+ε+1) θk L p+1 2j

j=N −2 k≤ j+4

(4.8)

∞

2 j (β+ε+1) θ j L p+1 + R(N ), p+1

(4.9)

j=N

where R(N ) = sup

N −1

t∈(0,T ) l=−1

p+1

2l(β+ε+1) θl L p+1 .

(4.10)

Quasi-Geostrophic Equations

809

We note that in order to obtain (4.8) we use Young’s inequality, Hölder’s inequality and we require that p satisfies p − β − ε > 0. Hence we choose p such that p>

2+ε . γ

(4.11)

In an analogous way we obtain the following upper bound on I2 : I2

∞

2 j (β+ε+1) θ j L p+1 + R(N ). p+1

(4.12)

j=N

Now we obtain an upper bound on I3 . We start by applying the Hölder inequality: ∞

I3

2 j (β+ε) ∇ · N j,hh

j=N

L

p+1 2

p−1

θ j L p+1 .

(4.13)

To estimate ∇ · N j,hh p+1 we apply the Hölder inequality and properties of the Riesz L 2 transform to obtain

∇ · N j,hh

L

p+1 2

= ∇ · 2j

∞

j (u k−2≤·≤k+2 θk )

k= j+3 ∞

L

p+1 2

θ˜k L p+1 θk L p+1 ,

k= j+3

which combined with (4.13) gives I3

∞

p−1

j=N

≤

∞

2

j (β+ε+1)

θ˜k L p+1 θk L p+1

k= j+3

p−1 p+1

p−1

θ j L p+1

j=N ∞

∞

2 j (β+ε+1) θ j L p+1

2(β+ε+1) ∞ j p+1 2 k 2(β+ε+1) 2 p+1 θ˜k 2L p+1 k 2

k= j+3

2

j (β+ε+1)

p+1

θ j L p+1

j=N

β+ε+1 ∞ ∞ j 2 2 p+1 + 2k(β+ε+1) θ˜k L p+1 2k

(4.14)

j=N k= j+3

∞

2 j (β+ε+1) θ j L p+1 , p+1

(4.15)

j=N

where to obtain (4.14) we used Young’s inequality. Now we combine (4.5), (4.9), (4.12) and (4.15) to obtain ∞ ∞ ∞ 1 d j (β+ε) p p p+1 2

θ j L p + λ 2 j (β+ε+γ ) θ j L p 2 j (β+ε+1) θ j L p+1 + R(N ), p dt j=N

j=N

j=N

which thanks to the following interpolation inequality p+1

p

θ j L p+1 ≤ θ j L p θ j L ∞ ,

810

H. Dong, N. Pavlovi´c

implies ∞ ∞ 1 d j (β+ε) p p 2

θ j L p + λ 2 j (β+ε+γ ) θ j L p p dt j=N

≤ C1

∞

j=N

p 2 j (β+ε+γ ) θ j L p 2 j (1−γ ) θ j L ∞ + C1 R(N ),

(4.16)

j=N

where C1 is some universal constant. We choose N such that 2 j (1−γ ) θ j L ∞ < all j ≥ N and all t ∈ (0, T ). Hence (4.16) implies that for any t ∈ (0, T ),

λ C1

for

∞ 1 d j (β+ε) p 2

θ j L p ≤ C1 R(N ). p dt j=N

Since β is given via (4.4) and R(N ) < ∞, we can use Theorem 2.4 to conclude that θ (t) is regular on (0, T ], which gives a contradiction. Lemma 4.2. Let θ be a weak solution of (1.1) in [0, T ] × R2 . There exists a positive constant ε1 such that if θ satisfies sup lim sup θ (t, ·) − θ (s, ·) C 1−γ < ε1 ,

(4.17)

t∈(0,T ] s→t −

then θ (t, x) is regular in [0, T ] × R2 . Proof. We prove the lemma by contradiction and without loss of generality assume T be the first blow-up time of θ . Because of (4.17), there exists t1 ∈ (0, T ) such that

θ (T, ·) − θ (t1 , ·) C 1−γ < ε1 . Since θ (t1 , ·) is regular, we have lim sup 2 j (1−γ ) θ j (t1 , ·) L ∞ = 0, x j→∞

and therefore, < ε1 . lim sup 2 j (1−γ ) θ j (T, ·) L ∞ x j→∞

This and (4.17) imply that for some t2 ∈ (0, T ), lim sup sup 2 j (1−γ ) θ j (s, ·) L ∞ < 2ε1 . x j→∞ s∈(t2 ,T )

To get a contradiction, it suffices to set ε1 = ε0 /2 and apply Lemma 4.1. Proof of Theorem 3.4. It follows directly from Lemma 4.2.

Quasi-Geostrophic Equations

811

5. Proof of Theorem 3.5 By applying Young’s inequality, we get 2 j (β+ε+1) θ j L ∞ ≤ C2 2 j (β+ε+γ +r (1−γ )) θ j rL ∞ + x x

λ j (β+γ +ε) 2 , 2C1

for some constant C2 > 0 depending only on λ, p and r . This together with (4.16) yields ∞ ∞ λ j (β+γ +ε) 1 d j (β+ε) p p 2

θ j L p + 2

θ j L p x x p dt 2 j=N

≤ C2

∞

j=N

r γ p 2 j (β+ε) θ j L p 2 j (1−γ + r ) θ j L ∞ + R(N ). x x

j=N

(5.1)

Due to the definition of Besov spaces, the right-hand side of (5.1) is less than or equal to C2

∞ p 2 j (β+ε) θ j L p θ rC α + R(N ). j=N

x

To finish the proof of Theorem 3.5, it suffices to use Gronwall’s inequality and Theorem 2.4 keeping in mind that β is given via (4.4). Acknowledgement. The authors are grateful to the referee for very helpful comments.

References 1. Caffarelli, L., Vasseur, A.: Drift diffusion equations with fractional diffusion and the quasi-geostrophic equation. Ann. Math. (to appear) 2. Chae, D.: The quasi-geostrophic equation in the Triebel-Lizorkin spaces. Nonlinearity 16(2), 479–495 (2003) 3. Chae, D.: On the regularity conditions for the dissipative quasi-geostrophic equations. SIAM J. Math. Anal. 37(5), 1649–1656 (2006) 4. Chae, D., Lee, J.: Global well-posedness in the super-critical dissipative quasi-geostrophic equations. Commun. Math. Phys. 233, 297–311 (2003) 5. Chen, Q., Miao, C., Zhang, Z.: A new Bernstein’s Inequality and the 2D Dissipative Quasi-Geostrophic Equation. Commun. Math. Phys. 271(3), 821–838 (2007) 6. Cheskidov, A., Shvydkoy, R.: On the regularity of weak solutions of the 3D Navier-Stokes equations in −1 B∞,∞ . http://arXiv.org/abs/math.AP/0708.3067v2[math.AP], 2007 7. Cheskidov, A., Constantin, P., Friedlander, S., Shvydkoy, R.: Energy conservation and Onsager’s conjecture for the Euler equations. Nonlinearity 21(6), 1233–1252 (2008) 8. Constantin, P., Cordoba, D., Wu, J.: On the critical dissipative quasi-geostrophic equation. Indiana Univ. Math. J. 50, 97–107 (2001) 9. Constantin, P., Majda, A.J., Tabak, E.: Formation of strong fronts in the 2-D quasigeostrophic thermal active scalar. Nonlinearity 7(6), 1495–1533 (1994) 10. Constantin, P., Wu, J.: Behavior of solutions of 2D quasi-geostrophic equations. SIAM J. Math. Anal. 30, 937–948 (1999) 11. Constantin, P., Wu, J.: Hölder continuity of solutions of super-critical dissipative hydrodynamic transport equations. Ann. Inst. H. Poincaré Anal. Non Linéaire 26(1), 159–180 (2009) 12. Constantin, P., Wu, J.: Regularity of Hölder continuous solutions of the supercritical quasi-geostrophic equation. Ann. Inst. H. Poincaré Anal. Non Linéaire 25(6), 1103–1110 (2008) 13. Córdoba, A., Córdoba, D.: A maximum principle applied to quasi-geostrophic equations. Commun. Math. Phys. 249(3), 511–528 (2004)

812

H. Dong, N. Pavlovi´c

14. Dong, B.-Q., Chen, Z.-M.: A remark on regularity criterion for the dissipative quasi-geostrophic equations. J. Math. Anal, Appl. 329(2), 1212–1217 (2007) 15. Dong, H.: Higher regularity for the critical and super-critical dissipative quasi-geostrophic equations. http://arXiv.org/abs/math/0701826v1[math.AP], 2007 16. Dong, H., Li, D.: On the 2D critical and supercritical dissipative quasi-geostrophic equation in Besov spaces, Preprint 2007 17. Dong, H., Du, D.: Global well-posedness and a decay estimate for the critical dissipative quasi-geostrophic equation in the whole space. Discrete Contin. Dyn. Syst. 21(4), 1095–1101 (2008) 18. Dong, H., Pavlovi´c, N.: A regularity criterion for the dissipative quasi-geostrophic equations, To appear in Annales de l’Institut Henri Poincare - Non Linear Analysis, DOI:10.1016/j.anihpe.2008.08.001, 2008 19. Hmidi, T., Keraani, S.: Global solutions of the super-critical 2D quasi-geostrophic equation in Besov spaces. Adv. Math. 214(2), 618–638 (2007) 20. Kiselev, A., Nazarov, F., Volberg, A.: Global well-posedness for the critical 2D dissipative quasi-geostrophic equation. Invent. Math. 167(3), 445–453 (2007) 21. Ju, N.: The maximum principle and the global attractor for the dissipative 2D quasi-geostrophic equations. Commun. Math. Phys. 255(1), 161–181 (2005) 22. Ju, N.: Dissipative quasi-geostrophic equation: local well-posedness, global regularity and similarity solutions. Indiana Univ. Math. J. 56(1), 187–206 (2007) 23. Miura, H.: Dissipative quasi-geostrophic equation for large initial data in the critical sobolev space. Commun. Math. Phys. 267(1), 141–157 (2006) 24. Pedlosky, J.: Geophysical fluid dynamics. New York: Springer, 1987 25. Resnick, S.: Dynamical problems in nonlinear advective partial differential equations. Ph.D. Thesis, University of Chicago, 1995 26. Wu, J.: Global solutions of the 2D dissipative quasi-geostrophic equations in Besov spaces. SIAM J. Math. Anal. 36(3), 1014–1030 (2004/05) (electronic) 27. Wu, J.: Solutions of the 2D quasi-geostrophic equation in Hölder spaces. Nonlinear Anal. 62(4), 579–594 (2005) 28. Wu, J.: Lower bounds for an integral involving fractional Laplacians and the generalized Navier-Stokes equations in Besov spaces. Commun. Math. Phys. 263(3), 803–831 (2006) 29. Wu, J.: Existence and uniqueness results for the 2-D dissipative quasi-geostrophic equation. Nonlinear Analysis 67, 3013–3036 (2007) Communicated by P. Constantin

Commun. Math. Phys. 290, 813–860 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0862-9

Communications in

Mathematical Physics

Analytic Torsion of a Bounded Generalized Cone Boris Vertman Department of Mathematics, University of Bonn, Endenicher Allee 60, 53115 Bonn, Germany. E-mail: [email protected] Received: 18 August 2008 / Accepted: 7 March 2009 Published online: 7 July 2009 – © Springer-Verlag 2009

Abstract: We compute the analytic torsion of a bounded generalized cone by generalizing the computational methods of M. Spreafico and using the symmetry in the de Rham complex, as established by M. Lesch. We evaluate our result in lower dimensions and further provide a separate computation of analytic torsion of a bounded generalized cone over S 1 , since the standard cone over the sphere is simply a flat disc. 1. Introduction Torsion invariants for manifolds, which are not simply connected, were introduced by K. Reidemeister in [Re1,Re2] and generalized to higher dimensions by W. Franz in [Fr]. Using these torsion invariants the authors obtained a full PL-classification of lens spaces. The Reidemeister-Franz torsion, in short − the Reidemeister torsion, was the first invariant of manifolds which was not a homotopy invariant. The Reidemeister-Franz definition of torsion invariants was extended later to smooth manifolds by J. H. Whitehead in [Wh] and G. de Rham in [Rh]. With their construction G. de Rham further proved that a spherical Clifford-Klein manifold is determined up to isometry by its fundamental group and its Reidemeister torsion. The Reidemeister-Franz torsion is a combinatorial invariant and can be constructed using a cell-decomposition or a triangulation of the underlying manifold. The combinatorial invariance under subdivisions was established by J. Milnor in [Mi], see also [RS]. It is therefore a topological invariant of M, however not a homotopy invariant. There is a series of results relating combinatorial and analytic objects, among them the Atiyah-Singer Index Theorem. In view of these results it is natural to ask for the analytic counterpart of the combinatorial Reidemeister torsion. Such an analytic torsion was introduced by D. B. Ray and I. M. Singer in [RS] in the form of a weighted product of zeta-regularized determinants of Laplace operators on differential forms. The zeta-regularized determinant of a Laplace Operator is a spectral invariant which very quickly became an object of interest on its own in differential and conformal

814

B. Vertman

geometry, studied in particular as a function of metrics for appropriate geometric operators. Further it plays a role in mathematical physics where it gives a regularization procedure of functional path integrals (partition function), see [H]. In their work D.B. Ray and I. M. Singer provided some motivation why the analytic torsion should equal the combinatorial invariant. The celebrated Cheeger-Müller Theorem, established independently by J. Cheeger in [Ch] and W. Müller in [Mu1], proved equality between the analytic Ray-Singer torsion and the combinatorial Reidemeister torsion for any smooth closed manifold with an orthogonal representation of its fundamental group. The proofs of J. Cheeger and W. Müller use different approaches. The first author in principle studied the behaviour of the Ray-Singer torsion under surgery. The second author used combinatorial parametrices and approximation theory of Dodziuk [Do] to reduce the problem to trivial representations, treating this problem then by surgeries. Note a different approach of Burghelea-Friedlander-Kappeler in [BFK] and Bismut-Zhang in [BZ1], who obtained a new proof of the result by J. Cheeger and W. Müller using Witten deformation of the de Rham complex via a Morse function. The study of the analytic torsion of Ray and Singer has taken the following natural steps. The setup of a closed Riemannian manifold with its marking point − the Cheeger Müller Theorem, was followed by the discussion of compact manifolds with smooth boundary. In the context of smooth compact manifolds with boundary a Cheeger-Müller type result was established in the work of W. Lück [Lü] and S. Vishik [V]. While the first author reduced the discussion to the known Cheeger-Müller Theorem on closed manifolds via the closed double construction, the second author gave an independent proof of the Cheeger-Müller Theorem on smooth compact manifolds with and without boundary by establishing the gluing property of the analytic torsion. Both proofs work under the assumption of product metric structures near the boundary. However with the anomaly formulas in [BZ1 and BM] the assumption on product structures can be relaxed. The next natural step in the study of analytic torsion is the treatment of Riemannian manifolds with singularities. We are interested in the simplest case, the conical singularity. The analysis and the geometry of spaces with conical singularities were developed in the classical works of J. Cheeger in [Ch1 and Ch2]. This setup is modelled by a bounded generalized cone M = (0, 1] × N over a closed Riemannian manifold (N , g N ) with the Riemannian metric gM = d x2 ⊕ x2gN . The analytic Ray-Singer Torsion is shown in [Dar] to exist on a bounded generalized cone and the natural question arises whether one can establish a Cheeger-Müller type theorem in the singular setup, as well. Following [L, Problem 5.3], the idea is to reduce via the gluing formula of Vishik [V] the comparison of Ray-Singer and p-Reidemeister ¯ torsion (intersection torsion, cf. [Dar]) on compact manifolds with conical singularities to a comparison on a bounded generalized cone. The presented computation of analytic torsion on a bounded generalized cone solves the problem posed in [L, Problem 5.3]. We have provided the general answer to the question in Theorems 8.1 and 8.2 and obtained as an example explicit results in two and in three dimensions in Corollaries 8.3 and 8.4. After the computation of the analytic torsion of a bounded generalized cone one faces the problem of comparing it to the intersection torsion in the “right” perversity p. ¯ However the complex form of the result for the analytic torsion at least complicates the comparison with the topological counterpart.

Analytic Torsion

815

In the actual computation of the analytic torsion of a bounded generalized cone, we use the approach of M. Spreafico [S], combined with elements of [BKD], together with an observation of symmetry in the de Rham complex by M. Lesch in [L3]. The methods of [S] are generalized and refined by Speafico in [S1 and S2]. In fact J.S. Dowker and K. Kirsten provided in [DK] some explicit results for a bounded generalized cone M = (0, 1] × N , giving formulas which related the zetadeterminants of form-valued Laplacians, essentially self-adjoint at the cone singularity and with Dirichlet or generalized Neumann conditions at the cone base, to the spectral information on the base manifold N . So, in the manner of [Ch2], they reduced analysis on the cone to that on its base. Theoretically these results can be composed directly into a formula for the analytic torsion. However this approach would disregard the subtle symmetry of the de Rham complex of a bounded generalized cone, which was derived by M. Lesch in [L3]. Furthermore the formulas obtained this way turn out to be rather ineffective. We present here an approach that does make use of the symmetry of the de Rham complex and leads to expressions that are easier to evaluate. The calculations are performed for any dimension ≥ 2 with an overall general result for the analytic torsion of a bounded generalized cone. Further calculations are possible only by specifying the base manifold N . In Subsect. 8 we provide explicit results in three and in two dimensions. For a bounded generalized cone of dimension two, over a one-dimensional sphere, one needs to introduce an additional parameter in the Riemannian cone metric in order to deal with bounded generalized cone and not simply with a flat disc D 1 . There is no need to evaluate the symmetry of the de Rham complex in this case. The calculations of [S] can be generalized to this setup in a straightforward way, which is done in Sect. 9. It was brought to the attention of the author, that the recent results of Hartmann–Spreafico in [HS] discuss analytic torsion of cones over spheres in dimensions two, three and four. 2. The de Rham Laplacian on a Bounded Generalized Cone Consider a bounded generalized cone M = (0, R] × N over a closed oriented Riemannian manifold (N , g N ) of dimension dim N = n, with the Riemannian metric on M given by a warped product gM = d x2 ⊕ x2gN . The volume forms on M and N , associated to the Riemannian metrics g M and g N , are related as follows: vol(g M ) = x n d x ∧ vol(g N ). Consider as in [BS, (5.2)] the following separation of variables map, which is linear and bijective: k : C0∞ ((0, R), k−1 (N ) ⊕ k (N )) → k0 (M) (φk−1 , φk ) → x k−1−n/2 φk−1 ∧ d x + x k−n/2 φk ,

(2.1)

where φk , φk−1 are identified with their pullback to M under the natural projection π : (0, R] × N → N onto the second factor, and x is the canonical coordinate on (0, R]. Here k0 (M) denotes differential forms of degree k = 0, . . . , n +1 with compact support in the interior of M. The separation of variables map k extends to an isometry with respect to the L 2 -scalar products, induced by the volume forms vol(g M ) and vol(g N ).

816

B. Vertman

Proposition 2.1. The separation of variables map (2.1) extends to an isometric identification of L 2 −Hilbert spaces k : L 2 ([0, R], L 2 (∧k−1 T ∗ N ⊕ ∧k T ∗ N , vol(g N )), d x) → L 2 (∧k T ∗ M, vol(g M )). Under this identification we obtain for the exterior derivative, as in [BS, (5.5)] 1 dk−1,N ck 0 (−1)k ∂x −1 , (2.2) + k+1 dk k = 0 dk,N 0 0 x where ck = (−1)k (k − n/2) and dk,N denotes the exterior derivative on differential forms over N of degree k. Taking adjoints we find t 1 dk−1,N 0 0 0 . (2.3) + k−1 dkt k+1 = t ck dk,N (−1)k+1 ∂x 0 x + mapping forms of even degree to forms Consider now the Gauss-Bonnet operator DG B of odd degree. The Gauss-Bonnet operator acting on forms of odd degree is simply the − + t formal adjoint DG B = (DG B ) . With respect to + := ⊕2k and − := ⊕2k+1 the relevant operators take the following form:

1 1 d d − + S0 , +−1 DG + S0 , B − = − dx x dx x d2 1 −1 + + t ) D = − + 2 S0 (S0 + 1), +−1 + + = +−1 (DG − + − B GB 2 dx x 2 d 1 −1 − −1 − t −1 − − = − (DG + S0 (S0 − 1), − B ) + + DG B − = − dx2 x2 −1 + − DG B + =

(2.4) (2.5)

where S0 is a first order elliptic differential operator on ∗ (N ). The transformed GaussBonnet operators in (2.4) are regular singular in the sense of [BS] and [Br, Sect. 3]. Moreover, the Laplace Operator on k-forms over M transforms to k k k−1 = −

d2 1 + Ak , dx2 x2

(2.6)

where the operator Ak is simply the restriction of S0 (S0 + (−1)k ) to k−1 (N ) ⊕ k (N ). Note, that under the isometric identification ∗ the previous non-product situation of the bounded generalized cone M is now incorporated in the x-dependence of the tangential parts of the geometric Gauss-Bonnet and Laplace operators. Next we take boundary conditions into account and consider their behaviour under the isometric identification ∗ . More precisely consider the exterior derivatives and their formal adjoints on differential forms with compact support in the interior of M: dk : k0 (M) → k+1 0 (M), k dkt : k+1 0 (M) → 0 (M). t as the graph closures in L 2 (∧∗ T ∗ Define the minimal closed extensions dk,min and dk,min M, vol(g M )) of the differential operators dk and dkt respectively. The operators dk,min t and dk,min are closed and densely defined, so we form the adjoint operators and set for the maximal extensions: t t dk,max := (dk,min )∗ , dk,max := (dk,min )∗ .

Analytic Torsion

817

The following result is an easy consequence of definitions of the minimal and maximal extensions and of Proposition 2.1. Proposition 2.2. −1 k−1 (D(dk,min )) = D([k+1 dk k ]min ), −1 k−1 (D(dk,max )) = D([k+1 dk k ]max ).

Similar statements hold for the minimal and maximal extensions of the formal adjoint operators dkt . The minimal and the maximal extensions of the exterior derivative give rise to self-adjoint extensions of the associated Laplace operator t k = dkt dk + dk−1 dk−1 .

It is important to note that there are self-adjoint extensions of k which do not come from closed extensions of dk and dk−1 , compare the notion of “ideal boundary conditions” in [BL1]. However the most relevant self-adjoint extensions of the Laplacian indeed seem to come from closed extensions of the exterior derivatives. In the present discussion we are interested in the “relative” self-adjoint extension of k , arising from the minimal extension of the exterior derivative and defined as follows: ∗ ∗ rkel := dk,min dk,min + dk−1,min dk−1,min t t = dk,max dk,min + dk−1,min dk−1,max .

(2.7)

As a direct consequence of the previous proposition and Proposition 2.1 we obtain for the relative self-adjoint extension. Corollary 2.3. Consider the following two complexes: −1 (∗0 (M), dk ), (C0∞ ((0, R), C ∞ (∧k−1 T ∗ N ⊕ ∧k T ∗ N )), dk := k+1 dk k ).

Then the relative self-adjoint extensions of the associated Laplacians ∗ ∗ rkel = dk,min dk,min + dk−1,min dk−1,min ,

rkel = d∗ dk,min + dk−1,min d∗ k,min k−1,min r el ) and are spectrally equivalent, with k−1 (D(rkel )) = D( k rkel = −1 rkel k . k As a consequence of Corollary 2.3 we can deal with the minimal extension of the −1 unitarily transformed exterior differential k+1 dk k and the relative extension of the −1 unitarily transformed Laplacian k k k without loss of generality for the spectral analysis. By a small abuse of notation we denote the operators again by dk,min and rkel , in order to keep the notation simple. Finally, we discuss the explicit form of the boundary conditions of rkel at the regular boundary {x = R} × N . They are derived from the following trace theorem of L. Paquet.

818

B. Vertman

Theorem 2.4 [P, Theorem 1.9]. Let K be a compact oriented Riemannian manifold with boundary ∂K and let ι : ∂K → K be the natural inclusion. Then the pullback ι∗ : k (K ) → k (∂K ) with k (∂K ) = {0} for k = dim K , extends continuously to the following linear surjective map: −1/2

ι∗ : D(dk,max ) → D(dk,∂ K ), −1/2

where dk,∂ K is the closure of the exterior derivative on ∂ K in the Sobolev space H −1/2 (∧∗ T ∗ ∂ K ) and dk,max the maximal extension of the exterior derivative on K . −1/2 The domains D(dk,max ) and D(dk,∂ K ) are Hilbert spaces with respect to the graphnorms of the corresponding operators. By Theorem 2.4 and regularity properties of elements in D(max ) at the regular boundary x = R we can state the following result on relative boundary conditions. Proposition 2.5. Let γ ∈ C ∞ [0, R] be a smooth cut-off function, vanishing identically at x = 0 and being identically one at x = R. Then γ D(rkel ) = {k (φk−1 , φk ) ∈ γ D(k,max ) | φk (R) = 0, (k − 1 − n/2)

φk−1 (R) = 0}. φk−1 (R) − R Proof. Let r ∈ (0, R) be fixed and consider the associated natural inclusions χ : [r, R] × N =: Mr → M, ι : {R} × N ≡ N → M, ιr : {R} × N ≡ N → Mr . We obviously have ι = χ ◦ ιr . The inclusions above induce pullbacks of differential forms. The pullback map χ ∗ : k (M) → k (Mr ) is simply a restriction and extends to a continuous linear map r χ ∗ : D(dk,max ) → D(dk,max ),

where dkr is the k th exterior derivative on Mr ⊂ M and the domains are endowed with the graph norms of the corresponding operators. Applying Theorem 2.4 to the compact manifold Mr , we deduce that ι∗ = ιr∗ ◦ χ ∗ extends to a continuous linear map −1/2

ι∗ : D(dk,max ) → D(dk,N ). Now, continuity of implies

ι∗

(2.8)

together with the definition of the minimal domain D(dk,min )

γ D(dk,min ) ⊆ {φ ∈ γ D(dk,max )|ι∗ φ = 0}. Equality in the relation above follows from the Lagrange identity for dk . We obtain for the relative boundary conditions at the cone base: t γ D(rkel ) = {φ ∈ γ D(k,max )|ι∗ φ = 0, ι∗ (dk−1 φ) = 0}. t Now the statement of the proposition follows from the explicit action of dk−1 under the isometric identification ∗ and the fact that for k (φk−1 , φk ) ∈ D(k,max ) we have ι∗ (k (φk−1 , φk )) = R k−n/2 φk (R).

Analytic Torsion

819

3. Decomposition of the de Rham Complex and its Inner Symmetry In this section we discuss a decomposition of the de Rham complex of a bounded generalized cone, which ultimately leads to an explicit computation of the analytic torsion. This is essentially the content of [BV, Sect. 4]. We continue in the setup and notation of the previous section. 3.1. Decomposition of the de Rham Laplacian. By the convenient structure (2.6) of the Laplacian k one is tempted to write

k =

λ∈Spec(Ak )

−

d2 λ + , dx2 x2

and study the boundary conditions induced on each one-dimensional component. However this decomposition might be incompatible with the boundary conditions, so the discussion of the corresponding self-adjoint realization might not reduce to simple onedimensional problems. Compatibility of a decomposition means explicitly the following in the context of our presentation. Definition 3.1. Let D be a closed operator in a Hilbert space H . Let H1 be a closed subspace of H and H2 := H1⊥ . We say the decomposition H = H1 ⊕ H2 is compatible with D if D(H j ∩ D(D)) ⊂ H j , j = 1, 2 and for any φ1 ⊕ φ2 ∈ D(D) we get φ1 , φ2 ∈ D(D). This definition corresponds to [W2, Exercise 5.39] where the subspaces H j , j = 1, 2 are called the “reducing subspaces of D”. We have the following result: Proposition 3.2 [W2, Theorem 7.28]. Let D be a self-adjoint operator in a Hilbert space H . Let H1 be a closed subspace of H and H2 := H1⊥ . Let the decomposition H = H1 ⊕ H2 be compatible with D. Then each operator Di := D|Hi , i = 1, 2 with domain D(Di ) := D(D) ∩ Hi , i = 1, 2 is a self-adjoint operator in Hi . In other words, the induced decomposition D = D1 ⊕ D2 is an orthogonal decomposition of D into sum of two self-adjoint operators. Definition 3.3. In the setup of Proposition 3.2 we say Di , i = 1, 2 is a self-adjoint operator “induced” by D. In order to simplify notation, put: H k : = L 2 ([0, R], L 2 (∧k−1 T ∗ N ⊕ ∧k T ∗ N , vol(g N )), d x), Hk, H∗ : = k≥0

where H k are mutually orthogonal in H ∗ . The following result follows straightforwardly by the definition of rkel and gives a practical condition for compatibility of a decomposition of H k with the self-adjoint realization rkel .

820

B. Vertman

Proposition 3.4. Let H k = H1 ⊕ H2 , H2 := H1⊥ be an orthogonal decomposition into closed subspaces, such that rkel (H j ∩ D(rkel )) ⊂ H j , j = 1, 2. Assume that for t } the images D D ∈ {dk , dkt , dkt dk , dk−1 dk−1 max (H j ∩ D(Dmax )), j = 1, 2 are mutu∗ ally orthogonal in H . Then the decomposition H k = H1 ⊕ H2 is compatible with the relative extension rkel . Now we can present, following [L3], a decomposition of H ∗ , compatible with r∗el . This decomposition will be one of the essential aspects in the computation of analytic torsion of a bounded generalized cone. To describe the decomposition in convenient terms, we denote by k,ccl,N the Laplace operator on coclosed k−forms on N and introduce some notation Vk := {η ∈ Speck,ccl,N }\{0}, E ηk := {ω ∈ k (N ) | k,N ω = ηω, d Nt ω = 0}, η ∈ Vk , ηk := E ηk ⊕ d N E ηk , Hk (N ) := E 0k . E Here k = 0, . . . , n = dim N and the eigenvalues of k,ccl,N in Vk are counted with their multiplicities, so that each single E ηk is a one-dimensional subspace. The eigenvectors for a η ∈ Vk , k = 0, . . . , n with multiplicity bigger than 1 are chosen to be mutually orthogonal with respect to the L 2 -inner product on N . Further for each Hk (N ) choose an orthonormal basis of harmonic forms {u ik } with i = 1, . . . , dim Hk (N ). Then by the Hodge decomposition on N we obtain for any fixed degree k = 0, . . . , n+ 1 (put n+1 (N ) = −1 (N ) = {0}) ⎡ k−1 (N ) ⊕ k (N ) = ⎣ ⎡ ⊕⎣

η∈Vk−1

k−1 dim H (N )

⎤

i=1

⎡

ηk−1 ⎦ ⊕ ⎣ E

⎤

⎡

⎤

dim Hk (N ) k−1 ⎦ ⎣ u i ⊕ u ik ⎦ i=1

η∈Vk−2

⎤

⎡

d N E ηk−2 ⎦ ⊕ ⎣

η∈Vk

⎤

E ηk ⎦ .

(3.1)

Theorem 3.5. The decomposition (3.1) induces an orthogonal decomposition of H k , compatible with the relative extension rkel . Proof. The decomposition of H k induced by (3.1) is orthogonal, since the decomposition (3.1) is orthogonal with respect to the L 2 -inner product on N . Applying now dk , dkt dk t and dk−1 , dk−1 dk−1 to each of the orthogonal components we find that the images remain mutually orthogonal, so we obtain with Proposition 3.4 the desired statement.

3.2. Decomposition of the de Rham complex. Following [L3], we decompose the de Rham complex of M into a direct sum of subcomplexes of two types. The first type of the subcomplexes is given as follows. Let ψ ∈ E ηk , η ∈ Vk , k = 1, . . . , n be a fixed non-zero generator of E ηk . Put

Analytic Torsion

821

ξ1 := (0, ψ) ∈ k−1 (N ) ⊕ k (N ), ξ2 := (ψ, 0) ∈ k (N ) ⊕ k+1 (N ), 1 ξ3 := (0, √ d N ψ) ∈ k (N ) ⊕ k+1 (N ), η 1 ξ4 := √ d N ψ, 0 ∈ k+1 (N ) ⊕ k+2 (N ). η Then C0∞ ((0, R), ξ1 , ξ2 , ξ3 , ξ4 ) is invariant under d, d t and we obtain a subcomplex of the de Rham complex: ψ

ψ

d0

d1

0 → C0∞ ((0, R), ξ1 ) −→ C0∞ ((0, R), ξ2 , ξ3 ) −→ C0∞ ((0, R), ξ4 ) → 0, ψ

(3.2)

ψ

where d0 , d1 take the following form with respect to the chosen basis: ψ

d0 =

(−1)k ∂x + √ x −1 η

ck x

ck+1

√ ψ . , d1 = x −1 η, (−1)k+1 ∂x + x

By Theorem 3.5 separating out the subcomplex above is compatible with the boundary el , r el . In other words we have a decomposition into reducing conditions of rkel , rk+1 k+2 subspaces of the Laplacians, see [W2] for further details. Hence the relative boundary conditions induce self-adjoint extensions of the Laplacians of the subcomplex: ψ

ψ

ψ

D(rkel ) ∩ L 2 ((0, R), E ηk ) = D((d0 )tmax d0,min ) =: D(0,r el ), el D(rk+2 ) ∩ L 2 ((0, R), d N E ηk ) =

ψ ψ D(d1,min (d1 )tmax )

=:

ψ D(2,r el ).

(3.3) (3.4)

The associated Laplacians are given explicitly by the following regular-singular expression: 1 1 n 2 1 ψ ψ t ψ ψ ψ ψ 2 − 0 := (d0 ) d0 = −∂x + 2 η + k + − = d1 (d1 )t =: 2 (3.5) x 2 2 4 under the identification of any φ = f · ξi ∈ C0∞ ((0, R), ξi ), i = 1, 4 with its scalar part f ∈ C0∞ (0, R). We continue under this identification from here on. The second type of the subcomplexes comes from the harmonics on the base manifold N and is given as follows. Consider Hk (N ) and fix an orthonormal basis {u i }, i = 1, . . . , dim Hk (N ) of Hk (N ). Observe that for any i the subspace C0∞ ((0, R), 0 ⊕ u i , u i ⊕0) is invariant under d, d t and we obtain a subcomplex of the de Rham complex d

→ C0∞ ((0, R), u ik ⊕ 0) → 0, 0 → C0∞ ((0, R), 0 ⊕ u ik , ) − ck d = (−1)k ∂x + , x

(3.6)

where the action of d is identified with its scalar action. We continue under this identification. As for the subcomplex (3.2), separating out the subcomplex above is compatible

822

B. Vertman

with the relative boundary conditions by Theorem 3.5. Hence we obtain for the induced self-adjoint extensions t D(rkel ) ∩ L 2 ((0, R), 0 ⊕ u ik ) = D(dmax dmin ) ck

ck

= D (−1)k+1 ∂x + (−1)k ∂x + , x max x min el t D(rk+1 ) ∩ L 2 ((0, R), u ik ⊕ 0) = D(dmin dmax )

c ck

k = D (−1)k ∂x + (−1)k+1 ∂x + . x min x max

(3.7)

(3.8)

By the Hodge decomposition on the base manifold N the de Rham complex (∗0 (M), d) decomposes completely into subcomplexes of the two types above. This decomposition gives in each degree k a compatible decomposition for rkel , as observed Theorem 3.5. In the language of [W2] we have a decomposition into reducing subspaces of the Laplacians. Hence the Laplacians rkel induce self-adjoint relative extensions of the Laplacians of the subcomplexes. 3.3. Symmetry in the decomposition. In this section we present a symmetry of the de Rham complex on a model cone, as elaborated by M. Lesch in [L3]. For simplicity we continue the discussion with R = 1. Consider the subcomplexes (3.2) of the first type for a fixed non-zero ψ ∈ E ηk , η ∈ Vk in any degree k = 1, . . . , n: d0

d1

0 → C0∞ ((0, 1), ξ1 ) − → C0∞ ((0, 1), ξ2 , ξ3 ) − → C0∞ ((0, 1), ξ4 ) → 0 with the associated Laplacians (identified with their scalar action) 1 1 n 2 1 ψ ψ t ψ ψ ψ ψ 2 0 := (d0 ) d0 = −∂x + 2 η + k + − − = d1 (d1 )t =: 2 . (3.9) x 2 2 4 By Theorem 3.5 separating out the subcomplex above provides a decomposition into el , r el . Hence the relative boundary conditions induce reducing subspaces of rkel , rk+1 k+2 ψ ψ self-adjoint extensions of the Laplacians 0 , 2 , ψ

ψ

ψ

ψ

ψ

ψ

0,r el = (d0 )tmax d0,min , 2,r el = d1,min (d1 )tmax . ψ

ψ

Now we discuss the relative boundary conditions for 0 , 2 . In case

3 1 n 2 1 − ≥ , η+ k+ − 2 2 4 4 ψ

ψ

the Laplacians 0 and 2 are in the limit point case at x = 0 and hence all their self-adjoint extensions in L 2 (0, 1) coincide at x = 0. Then we only need to consider the relative boundary conditions at x = 1. With Proposition 2.5, which is essentially the trace theorem of L. Paquet in [P], we obtain ψ

ψ

ψ

ψ

D(0,r el ) = { f ∈ D(0,max )| f (1) = 0}, D(2,r el ) = { f ∈ D(2,max )|(−1)k f (1) + ck+1 f (1) = 0}.

Analytic Torsion

823 ψ

2 (0, 1]. The values f (1), f (1) are well-defined since D(0,2,max ) ⊂ Hloc If however

3 1 n 2 1 − < , η+ k+ − 2 2 4 4 ψ

ψ

then the Laplacians 0 and 2 are in the limit circle case at x = 0 and the boundary ψ conditions at x = 0 need to be considered as well. Consider any f ∈ D(0 ), which ψ ψ ψ in particular lies in D(d0,min ) ⊂ D(d0,max ). From the explicit form of d0 we infer that f ∈ D(1/x)max , where 1/x is the √ obvious multiplication operator. This implies the asymptotic behaviour f (x) = O( x) as x → 0. Similar argumentation holds for any ψ element of D(2 ) and we obtain the following result. ψ

ψ

Proposition 3.6. The self-adjoint realizations 0 and 2 act on the following domains: √ ψ ψ D(0,r el ) = { f ∈ D(0,max )| f (x) = O( x), x → 0, f (1) = 0}, √ ψ ψ D(2,r el ) = { f ∈ D(2,max )| f (x) = O( x), x → 0, (−1)k f (1) + ck+1 f (1) = 0}, where the values f (x), f (x) are well-defined for x ∈ (0, 1] and the condition on the asymptotic behaviour as x → 0 is redundant for

1 n 2 1 3 η+ k+ − − ≥ . 2 2 4 4 Next we consider the twin-subcomplex, associated to the subcomplex discussed above. Let φ := ∗ N ψ ∈ n−k (N ). Put 1 ξ1 := (0, √ d Nt φ) ∈ n−k−2 (N ) ⊕ n−k−1 (N ), η 1 ξ2 := ( √ d Nt φ, 0) ∈ n−k−1 (N ) ⊕ n−k (N ), η ξ3 := (0, φ) ∈ n−k−1 (N ) ⊕ n−k (N ), ξ4 := (φ, 0) ∈ n−k (N ) ⊕ n−k+1 (N ). Again the subspace C0∞ ((0, 1), ξ1 , ξ2 , ξ3 , ξ4 ) is invariant under the action of d and d t , and in fact we obtain a complex φ

d0

φ

d1

0 → C0∞ ((0, 1), ξ1 ) −→ C0∞ ((0, 1), ξ2 , ξ3 ) −→ C0∞ ((0, 1), ξ4 ) → 0. By computing explicitly the action of the exterior derivative (2.2) on the basis elements ξi we obtain cn−k

(−1)n−k−1 ∂x + cn−k−1 φ φ −1 √ n−k x . , d = x ∂ + d0 = η, (−1) √ x 1 x −1 η x As for the first subcomplex we compute the relevant Laplacians: 1 1 n 2 1 ψ ψ φ φ 2 0 = 2 = −∂x + 2 η + k + − − = 0 = 2 , x 2 2 4

(3.10)

824

B. Vertman

where the operators are identified with their scalar actions. As before, separating out el el compatibly. Hence the relative the subcomplex above, we decompose rn−k±1 , rn−k boundary conditions induce self-adjoint extensions φ

φ

φ

φ

φ

φ

0,r el = (d0 )tmax d0,min , 2,r el = d1,min (d1 )tmax φ

φ

of the Laplacians 0 , 2 respectively. By similar arguments as above, the relative boundary conditions for this pair of operators are given as follows. φ

φ

Proposition 3.7. The self-adjoint realizations 0 and 2 act on the following domains: √ φ φ D(0,r el ) = { f ∈ D(0,max )| f (x) = O( x), x → 0, f (1) = 0}, √ φ φ D(2,r el ) = { f ∈ D(2,max )| f (x) = O( x), x → 0, (−1)n−k+1 f (1) + cn−k f (1) = 0}, where the values f (x), f (x) are well-defined for x ∈ (0, 1] and the condition on the asymptotic behaviour as x → 0 is redundant for

3 1 n 2 1 − ≥ . η+ k+ − 2 2 4 4 4. The Scalar Analytic Torsion We continue our discussion of the relative self-adjoint extension rkel of the Laplace operator on k−differential forms over a bounded generalized cone M = (0, 1] × N over a closed oriented Riemannian manifold (N , g N ) of dimension dim N = n. The notation is fixed in Sect. 2. We will only need the following well-known result, which is a direct application of [L1, Prop. 1.4.7] Theorem 4.1. The self-adjoint operator rkel is discrete with the zeta-function λ−s , Re(s) > m/2, ζ (s, rkel ) = λ∈Sp(rkel )\{0}

being holomorphic for Re(s) > m/2. The meromorphic continuation of zeta-functions for general self-adjoint extensions of regular-singular operators is discussed in a series of sources, notably [L1, Theorem 2.4.1], [Ch2, Theorem 4.1] and [LMP, Theorem 5.7]. For a compact oriented Riemannian manifold X m the scalar analytic torsion ([RS]) is defined by 1 (−1)k · k · ζk (0), 2 m

log T (X ) :=

k=0

where ζk (s) denotes the zeta-function of the Laplacian on k-forms of X , with relative or the absolute boundary conditions posed at ∂X . On compact Riemannian manifolds the zeta-functions ζk (s) extend meromorphically to C with s = 0 being regular, so the definition makes sense. On the bounded generalized cone M the zeta-functions ζ (s, rkel ) possibly have a simple pole at s = 0. However we have the following result of A.Dar:

Analytic Torsion

825

Theorem 4.2 [Dar]. The meromorphic function 1 (−1)k · k · ζ (s, rkel ) 2 m

T (M, s) :=

(4.1)

k=0

is regular at s = 0. Thus the analytic torsion T r el (M) := exp(T (M, s = 0)) of a bounded generalized cone exists. Thus even though the zeta-functions ζ (s, rkel ) need not be regular at s = 0, their residua at s = 0 cancel in the alternating weighted sum T (M, s). The statement extends to general compact manifolds with isolated conical singularities. A compact manifold with a conical singularity is a Riemannian manifold (M1 ∪ N U, g) partitioned by a compact hypersurface N , such that M1 is a compact manifold with boundary N and U is isometric to (0, ] × N with the metric over U being of the following form: g|U = d x 2 ⊕ x 2 g| N . In this article we compute for the bounded generalized cone M the analytic continuation of log T (M, s) to s = 0 by means of a decomposition of the de Rham complex. In particular this leads to a proof of the statement in Theorem 4.2. 4.1. Contribution of subcomplexes to analytic torsion. By the Hodge decomposition on the base manifold N the de Rham complex (∗0 (M), d) decomposes completely into subcomplexes of the two types (3.2) and (3.6) from the previous section. This decomposition gives in each degree k a compatible decomposition for rkel , as observed in Theorem 3.5. Hence the Laplacians rkel induce self-adjoint relative extensions of the Laplacians of the subcomplexes. In particular each subcomplex contributes to the function in (4.1) as follows. The relative boundary conditions turn the complex (3.2) of the first type into a Hilbert complex (see [BL1]) of the following general form: D

D

0 → Hk − → Hk+1 − → Hk+2 → 0. By the specific form of the subcomplex we have the following relation between the zeta-functions corresponding to the Laplacians of the subcomplex ζ (s, D ∗ D + DD ∗ ) = ζ (s, D ∗ D) + ζ (s, DD ∗ ).

(4.2)

From the spectral relation (4.2) we deduce that the contribution of the subcomplex H to the function T (M, s) amounts to (−1)k kζ (s, D ∗ D) − (k + 1)ζ (s, D ∗ D + DD ∗ ) + (k + 2)ζ (s, DD ∗ ) 2 (−1)k (ζ (s, DD ∗ ) − ζ (s, D ∗ D)). = 2

(4.3)

Since there are in fact infinitely many subcomplexes of the first type, we first have to add up the contributions for Re(s) large and then continue the sum analytically to s = 0. Then the derivative at zero gives the contribution to T (M).

826

B. Vertman

For the contribution of the subcomplexes (3.6) of the second kind to the analytic torsion, note that the relative boundary conditions turn the complex of second type into a Hilbert complex of the following general form: D

0 → Hk − → Hk+1 → 0. There are only finitely many such subcomplexes, since dim H∗ (N ) < ∞. Hence we obtain directly for the contribution to log T (M) from each of such subcomplexes (−1)k+1 ∗ ζ (D D, s = 0). (4.4) 2 The major part of the computation deals with the contribution of subcomplexes of first type to the analytic torsion. Here one faces the challenge of summing over an infinite number of subcomplexes, whereas for the finitely many subcomplexes of second type one can use already known results on zeta-determinants of regular singular Sturm-Liouville operators by M. Lesch in [L]. Recall the explicit form of the symmetric pairs of subcomplexes of first type. Let ψ ∈ E ηk , η ∈ Vk , k = 1, . . . , n be a fixed non-zero generator of E ηk . Put ξ1 := (0, ψ) ∈ k−1 (N ) ⊕ k (N ), ξ2 := (ψ, 0) ∈ k (N ) ⊕ k+1 (N ), 1 ξ3 := (0, √ d N ψ) ∈ k (N ) ⊕ k+1 (N ), η 1 ξ4 := ( √ d N ψ, 0) ∈ k+1 (N ) ⊕ k+2 (N ). η The associated subcomplex is given as follows: d0

d1

→ C0∞ ((0, 1), ξ2 , ξ3 ) − → C0∞ ((0, 1), ξ4 ) → 0 0 → C0∞ ((0, 1), ξ1 ) − with the associated Laplacians (identified with their scalar action), 1 1 n 2 1 ψ ψ t ψ ψ ψ ψ 2 0 := (d0 ) d0 = −∂x + 2 η + k + − − = d1 (d1 )t =: 2 . x 2 2 4

(4.5)

We have by Proposition 3.6,

√ ψ ψ D(0,r el ) = { f ∈ D(0,max )| f (x) = O( x), x → 0, f (1) = 0}, √ ψ ψ D(2,r el ) = { f ∈ D(2,max )| f (x) = O( x), x → 0, (−1)k f (1) + ck+1 f (1) = 0},

where the condition on the asymptotic behaviour at x = 0 is redundant for η ∈ Vk large enough. The twin-subcomplex to the subcomplex from above is given as follows. Let φ := ∗ N ψ ∈ n−k (N ). Put 1 t ξ1 := 0, √ d N φ ∈ n−k−2 (N ) ⊕ n−k−1 (N ), η 1 t ξ2 := √ d N φ, 0 ∈ n−k−1 (N ) ⊕ n−k (N ), η ξ3 := (0, φ) ∈ n−k−1 (N ) ⊕ n−k (N ), ξ4 := (φ, 0) ∈ n−k (N ) ⊕ n−k+1 (N ).

Analytic Torsion

827

The associated subcomplex is given as follows: d

φ

d

φ

0 1 0 → C0∞ ((0, 1), ξ1 ) −→ C0∞ ((0, 1), ξ2 , ξ3 ) −→ C0∞ ((0, 1), ξ4 ) → 0

with the associated Laplacians (identified with their scalar action) as before, 1 1 n 2 1 ψ ψ φ φ 2 − 0 = 2 = −∂x + 2 η + k + − = 0 = 2 . x 2 2 4 By Proposition 3.7 we obtain: √ φ φ D(0,r el ) = { f ∈ D(0,max ) | f (x) = O( x), x → 0, f (1) = 0}, √ φ φ D(2,r el ) = { f ∈ D(2,max ) | f (x) = O( x), x → 0, (−1)n−k+1 f (1) + cn−k f (1) = 0}. So in total we obtain four self-adjoint operators, which differ only by their boundary conditions at x = 1. Unfortunately the differences in the domains do not allow to cancel the contribution of the two twin-subcomplexes to the analytic torsion. However the symmetry still allows us to perform explicit computations. Recall that ψ was chosen to be a normalized coclosed η-eigenform on N of degree k and φ = ∗ N ψ. Denote the dependence of the generating forms ψ and φ on the eigenvalue η by ψ(η) and φ(η). Introduce further the notation ψ(η)

D(k) := {λ ∈ Spec0,r el |η ∈ Speck,ccl,N \{0}} φ(η)

= {λ ∈ Spec0,r el |η ∈ Speck,ccl,N \{0}}, ψ(η)

N1 (k) := {λ ∈ Spec2,r el |η ∈ Speck,ccl,N \{0}}, φ(η)

N2 (k) := {λ ∈ Spec2,r el |η ∈ Speck,ccl,N \{0}}, where all eigenvalues are counted according to their multiplicities. Using this notation we now introduce the following zeta-functions for Re(s) 0: ζ Dk (s) :=

λ∈D(k)

λ−s , ζ Nk 1 (s) :=

λ∈N1 (k)

λ−s , ζ Nk 2 (s) :=

λ−s , Re(s) 0.

λ∈N2 (k)

The D-subscript is aimed to point out that the zeta-functions in the sum are associated to Laplacians with Dirichlet boundary conditions at x = 1. Similarly the N -subscripts point out the generalized Neumann boundary conditions at x = 1, which are however ψ φ different for 2,r el and 2,r el . The zeta-functions ζ Dk (s), ζ Nk 1 (s) and ζ Nk 2 (s) are by Theorem 4.1 holomorphic for Re(s) sufficiently large, since they sum over eigenvalues of r∗el but with lower multiplicities. In view of (4.3), which describes the contribution to analytic torsion from the subcomplexes, we set for Re(s) large Definition 4.3. ζk (s) := ζ Nk 1 (s) − ζ Dk (s)) + (−1)n−1 (ζ Nk 2 (s) − ζ Dk (s)).

828

B. Vertman

Remark 4.4. Note that ζ Dk (s) in the definition of ζk (s) cancel for m = dim M odd, simplifying the expression for ζk (s) considerably. Further simplifications (notably Proposition 6.8) take place throughout the discussion, so that an effective result can be obtained in the end. Below we provide the analytic continuation of ζk (s) to s = 0 for any fixed degree k < dim N − 1 and compute (−1)k ζk (0). The contribution coming from the subcomplexes of the second type (3.6), induced by the harmonic forms on the base N , is not included in ζk (s) and will be determined explicitly in a separate discussion. Remark 4.5. The total contribution of subcomplexes (3.2) of first type to the logarithmic scalar analytic torsion log T (M) of the odd-dimensional bounded generalized cone M is given by n/2−1 1 (−1)k · ζk (0). 2 k=0

For an even-dimensional cone M the zeta-function ζk (s) counts in the degree k = (n − 1)/2 each subcomplex of type (3.2) twice. Thus the total contribution of subcomplexes of first type to log T (M) is given by ⎞ ⎛ (n−1) (n−3)/2 2 1⎝ (−1)

· ζ n−1 (−1)k · ζk (0) + (0)⎠ , 2 2 2 k=0

where the first sum is set to zero for dim M = n + 1 = 2. 5. Some Auxiliary Analysis Fix a positive real number ν > 0 and consider the following differential operator: lν := −

d2 ν 2 − 1/4 + : C0∞ (0, 1) → C0∞ (0, 1). dx2 x2

Define for any α ∈ R∗ a self-adjoint extension of lν as follows: √ D(L ν (α)) := { f ∈ D(lν,max )| f (x) = O( x), x → 0,(α − 1/2)−1 f (1) + f (1) = 0}, where α = ∞ defines the Dirichlet boundary conditions at x = 1 and α = 1/2 the pure Neumann boundary conditions at x = 1. The condition on the asymptotic behaviour at x = 0 is redundant in case ν > 1. In this case the operator lν is in the limit point case at x = 0 and then only boundary conditions at x = 1 are required to define a self-adjoint extension of lν . Proposition 5.1. The self-adjoint operator L ν (α), α ∈ R∗ is discrete and bounded from below. For α 2 < ν 2 and α = ∞ the operator L ν (α) is positive.

Analytic Torsion

829

Proof. The discreteness of L ν (α) is asserted in [BS2], see also [L, Theorem 1.1] where this result is restated. For semi-boundedness of L ν (α) consider any f ∈ D(L ν (α)), 2 (0, 1], which implies that f is continuously differf ≡ 0. Recall f ∈ D(lν,max ) ⊂ Hloc

entiable at (0, 1) and f, f extend continuously to x = 1. We compute via integration by parts for any ∈ R with 2 < ν 2 : 1 d d − 1/2 − 1/2 − f (x) lν f, f L 2 (0,1) = + + dx x dx x 0 1 ν2 − 2 − 1/2

+ f (x) · f (x) · f (x)d x = − f (x) + f (x) 2 x x x→0+ √ 2 2 2 2 − 1/2 ν − f f + + . f + 2 x x L 2 (0,1) L (0,1)

The standard asymptotic behaviour of elements in D(lν,max ), with ν > 0, compare [BV, Prop. 2.10 (ii) and (iii)], implies together with f ≡ 0: lν f, f L 2 (0,1) > − f (1) · ( f (1) + ( − 1/2) f (1)).

(5.1)

Evaluation of the boundary conditions at x = 1 for f ∈ D(L ν (α)) with α = ∞ or α 2 < ν 2 proves the statement. Corollary 5.2. Let Jν (z) denote the Bessel function of first kind and put for any fixed α ∈ R∗ , Jνα (z) := α Jν (z) + z Jν (z), where for α = ∞ we put Jνα (z) := Jν (z). Then for ν > 1 and α = ∞ or α 2 < ν 2 , the zeros of Jνα (z) are real, discrete and symmetric about the origin. The eigenvalues of the positive operator L ν (α) are simple and given by squares of positive zeros of Jνα (z), i.e. SpecL ν (α) = {µ2 | Jνα (µ) = 0, µ > 0}. Proof. The general solution to lν f = µ2 f, µ = 0 is of the following form: √ √ f (x) = c1 x Jν (µx) + c2 xYν (µx), where c1 , c2 are constants and Jν , Yν denote Bessel functions of first and second kind, respectively. The asymptotic behaviour at x = 0 of f ∈ D(L ν (α)) is given by f (x) = √ O( x), x → 0. Hence a solution to lν f = µ2 f, µ = 0 with f ∈ D(L ν (α)) must be of the form √ f (x) = c1 x Jν (µx). Taking in account the boundary conditions at x = 1 for L ν (α) with at α = ∞ or α 2 < ν 2 , we deduce correspondence between zeros of Jνα (z) and eigenvalues of L ν (α). Hence by Proposition 5.1 we deduce the statements about the zeros of Jνα (z), up to the statement on the symmetry of zeros, which follows simply from the standard infinite series representation of Bessel functions. Furthermore, Jν (−µx) = (−1)ν Jν (µx), µ = 0 and hence each eigenvalue µ2 of L (α) is simple with the unique (up to a multiplicative constant) eigenfunction f (x) = √ν x Jν (µx), µ > 0. This completes the proof.

830

B. Vertman

Similar statements can be deduced for more general values of α ∈ R∗ , but are not relevant in the present discussion. Finally note as a direct application of Proposition 5.1 ψ φ that the Laplacians 0,2,r el and 0,2,r el , introduced in the previous section, are positive. Corollary 5.3. For all degrees k = 0, . . . , dim N we have D(k) ⊂ R+ , Ni (k) ⊂ R+ , i = 1, 2. Next consider the zeta-function ζ (s, L ν (α)), α ∈ R∗ associated to the self-adjoint realization L ν (α) of lν . It is well-known, see [L, Theorem 1.1] that the zeta-function extends meromorphically to C with the analytic representation given by the Mellin transform of the heat trace: ∞ 1 ζ (s, L ν (α)) = t s−1 Tr L 2 (e−t L ν (α) P)dt, (s) 0 where P is the projection on the orthogonal complement of the null space of L ν (α). The heat operator exp(−t L ν (α)) is defined by the spectral theorem and is a bounded smoothing operator with finite trace Tr L 2 (e−t L ν (α) P) of standard polylogarithmic asymptotics as t → 0+, see [Ch, Theorem 2.1]. We can write for t > 0, 1 e−λt Tr(λ − L ν (α))−1 dλ, Tr(e−t L ν (α) ) = 2πi where the contour shall encircle all non-zero eigenvalues of the semi-bounded L ν (α), α ∈ R∗ and be counter-clockwise oriented, in analogy to Fig. 1 below. Now, following [S], we obtain an integral representation for ζ (s, L ν (α)) in a computationally convenient form. Introduce a numbering (λn ) of the eigenvalues of L ν (α) and observe 1 d λ Tr(λ − L ν (α))−1 = , log 1 − = λ − λn dλ λn n n where we fix henceforth the branch of logarithm in C\R+ with 0 ≤ Im log z < 2π . We continue with this branch of logarithm throughout the section. Integrating now by parts first in λ, then in t we obtain ∞ −λt 1 e s2 λ ζ (s, L ν (α)) = t s−1 log 1 − dλdt. (5.2) − (s + 1) 0 2πi −λ λn n 6. Contribution from the Subcomplexes I We continue in the setting and in the notation of Sect. 3.3 and fix any degree k ≤ dim N − 1. We define the following contour: c := {λ ∈ C||arg(λ − c)| = π/4}

(6.1)

oriented counter-clockwise, with c > 0 a fixed positive number, smaller than the lowest non-zero eigenvalue of r∗el . The contour is visualized in Fig. 1 below, with ×’s representing the eigenvalues of r∗el . In analogy to the constructions of [S] we obtain for the zeta-functions ζ Dk (s), ζ Nk 1 (s), k ζ N2 (s) the following results.

Analytic Torsion

831

Fig. 1. Integration contour c

Proposition 6.1. Let M = (0, 1] × N , g M = d x 2 ⊕ x 2 g N be a bounded generalized cone. Denote by k,ccl,N the Laplace Operator on coclosed k-forms on N . Let Fk := {ξ ∈ R+ | ξ 2 = η + (k + 1/2 − n/2)2 , η ∈ Speck,ccl,N \{0}}. Then we obtain with ( jν,i )i∈N being the positive zeros of the Bessel function Jν (z), ∞ 1 e−λt k s2 t s−1 (6.2) ζ Dk (s) = T (s, λ)dλdt, (s + 1) 0 2πi ∧c −λ D ∞ ν2λ k D,k −2s D,k TD (s, λ) = tν (λ) ν , tν (λ) = − log 1 − 2 . (6.3) jν,i ν∈F i=1 k

ψ(η)

φ(η)

and 0 , defined in Proof. Consider for η ∈ Speck,ccl,N \{0} the operators 0 (3.5) and (3.10). Under the identification with their scalar parts we have

1 1 ψ(η) φ(η) 2 2 , 0 = 0 = −∂x + 2 ν − x 4 where ν := η + (k + 1/2 − n/2)2 . By Corollary 5.2 we obtain: ∞ ∞ jν,i −2s −2s jν,i = ν −2s , Re(s) 0, ζ Dk (s) = ν ν∈Fk i=1

ν∈Fk

i=1

where jν,i are the positive zeros of Jν (z). This series is well-defined for Re(s) large by ψ(η) φ(η) Theorem 4.1, since 0,r el (≡ 0,r el ) as direct sum components of r∗el have the same spectrum as r∗el , but with lower multiplicities in general. Due to the uniform convergence of integrals and series we obtain with similar computations as for (5.2) an integral representation for this sum: ∞ 1 e−λt k s2 ζ Dk (s) = T (s, λ)dλdt, t s−1 (6.4) (s + 1) 0 2πi ∧c −λ D ∞ ν2λ k D,k −2s D,k TD (s, λ) = tν (λ) ν , tν (λ) = − log 1 − 2 . (6.5) jν,i ν∈F i=1 k

φ(η)

ψ(η)

Note that the contour c , defined in (6.1) encircles all eigenvalues of 0,r el ≡ 0,r el by construction, since the operators are positive by Corollary 5.3.

832

B. Vertman

Proposition 6.2. Let M = (0, 1] × N , g M = d x 2 ⊕ x 2 g N be a bounded generalized cone. Denote by k,ccl,N the Laplace Operator on coclosed k-forms on N . Let Fk := {ξ ∈ R+ | ξ 2 = η + (k + 1/2 − n/2)2 , η ∈ Speck,ccl,N \{0}}. Then we obtain for l = 1, 2, ζ Nk l (s) =

TNk l (s, λ) =

s2 (s + 1)

ν∈Fk

0

∞

t s−1

1 2πi

∧c

e−λt k T (s, λ)dλdt, −λ Nl ∞

(6.6)

ν2λ tνNl ,k (λ) ν −2s , tνNl ,k (λ) = − log 1 − 2 jν,l,i i=1

,

(6.7)

where ( jν,l,i )i∈N are the positive zeros of JνNl ,k (z) for l = 1, 2. The functions JνNl (z) are defined as follows: 1 + (−1)k ck+1 Jν (z) + z Jν (z), JνN1 ,k (z) := 2 1 + (−1)k ck Jν (z) + z Jν (z). JνN2 ,k (z) := 2 ψ(η)

φ(η)

Proof. Consider for η ∈ Speck,ccl,N \{0} the operators 2 and 2 , defined in (3.5) and (3.10), which contribute to the zeta-functions ζ Nk 1 (s) and ζ Nk 2 (s) correspondingly. Under the identification with their scalar parts we have

1 1 ψ(η) φ(η) , 2 = 2 = −∂x2 + 2 ν 2 − x 4 where ν := η + (k + 1/2 − n/2)2 . Recall √ ψ ψ D(2,r el ) = { f ∈ D(2,max )| f (x) = O( x), x → 0, f (1) + (−1)k ck+1 f (1) = 0}, √ φ φ D(2,r el ) = { f ∈ D(2,max )| f (x) = O( x), x → 0, f (1) + (−1)n−k+1 cn−k f (1) = 0}. Observe (−1)n−k+1 cn−k = (−1)k ck and put 1 + (−1)k ck+1 Jν (µ) + µJν (µ), JνN1 ,k (µ) := 2 1 + (−1)k ck Jν (µ) + µJν (µ). JνN2 ,k (µ) := 2 Note for any degree k and any ν ∈ Fk , 1 + (−1)k ck+1 = 1 + (−1)k ck = n − 1 − k < ν. 2 2 2 2

Analytic Torsion

833

Hence by Corollary 5.2 we obtain for l = 1, 2: ζ Nk l (s)

=

∞ ν∈Fk i=1

−2s = jν,l,i

ν

−2s

ν∈Fk

−2s ∞ jν,l,i i=1

ν

,

Re(s) 0,

where jν,l,i are the positive zeros of JνNl ,k (z) for l = 1, 2. This series is well-defined ψ(η) φ(η) for Re(s) large by Theorem 4.1, since 2,r el , 2,r el as direct sum components of r∗el r el have the same spectrum as ∗ , but with lower multiplicities in general. Due to the uniform convergence of integrals and series we obtain with similar computations as for (5.2) an integral representation for this sum:

∞

e−λt k TNl (s, λ)dλdt, ∧c −λ 0 ∞ ν2λ k Nl ,k −2s Nl ,k TNl (s, λ) = tν (λ) ν , tν (λ) = − log 1 − 2 . jν,l,i ν∈F i=1 ζ Nk l (s) =

s2 (s + 1)

t s−1

1 2πi

(6.8) (6.9)

k

φ(η)

ψ(η)

Note that the contour c encircles all the possible eigenvalues of 2,r el , 2,r el by construction, since the operators are positive by Corollary 5.3. Corollary 6.3. Let M = (0, 1] × N , g M = d x 2 ⊕ x 2 g N be a bounded generalized cone. Then we obtain with Definition 4.3 in the notation of Propositions 6.1 and 6.2, ζk (s) =

1 e−λt k T (s, λ)dλdt, 2πi ∧c −λ 0 T k (s, λ) := tνk (λ) ν −2s ,

s2 (s + 1)

∞

t s−1

ν∈Fk

tνk (λ) := tνN1 ,k (λ) − tνD,k (λ)) + (−1)n−1 (tνN2 ,k (λ) − tνD,k (λ)) . If dim M is odd we obtain with z :=

√ −λ and αk := n/2 − 1/2 − k,

αk

tνk (λ) = − log(αk Iν (νz) + νz Iν (νz)) + log 1 + ν α k

+ log(−αk Iν (νz) + νz Iν (νz)) − log 1 − . ν For dim M even we compute with z :=

√

−λ,

αk

tνk (λ) = − log(αk Iν (νz) + νz Iν (νz)) + log 1 + ν

αk

− log(−αk Iν (νz) + νz Iν (νz)) + log 1 − ν + 2 log(Iν (νz)) + 2 log ν .

(6.10)

834

B. Vertman

Proof. Recall for convenience the definition of ζk (s) in Definition 4.3, ζk (s) := ζ Nk 1 (s) − ζ Dk (s)) + (−1)n−1 (ζ Nk 2 (s) − ζ Dk (s)). The integral representation and the definition of tνk (λ) are then a direct consequence of Propositions 6.1 and 6.2. It remains to present tνk (λ) in terms of special functions. In order to simplify notation we put (recall c j := (−1) j ( j − n/2)) 1 1 1 n k k αk := + (−1) ck+1 = − − k = − + (−1) ck . 2 2 2 2 Now we present tνD,k (λ) and tνNl ,k (λ), l = 1, 2 in terms of special functions. This can be done by referring to tables of Bessel functions in [GRA or AS]. However in the context of the paper it is more appropriate to derive the presentation from results on zeta-regularized determinants. Here we follow the approach of [L, Sect. 4.2] in a slightly different setting. The original setting of [L, (4.22)] provides an infinite product representation for Iν (z). We apply its approach in order to derive the corresponding result for IνN (z) :=

α Iν (z) + z Iν (z), with α ∈ {±αk } and ν ∈ Fk . Consider now the following regular-singular Sturm-Liouville operator and its selfadjoint extension with α ∈ {±αk } and ν ∈ Fk , d2 1 1 : C0∞ (0, 1) → C0∞ (0, 1), lν := − 2 + 2 ν 2 − dx x 4 √ D(L ν (α)) := { f ∈ D(lν,max )| f (x) = O( x), x → 0, f (1) + (α − 1/2) f (1) = 0}. Note we have α 2 < ν 2 by construction and in particular α = −ν. Thus we find by Proposition 5.1 that ker L ν (α) = {0} and det ζ (L ν (α)) =

√

2π

α+ν . + 1)

2ν (ν

(6.11)

Denote by φ(x, z), ψ(x, z) the solutions of (lν +z 2 ) f = 0, normalized in the sense of [L, (1.38a), (1.38b)] at x = 0 and x = 1, respectively. The general solution to (lν + z 2 ) f = 0 is of the following form: √ √ f (x) = c1 x Iν (zx) + c2 x K ν (zx). Applying the normalizing conditions of [L, (1.38a), (1.38b)] we obtain straightforwardly ψ(1, z) = 1, ψ (1, z) = 1/2 − α, φ(1, z) = 2ν (ν + 1)z −ν Iν (z) with φ(1, 0) = 1, φ (1, z) = 2ν (ν + 1)z −ν (Iν (z) · 1/2 + z Iν (z)) with φ (1, 0) = ν + 1/2. Finally by [L, Prop. 4.6] we obtain with {λn }n∈N being a counting of the eigenvalues of L ν (α): ∞ z2 1+ . (6.12) det ζ (L ν (α) + z 2 ) = det ζ (L ν (α)) · λn n=1

Analytic Torsion

835

Since ker L ν (α) = {0}, for all n ∈ N we have λn = 0. Denote the positive zeros of JνN (z) := α Jν (z) + z Jν (z) by ( jν,i )i∈N . Note in the notation of Proposition 6.2 that for α = αk , JνN (z) = JνN1 ,k (z) and for α = −αk , JνN (z) = JνN2 ,k (z). Observe by Corollary 5.2: 2 Spec(L ν (α)) = { jν,i |i ∈ N}.

Using the product formula (6.12) and [L, Theorem 1.2] applied to L ν (α)+z 2 , we compute in view of (6.11), ∞ 2ν (ν) z2 W (φ(·, z); ψ(·, z)) = ν (α Iν (z) + z Iν (z)) 1+ 2 = α + ν z (1 + α/ν) j ν,i i=1 ∞

ν 2 α z z N

1+ ⇒ Iν (z) ≡ α Iν (z) + z Iν (z) = ν 1 + 2 . (6.13) 2 (ν) ν jν,i i=1 The original computations of [L, (4.22)] provide an analogous result for Iν (z), ∞ zν z2 Iν (z) = ν 1+ 2 , 2 (ν + 1) jν,i i=1 where jν,i are the positive zeros of Jν (z). Finally in view of the series representations D,k (λ) and t Nl ,k (λ), l = 1, 2 derived in Propositions 6.1 and 6.2 we obtain with for tν√ ν z = −λ, (νz)ν , (6.14) tνD,k (λ) = − log Iν (νz) + log ν 2 (ν + 1) αl

(νz)ν 1+ , (6.15) tνNl ,k (λ) = − log(αl Iν (νz) + νz Iν (νz)) + log ν 2 (ν) ν where αl = αk if l = 1 and αl = −αk if l = 2. Putting together these two results we obtain with Definition 4.3 the statement of the corollary. Now we turn to the discussion of T k (s, λ). For this we introduce the following zetafunction for Re(s) large: ζk,N (s) := ν −s = (ν 2 )−s/2 , ν∈Fk

ν∈Fk

where ν ∈ Fk are counted with their multiplicities and the second equality is clear, since ν ∈ Fk are positive. Recall that ν ∈ Fk solves ν 2 = η + (k + 1/2 − n/2)2 , η ∈ Speck,ccl,N \{0}, and hence ζk,N (2s) is simply the zeta-function of k,ccl,N + (k + 1/2 − n/2)2 . By standard theory ζ (2s) extends (note that ζ (2s) can be presented by an alternating sum of zeta functions of j,N + (k + 1/2 − n/2)2 , j = 0, . . . , k) to a meromorphic function with possible simple poles at the usual locations {(n − p)/2| p ∈ N} and s = 0 being

836

B. Vertman

a regular point. Thus the 1/ν r dependence in tνk (λ) causes a non-analytic behaviour of T k (s, λ) at s = 0 for r = 1, . . . , n, since ν∈Fk

ν −2s

1 = ζk,N (2s + r ) νr

possesses possibly a pole at s = 0. Therefore the first n = dim N leading terms in the asymptotic expansion of tνk (λ) for large orders ν are to be removed. We put tνk (λ) =: pνk (λ) +

n 1 k f (λ), νr r

P k (s, λ) :=

r =1

ν>1

pνk (λ) ν −2s .

(6.16)

In order to get explicit expressions for frk (λ) we need the following expansions of Bessel-functions for large order ν, see [O, Sect. 9]: ∞ u r (t) 1 eνη Iν (νz) ∼ √ 1+ , νr 2π ν (1 + z 2 )1/4 r =1 ∞ vr (t) 1 eνη

Iν (νz) ∼ √ 1+ , νr 2π ν z(1 + z 2 )−1/4 r =1

√

where we put z := −λ, t := (1 + z 2 )−1/2 and √ η := 1/t + log(z/(1 + 1/t)). Recall that λ ∈ c , defined in (6.1). The induced z = −λ is contained in {z ∈ C||arg(z)| < π/2} ∪ {i x|x ∈ (−1, 1)}. This is precisely the region of validity for these asymptotic expansions, determined in [O, (7.18)]. The same expansions are quoted in [BKD, Sect. 3]. In particular we have as in [BKD, (3.15)] the following expansion in terms of orders ∞ ∞ u r (t) Dr (t) log 1 + , (6.17) ∼ r ν νr r =1 r =1 ∞ ∞ ∞ vr (t) u r (t) Mr (t, ±αk ) αk log 1 + t 1 + , (6.18) ± ∼ r r ν ν ν νr k=1

r =1

r =1

where Dr (t) and Mr (t, ±αk ) are polynomial in t. Using these series representations we prove the following result. √ Lemma 6.4. For dim M being odd we have with z := −λ, t := (1 + z 2 )−1/2 = √ 1/ 1 − λ and αk = n/2 − 1/2 − k, frk (λ) = Mr (t, −αk ) − Mr (t, +αk ) + (−1)r +1

αkr − (−αk )r . r

For dim M being even we have in the same notation frk (λ) = −Mr (t, −αk ) − Mr (t, +αk ) + 2Dr (t) + (−1)r +1

αkr + (−αk )r . r

Analytic Torsion

837

Proof. We get by the series representation (6.17) and (6.18) the following expansions for large orders ν: ∞ Mr (t, ±αk ) ν eνη + , log(±αk Iν (νz) + νz Iν (νz)) ∼ log √ νr 2π ν z(1 + z 2 )−1/4 r =1 ∞ Dr (t) 1 eνη + . log(Iν (νz)) ∼ log √ 2 1/4 νr 2π ν (1 + z ) r =1

Furthermore, with ν > |αk | for ν ∈ Fk we obtain ∞ (±αk )r αk = (−1)r +1 . log 1 ± ν r νr r =1

Hence in total we obtain an expansion for tνk (λ) in terms of orders ν: ∞ r r 1 r +1 αk − (−αk ) , M (t, −α ) − M (t, +α ) + (−1) tνk (λ) ∼ r k r k νr r r =1

for dim M odd, ∞ r r 1 r +1 αk + (−αk ) tνk (λ) ∼ 2D (t) − M (t, −α ) − M (t, +α ) + (−1) r r k r k νr r r =1 λ , for dim M even. + log λ−1 From here the explicit result for frk (λ) follows by its definition. From the integral representation (6.10) we find that the singular behaviour enters the zeta-function in form of ∞ n s2 e−λt k s−1 1 ζk,N (2s + r ) f (λ)dλdt. t (s + 1) 2πi ∧c −λ r 0 r =1

We compute explicitly this contribution coming from frk (λ) in terms of the polynomial structure of Mr and Dr . It can be derived from (6.17) and (6.18), see also [BKD, (3.7), (3.16)], that the polynomial structure of Mr and Dr is given by Dr (t) =

r

xr,b t

r +2b

,

Mr (t, ±αk ) =

b=0

r

zr,b (±αk )t r +2b .

b=0

Lemma 6.5. For dim M odd we obtain ∞ r 1 e−λt k (s + b + r/2) fr (λ)dλdt = . t s−1 (zr,b (−αk ) − zr,b (αk )) 2πi ∧c −λ s(b + r/2) 0 b=0

For dim M even we obtain

0

∞

t s−1

1 2πi

∧c

e−λt k f (λ)dλdt −λ r

r (s + b + r/2) . = (2xr,b − zr,b (−αk ) − zr,b (αk )) s(b + r/2) b=0

838

B. Vertman

Proof. Observe from [GRA, 8.353.3] by substituting the new variable x = λ − 1, with a > 0: 1 1 1 e−λt 1 −t e−xt dλ = − dx e a 2πi ∧c −λ (1 − λ) 2πi x + 1 (−x)a ∧c−1 1 = sin(πa)(1 − a)(a, t). π Using now the relation between the incomplete Gamma function and the probability integral ∞ (s + a) , t s−1 (a, t)dt = s 0 we obtain

∞

1 e−λt dλdt a ∧c −λ (1 − λ) 0 (s + a) (s + a) 1 = . = sin (πa) (1 − a) π s s(a) t s−1

1 2πi

Further note for t > 0, 1 2πi

∧c

e−λt dλ = 0, −λ

since the contour c does not encircle the pole λ = 0 of the integrand. Hence the λ−independent part of frk (λ) vanishes after integration. The statement is now a direct consequence of Lemma 6.4. Next we derive asymptotics of pνk (λ) := tνk (λ) − rn=1 ν1r frk (λ) for large arguments λ and fixed order ν. Proposition 6.6. For large arguments λ and fixed order ν we have the following asymptotics:

pνk (λ) = aνk log(−λ) + bνk + O (−λ)−1/2 , where for dim M odd aνk

= 0,

bνk

n αr − (−αk )r αk αk

− log 1 − − = log 1 + (−1)r +1 k ν ν r νr

,

r =1

and for dim M even

aνk

= −1,

bνk

αr + (−αk )r αk αk

+ log 1 − − = log 1 + (−1)r +1 k ν ν r νr

n

r =1

.

Analytic Torsion

839

Proof. For large argument λ we obtain t=√

1 = O (−λ)−1/2 . =√ 1−λ 1 + z2 1

Therefore the polynomials ! Mr (t, ±αk ) and Dr (t), having no constant terms, are of asymptotics O (−λ)−1/2 for large λ. Hence directly from Lemma 6.4 we obtain in odd dimensions for large λ,

r r frk (λ) r +1 (αk ) − (−αk ) −1/2 . ∼ (−1) + O (−λ) νr r νr

(6.19)

In even dimensions we get

r r frk (λ) r +1 (αk ) + (−αk ) −1/2 . ∼ (−1) + O (−λ) νr r νr

(6.20)

It remains to identify explicitly the asymptotics of tνk (λ). Note by [AS, p. 377] the following expansions for large arguments and fixed order: 1 1 ez ez , Iν (z) = √ . 1+O 1+O Iν (z) = √ z z 2π z 2π z √ These expansions hold for |arg(z)| < π/2 and in particular for z √ = −λ with λ ∈ c large, where c is defined in (6.1). Further observe for such z = −λ, λ ∈ c large:

1 log 1 + O = O (−λ)−1/2 , z αk ⇒ log(±αk + νz) = log z + log ν + log 1 ± νz

−1/2 . = log z + log ν + O (−λ) Together with the expansions of the Bessel-functions we obtain for tνk (λ) defined in Corollary 6.3,

αk

αk

tνk (λ) = log 1 + − log 1 − + O (−λ)−1/2 , ν ν for dim M odd,

αk

αk

tνk (λ) = − log(−λ) + log 1 + + log 1 − + O (−λ)−1/2 , ν ν for dim M even. Recall the definition of pνk (λ) in (6.16). Combining this with (6.19) and (6.20) we obtain the desired result. Definition 6.7. With the coefficients aνk and bνk defined in Proposition 6.6, we set for Re(s) 0, Ak (s) := aνk ν −2s , B k (s) := bνk ν −2s . ν∈Fk

ν∈Fk

840

B. Vertman

Now the last step towards the evaluation of the zeta-function of Corollary 6.3 is the discussion of P k (s, λ) := pνk (λ) ν −2s , Re(s) 0. ν∈Fk

At this point the advantage of taking into account the symmetry of the de Rham complex is particularly visible: Proposition 6.8. P k (s, 0) = 0. Proof. As λ → 0 we find that t = (1 − λ)−1/2 tends to 1. Since as in [BGKE, (4.24)] Mr (1, ±αk ) = Dr (1) + (−1)r +1

(±αk )r r

(6.21)

we find with Lemma 6.4 that in both the even- and odd-dimensional case frk (λ) → 0 as λ → 0. Thus we simply need to study the behaviour of tνk (λ) defined in Corollary 6.3 for small arguments. The results follow from the asymptotic behaviour of Bessel functions of second order for small arguments which holds without further restrictions on z, Iν (z) ∼

z ν 1 , |z| → 0. (ν + 1) 2

Using the relation Iν (z) = 21 (Iν+1 (z) + Iν−1 (z) we compute as |z| → 0, νz ν αk νz 2 ν 1± + , ∼ (ν + 1) 2 ν 4(ν + 1)

νz ν ν ν Iν (νz) ∼ . (ν + 1) 2

±αk Iν (νz) + νz Iν (νz)

The result now follows from the explicit form of tνk (λ).

Remark 6.9. The statement of Proposition 6.8 shows an obvious advantage of taking in account the symmetry of the de Rham complex. Now we have all the ingredients together, since by analogous arguments as in [S, Sect. 4.1] the total zeta-function of Corollary 6.3 is given as follows: s 1 [γ Ak (s) − B k (s) − Ak (s) + P k (s, 0)] (s + 1) s ∞ n 2 s e−λt k s2 s−1 1 ζk,N (2s + r ) fr (λ)dλdt + h(s), + t (s + 1) 2πi ∧c −λ (s + 1) 0

ζk (s) =

r =1

where the last term vanishes with its derivative at s = 0. Simply by inserting the results of Lemma 6.5, Proposition 6.6, Proposition 6.8 together with Definition 6.7 into the above expression we obtain the following proposition:

Analytic Torsion

841

Proposition 6.10. Let M = (0, 1] × N , g M = d x 2 ⊕ x 2 g N be a bounded generalized cone. Up to a term of the form s 2 h(s)/ (s + 1), which vanishes with its derivative at s = 0, the zeta-function ζk (s) from Definition 4.3 is given for dim M odd by ⎡ s αk −2s αk

⎣ − ν −2s log 1 − ν log 1 + (s + 1) ν ν ν∈Fk ν∈Fk ⎤ n αr − (−αk )r ⎦ ζk,N (2s + r ) (−1)r +1 k + r r =1

+

n r =1 r

×

ζk,N (2s + r )

b=0

s (s + 1)

(s + b + r/2) (zr,b (−αk ) − zr,b (αk )) . (b + r/2)

For dim M even we obtain ⎡ s αk −2s αk

⎣− − ν −2s log 1 − ν log 1 + (s + 1) ν ν ν∈Fk

ν∈Fk

⎤ r + (−α )r α 1 k ⎦ −γ + ν −2s ζk,N (2s + r )(−1)r +1 k + s r r =1 ν∈Fk r n s ζk,N (2s + r ) (2xr,b − zr,b (−αk ) + (s + 1) r =1 b=0 (s + b + r/2) . − zr,b (αk )) (b + r/2) Corollary 6.11. With ζk,N (s, a) := ν∈Fk (ν + a)−s we deduce for odd dimensions

n

ζk (0) = ζk,N (0, αk ) − ζk,N (0, −αk )

+

n

(−1)i+1

i=1

+

n 1 i=1

2

# " αki − (−αk )i γ (i) Resζk,N (i) + i 2 (i)

Resζk,N (i)

i b=0

z i,b (−αk ) − z i,b (αk )

! (b + i/2) (b + i/2)

,

and for even dimensions

ζk (0) = ζk,N (0, αk ) + ζk,N (0, −αk )

# " n i i γ (i) i+1 αk + (−αk ) Resζk,N (i) + (−1) + i 2 (i) i=1

+

n 1 i=1

2

Resζk,N (i)

i b=0

2xi,b − z i,b (−αk ) − z i,b (αk )

! (b + i/2) (b + i/2)

.

842

B. Vertman

Proof. First we consider a major building brick of the expressions in Proposition 6.10. Here we follow the approach of [BKD, Sect. 11]. Put for α ∈ {±αk }, n

α r 1 α + K (s) := ν −2s − log 1 + (−1)r +1 . ν r ν r =1

ν∈Fk

Since the zeta-function ζk,N (s) = ν∈Fk ν −s converges absolutely for Re(s) ≥ n + 1, n = dim N , the sum above converges for s = 0. In order to evaluate K (0), introduce a regularization parameter z as follows: n i i ∞ z−1 −νt −αt i+1 α t K 0 (z) := t e + (−1) e dt i! 0 ν∈Fk

i=0

= (z) · ζk,N (z, α) +

n

(−1)i+1

i=0

where we have introduced ζk,N (z, α) :=

αi (z + i)ζk,N (z + i), i!

1 ∞ z−1 −(ν+α)t t e dt. (z) 0 ν∈Fk

For Re(s) large enough ζk,N (z, α) = ν∈Fk (ν + α)−z , is holomorphic and extends meromorphically to C. Note that for α ∈ {±αk } and ν ∈ Fk we have α = −ν, so no zero mode appears in the zeta function ζk,N (z, α). In particular K 0 (z) is meromorphic in z ∈ C and by construction K 0 (0) = K (0). With the same arguments as in [BKD, Sect. 11] we arrive at

K (0) = ζk,N (0, α) − ζk,N (0)

# " n i α (i) Resζk,N (i) γ + + PPζk,N (i) , (−1)i+1 + i (i) i=1

where PPζk,N (r ) denotes the constant term in the asymptotics of ζk,N (s) near the pole singularity s = r . This result corresponds to the result obtained in [BKD, p. 388], where

(0) and 2 in front of Res ζ the factors 1/2 in front of ζk,N k,N (i), as present in [BKD], −s do not appear here because of a different notation: here we have set ζk,N (s) = ν −2s instead of ν . In fact K (0) enters the calculations twice: with α = αk and α = −αk . In the odddimensional case both expressions are subtracted from each other, in the even-dimensional case they are added up. Furthermore we compute straightforwardly (s + b + r/2) d s ζk,N (2s + r ) ds s=0 (s + 1) (b + r/2)

(b + r/2) 1 + γ + PPζk,N (r ). = Resζk,N (r ) 2 (b + r/2)

Analytic Torsion

843

We infer from (6.21), r αr − (−αk )r , (zr,b (−αk ) − zr,b (αk )) = (−1)r k r b=0

i b=0

! αr + (−αk )r . 2xi,b − z i,b (−αk ) − z i,b (αk ) = (−1)r k r

This leads after several cancellations to the desired result in odd dimensions. In even dimensions the result follows by a straightforward evaluation of the derivative at zero for the remaining component: d s 1

ζk,N (2s) − γ = 2ζk,N (0). ds s=0 (s + 1) s 7. Contribution from the Subcomplexes II It remains to identify the contribution to the analytic torsion coming from the subcomplexes (3.6) of second type, induced by the harmonics on the base manifold N . The calculations in Sect. 7.2 are provided in [L3] and are repeated here for completeness. 7.1. First order regular-singular model operators. We consider the following regularsingular model operator: d p :=

d p + : C0∞ (0, R) → C0∞ (0, R), dx x

p ∈ R.

Any element of the maximal domain D(d p,max ) is square-integrable with its weak deriv2 (0, R], due to regularity of the coefficients of d at x = R. So we have ative in L loc p (compare [W, Theorem 3.2]) 1 D(d p,max ) ⊂ Hloc (0, R].

Consequently elements of the maximal domain D(d p,max ) are continuous at any x ∈ (0, R]. Further we derive by solving the inhomogeneous differential equation d p f = g ∈ L 2 (0, R) via the variation of constants method (the solution to the homogeneous equation d p u = 0 is simply u(x) = c · x − p ), that elements of the maximal domain f ∈ D(d p,max ) are of the following form: R f (x) = c · x − p − x − p · y p (d p f )(y)dy. (7.1) x

We now analyze the expression above in order to determine the asymptotic behaviour at x = 0 of elements in the maximal domain of d p for different values of p ∈ R. √ Proposition 7.1. Let O( x) and O( x| log(x)|) refer to the asymptotic behaviour as x → 0. Then the maximal domain of d p is characterized explicitly as follows:

844

B. Vertman

(i) For p < −1/2 we have

√ 1 (0, R]| f (x) = O( x), d p f ∈ L 2 (0, R)}. D(d p,max ) = { f ∈ Hloc

(ii) For p = −1/2 we have

1 (0, R]| f (x) = O( x| log x|), d p f ∈ L 2 (0, R)}. D(d p,max ) = { f ∈ Hloc

(iii) For p ∈ (−1/2; 1/2) we have

√ 1 (0, R]| f (x) = c f x − p + O( x), d p f ∈ L 2 (0, R)}, D(d p,max ) = { f ∈ Hloc

where the constants c f depend only on f . (iv) For p ≥ 1/2 we have

√ 1 (0, R]| f (x) = O( x), d p f ∈ L 2 (0, R)}. D(d p,max ) = { f ∈ Hloc

Proof. Due to similarity of arguments we prove the first statement only, in order to avoid repetition. Let p < −1/2 and consider any f ∈ D(d p,max ). By (7.1) this element can be expressed by R −p −p f (x) = c · x − x · y p g(y)dy, x

where g = d p f . By the Cauchy-Schwarz inequality we obtain for the second term in the expression $ $ R R −p R p − p 2 p dy · x ≤x y g(y)dy y g2 x x x √ ≤ c · x−p x 2 p+1 − R 2 p+1 g L 2 = c · x 1 − R 2 p+1 x −2 p−1 g L 2 , √ where c = 1/ −2 p − 1. Since (−2 p − 1) > 0 we obtain for the asymptotics as x → 0, R √ x−p y p g(y)dy = O( x). x

√ Observe further that for p < −1/2 we also have x − p = O( x). This shows the inclusion ⊆ in the statement. To see the converse inclusion observe √ 1 (0, R]| f (x) = O( x), as x → 0} ⊂ L 2 (0, R). { f ∈ Hloc This proves the statement. In order to analyze the minimal closed extension d p,min of d p , we need to derive an identity relating d p to its formal adjoint d tp , the so-called Lagrange identity. With the notation of Proposition 7.1 we obtain the following result. Lemma 7.2 (Lagrange-Identity). For any f ∈ D(d p,max ) and g ∈ D(d tp,max ), ( % & ' d p f, g − f, d tp g = f (R)g(R) − c f cg , for | p| < 1/2, ( % & ' d p f, g − f, d tp g = f (R)g(R), for | p| ≥ 1/2.

Analytic Torsion

845

Proof. ( & ' d p f, g − f, d tp g = f (R)g(R) − f (x) · g(x)|x→0 .

%

Applying Proposition 7.1 to f ∈ D(d p,max ) and g ∈ D(d tp,max ) = D(d− p,max ) we obtain: f (x) · g(x)|x→0 = c f cg , for | p| < 1/2, f (x) · g(x)|x→0 = 0, for | p| ≥ 1/2. This proves the statement of the lemma.

Proposition 7.3. D(d p,min ) = { f ∈ D(d p,max )|c f = 0, f (R) = 0}, for | p| < 1/2, D(d p,min ) = { f ∈ D(d p,max )| f (R) = 0}, for | p| ≥ 1/2, where the coefficient c f refers to the notation in Proposition 7.1 (iii). Proof. Fix some f ∈ D(d p,min ). Then for any g ∈ D(d tp,max ) we obtain using d p,min = (d tp,max )∗ the following relation: ( % & ' d p,min f, g − f, d tp,max g = 0. Together with the Lagrange identity, established in Lemma 7.2 we find f (R)g(R) − c f cg = 0, for | p| < 1/2,

(7.2)

f (R)g(R) = 0, for | p| ≥ 1/2.

(7.3)

Let now | p| < 1/2. Then for any c, b ∈ C there exists g ∈ D(d tp,max ) such that cg = c and g(R) = b. By arbitrariness of c, b ∈ C we conclude from (7.2), c f = 0,

f (R) = 0.

For | p| ≥ 1/2 similar arguments hold, so we get f (R) = 0. This proves the inclusion ⊆ in the statements. For the converse inclusion consider some f ∈ D(d p,max ) with c f = 0 (for | p| < 1/2) and f (R) = 0. Now for any g ∈ D(d tp,max ) we infer from Lemma 7.2, ( & ' d p,max f, g − f, d tp,max g = 0.

%

Thus f is automatically an element of D((d tp,max )∗ ) = D(d p,min ). This proves the converse inclusion. Remark 7.4. The calculations and results of this subsection are the one-dimensional analogue of the discussion in [BS].

846

B. Vertman

7.2. Computation of the zeta-determinants. Recall the explicit form of the subcomplexes of second type, d

0 → C0∞ ((0, 1), 0 ⊕ u i ) − → C0∞ ((0, 1), u i ⊕ 0) → 0,

(7.4)

where {u i } is an orthonormal basis of dim Hk (N ). With respect to the generators 0 ⊕ u i and u i ⊕ 0 we obtain for the action of the exterior derivative, ck d = (−1)k ∂x + , ck = (−1)k (k − n/2). x By compatibility of the induced decomposition we have (cf. (3.7)) t D(rkel ) ∩ L 2 ((0, R), 0 ⊕ u i ) = D(dmax dmin )

c ck

k = D (−1)k+1 ∂x + (−1)k ∂x + . x max x min

Consider, in the notation of Sect. 5, for any ν ∈ R and α ∈ R ∪ {∞} the operator lν = −∂x2 + x −2 (ν 2 − 1/4) with the following self-adjoint extension: D(L ν (α)) = { f ∈ D(lν,max )|(α − 1/2)−1 f (1) + f (1) = 0, √ f (x) = O( x), x → 0}. Here L ν (α = 1/2) denotes the self-adjoint extension of lν with pure Neumann boundary conditions at x = 1. Furthermore L ν (∞) is the extension with Dirichlet boundary condit d tions at x = 1. Proposition 7.3 determines the asymptotics of elements in D(dmax min ) at x = 0. The boundary conditions at x = 1 follow from Proposition 2.5. As a consequence we have ck

ck

(−1)k ∂x + (−1)k+1 ∂x + = L |k−(n−1)/2| (∞). x max x min It is well-known, see also [L, Theorem 1.1] and [L, (1.37)], that the zeta-function of L ν (α) extends meromorphically to C and is regular at the origin. We abbreviate T (L ν (α)) := log det L ν (α) = −ζ (s = 0, L ν (α)). Put bk := dim Hk (N ). Then the contribution to the analytic torsion coming from harmonics on the base manifold is given due to the formula (4.4) as follows: dim M 1 (−1)k bk T (L |k−(n−1)/2| (∞)). 2

(7.5)

k=0

Proposition 7.5. For ν ≥ 0 we have Spec(L ν (∞)) ∪ {0} = Spec(L ν+1 (ν + 1)) ∪ {0}. Proof. Put d p := ∂x + x −1 p. We get l p+1/2 = d tp d p , l p−1/2 = d p d tp . By a combination of Propositions 7.1 and 7.3, which determine the maximal and the minimal domains of d p , and by Proposition 2.5, we obtain for ν ≥ 0, √ t D(dν+1/2,max dν+1/2,min ) = { f ∈ D(lν,max )| f (x) = O( x), x → 0, f (1) = 0}, √ t dν+1/2,max ) = { f ∈ D(lν+1,max )| f (x) = O( x), x → 0, D(dν+1/2,min f (1) + (ν + 1/2) f (1) = 0}.

Analytic Torsion

847

Hence we find t = dν+1/2,max (dν+1/2,max )∗ , L ν (∞) = dν+1/2,max dν+1/2,min

t dν+1/2,max = (dν+1/2,max )∗ dν+1/2,max . L ν+1 (ν + 1) = dν+1/2,min

Comparing both operators we deduce the statement on the spectrum, since all non-zero eigenvalues of the operators are simple by similar arguments as in Corollary 5.2. Proposition 7.6. Let α + ν = 0. Then T (L ν (∞)) = T (L ν (α)) − log(α + ν). Proof. The assumption α + ν = 0 implies with [L, Theorem 1.2] √ α+ν . det ζ (L ν (α)) = 2π ν 2 (1 + ν) Moreover we have again by [L, Theorem 1.2], √ 2π det ζ (L ν (∞)) = . (1 + ν)2ν In order to apply [L, Theorem 1.2] explicitly, one needs to identify the so-called “normalized solutions” of L ν (α) and L ν (∞). This is done in detail in [BV, Cor. 3.11 and 3.12]. Consequently we obtain for α + ν = 0, det ζ (L ν (∞)) 1 = . det ζ (L ν (α)) α+ν Taking logarithms we get the result.

Proposition 7.7. T (L k+1/2 (∞)) = log 2 −

k

log(2l + 1).

l=0

Proof. Apply Proposition 7.6 to L ν+1 (ν + 1), ν ≥ 0. We obtain T (L ν+1 (∞)) = T (L ν+1 (ν + 1)) − log(2ν + 2) = T (L ν (∞)) − log(2ν + 2), where for the second equality we used Proposition 7.5. We iterate the equality with ν = k − 1/2 and obtain T (L k+1/2 (∞)) = T (L 1/2 (∞)) −

k

log(2l + 1).

l=0

The operator L 1/2 (∞) is simply −∂x2 on [0,1] with Dirichlet boundary conditions. Its spectrum is given by (n 2 π 2 )n∈N . Thus we obtain with ζ R (0) = −1/2 and ζ R (0) = −1/2 log 2π , ζ L 1/2 (∞) (s) =

∞

π −2s n −2s

n=1

⇒ ζ L 1/2 (∞) (0) = −2(log π )ζ R (0) + 2ζ R (0) = − log 2. Now we finally compute the contribution from harmonics on the base:

848

B. Vertman

Theorem 7.8. Let M be a bounded generalized cone of length one over a closed oriented Riemannian manifold N of dimension n. Let χ (N ) denote the Euler characteristic of N and bk := dim Hk (N ) be the Betti numbers. Then the contribution to the analytic torsion coming from harmonics on the base manifold is given as follows. For dim M odd the contribution amounts to n/2−1 n/2−k−1 log 2 (−1)k bk log(2l + 1) χ (N ) − 2

−

k=0 n/2−1

1 2

l=0

(−1)k bk log(n − 2k + 1).

k=0

For dim M even the contribution amounts to (n−1)/2 1 (−1)k bk log(n − 2k + 1). 2 k=0

Proof. We infer from (7.5) for the contribution of the harmonics on the base manifold, dim M 1 (−1)k bk T (L |k−(n−1)/2| (∞)). 2 k=0

We obtain by Poincaré duality on the base manifold N , for dim M = n + 1 odd:

dim M 1 (−1)k bk T (L |k−(n−1)/2| (∞)) 2 k=0

=

1 2

n/2−1

(−1)k bk (T (L n/2−k−1/2 (∞)) + T (L n/2−k+1/2 (∞))),

k=0

for dim M = n + 1 even:

dim M 1 (−1)k bk T (L |k−(n−1)/2| (∞)) 2 k=0

=

1 2

(n−1)/2

(−1)k bk (T (L n/2−k−1/2 (∞)) − T (L n/2−k+1/2 (∞))).

k=0

Inserting the result of Proposition 7.7 into the expressions above, we obtain the statement. 8. Total Result and Formulas in Lower Dimensions Patching together the results of both preceding sections we can now provide a complete formula for the analytic torsion of a bounded generalized cone. In fact we simply have to add up the results of Theorem 7.8 and Corollary 6.11. In even dimensions one has to be careful in the middle degree, as explained in Remark 4.5.

Analytic Torsion

849

Theorem 8.1. Let M = (0, 1] × N , g M = d x 2 ⊕ x 2 g N be an odd-dimensional bounded generalized cone over a closed oriented Riemannian manifold (N , g N ). Introduce the notation n = dim N , αk = (n − 1)/2 − k and bk = dim Hk (N ). Put Fk := {ξ ∈ R+ | ξ 2 = η + (k + 1/2 − n/2)2 , η ∈ Speck,ccl,N \{0}}, ν −s , ζk,N (s, α) := (ν + α)−s , Re(s) 0. ζk,N (s) = ν∈Fk

ν∈Fk

Then the logarithm of the scalar analytic torsion of M is given by n/2−1 n/2−k−1 log 2 k χ (N ) − log T (M) = (−1) bk log(2l + 1) 2 k=0

−

1 2

n/2−1

(−1)k bk log(n − 2k + 1) +

n/2−1

k=0

k=0

+

n/2−1

(−1)k 2

k=0

+

n/2−1

n (−1)k

2

k=0

n

i=1

(−1)i+1

i=1

(−1)k 2

l=0

(ζk,N (0, αk ) − ζk,N (0, −αk ))

# " αki − (−αk )i γ (i) Resζk,N (i) + i 2 (i)

! (b + i/2) 1 Resζk,N (i) . z i,b (−αk ) − z i,b (αk ) 2 (b + i/2) i

b=0

Theorem 8.2. Let M = (0, 1]× N , g M = d x 2 ⊕x 2 g N be an even-dimensional bounded generalized cone over a closed oriented Riemannian manifold (N , g N ). Introduce the notation n = dim N , αk = (n − 1)/2 − k and bk = dim Hk (N ). Put Fk := {ξ ∈ R+ | ξ 2 = η + (k + 1/2 − n/2)2 , η ∈ Speck,ccl,N \{0}}, ν −s , ζk,N (s, α) := (ν + α)−s , Re(s) 0. ζk,N (s) := ν∈Fk

"

δk :=

ν∈Fk

1/2 if k = (n − 1)/2, 1 otherwise.

Then the logarithm of the scalar analytic torsion of M is given by (n−1)/2

log T (M) =

k=0

+

(n−1)/2 k=0

+

(n−1)/2

(−1)k

bk log(n − 2k + 1) + δk ζk,N (0, αk ) + δk ζk,N (0, −αk ) 2

# " n i i (−1)k γ (i) i+1 αk + (−αk ) δk Resζk,N (i) + (−1) 2 i 2 (i) i=1

(−1)k

k=0

−z i,b (αk )

2

δk

n 1 i=1

2

! (b + i/2) (b + i/2)

.

Resζk,N (i)

i b=0

2xi,b − z i,b (−αk )

850

B. Vertman

The formula could not be made further explicit due to presence of coefficients xr,b and zr,b (±αk ), arising from the polynomials Dr (t) =

r

xr,b t

r +2b

,

Mr (t, ±α) =

b=0

r

zr,b (±α)t r +2b ,

b=0

which were introduced in the expansions (6.17) and (6.18). These polynomials can be computed explicitly for any given order r ∈ N. To point out the applicability of the general results we pursue explicit computations in dimension two and three. We continue in the notation of the theorems above. Corollary 8.3. Let M be a two-dimensional bounded generalized cone of length one over a closed oriented manifold N . Then the analytic torsion of M is given by log T (M) =

1 1

1 dim H 0 (N ) log 2 + ζ0,N (0) − Resζ0,N (s = 1). 2 2 4

In the special case of N = S 1 we obtain log T (M) =

1 (− log π − 1) . 2

Proof. In the two-dimensional case the general formula of Theorem 8.2 reduces to the following expression: 1

1

1 dim H 0 (N ) log 2 + ζ0,N (0, α0 ) + ζ0,N (0, −α0 ) 2 4 4 1 ! (b + 1/2) 1 + Res ζ0,N (1) 2x1,b − z 1,b (−αk ) − z 1,b (αk ) . 8 (b + 1/2) log T (M) =

b=0

Now we evaluate the combinatorial factor of Res ζ0,N (1) by considering the following formulas, encountered in [BGKE, Sect. 2–3]: D1 (t) =

1

x1,b t 1+2b =

b=0

M1 (t, α) =

1 b=0

5 1 t − t 3, 8 24

7 3 z 1,b (±α)t 1+2b = − + α t + t 3 . 8 24

(8.1)

Further one needs the following values (calculated from the known properties of Gamma functions): (1/2) = −(γ + 2 log 2), (1/2)

(3/2) = 2 − (γ + 2 log 2). (3/2)

Finally one observes α0 = 0 in this setting. This easily leads to the first formula in the statement of the corollary. The second formula follows from the first by ζ0,N (s) = 2ζ R (s),

Analytic Torsion

851

where the factor 2 comes from the fact that the eigenvalues n 2 of the Laplacian k=0,S 1 are of multiplicity two for n = 0. The Riemann zeta function has the following special values: 1 ζ R (0) = − log 2π, Res ζ R (1) = 1, 2 which gives the second formula. Corollary 8.4. Let M be a three-dimensional bounded generalized cone of length one over a closed oriented manifold N . Then the analytic torsion of M is given by log 3 log 2 1

χ (N ) − dim H 0 (N ) + ζ0,N (0, 1/2) 2 2 2 1

log 2 1 − ζ0,N (0, −1/2) + Res ζ0,N (1) + Res ζ0,N (2). 2 2 16 Proof. In the three-dimensional case the general formula of Theorem 8.1 reduces to the following expression: log T (M) =

log 3 log 2 χ (N ) − dim H 0 (N ) 2 2 ! 1

γ (1)

+ ζ0,N (0, α0 ) − ζ0,N (0, −α0 ) + α0 Res ζ0,N (1) + 2 2 (1) log T (M) =

+

! (b + i/2) 1 . z i,b (−αk ) − z i,b (αk ) Res ζ0,N (i) 4 (b + i/2) 2

i

i=1

b=0

Now we simply evaluate the last combinatorial sum by considering formulas from [BGKE, (3.6), (3.7)]: 1 7 3 M1 (t, α) = z 1,b (±α)t 1+2b = − + α t + t 3 , 8 24 M2 (t, α) =

b=0 2 b=0

α2 2 3 α 5 α 4 7 t + − t − t 6. z 2,b (±α)t 2+2b = − + − 16 2 2 8 2 16

We further need the values (1) (1/2) = −γ , = −(γ + 2 log 2), (1) (1/2) (2) (3/2) = 1 − γ, = 2 − (γ + 2 log 2). (2) (3/2) This leads together with α0 = 1/2 in the three-dimensional case to the following formula log 2 log 3 χ (N ) − dim H 0 (N ) 2 2 ! γ 1

+ ζ0,N (0, 1/2) − ζ0,N (0, −1/2) − Res ζ0,N (1) 2 4 1 1 Res ζ0,N (1)[γ + 2 log 2] + Res ζ0,N (2) . + 4 4 log T (M) =

Obvious cancellations in the formula above prove the result.

(8.2)

852

B. Vertman

9. Analytic Torsion of a Cone over S1 The preceding computations reduce in the two-dimensional case simply to the computation of the analytic torsion of a disc. In order to deal with a generalized bounded cone in two dimensions, which is not simply a flat disc, we need to introduce an additional parameter in the Riemannian metric. So in two dimensions the setup is as follows: Let M := (0, R] × S 1 with g M = d x 2 ⊕ ν −2 x 2 g S

1

be a bounded generalized cone over S 1 of angle arcsec(ν) and length 1, with a fixed orientation and with fixed parameter ν ≥ 1, as in the Fig. 2 below.

Fig. 2. A bounded cone of angle arcsec(ν), ν ≥ 1 and length R

The main result of our discussion in this part of the presentation is then the following theorem: Theorem 9.1. The analytic torsion T (M) of a bounded generalized cone M of length R and angle arcsec ν > 0 over S 1 is given by 1 2 log T (M) = − log(π R 2 ) + log ν − . ν This result corresponds precisely to the result obtained in Corollary 8.3 for the special case ν = 1 (for R = 1). In fact this result can also be derived from [BGKE, Sect. 5]. This setup was considered by Spreafico in [S]. However [S] deals only with Dirichlet boundary conditions at the cone base. So we extend his approach to the Neumann boundary conditions in order to obtain an overall result for the analytic torsion of this specific cone manifold. The author was made aware of the work by Hartmann–Spreafico in [HS], where this setup and analytic torsion of cones over spheres in general, is discussed and computed. We proceed as follows. Denote forms with compact support in the interior of M by ∗0 (M). The associated de Rham complex is given by d0

d1

0 → 00 (M) − → 10 (M) − → 20 (M) → 0. Consider the following maps: 0 : C0∞ ((0, R), 0 (S 1 )) → 00 (M), 2 :

C0∞ ((0,

φ → x −1/2 φ, R), 1 (S 1 )) → 20 (M), φ

→

x 1/2 φ ∧ d x,

Analytic Torsion

853

where φ is identified with its pullback to M under the natural projection π : (0, R] × N → N onto the second factor, and x is the canonical coordinate on (0, R]. We find d2 1 1 −1 t 0 2 2 := 0 d0 d0 0 = − 2 + 2 −ν ∂θ − on C0∞ ((0, R), 0 (S 1 )), dx x 4 d2 1 1 on C0∞ ((0, R), 1 (S 1 )), 2 := 2−1 d1 d1t 2 = − 2 + 2 −ν 2 ∂θ2 − dx x 4 where θ is the local variable on the one-dimensional sphere. In fact both maps 0 and 2 extend to isometries on the L 2 −completion of the spaces, by similar arguments as behind Proposition 2.1. Now consider the minimal extensions Dk := dk,min of the boundary operators dk in the de Rham complex (∗0 (M), d). This defines by [BL1, Lemma 3.1] a Hilbert complex (D, D), with Dk := D(Dk ). Put 0rel := 0−1 D0∗ D0 0 , 2rel := 2−1 D1 D1∗ 2 . The Laplacians 0rel , 2rel are spectrally equivalent to D0∗ D0 , D1 D1∗ , respectively. The boundary conditions for 0rel and 2rel at the cone base {1} × S 1 are determined in Proposition 2.5. In order to identify the boundary conditions for 0rel and 2rel at the cone singularity, observe that by [BL2, Theorem 3.7] the ideal boundary conditions for the de Rham complex are uniquely determined at the cone singularity. Further [BL2, Lemma 3.1] shows that the corresponding extension coincides with Friedrich’s extension at the cone singularity. We infer from [BS3, Theorem√6.1] that the elements in the domain of Friedrich’s extension are of the asymptotics O( x) as x → 0. Hence we find D(0rel )

√ 2 = {φ ∈ Hloc ((0, R] × S 1 )|φ(R) = 0, φ(x) = O( x) as x → 0}, D(2rel ) 2 = {φ ∈ Hloc ((0, R] × S 1 )|φ (R) −

√ 1 φ(R) = 0, φ(x) = O( x) as x → 0}. 2R

The first operator with Dirichlet boundary conditions at the cone base is already elaborated in [S]. We adapt their approach to deal with the second operator with generalized Neumann boundary conditions at the cone base. The scalar analytic torsion of the bounded generalized cone is then given in terms of both results 2 log T (M) = ζ 2 (0) − ζ 0 (0). rel

rel

Note that the Laplacian (−∂θ2 ) on S 1 has a discrete spectrum n 2 , n ∈ Z, where the eigenvalues n 2 are of multiplicity two, up to the eigenvalue n 2 = 0 of multiplicity one. Consider now a µ-eigenform φ of 2rel . Since eigenforms of (−∂θ2 ) on S 1 are smooth, the projection of φ for any fixed x ∈ (0, R] onto some n 2 −eigenspace of (−∂θ2 ) maps

854

B. Vertman

2 ((0, R]× S 1 ), still satisfies the boundary conditions for D(2 ) and hence again to Hloc 2,rel gives again an eigenform of D(22,rel ). Hence for the purpose of spectrum computation we can assume without loss of generality the µ−eigenform φ to lie in a n 2 −eigenspace of (−∂θ2 ) for any fixed x ∈ (0, R]. This element φ, identified with its scalar part, is a solution to d2 1 1 2 2 − 2 φ(x) + 2 ν n − φ(x) = µ2 φ(x), dx x 4

subject to the relative boundary conditions. The general solution to the equation above is √ √ φ(x) = c1 x Jνn (µx) + c2 xYνn (µx), where Jνn (z) and Yνn (z) denote the Bessel functions √ of first and second kind. The boundary conditions at x = 0 are given by φ(x) = O( x) as x → 0 and consequently c2 = 0. The boundary conditions at the cone base give φ (R) −

√

1 φ(R) = c1 µ R Jνn (µR) = 0. 2R

Since we are not interested in zero-eigenvalues, the relevant eigenvalues are by Corollary 5.2 given as follows: 2 jνn,k λn,k = R

(z). We obtain in view of the multiplicities of with jνn,k being the positive zeros of Jνn 2 2 1 the n −eigenvalues of (−∂θ ) on S for the zeta-function

ζ2 (s) =

∞

rel

λ−s 0,k + 2

k=1

∞ n,k=1

λ−s n,k =

∞ −2s ∞ j0,k −2s + 2R 2s . jνn,k R k=1

n,k=1

The derivative at zero for the first summand follows by a direct application of [S, Sect. 3]: Lemma 9.2. ∞ !−2s d 3 1 K := j0,k /R = − log 2π − log R + log 2. ds 0 2 2 k=1

Proof. The values j0,k are zeros of J0 (z). Since J0 (z) = −J1 (z) they are also zeros of J1 (z). Using [S, Lemma 1 (b)] and its application on [S, p.361] we obtain in the notation therein ∞ !−2s d j0,k /R = −B(1) + T (0, 1) ds 0 k=1

3 1 = − log 2π − log R + log 2. 2 2

Analytic Torsion

855

−2s Now we turn to the discussion of the second summand. We put z(s) = ∞ n,k=1 jνn,k for Re(s) 0. This series is well-defined for Re(s) sufficiently large by the general result in Theorem 4.1. Due to uniform convergence of integrals and series we obtain with computations similar to (5.2) the following integral representation: ∞ 1 e−λt s2 T (s, λ)dλdt, (9.1) t s−1 z(s) = (s + 1) 0 2πi ∧c −λ ∞ ∞ (νn)2 λ −2s T (s, λ) = (νn) tn (λ), tn (λ) = − log 1 − 2 , (9.2) jνn,k n=1 k=1 where c := {λ ∈ C||arg(λ − c)| = π/4} with c > 0 being any fixed positive number, smaller than the lowest non-zero eigenvalue of 2rel . We proceed with explicit calculations by presenting tn (λ) in terms of special functions. Using the infinite product expansion (6.13) we obtain the following result for the derivative of the modified Bessel function of first kind: ∞ νn−1 2 (νnz) (νnz)

Iνn (νnz) = νn 1+ 2 , 2 (νn) jνn,k k=1 √

(z). Putting z = −λ we get where jνn,k denotes the positive zeros of Jνn ∞ ∞ (νn)2 λ (νnz)2 log 1 − 2 tn (λ) = − = − log 1+ 2 jνn,k jνn,k k=1 k=1

(νnz) + log(νnz)νn−1 − log 2νn (νn). = − log Iνn

(9.3)

The associated function T (s, λ) from (9.2) is however not analytic at s = 0. The 1/νndependence in tn (λ) causes non-analytic behaviour. We put tn (λ) =: pn (λ) +

1 f (λ), νn

P(s, λ) =

∞

(νn)−2s pn (λ).

(9.4)

n=1

To get explicit expressions for P(s, λ) and f (λ) we use the asymptotic expansion of the Bessel-functions for large order √ from [O], in√analogy to Lemma 6.4. We obtain in the notation of (6.18) with z = −λ and t = 1/ 1 − λ: f (λ) = −M1 (t, 0) =

7 3 t − t 3, 8 24

where we inferred the explicit form of M1 (t, 0) from (8.1). We obtain for pn (λ)

pn (λ) = − log Iνn (νnz) + log(νnz)νn−1 − log 2νn (νn) 7 3 1 3 t− t . − νn 8 24

As in Lemma 6.5 we compute the contribution coming from f (λ). Lemma 9.3. ∞ t 0

s−1

1 2πi

∧c

1 1 e−λt 1 f (λ)dλdt = √ s + −7 . −λ 2 s 12 π

(9.5)

856

B. Vertman

Proof. Observe from [GRA, 8.353.3] by substituting the new variable x = λ − 1, 1 2πi

∧c

1 1 e−λt 1 −t e−xt e dλ = − dx a −λ (1 − λ) 2πi x + 1 (−x)a ∧c−1 1 = sin(πa)(1 − a)(a, t). π

Using now the relation between the incomplete Gamma function and the probability integral ∞ (s + a) , t s−1 (a, t)dt = s 0 we finally obtain

∞

e−λt f (λ)dλdt 0 ∧c −λ π 1 (s + 1/2) 3 sin 1− = 8π 2 2 s 3 (s + 3/2) 7 3π 1− − sin 24π 2 2 s t

s−1

1 2πi

7 (s + 3/2) 3 (s + 1/2) − √ √ s s 8 π 12 π " # 3 7 1 1 1 + s+ = √ s+ 2 8s 12s 2 π 1 1 1 −7 . = √ s+ 2 s 12 π =

By classical asymptotics of Bessel functions for large arguments and fixed order 1 eνnz

Iνn , 1+O (νnz) = √ z 2π νnz where the region of validity is preserved (see the discussion in the higher-dimensional case in Proposition 6.6), we obtain for pn (λ) from (9.5), √ 1 1 1 + (νn − 1) log(−λ) + log 2π νn pn (λ) = −νn λ + 4 2 2 +(νn − 1) log νn − log(2νn (νn)) + O((−λ)−1/2 ). Following [S, Sect. 4.2] we reorder the summands in the above expression to get √ pn (λ) = −νn λ + an log(−λ) + bn + O((−λ)−1/2 ),

Analytic Torsion

857

where the interesting terms are clear from above. We set ∞ 1 1 (νn)−2s an = ν −2s+1 ζ R (2s − 1) − ν −2s ζ R (2s), 2 4 n=1 ∞ 1 2π ζ R (2s) B(s) := (νn)−2s bn = ν −2s log 2 ν n=1 ν

+ν −2s+1 log ζ R (2s − 1) − ν −2s+1 ζ R (2s − 1) 2 ∞ 1 + ν −2s ζ R (2s) − (νn)−2s log (νn). 2

A(s) :=

n=1

Following the approach of M. Spreafico it remains to evaluate P(s, 0) defined in (9.4) in order to obtain a closed expression for the function z(s). Lemma 9.4. P(s, 0) = −

1 −2s−1 ν ζ R (2s + 1). 12

Proof. Recall the asymptotic behaviour of Bessel functions of second order for small arguments, Iνn (x) ∼

x νn x νn−1 1 νn

⇒ Iνn (x) ∼ . (νn + 1) 2 2(νn + 1) 2

Further observe that as λ → 0 we obtain with z =

√

√ −λ and t = 1/ 1 + z 2 ,

7 1 3 3 7 λ→0 =− . M1 (t, 0) = − t + t 3 −−−→ − + 8 24 8 24 12 Using these two facts we obtain from (9.5) for pn (0), pn (0) = − log νn + log (νn + 1) − log (νn) − ⇒ P(s, 0) =

1 1 =− 12νn 12νn

∞ 1 (νn)−2s pn (0) = − ν −2s−1 ζ R (2s + 1). 12 n=1

Now we have all the ingredients together, since by [S, p. 366] and Lemma 9.3 the function z(s) is given as follows: s 1 [γ A(s) − B(s) − A(s) + P(s, 0)] (s + 1) s 1 s2 s2 1 1 ν −2s−1 ζ R (2s + 1) √ s + −7 + h(s), + (s + 1) 2 s (s + 1) 12 π z(s) =

858

B. Vertman

where the last term vanishes with its derivative at s = 0. We are interested in the value of the function itself z(0) and its derivative z (0). In order to compute the value of z(0) recall the fact that close to 1 the Riemann zeta function behaves as follows: ζ R (2s + 1) =

1 + γ + o(s), s → 0. 2s

This implies 1 1 1 1 s2 ν −2s−1 ζ R (2s + 1) √ s + −7 → , s → 0. (s + 1) 2 s 24ν 12 π Furthermore note that the function η(s, ν) :=

∞ 1 (νn)−2s log (νn + 1) − ν −2s−1 ζ R (2s + 1), 12 n=1

introduced in [S, p.366] is regular at s = 0, cf. [S, Sect. 4.3]. Hence γ A(s)−B(s)+P(s, 0) is regular at s = 0 and we obtain straightforwardly: z(0) = −A(0) +

1 1 1 1 = − νζ R (−1) + ζ R (0) + . 24ν 2 4 24ν

1 In view of the explict values ζ R (−1) = − 12 and ζ R (0) = − 21 we find

z(0) =

1 1 ν + − . 24 24ν 8

(9.6)

Lemma 9.5.

1 1 1 7 1 z (0) = η(0, ν) + log ν − log 2π − ν log 2 + γ − log 2ν − , 2 4 12 12ν 2 −2s log (νn + 1) − 1 ν −2s−1 ζ (2s + 1). where η(s, ν) = ∞ R n=1 (νn) 12

Proof. We compute z (0) from the above expression for z(s), using √ (1/2) = − π (γ + 2 log 2). Straightforward computations lead to: z (0) = P(0, 0) − A (0) − B(0) +

7 1 (γ − log 2ν − ). 12ν 2

(9.7)

The statement follows with η(s, ν) being defined precisely as in [S, Sect. 4.2]. Now we are able to provide a result for the derivative of the zeta function ζ 2 (0). rel

Recall ∞ −2s ∞ j0,k −2s ζ2 (s) = + 2R 2s . jνn,k rel R k=1

νn,k=1

Analytic Torsion

859

With K defined in Lemma 9.2 and z(s) =

∞

−2s n,k=1 jνn,k

we get

ζ 2 (0) = K + 4z(0) log R + 2z (0). rel

It remains to compare each summand to the corresponding results for ζ 0 (0) obtained rel

in [S]. Using Lemma 9.2, (9.6) and (9.5) we finally arrive after several cancellations at Theorem 9.1, 1 2 log T (M) = ζ 2 (0) − ζ 0 (0) = − log(π R 2 ) + log ν − . rel rel ν Acknowledgements. The results of this article were obtained during the author’s Ph.D. studies at Bonn University, Germany. The author would like to thank his thesis advisor Prof. Matthias Lesch for his support and useful discussions. The author was supported by the German Research Foundation as a scholar of the Graduiertenkolleg 1269 “Global Structures in Geometry and Analysis”.

References [AS] Abramowitz, M., Stegun, I.A. (eds.): Handbook of math. Functions. New York: Dover, 1965 [BGKE] Bordag, M., Geyer, B., Kirsten, K., Elizalde, E.: Zeta function determinant of the laplace operator on the d-dimensional ball. Commun. Math. Phys. 179(1), 215–234 (1996) [BKD] Bordag, M., Kirsten, K., Dowker, J.S.: Heat-kernels and functional determinants on the generalized cone. Commun. Math. Phys. 182, 371–394 (1996) [BL1] Brüning, J., Lesch, M.: Hilbert complexes. J. Funct. Anal. 108, 88–132 (1992) [BL2] Brüning, J., Lesch, M.: Kähler-hodge theory for conformal complex cones. Geom. Funct. Anal. 3, 439–473 (1993) [BM] Brüning, J., Ma, X.: An anomaly-formula for ray-singer metrics on manifolds with boundary. Geom. Funct. An. 16(4), 767–837 (2006) [Br] Brüning, J.: L 2 -index theorems for certain complete manifolds. J. Diff. Geom. 32, 491–532 (1990) [BS] Brüning, J., Seeley, R.: An index theorem for first order regular singular operators. Amer. J. Math 110, 659–714 (1988) [BS2] Brüning, J., Seeley, R.: Regular singular asymptotics. Adv. Math. 58, 133–148 (1985) [BS3] Brüning, J., Seeley, R.: The resolvent expansion for second order regular singular operators. J. Funct. Anal. 73, 369–429 (1987) [BZ] Bismut, J.-M., Zhang, W.: Milnor and ray-singer metrics on the equivariant determinant of a flat vector bundle. Geom. Funct. Anal. 4(2), 136–212 (1994) [BZ1] Bismut, J.-M., Zhang, W.: An extension of a Theorem by Cheeger and Müller. Asterisque 205, Paris: Soc. Math. Francaise, 1992 [C] Callias, C.: The resolvent and the heat kernel for some singular boundary problems. Comm. Part. Diff. Eq. 13(9), 1113–1155 (1988) [Ch] Cheeger, J.: Analytic torsion and reidemeister torsion. Proc. Nat. Acad. Sci. USA 74, 2651–2654 (1977) [Ch1] Cheeger, J.: On the spectral geometry of spaces with conical singularities. Proc. Nat. Acad. Sci. 76, 2103–2106 (1979) [Ch2] Cheeger, J.: Spectral geometry of singular riemannian spaces. J. Diff. Geom. 18, 575–657 (1983) [Dar] Dar, A.: Intersection r-torsion and analytic torsion for pseudo-manifolds. Math. Z. 194, 193–216 (1987) [DK] Dowker, J.S., Kirsten, K.: Spinors and forms on the ball and the generalized cone. Comm. Anal. Geom. 7(3), 641–679 (1999) [Do] Dodziuk, J.: Finite-difference approach to the hodge theory of harmonic forms. Amer. J. Math. 98(1), 79–104 (1976) [Fr] Franz, W.: Über die Torsion einer Überdeckung. J. reine angew. Math. 173, 245–254, (1935) [GRA] Gradsteyn, I.S., Ryzhik, I.M., Alan Jeffrey: Table of integrals, Series and Products. 5th edition, London-NewYork: Academic Press, Inc., 1994 [H] Hawking, S.W.: Zeta-function regularization of path integrals in curved space time. Commun. Math. Phys. 55, 133–148 (1977) [HS] Hartmann, L., Spreafico, M.: The Analytic torsion of a cone over a sphere. http://arxiv.org/abs/ 0902.3887v1[math.DG], 2009

860

B. Vertman

[L]

Lesch, M.: Determinants of regular singular sturm-liouville operators. Math. Nachr. 194, 139–170 (1998) Lesch, M.: Operators of Fuchs Type, Conical Singularities, and Asymptotic Methods. Berlin: Band 136, Teubner, 1997 Lesch, M.: The Analytic Torsion of the Model Cone. Unpublished notes, Columbus University, 1994 Lott, J., Rothenberg, M.: Analytic torsion for group actions. J. Diff. Geom. 34, 431–481 (1991) Lück, W.: Analytic and topological torsion for manifolds with boundary and symmetry. J. Diff. Geom. 37, 263–322 (1993) Mooers, E.: Heat kernel asymptotics on manifolds with conic singularities. J. Anal. Math. 78, 1–36 (1999) Milnor, J.: Whitehead torsion. Bull. Ams. 72, 358–426 (1966) Müller, W.: Analytic torsion and r-torsion for unimodular representations. J. Amer. Math. Soc. 6(3), 721–753 (1993) Müller, W.: Analytic torsion and r-torsion of Riemannian manifolds. Adv. Math. 28, 233–305 (1978) Nicolaescu, L.I.: The Reidemeister Torsion of 3-Manifolds. de Gruyter Studies in Mathematics, Vol. 30, Berlin: de Gruter, 2003 Olver, F.W.: Asymptotics and Special Functions. AKP Classics, Wellestey, MA: A.K. Peters, Ltd, 1997 Paquet, L.: Problemes mixtes pour le systeme de maxwell. Ann. Fac. Sci. Toulouse IV, 103–141 (1982) Reidemeister, K.: Die klassifikation der linsenräume. Abhandl. Math. Sem. Hamburg 11, 102–109 (1935) Reidemeister, K.: Überdeckungen von komplexen. J. Reine Angew. Math. 173, 164–173 (1935) de Rham, G.: Complexes a automorphismes et homeomorphie differentiable. Ann. Inst. Fourier 2, 51–67 (1950) Ray, D.B., Singer, I.M.: R-torsion and the laplacian on riemannian manifolds. Adv. Math. 7, 145–210 (1971) Spreafico, M.: Zeta function and regularized determinant on a disc and on a cone. J. Geom. Phys. 54, 355–371 (2005) Spreafico, M.: Zeta invariants for Dirichlet series. Pac. Math. J. 224(1), 185–200 (2006) Spreafico, M.: The determinant of the laplacian on forms and a torsion-type invariant for cone on the circle. FJMS 29(2), 353–368 (2008) Vertman, B.: Functional determinants for regular-Singular laplace-type operators. accepted for J. Math. Phys. http://arxiv.org/abs/0808.0443v2[math-ph], 2008 Vishik, S.: Generalized ray-singer conjecture i. a manifold with smooth boundary. Commun. Math. Phys. 167, 1–102 (1995) Weidmann, J.: Spectral Theory of Ordinary Differential Equations. Lecture Notes in Math. 1258, Berlin-Heidelberg: Springer-Verlag, 1987 Weidmann, J.: Linear Operators in Hilbert Spaces. New York: Springer-Verlag, 1980 Whitehead, J.H.: Simple homotopy types. Amer. J. Math. 72, 1–57 (1950) Watson, G.N.: A Treatise on the Theory of Bessel Functions. Cambridge: Camb. Univ. Press, 1922

[L1] [L3] [LR] [Lü] [M] [Mi] [Mu] [Mu1] [Nic] [O] [P] [Re1] [Re2] [Rh] [RS] [S] [S1] [S2] [BV] [V] [W] [W2] [Wh] [WT]

Communicated by S. Zelditch

Commun. Math. Phys. 290, 861–870 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0726-8

Communications in

Mathematical Physics

Localization and Geometric Depletion of Vortex-Stretching in the 3D NSE Zoran Gruji´c Department of Mathematics, University of Virginia, Charlottesville, VA 22904, USA. E-mail: [email protected]; [email protected] Received: 12 September 2008 / Accepted: 16 October 2008 Published online: 13 January 2009 – © Springer-Verlag 2009

Abstract: Vortex-stretching and, consequently, the evolution of the vorticity is localized on an arbitrarily small space-time cylinder. This yields a complete localization of the geometric condition(s) for the regularity involving coherence of the vorticity direction. In particular, it implies the regularity of any geometrically constrained Leray solution independently of the type of the spatial domain or the boundary conditions.

1. Introduction Numerical simulations and experiments reveal that the regions of intense vorticity self-organize in quasi-low-dimensional sparse coherent structures, e.g., vortex sheets and vortex tubes. An intriguing and challenging problem in mathematical analysis of the incompressible flows is to explain the influence of geometry of the regions of high vorticity magnitude on smoothness of the flow. The geometric approach to studying smoothness/avoiding singularity formation in the 3D incompressible flows was pioneered in [Co] where Constantin derived a singular integral representation of the stretching factor in the evolution of the vorticity magnitude. The representation formula involves an explicit geometric kernel which is depleted by both local alignment and anti-alignment of the vorticity direction. Hence, local coherence of the vorticity direction, a purely geometric property, depletes the nonlinearity. This type of geometric depletion of the nonlinearity was subsequently exploited in [CoFe]; the main result states that as long as the vorticity direction (in the region of intense vorticity) is Lipschitz-coherent, no blow-up can occur and the flow remains smooth. Following [CoFe], it was shown in [daVeigaBe1] that 21 -Hölder coherence suffices. In a related work [GrRu], a class of hybrid geometric-analytic conditions for avoiding singularity formation was obtained containing a purely geometric 21 -Hölder coherence and a purely analytic Beale-Kato-Majda condition (time-integrability of the L ∞ -norm of the vorticity) as the endpoint cases.

862

Z. Gruji´c

It is possible to detect more structure, and, in particular, more cancelation prop (ω · ∇)u · ω = α|ω|2 erties in the integral form of the vortex-stretching term R3

R3

induced by Constantin’s representation of the vortex stretching factor α. This was realized in [RuGr] and two types of results followed. One stating that concentration of the vorticity on small, sparsely populated vortex structures depletes the nonlinearity preventing a blow-up, and the other stating that a certain isotropy condition on curl ω = −u induces enough cancelations in the integral form of the vortex-stretching term to prevent singularity formation. It is worth noting that the Laplacian is a rotationally invariant operator; hence, this is essentially a condition on isotropy of the velocity field. A different method of taking into account sparseness/thinness of the regions of intense vorticity is based on utilizing local-in-time spatial analyticity properties of the solutions to the 3D NSE via a plurisubharmonic measure maximum principle in C3 [Gr]. The result states that local existence of a thin direction–on a scale comparable to a localized vorticity version of the Kolmogorov dissipation scale– in the region of high vorticity magnitude suffices to control the L ∞ -norm of the vorticity preventing the blow-up. In all of the aforementioned results, the geometric conditions for preventing singularity formation, although being local in nature, were assumed uniformly on a time interval and uniformly in the region of intense vorticity throughout the spatial domain R3 . In a recent work [GrZh], it was shown that it is possible to localize the conditions on coherence of the vorticity direction derived in [GrRu], and in particular, the purely geometric 1 3 2 -Hölder coherence, to an arbitrarily small space-time cylinder in R × (0, T ). At this point, a natural question arises: Is it possible to obtain analogous results in the cases when the spatial domain is not the whole space R3 , e.g., in the case of standard nonslip (Dirichlet) boundary conditions on bounded and unbounded domains? In the case ∂u j = 0, 1 ≤ j ≤ 2, of the half-space R3+ with the slip boundary conditions, u 3 = 0, ∂ x3 it was shown in [daVeiga1] that 21 -Hölder coherence of the vorticity direction suffices to prevent the singularity formation. In the case of the non-slip boundary conditions on smooth bounded domains, it was shown in [daVeiga2] that an analogous result is possible under a certain assumption on the control of the normal derivative of the vorticity magnitude at the boundary. A key step in the proofs was to obtain a version of the Biot-Savart law, i.e., to express the velocity in terms of the vorticity, for the type of the domain/boundary conditions in question; this lead to a suitable representation of the vortex-stretching term and ultimately to bounds on the enstrophy (the L 2 -norm of the vorticity). In a very recent work [daVeigaBe2], the authors showed that 21 -Hölder coherence of the vorticity direction is a sufficient condition for the regularity in the case of the free boundary-type boundary conditions, u · n = 0, ω × n = 0 (n is the exterior unit normal), on a bounded smooth domain. (In the case of the half-space, this type of boundary conditions reduces to the slip boundary conditions; hence, this result can be viewed as an extension of the work in [daVeiga1] to an arbitrary bounded smooth domain.) In this paper, a localized representation formula for the vortex-stretching term is obtained leading to a control of the localized enstrophy for any Leray solution on an arbitrarily small space-time cylinder Q δ (x0 , t0 ) = Bδ (x0 ) × (t0 − δ 2 , t0 ). This then yields a localization of the vorticity direction coherence conditions for the regularity to Q δ (x0 , t0 ). The proof merges the localization of the transport of the vorticity by the velocity previously obtained in [GrZh] with the newly obtained localization of

Localization and Geometric Depletion of Vortex-Stretching in the 3D NSE

863

vortex-stretching. (The localization of vortex-stretching presented in [GrZh] utilized the singular integral representation formula for the vortex-stretching factor α over the whole space R3 which was then split into small and large scales.) This implies the regularity of any Leray solution with a 21 -Hölder coherent vorticity direction field independently of the type of the spatial domain or the boundary conditions. 2. Notation The Navier-Stokes equations modeling the flow of a viscous incompressible fluid on a space-time domain × (0, T ) ( an open subset of R3 ) read u t − νu + (u · ∇)u + ∇ p = 0,

(1)

supplemented with the incompressibility condition divu = 0, where the vector field u = u(x, t) is the velocity of the fluid, the scalar field p = p(x, t) is the pressure and the positive constant ν is the viscosity. (In what follows, the viscosity will be set to 1 – the results in the general case can be recovered by scaling.) Taking the curl of (1) yields ωt − ω + (u · ∇)ω = (ω · ∇)u ,

(2)

where ω = curl u is the vorticity. The right-hand side of (2) , (ω · ∇)u, is the vortexstretching term which holds the key to understanding the phenomenon (or the lack thereof) of the singularity formation in the flow. (The other component of the nonlinearity, (u · ∇)ω, is the part of the material derivative of the vorticity, i.e., of the transport of the vorticity by the velocity, and can be controlled.) ω Let ξ denote the vorticity direction field, ξ = , and S the symmetric part of the |ω| 1 ∇u + (∇u)t . Define a key quantity α by α = Sξ · ξ . Then, a velocity gradient, S = 2 direct computation yields (ω · ∇)u · ω = Sω · ω = α|ω|2 and (∂t + u · ∇ − )|ω|2 + |∇ω|2 = α|ω|2 , i.e., α represents the stretching factor in the evolution of the vorticity magnitude. Constantin (cf. [Co]) derived the following representation formula for α, 3 1 α(x) = P.V. D yˆ , ξ(x + y), ξ(x) |ω(x + y)| 3 dy, 4π |y| where yˆ is the unit vector in the y-direction and the geometric kernel D is defined by D(e1 , e2 , e3 ) = (e1 · e3 ) (e1 · (e2 × e3 )) for arbitrary unit vectors e1 , e2 and e3 . It is easily seen that |D yˆ , ξ(x + y), ξ(x) | ≤ | sin ϕ (ξ(x), ξ(x + y)) |; hence, coherence of the vorticity direction field softens up the singularity and depletes the nonlinearity (the vortex-stretching term). For a point (x0 , t0 ) ∈ × (0, T ), denote by Q δ (x0 , t0 ) an open parabolic cylinder B(x0 , δ) × (t0 − δ 2 , t0 ) contained in × (0, T ).

864

Z. Gruji´c

3. Localization of Vortex-Stretching As in [GrZh], let ψ(x, t) = φ(x)η(t) be a smooth cut-off function on Q 2r (x0 , t0 ) satisfying supp φ ⊂ B(x0 , 2r ), φ = 1 on B(x0 , r ),

|∇φ| c ≤ for some ρ ∈ (0, 1), 0 ≤ φ ≤ 1 ρ φ r

and supp η ⊂ (t0 − (2r )2 , t0 ], η = 1 on [t0 − r 2 , t0 ], |η | ≤

c , 0 ≤ η ≤ 1. r2

1 ≤ ρ < 1, leads to a suitable 2 bound on the localized transport term (u · ∇)ω · ψ 2 ω. An additional restriction on ρ will transpire here in order to control the lower order terms in the localization of the vortex-stretching term. The goal of this section is to obtain an explicit localization formula for the vortex stretching term (ω · ∇)u · ω on Q 2r (x0 , t0 ). The computation will be uniform in time; thus the time variable t will be omitted. Let x be in B(x0 , 2r ), and consider It was shown in [GrZh]–Sect. 2 that choosing any ρ,

φ 2 (x)(ω · ∇)u · ω (x) = φ(x)

∂ u j (x) φ(x) ωi (x) ω j (x). ∂ xi

First write φu j as

1 φu j dy |x − y| B(x ,2r ) 0 1 =c φ u j dy |x − y| B(x0 ,2r ) 1 2∇φ · ∇u j + φ u j dy +c |x − y| B(x0 ,2r ) 1 =c φ (curl ω) j dy |x − y| B(x0 ,2r ) 1 2∇φ · ∇u j + φ u j dy +c |x − y| B(x0 ,2r ) = I1 + I2 ,

φ(x)u j (x) = c

and note that the terms in I2 are the lower order terms with respect to I1 . Using the Levi-Civita symbol jkl , I1 can be written in the following way, 1 ∂ φ jkl I1 = c ωl dy ∂ yk B(x0 ,2r ) |x − y| ∂ ∂ 1 1 φ ωl dy + −c jkl jkl φ ωl dy = −c ∂ yk |x − y| |x − y| ∂ yk B(x0 ,2r ) B(x0 ,2r ) = J1 + J2 .

Localization and Geometric Depletion of Vortex-Stretching in the 3D NSE

865

Hence, φu j = J1 + J2 + I2 ,

(3)

where J2 and I2 are the lower order terms with respect to J1 . Differentiating the localized Biot-Savart law (3) yields ∂ ∂2 1 ∂ ∂ (φu j )(x) = −c P.V. jkl J2 + I2 . φ ωl dy + ∂ xi ∂ xi ∂ yk |x − y| ∂ xi ∂ xi B(x0 ,2r ) Writing φ

∂ ∂ ∂ uj = (φu j ) − φ u j , this implies our localization formula, ∂ xi ∂ xi ∂ xi

φ 2 (x)(ω · ∇)u · ω (x) ∂ = φ(x) u j (x) φ(x) ωi (x) ω j (x) ∂ xi ∂2 1 = −c P.V. jkl φ ωl dy φ(x) ωi (x) ω j (x) + LOT ∂ x ∂ y |x − y| i k B(x0 ,2r ) = −c P.V. (ω(x) × ω(y)) · G ω (x, y) φ(y) φ(x) dy + LOT B( x0 ,2r )

= VSTloc + LOT,

(4)

where 1 ∂2 ωi (x) ∂ xi ∂ yk |x − y| and LOT denotes the terms that are the lower order terms with respect to VSTloc in the sense they are either lower order for at least one order of the differentiation or/and less singular for at least one power of |x − y|. (G ω (x, y))k =

4. Control of the Localized Enstrophy Let u be a Leray solution on × (0, T ), i.e., a weak (distributional) solution satisfying u ∈ L ∞ (0, T ; L 2 ) ∩ L 2 (0, T ; H 1 ). Fix a point (x0 , t0 ) and let R > 0 be such that Q 2R (x0 , t0 ) ⊂ × (0, T ). For simplicity of the exposition, we will assume that u is smooth on an open parabolic cylinder Q 2R (x0 , t0 ) and obtain bounds on the enstrophy localized to B(x0 , R) uniformly in t in (t0 − R 2 , t0 ). Let r ≤ R. A direct calculation shows (cf. [GrZh] – Sect. 2) that multiplying the vorticity equations (2) by ψ 2 ω and integrating over Q s2r = B(x0 , 2r ) × (t0 − (2r )2 , s), for a fixed s in (t0 − (2r )2 , t0 ), yields 1 φ 2 (x)|ω|2 (x, s) d x + |∇(ψω)|2 d xdt 2 B(x0 ,2r ) Q s2r |η||∂t η| + |∇ψ|2 |ω|2 d xdt ≤ Q 2r 2 2 (u · ∇)ω · ψ ω d xdt + (ω · ∇)u · ψ ω d xdt + Qs Qs 2r

= T1 + T2 + T3 . It is plain that

2r

866

Z. Gruji´c

T1 ≤ c(r )

|ω|2 d xdt,

(5)

Q 2r

and the following bound on the localized transport term T2 was derived in [GrZh] – Sect. 2, 1 T2 ≤ |∇(ψω)|2 d xdt + c(r ) |ω|2 d xdt. (6) 2 Q 2r Q 2r To estimate the localized vortex-stretching term T3 , we will utilize the representation formula (4). (As mentioned in the introduction, the estimate in [GrZh]–Sect. 3 made use of the singular integral representation formula of the vortex-stretching factor α over the whole space R3 which was then separated in small and large scales.) Bringing back to life the time variable t and multiplying both sides of (4) by η2 implies 1 T3 ≤ c P.V. ψ(y, t) |ω|(y, t) ψ(x, t) |ω|2 (x, t) d y d xdt (7) 3 s Q 2r B(x0 ,2r ) |x − y| + the lower order terms, where each lower order term is at least for one order of differentiation or/and at least one power of |x − y| less singular than the principal term. This bound is insufficient to close the estimate on the localized enstrophy. A geometric structure of the leading term in (4) will be exploited in the following section to show that 21 -Hölder coherence of the vorticity direction depletes the nonlinearity preventing a possible singularity formation. 5. Localization of the Coherence of the Vorticity Direction Field Condition for the Regularity Theorem 1. Let ⊆ R3 be open, and u a Leray solution on the space-time domain × (0, T ) for some T > 0. Fix a point (x0 , t0 ) in × (0, T ), and let 0 < R < 1 be such that the open parabolic cylinder Q 2R (x0 , t0 ) = B(x0 , 2R) × (t0 − (2R)2 , t0 ) is contained in × (0, T ). Suppose that u is smooth on Q 2R (x0 , t0 ) and that there exist two positive constants K , M such that the following coherence condition holds, 1

| sin ϕ (ξ(x, t), ξ(y, t)) | ≤ K |x − y| 2 for all (x, t), (y, t) in Q 2R ∩ {|ω| > M}. Then the localized enstrophy remains uniformly bounded up to t = t0 , i.e., sup |ω|2 (x, t) d x < ∞. t∈(t0 −R 2 ,t0 )

B(x0 ,R)

Proof. It remains to estimate T3 . First, separate the contribution of the region of the low vorticity magnitude from the contribution of the region of the high vorticity magnitude, 2 T3 ≤ (ω · ∇)u · ψ ω d xdt Q s ∩{|ω|<M} 2r 2 + (ω · ∇)u · ψ ω d xdt = T3l + T3h . Q s ∩{|ω|>M} 2r

Localization and Geometric Depletion of Vortex-Stretching in the 3D NSE

867

The first term, T3l , is bounded by c

Q 2r

|∇u|2 d xdt. Utilizing 21 -Hölder coherence of

the vorticity direction and (4), the following estimate transpires, T3h ≤ c

1

ψ(y, t) |ω|(y, t) ψ(x, t) |ω|(x, t) |ω|(x, t) d y d xdt 5 |x − y| 2 + the lower order terms = I + I L O T . (8) B(x0 ,2r )

Q s2r

For the leading term in the above estimate, I , we apply Hölder (in x) with the exponents 4, 4 and 2, and then the weak Young to the first factor, leading to I ≤ ≤

s

t0 −(2r )2 t0 t0 −(2r )2

ψω(t)

ψω(t)

L

12 5 (R 3 )

L

12 5 (R 3 )

ψω(t) L 4 (R3 ) ω(t) L 2 (B(x0 ,2r )) dt

ψω(t) L 4 (R3 ) ω(t) L 2 (B(x0 ,2r )) dt.

This followed by interpolating the first two L p -norms yields I ≤ = ≤

t0

t0 −(2r )2 t0 t0 −(2r )2

∇(ψω)(t) L 2 (R3 ) ψω(t) L 2 (R3 ) ω(t) L 2 (B(x0 ,2r )) dt

∇(ψω)(t) L 2 (B(x0 ,2r )) ψω(t) L 2 (B(x0 ,2r )) ω(t) L 2 (B(x0 ,2r )) dt

sup t∈(t0 −(2r )2 ,t0 )

φω(t) L 2 (B(x0 ,2r )) ∇(ψω) L 2 (Q 2r ) ω L 2 (Q 2r )

≤ ω L 2 (Q 2r )

1 2 2 sup

φω(t) L 2 (B(x ,2r )) + ∇(ψω) L 2 (Q ) . 0 2r 2 t∈(t0 −(2r )2 ,t0 )

Collecting the estimates on T1 , T2 , T3l and T3h , we arrive at the following bound, 1 2

B(x0 ,2r )

φ 2 (x)|ω|2 (x, s) d x +

Q s2r

|∇(ψω)|2 d xdt

≤ c(r ) |ω|2 d xdt Q 2r 1 2 + |∇(ψω)| d xdt + c(r ) |ω|2 d xdt 2 Q 2r Q 2r +c |∇u|2 d xdt Q 2r

1 2 2

φω(t) L 2 (B(x ,2r )) + ∇(ψω) L 2 (Q ) + ω L 2 (Q 2r ) sup 0 2r 2 t∈(t0 −(2r )2 ,t0 ) + IL O T

868

Z. Gruji´c

for all s in (t0 − (2r )2 , t0 ). Hence, 1 2

sup t∈(t0

−(2r )2 ,t

0)

φω(t) 2L 2 (B(x

0 ,2r ))

+ ∇(ψω) 2L 2 (Q

2r )

≤ c(r )

|∇u|2 d xdt

1 2 2 sup

φω(t) L 2 (B(x ,2r )) + ∇(ψω) L 2 (Q ) + 2 ∇u L 2 (Q 2r ) 0 2r 2 t∈(t0 −(2r )2 ,t0 ) Q 2r

+ IL O T . Since u is a Leray solution, the first term in the estimate is finite. Moreover, since 1

∇u L 2 (Q 2r ) → 0, r → 0, there exist δ > 0 such that ∇u L 2 (Q 2ρ ) ≤ for all ρ ≤ δ. 4 If r ≤ δ, the second term is absorbed. If r > δ, a desired bound is obtained by covering Br (x0 , t0 ) with finitely many balls Bδ (z 0 , t0 ) and redoing the proof on each cylinder Q 2δ (z 0 , t0 ). At the end, let us briefly address the lower order terms, I L O T . Consider a highest order lower order term I Lh O T , 1 I Lh O T = |∇ψ|(y, t) |∇u|(y, t) ψ(x, t) |ω|2 (x, t) d y d xdt. 2 Q 2r B(x0 ,2r ) |x − y| 1 Let 21 ≤ ρ < 1. By construction of the cut-off φ, |∇φ(y)| ≤ cρ φ ρ (y), i.e., we can r estimate the gradient of the cut-off by the ρ-power of the cut-off. A problem here is that the cut-off is in the wrong variable (there is no use localizing |∇u|). To switch localization from y to x use the Mean Value Theorem to write φ ρ (y) = φ ρ (x) + ρ Since ρ ≥

∇φ(z) · (y − x). φ 1−ρ (z)

1 , 2 1 1 |∇φ(y)| ≤ c1 (ρ) φ ρ (x) + c2 (ρ) 2 |x − y|. r r

This leads to the following bound, I Lh O T

1 1 1−ρ |ω| |∇u|(y, t) (x, t) |ψω|1+ρ (x, t) d y d xdt 2 r Q 2r B(x0 ,2r ) |x − y| 1 1 + c2 (ρ) 2 |∇u|(y, t) |ω|(x, t) |ψω|(x, t) d y d xdt r Q 2r B(x0 ,2r ) |x − y| = A + B.

≤ c1 (ρ)

For the first term, notice that were the limit case ρ = 1 possible, the critical kernel 1 would be . Namely, Hölder (in x) with the exponents 3 and 23 followed by the 5 |x − y| 2

Localization and Geometric Depletion of Vortex-Stretching in the 3D NSE

weak Young would yield the bound t0

∇u(t) L 2 (B(x0 ,2r )) ψω(t) 2L 3 (B(x t0 (2r )2

0 ,2r ))

869

dt;

interpolating the second norm, this is easily bounded by

1 sup

φω(t) 2L 2 (B(x ,2r )) + ∇(ψω) 2L 2 (Q ) .

∇u L 2 (Q 2r ) 0 2r 2 t∈(t0 −(2r )2 ,t0 ) Since there is a gap between our kernel and the critical kernel, it is possible to choose ρ sufficiently close to 1 in order to separate out r1 |ω|1−ρ factor and bound its contribution by a term depending only on r and ∇u L 2 (Q 2r ) . For the second term, a crude estimate (Hölder (in x) with the exponents ∞, 2 and 2 followed by Cauchy-Schwartz in y) yields t0 B ≤ c(ρ, r )

∇u(t) L 2 (B(x0 ,2r )) ω(t) L 2 (B(x0 ,2r )) ψω(t) L 2 (B(x0 ,2r )) dt, t0 −(2r )2

and this is bounded by

sup t∈(t0

for any > 0.

−(2r )2 ,t

0)

φω(t) 2L 2 (B(x

0 ,2r ))

+ c(ρ, r, ) ∇u 4L 2 (Q

2r )

Remark 1. The proof can be modified leading to localization of the hybrid geometricanalytic conditions for the regularity derived in [GrRu]. Remark 2. The theorem implies interior regularity of any Leray solution possessing 1 2 -Hölder coherent vorticity direction field independently of the type of the domain or the boundary conditions. This, in particular, provides a partially positive answer to a question raised in [daVeiga2] whether in the case of the non-slip boundary conditions on a bounded domain, the coherence of the vorticity direction field alone, without any extra assumptions, implies the regularity. References [Co] [CoFe]

Constantin, P.: Geometric statistics in turbulence. SIAM Rev. 36(1), 73–98 (1994) Constantin, P., Fefferman, C.: Direction of vorticity and the problem of global regularity for the Navier-Stokes equations. Indiana Univ. Math. J. 42, 775–789 (1993) [daVeiga1] Beirao da Veiga, H.: Vorticity and regularity for flows under the Navier boundary conditions. Comm. Pure App. Anal. 5, 483–494 (2006) [daVeiga2] Beirao da Veiga, H.: Vorticity and regularity for viscous incompressible flows under the Dirichlet boundary condition. Results and Related Open Problems. J. Math. Fluid Mech. 9, 506–516 (2007) [daVeigaBe1] Beirao da Veiga, H., Berselli, L.C.: On the regularizing effect of the vorticity direction in incompressible viscous flows. Diff. Int. Eqs. 15(3), 345–356 (2002) [daVeigaBe2] Beirao da Veiga, H., Berselli, L.C.: Navier-Stokes equations: Green’s matrices, vorticity direction, and regularity up to the boundary. J. Diff. Eqs. 246(2), 597–628 (2008) [Gr] Gruji´c, Z.: The geometric structure of the super-level sets and regularity for 3D Navier-Stokes equations. Indiana Univ. Math. J. 50, 1309–1317 (2001) [GrRu] Gruji´c, Z., Ruzmaikina, A.: Interpolation between algebraic and geometric conditions for smoothness of the vorticity in the 3D NSE. Indiana Univ. Math. J. 53, 1073–1080 (2004)

870

[GrZh] [RuGr]

Z. Gruji´c

Gruji´c, Z., Zhang, Q.S.: Space-time localization of a class of geometric criteria for preventing blow-up in the 3D NSE. Commun. Math. Phys. 262, 555–564 (2006) Ruzmaikina, A., Gruji´c, Z.: On depletion of the vortex-stretching term in the 3D Navier-Stokes equations. Commun. Math. Phys. 247, 601–611 (2004)

Communicated by P. Constantin

Commun. Math. Phys. 290, 871–902 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0760-1

Communications in

Mathematical Physics

On the Lie-Algebraic Origin of Metric 3-Algebras Paul de Medeiros1 , José Figueroa-O’Farrill1,2 , Elena Méndez-Escobar1 , Patricia Ritter1 1 School of Mathematics and Maxwell Institute for Mathematical Sciences,

University of Edinburgh, James Clerk Maxwell Building, King’s Buildings, Edinburgh EH9 3JZ, UK. E-mail: [email protected]; [email protected]; [email protected]; [email protected] 2 Department de Física Teòrica & IFIC (CSIC-UVEG), Universitat de València, 46100 Burjassot, Spain Received: 12 September 2008 / Accepted: 27 November 2008 Published online: 26 February 2009 – © Springer-Verlag 2009

Abstract: Since the pioneering work of Bagger–Lambert and Gustavsson, there has been a proliferation of three-dimensional superconformal Chern–Simons theories whose main ingredient is a metric 3-algebra. On the other hand, many of these theories have been shown to allow for a reformulation in terms of standard gauge theory coupled to matter, where the 3-algebra does not appear explicitly. In this paper we reconcile these two sets of results by pointing out the Lie-algebraic origin of some metric 3-algebras, including those which have already appeared in three-dimensional superconformal Chern–Simons theories. More precisely, we show that the real 3-algebras of Cherkis–Sämann, which include the metric Lie 3-algebras as a special case, and the hermitian 3-algebras of Bagger–Lambert can be constructed from pairs consisting of a metric real Lie algebra and a faithful (real or complex, respectively) unitary representation. This construction generalises and we will see how to construct many kinds of metric 3-algebras from pairs consisting of a real metric Lie algebra and a faithful (real, complex or quaternionic) unitary representation. In the real case, these 3-algebras are precisely the Cherkis–Sämann algebras, which are then completely characterised in terms of this data. In the complex and quaternionic cases, they constitute generalisations of the Bagger–Lambert hermitian 3-algebras and anti-Lie triple systems, respectively, which underlie N = 6 and N = 5 superconformal Chern–Simons theories, respectively. In the process we rederive the relation between certain types of complex 3-algebras and metric Lie superalgebras. Contents 1. 2.

Contextualisation and Introduction . . . . . . . . . . . . . The Generalised Metric Lie 3-Algebras of Cherkis–Sämann 2.1 Deconstruction . . . . . . . . . . . . . . . . . . . . . 2.2 Reconstruction . . . . . . . . . . . . . . . . . . . . . 2.3 Recognition . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

872 874 875 879 881

872

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

3.

The Hermitian 3-Algebras of Bagger–Lambert 3.1 Deconstruction . . . . . . . . . . . . . . 3.2 Reconstruction . . . . . . . . . . . . . . 3.3 Recognition . . . . . . . . . . . . . . . . 4. A General Algebraic Construction . . . . . . . 4.1 The Faulkner construction . . . . . . . . 4.2 Embedding Lie (super)algebras . . . . . . 4.3 Unitary representations . . . . . . . . . . 4.3.1 Real orthogonal representations. . . . 4.3.2 Complex unitary representations. . . . 4.3.3 Quaternionic unitary representations. 5. Conclusions and Open Problems . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

882 883 886 888 892 892 894 894 895 895 895 900 901 901

1. Contextualisation and Introduction The foundational work of Bagger and Lambert [1] and Gustavsson [2], culminating in the theory proposed in [3], has prompted a great deal of progress recently [4–18] in our ability to construct new superconformal field theories in three dimensions that are thought to describe the low-energy dynamics of configurations of multiple coincident M2-branes in M-theory. These new superconformal field theories all involve a non-dynamical gauge field, described by a Chern–Simons-like term in the lagrangian, which is coupled to matter fields parametrising the degrees of freedom transverse to the worldvolume of the M2-branes. The order of the matter couplings in these theories is dictated by superconformal invariance and indeed several other features are as expected (see [19,20] and references therein) following analyses based on existing N = 1 and N = 2 superfield techniques in three dimensions. The novelty in the proposal of [1–3] for a theory of this kind with maximal N = 8 supersymmetry is that the interactions involve an algebraic object known as a Lie 3-algebra, instead of the Lie algebras commonly used to describe conventional gauge couplings. A lagrangian description of this theory requires the Lie 3-algebra to admit an “invariant” inner product. (See, for example, [21, Remark 8].) Demanding the signature of this inner product to be positive-definite ensures the unitarity of the resulting quantum theory, but restricts [22–24] one for all practical purposes to the unique indecomposable example already considered in [3]. To some extent, the Bagger-Lambert theory in [3] has now been subsumed into an N = 6 superconformal Chern–Simons-matter theory proposed by Aharony, Bergman, Jafferis and Maldacena [6] (see also [7]). This N = 6 theory involves ordinary gauge couplings for matter fields valued in the bifundamental representation of the Lie algebra u(n) ⊕ u(n) (or su(n) ⊕ su(n) via a certain truncation) and coincides with [3] for the special case of n = 2, where supersymmetry is enhanced to N = 8. Generically it is thought to describe n coincident M2-branes on the orbifold R8 /Zk , where k denotes the integer level of the Chern–Simons term. Despite the conventional appearance of the gauge symmetry in this theory, Bagger and Lambert [10] subsequently showed that it too can be recast in terms of a metric 3-algebra, albeit of a different kind from the metric Lie 3-algebras of [3]. Another type of superconformal field theory which, like [6,10], has an SU(4) × U(1) global symmetry but is generically only N = 2 supersymmetric has been proposed by

On the Lie-Algebraic Origin of Metric 3-Algebras

873

Cherkis and Sämann [11]. This theory is also based on a metric 3-algebra, albeit different from the metric Lie 3-algebras of [3], which they generalise, and from the metric 3-algebras of [10]. Generalisations of the model in [6] have been considered in [9,12,13,18], using a plethora of different techniques, though all result in conventional Chern–Simons theories coupled to matter fields valued in the bifundamental or similar tensor-product representation of a variety of possible Lie algebras. With generic N = 6 supersymmetry, it has been found [9,12,18] that this Lie algebra can be either su(m) ⊕ su(n) ⊕ u(1) or sp(m) ⊕ u(1) (up to additional factors of u(1)). (In a revision to [10], Bagger and Lambert have also written the 3-algebra that corresponds to the u(m) ⊕ u(n) case.) With generic N = 5 supersymmetry, the Lie algebra can be either so(m) ⊕ sp(n) [9], so(7) ⊕ sp(1), g2 ⊕ sp(1) or so(4) ⊕ sp(1) [18]. For the regular cases where m = n, these theories have been interpreted [13] in terms of |m − n| fractional M2-branes probing the singularity of the M-theory background R8 /Zk in the unitary case and R8 / Dˆ k in the orthogonal and symplectic cases ( Dˆ k being the binary dihedral group of order 4k). Consistent N = 4 truncations of all these new theories exist [18] from which one can recover some previously known classes of N = 4 superconformal Chern–Simons-matter theories [4,5]. There have also been new proposals for the holographic duals of some smooth M-theory backgrounds of the form Ad S4 × X 7 which are less than half-BPS, and again they involve conventional superconformal Chern–Simons-matter theories with quiver gauge groups. The N = 3 superconformal theories dual to backgrounds for which X 7 is a 3-Sasaki manifold arising from a particular quotient construction have been described in [15]. An N = 1 truncation of the model in [6], obtained by breaking the SU(4) R-symmetry to Sp(2), has been given in [14] which corresponds to X 7 being a squashed S 7 . All this prompts one to question whether the 3-algebras appearing in the constructions [1–3,10,11] play a fundamental rôle in M-theory or, at least insofar as the effective field theory is concerned, are largely superfluous. The equivalence of [10 and 6] and the abundance of new theories (dual to known M-theory backgrounds) which seem not to involve a 3-algebra might suggest the latter. Nonetheless, given our lack of understanding of how to incorporate in Lie-algebraic terms the expected properties of M-theoretic degrees of freedom, like the entropy scaling laws for M2- and M5-brane condensates, it may be useful to understand the precise relation between the 3-algebras appearing in the recent literature on superconformal Chern–Simons theory and Lie algebras. In this paper we will do precisely this. We will show how certain types of metric 3-algebras, which include the ones which have appeared in the recent literature, can be deconstructed into and reconstructed from pairs consisting of a metric real Lie algebra and a faithful unitary representation. In a way this reduces the problem of constructing and even characterising such metric 3-algebras to the problem of classifying metric Lie subalgebras of the orthogonal, unitary and unitary symplectic Lie algebras. Our results take their inspiration from a general algebraic construction of pairs due to Faulkner [25], which we will review below; although for pedagogical reasons we have decided not to present the results in this paper as a series of specialisations of Faulkner’s construction, but rather to present this construction as an a posteriori unifying theme. In a companion paper we will present the construction of superconformal Chern–Simons theories from the Lie-algebraic datum—one principal aim being to translate physical properties of the field theory into Lie representation theory. This paper is organised as follows. In Sect. 2 we review the definition of the generalised metric Lie 3-algebras of [11] and extract from every such 3-algebra a pair (g, V ) consisting of a metric real Lie algebra g and a faithful real orthogonal representation V .

874

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

We illustrate this with several examples: the unique nonabelian indecomposable euclidean Lie 3-algebra, the nonsimple nonabelian indecomposable lorentzian Lie 3-algebras, the C2d 3-algebras of [11] and a mild generalisation we denote Cm+n . We then show how starting from such a pair (g, V ) one can reconstruct a metric Lie 3-algebra in the class discussed in [11] and in this way establish a one-to-one correspondence between isomorphism classes of generalised metric Lie 3-algebras and isomorphism classes of pairs (g, V ), for natural notions of isomorphism which will be defined. This reduces the classification of generalised metric Lie 3-algebras to that of metric Lie subalgebras of so(V ). In Sect. 3 we review the definition of the hermitian 3-algebras of [10] and extract from every such 3-algebra a pair (g, V ) consisting again of a metric real Lie algebra g and now a faithful complex unitary representation V . We illustrate this with the examples of the hermitian 3-algebras in (the revised version of) [10]. Conversely, starting from such a pair (g, V ) we show how to reconstruct a hermitian 3-algebra which in general will belong to a class which includes the 3-algebras of [10] as special cases. This then poses the problem of recognising the 3-algebras of [10] within this more general class. We solve this problem by showing that these are precisely those hermitian 3-algebras which can be embedded in a complex metric Lie superalgebra with even subalgebra the complexification gC of g and even subspace V ⊕ V ∗ . This complex Lie superalgebra is itself the complexification of a real Lie superalgebra with underlying vector superspace g ⊕ [[V ]], where [[V ]] is the real representation of g whose complexification is V ⊕ V ∗ . This result may be thought of as a 3-algebraic rederivation of the relation between N = 6 Chern–Simons theories and Lie superalgebras [9,12] based on earlier work on N = 4 theories [4,5] and sheds some light on the relationship between that work and [10]. In Sect. 4 we introduce Faulkner’s construction of pairs from a metric Lie algebra and a faithful representation and we show how the constructions in previous sections correspond to the special cases where the representation is real orthogonal and complex unitary, respectively. The former construction appears already in [25], whereas the latter appears to be new. For completeness, we also present the construction of 3-algebras from quaternionic unitary representations, that being the only other type of inner product which admits a notion of positive-definiteness. The construction can be repeated for other types of inner products and some of them have been considered already in [25], but we restrict ourselves in this paper to those which are likely to play a rôle in the construction of superconformal Chern–Simons theories. This case contains the metric 3-algebras which can be used to reformulate N = 5 superconformal Chern–Simons theories. Finally in Sect. 5 we summarise the main results of this paper and comment on future directions of research that these results suggest. 2. The Generalised Metric Lie 3-Algebras of Cherkis–Sämann In [11] Cherkis and Sämann constructed three-dimensional N = 2 superconformal Chern– Simons theories from a 3-algebra defined as follows [11, §4.1]. Definition 1. A generalised metric Lie 3-algebra consists of a real inner product space (V, −, −) and a trilinear bracket : V × V × V → V , denoted (x, y, z) → [x, y, z], satisfying the following axioms for all x, y, z, v, w ∈ V : (CS1) the unitarity condition [x, y, z], w = − z, [x, y, w] ;

(1)

On the Lie-Algebraic Origin of Metric 3-Algebras

875

(CS2) the symmetry condition [x, y, z], w = [z, w, x], y ;

(2)

(CS3) and the fundamental identity [x, y, [v, w, z]] − [v, w, [x, y, z]] = [[x, y, v], w, z] + [v, [x, y, w], z].

(3)

We remark that axioms (1) and (2) together imply that [x, y, z] = −[y, x, z], whence the bracket defines a linear map 2 V ⊗ V → V . Decomposing 2 V ⊗ V into 2 V ⊗ V = 3 V ⊕ V

,

where V contains the contraction with the inner product and is therefore not irreducible, breaks up in turn into two components: (1) F : 3 V → V , which is totally skewsymmetric; and (2) L : V → V , which is such that L (x, y, z) + L (z, x, y) + L (y, z, x) = 0. There are thus two extremal cases of these 3-algebras: • (metric) Filippov or Lie 3-algebras, where = F , and • (metric) Lie triple systems, where = L . The former case coincides with the metric Lie 3-algebras of the N = 8 theories [1–3]. The general case, however, is a mixture of these two. We will now show that generalised metric Lie 3-algebras are in one-to-one correspondence with pairs (g, V ) consisting of a (real) metric Lie algebra g and a faithful orthogonal representation V , or equivalently, with metric subalgebras g < so(V ). This is proved in two steps. Given such a 3-algebra V we will associate a metric Lie algebra g, and conversely we will reconstruct such a 3-algebra from a pair (g, V ). 2.1. Deconstruction. Let V be a generalised metric Lie 3-algebra and let us define a bilinear map D : V × V → End V by D(x, y)z = [x, y, z]. We have seen before that D(x, y) = −D(y, x). Moreover the unitarity axiom (1), when written in terms of D, becomes D(x, y)z, w = − z, D(x, y)w , whence D(x, y) ∈ so(V ) is skewsymmetric. Let g denote the span of the D(x, y) in End V . As we now show, it is closed under commutators. Proposition 2. Let g = im D. Then g < so(V ) is a Lie subalgebra. Proof. This is a consequence of the fundamental identity (3). Indeed, written in terms of D, the fundamental identity reads D(x, y)D(v, w)z − D(v, w)D(x, y)z = D(D(x, y)v, w)z + D(v, D(x, y)w)z, which upon abstracting z can be rewritten as [D(x, y), D(v, w)] = D(D(x, y)v, w) + D(v, D(x, y)w), which shows that [g, g] ⊂ g.

(4)

876

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

Moreover we claim that g is a metric Lie algebra, so that it possesses an ad-invariant inner product denoted by (−, −). This inner product is defined by extending (D(x, y), D(z, w)) = D(x, y)z, w

(5)

bilinearly to all of g. Proposition 3. The bilinear form on g defined by (5) is symmetric, nondegenerate and ad-invariant. Proof. The symmetry of the bilinear form (5) is precisely the symmetry axiom (2). Indeed, the symmetry axiom says that [x, y, z], w = D(x, y)z, w = (D(x, y), D(z, w)) is equal to [z, w, x], y = D(z, w)x, y = (D(z, w), D(x, y)). To prove nondegeneracy, let δ = i D(xi , yi ) be such that (δ, D(u, v)) = 0 for all u, v ∈ V . This means that δu, v = 0 ∀u, v ∈ V. Since the inner product on V is nondegenerate, this means that δu = 0 for all u ∈ V , whence the endomorphism δ = 0. This means that (5) does define an inner product. Finally, we prove the ad-invariance of the inner product: (D(u, v), [D(w, x), D(y, z)]) = (D(u, v), D(D(w, x)y, z) + D(y, D(w, x)z)) = D(u, v)D(w, x)y, z + D(u, v)y, D(w, x)z = D(u, v)D(w, x)y, z − D(w, x)D(u, v)y, z = [D(u, v), D(w, x)]y, z = ([D(u, v), D(w, x)], D(y, z))

by (4) by (5) by (1) again by (5).

It is worth stressing that the bilinear form (−, −) in (5) depends on the inner product −, − on V and is distinct from the Killing form on g. For the case of metric Lie 3-algebras, the distinction between these two objects has been noted already by Gustavsson in [26]. It is the bilinear form in (5) which appears in the Chern-Simons term in the superconformal field theories associated with these 3-algebras, rather than the Killing on g that is sometimes assumed. Let us illustrate the aforementioned distinction with some examples. Example 4. First of all we consider the original euclidean Lie 3-algebra in [3], denoted A4 . The underlying vector space is R4 with the standard euclidean structure. Relative to an orthonormal basis (e1 , . . . , e4 ), the 3-bracket is given by [ei , e j , ek ] =

4 =1

i jk e .

(6)

On the Lie-Algebraic Origin of Metric 3-Algebras

877

The Lie algebra g = im D is spanned by generators Di j := D(ei , e j ), for i < j. It is not hard to see that these six generators are linearly independent, whence they must span all of so(4), which is the gauge algebra of the original Bagger–Lambert model. Unlike the Killing form on so(4), the inner product (−, −) induced by the 3-algebra structure Di j , Dk = [ei , e j , ek ], e (7) has indefinite signature. Indeed, for the six generators Di j , the only nonzero inner products are (D12 , D34 ) = (D14 , D23 ) = (D13 , D42 ) = 1,

(8)

whence the signature is split. This manifests itself in the fact that the levels of the Chern– Simons terms coming from the different su(2) factors in so(4) have opposite signs, as seen, for example, in [27]. The fact that the induced inner product on g is not positive-definite is actually always the case for metric Lie 3-algebras. Indeed, we have the following Proposition 5. The D(x, y) ∈ g have zero norm for all x, y ∈ V if and only if the 3-bracket is totally skewsymmetric. Proof. Skewsymmetry of the bracket says that for all x, y ∈ V , (D(x, y), D(x, y)) = [x, y, x], y = 0, whence D(x, y) has zero norm. (This property has been noted in Eq. (28) of [26].) Conversely, suppose that for all x, y ∈ V , D(x, y) is null. Polarising 0 = (D(x, y + z), D(x, y + z)) = (D(x, y) + D(x, z), D(x, y) + D(x, z)) = 2 (D(x, y), D(x, z)) . This says that (D(x, y), D(z, w)) = [x, y, z], w is skewsymmetric under x ↔ z. Since it is already skewsymmetric in x ↔ y and z ↔ w, it follows that it is totally skewsymmetric. This means that in the case of a metric Lie 3-algebra, g has a basis consisting of vectors of zero norm. This means that the inner product on g has split signature and in particular that g is even-dimensional. We continue with the lorentzian metric Lie 3-algebras of [21,28–30]. Example 6. Let s be a semisimple Lie algebra with a choice of ad-invariant inner product. We let W (s) denote the lorentzian Lie 3-algebra with underlying vector space V = s ⊕ Ru ⊕ Rv with inner product extending the one on s by declaring u, v ⊥ s and u, v = 1 and v, v = 0 = u, u, the last condition being a choice. The nonzero 3-brackets are given by [u, x, y] = [x, y]

and

[x, y, z] = − [x, y], z v,

for all x, y, z ∈ s. The Lie algebra g is the Lie algebra of inner derivations of the Lie 3-algebra, which was termed ad V in [21]. As shown for example in that paper, this algebra is isomorphic to s sab with generators A x = D(x1 , x2 ) and Bx = D(u, x) for

878

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

x ∈ s and where in the definition of A x , we have x = [x1 , x2 ]. Since s is semisimple, [s, s] = s and hence such x1 , x2 can always be found and moreover A x is independent of the precise choice of x1 , x2 solving x = [x1 , x2 ]. The nonzero Lie brackets of g are given by [A x , A y ] = A[x,y]

and

[A x , B y ] = B[x,y] .

The inner product is then given by A x , A y = [x1 , x2 , y1 ], y2 = − x, y1 v, y2 = 0, A x , B y = [x1 , x2 , u], y = x, y, Bx , B y = [u, x, u], y = 0, whence we see that the inner product is again split, as expected. Next, we consider the C2d 3-algebras in [11]. These are not metric Lie 3-algebras. Example 7. The underlying vector space V of C2d is the real vector space of off-diagonal hermitian 2d × 2d matrices:

0 A

A ∈ Mat d (C) , V = A∗ 0

with A∗ the hermitian adjoint and with scalar product given by the trace of the product, which agrees with (twice) the real part of the natural hermitian inner product on the space of complex d × d matrices: 0 A 0 B , = tr(AB ∗ + A∗ B) = 2Re tr A∗ B. A∗ 0 B∗ 0 1 0 The 3-bracket is defined by [x, y, z] := [[x, y]τ, z], where τ = , for all x, y, z ∈ V , 0 −1 0 B 0 A ,y= and and where [−, −] is the matrix commutator. Letting x = B∗ 0 A∗ 0 0 C z= , a calculation reveals that C∗ 0 [x, y, z] 0 (AB ∗ − B A∗ )C + C(A∗ B − B ∗ A) , = (B ∗ A − A∗ B)C ∗ + C ∗ (B A∗ − AB ∗ ) 0 whence the action of D(x, y) on z is induced from the action of D(x, y) on C, which is a linear combination of left multiplication by the skewhermitian d × d matrix AB ∗ − B A∗ and right multiplication by the skewhermitian d × d matrix A∗ B − B ∗ A. Let us decompose them into traceless + scalar matrices as follows AB ∗ − B A∗ = S(A, B) + di α(A, B)1d

and

A∗ B − B ∗ A = S(A∗ , B ∗ ) − di α(A, B)1d , where S(A, B) and S(A∗ , B ∗ ) are traceless, whence in su(d), and α(A, B) = 2Im tr AB ∗ ∈ R. Notice as well that

On the Lie-Algebraic Origin of Metric 3-Algebras

879

(AB ∗ − B A∗ )C + C(A∗ B − B ∗ A) = S(A, B)C + C S(A∗ , B ∗ ), whence the Lie algebra g is isomorphic to su(d) ⊕ su(d), acting on V as the underlying real representation of (d, d) ⊕ (d, d), which we denote [[(d, d)]]. In other words, to the generalised metric Lie 3-algebra C2d one can associate the pair (su(d) ⊕ su(d), [[(d, d)]]). Notice finally that the invariant inner product on su(d) ⊕ su(d) has split signature, being given by (X L ⊕ X R , Y L ⊕ Y R ) = tr(X L Y L − X R Y R ), for X L , Y L , X R , Y R ∈ su(d). It is possible to modify this example in order to construct 3-algebras denoted Cm+n , where the underlying vector space is the space of hermitian (m + n) × (m + n) matrices of the form 0 A , A∗ 0 where A is a complex m × n matrix and its hermitian adjoint A∗ is therefore a complex n × m matrix. Mutatis mutandis the same construction as above gives rise to a generalised metric Lie 3-algebra to which one may associate the pair (su(m) ⊕ su(n) ⊕ u(1), [[(m, n)]]). The inner product on su(m) ⊕ su(n) ⊕ u(1) is again indefinite, given by X L ⊕ X R , Y L ⊕ Y R = tr(X L Y L − X R Y R ), for X L , Y L ∈ su(m) and X R , Y R ∈ su(n), whereas the inner product on the u(1) factor is either positive-definite or negativedefinite depending on whether m < n or m > n, respectively. Of course, if m = n we are back in the original case, in which the u(1) factor is absent since it acts trivially on the space of matrices. Finally, a further explicit example of a generalised metric Lie 3-algebra appears in [31] and is built out of the octonions. It was shown in [32] that it satisfies the axioms (CS1)-(CS3). This example may be deconstructed into the pair (g2 , O), where O denotes the octonions, which is a reducible, faithful, orthogonal representation of g2 . The resulting 3-algebra has a nondegenerate centre spanned by 1 ∈ O. Quotienting by the centre gives another generalised metric Lie 3-algebra associated with the pair (g2 , ImO), with ImO the 7-dimensional representation of imaginary octonions. 2.2. Reconstruction. Now let g be a metric real Lie algebra and let V be a faithful real orthogonal representation of g. We will let (−, −) and −, − denote the inner products on g and V , respectively. They are both invariant under the corresponding actions of g. We will define a 3-bracket on V as follows. We start by defining a bilinear map D : V × V → g, by transposing the g-action. Indeed, given x, y ∈ V , let D(x, y) ∈ g be defined by (X, D(x, y)) = X · x, y ∀X ∈ g,

(9)

where · stands for the action of g on V . Remark 8. This construction is reminiscent of the definition of the Dirac current associated to two spinors, which plays such an important rôle in the construction of the Killing superalgebra of a supersymmetric supergravity background. (See, e.g., [33].) In that case, the rôle of V is played by spinors, whereas that of g by vectors, with · being the Clifford action. Lemma 9. D is surjective onto g.

880

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

Proof. We will first show that the image of D is an ideal. We do this because this result is slightly more general, as it does not use the faithfulness of the action of g on V , and results in a useful explicit formula. The fact that the image of D is an ideal follows by an explicit computation. Let X ∈ g and v, w ∈ V . Then for all Y ∈ g we have since g is metric ([D(v, w), X ], Y ) = (D(v, w), [X, Y ]) = [X, Y ] · v, w by (9) X Y = · Y · v, w − · X · v, w since V is a g-module = − Y · v, X · w − Y · X · v, w since X ∈ so(V ) = − (D(v, X · w), Y ) − (D(X · v, w), Y ) again by (9), whence abstracting Y , [X, D(v, w)] = D(X · v, w) + D(v, X · w).

(10)

Of course, this just says that D is g-equivariant. Now let X ∈ g be perpendicular to the image of D. This means that (X, D(v, w)) = 0 for all v, w ∈ V , or equivalently that X · v, w = 0 for all v, w ∈ V . Nondegeneracy of the inner product on V implies that X · v = 0 for all v ∈ V , which in turn implies that X = 0 since the action of g on V is faithful. Finally, nondegeneracy of the inner product on g says that (im D)⊥ = 0 implies surjectivity. Now let us define a trilinear map V × V × V → V by (x, y, z) → [x, y, z] := D(x, y) · z. We claim that this trilinear map turns V into a generalised metric Lie 3-algebra. Proposition 10. The 3-bracket [x, y, z] = D(x, y) · z on V obeys the axioms (1), (2) and (3) of Definition 1. Proof. The first axiom (1) follows from the fact that g preserves the inner product on V . The second axiom is the symmetry of the inner product on g. Indeed, according to (9), (D(x, y), D(z, w)) = D(x, y) · z, w = [x, y, z], w, whereas (D(z, w), D(x, y)) = D(z, w) · x, y = [z, w, x], y. The equality of both of these expressions is precisely axiom (2). Finally, we prove the fundamental identity (3) by substituting X = D(x, y) into Eq. (10) in the proof of Lemma 9 and applying both sides of the equation to z. An isomorphism ϕ : V → W of generalised metric Lie 3-algebras is a linear map such that for all x, y, z ∈ V , [ϕx, ϕy, ϕz]W = ϕ[x, y, z]V

and

ϕx, ϕyW = x, yV .

This last condition induces a Lie algebra isomorphism so(V ) → so(W ) by X → ϕ ◦ X ◦ ϕ −1 which relates the Lie algebras gV and gW by gW = ϕ ◦ gV ◦ ϕ −1 . We say that pairs (gV , V ) and (gW , W ) are isomorphic if there is an isometry ϕ : V → W

On the Lie-Algebraic Origin of Metric 3-Algebras

881

which relates gV < so(V ) and gW < so(W ) by gW = ϕ ◦ gV ◦ ϕ −1 and such that this is an isometry of metric Lie algebras. It then follows that isomorphic 3-algebras give rise to isomorphic pairs, and conversely. It is clear that deconstruction and reconstruction are mutual inverse procedures, whence we have proved the following Theorem 11. There is a one-to-one correspondence between isomorphism classes of generalised metric Lie 3-algebras V and isomorphic classes of pairs (g, V ), where g is a metric real Lie algebra and V is a faithful orthogonal g-module. This reduces the classification problem of generalised metric Lie 3-algebras to that of (conjugacy classes of) metric Lie subalgebras g < so(V ). In other words, pairs (g, b), where g < so(V ) and b is an ad-invariant inner product on g. It is important to remark that the inner product on g is an essential part of the data and that, in fact, it need not come induced from an inner product on so(V ). To illustrate this, consider the following example. Example 12. Consider the pair (so(4), R4 ) of Example 4, but where the inner product on so(4) is given by the Killing form, so that it is negative-definite instead of split. It is easy to see that relative to an orthogonal basis (e1 , . . . , e4 ) for R4 , the 3-bracket is now given by [ei , e j , ek ] = δ jk ei − δik e j , which, since [x, y, z] + cyclic = 0, defines on R4 the structure of a Lie triple system. 2.3. Recognition. We saw above that there are two extreme cases of generalised metric Lie 3-algebras: the metric Lie 3-algebras of [34,35] and the (metric) Lie triple systems [36–38]. It is possible to characterise the Lie triple systems among these 3-algebras as those for which on g ⊕ V one can define the structure of a Z2 -graded metric Lie algebra as follows. We declare g to be the even subspace and V to be the odd subspace, with the following Lie brackets extending those of g: [D(x, y), z] = [x, y, z]

and

[x, y] = D(x, y).

Notice that since D(x, y) = −D(y, x), the odd-odd bracket is skewsymmetric and hence this construction leads to a Lie algebra and not a Lie superalgebra. Being Z2 -graded, there are four components to the Jacobi identity: • the even-even-even (or 000) component is the Jacobi identity for g; • the even-even-odd (or 001) component is simply the fact that V is a g-module; • the even-odd-odd (or 011) component is the g-invariance of D, which is the content of Eq. (10); and • the odd-odd-odd (or 111) component is [[u, v], w] + cyclic = [D(u, v), w] + cyclic = [u, v, w] + cyclic = 0, which singles out the case where V is a Lie triple system with g ⊕ V its embedding Lie algebra, as explained for example in [37].

882

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

We define an inner product on g ⊕ V by using the g-invariant inner products on g and V , and declaring that g ⊥ V . One checks that this inner product on g ⊕ V is ad-invariant, turning g ⊕ V into a metric Lie algebra. Example 13. For the Lie triple system of Example 12, the embedding Lie algebra is isomorphic to so(5), whence the Lie triple system corresponds to the symmetric space SO(5)/SO(4), which is the round 4-sphere. 3. The Hermitian 3-Algebras of Bagger–Lambert In [10], Bagger and Lambert constructed the three-dimensional N = 6 superconformal field theory of [6] from a hermitian vector space (V, h) with a 3-bracket [x, y; z] defined as follows. Let (V, h) be a finite-dimensional complex hermitian vector space, where h is a nondegenerate sesquilinear map h : V × V → C obeying h(v, w) = h(w, v) and h(ζ v, w) = ζ h(v, w), hence h(v, ζ w) = ζ h(v, w), for all v, w ∈ V and ζ ∈ C. A hermitian structure defines a complex antilinear map V → V ∗ by w → w := h(−, w), which is nevertheless an isomorphism of the underlying real vector spaces. The 3-bracket [x, y; z] is complex linear in the first two entries and complex antilinear in the third and obeys the following properties for all u, v, w, x, y, z ∈ V : (BL1) the skewsymmetry axiom: [x, y; z] = −[y, x; z];

(11)

h([x, y; z], w) = h(y, [z, w; x]);

(12)

(BL2) the symmetry axiom (BL3) and a fundamental identity [x, [z, u; w]; v]−[x, [z, u; v]; w]+[x, z; [w, v; u]]−[x, u; [w, v; z]] = 0. (13) Mutatis mutandis, axioms (11) and (12) are precisely Eq. (35) in [10], whereas the fundamental identity (13) is precisely Eq. (36) in [10]. Notice that axioms (11) and (12) together imply h([x, y; z], w) = −h([x, y; w], z).

(14)

We will find it convenient to rewrite the fundamental identity. Lemma 14. The fundamental identity (13) is equivalent to [[z, v; w], x; y] − [[z, x; y], v; w] − [z, [v, x; y]; w] + [z, v; [w, y; x]] = 0.

(15)

Proof. First we take the hermitian inner product of Eq. (15) with u to obtain h([[z, v; w], x; y], u) − h([[z, x; y], v; w], u) −h([z, [v, x; y]; w], u) + h([z, v; [w, y; x]], u) = 0, and then bring this expression to a standard form without nested brackets. To this end we use axiom (11) on the first two terms, axiom (12) on all but the last term of the resulting expression and then relation (14) on the last term to obtain h([z, v; w], [u, y; x]) − h([z, x; y], [u, w; v]) −h([v, x; y], [w, u; z]) + h([v, z; u], [w, y; x]) = 0.

(16)

We now bring Eq. (13) to a similar normal form, by taking the hermitian inner product

On the Lie-Algebraic Origin of Metric 3-Algebras

883

with y to obtain h([x, [z, u; w]; v], y) − h([x, [z, u; v]; w], y) +h([x, z; [w, v; u]], y) − h([x, u; [w, v; z]], y) = 0, and use axiom (12) on the first two terms and (14) on the last two terms to obtain h([z, u; w], [v, y; x]) − h([z, u; v], [w, y; x]) −h([x, z; y], [w, v; u]) + h([x, u; y], [w, v; z]) = 0. Comparing with (16), we see that they are the same expression with u, v interchanged after rearranging and using (11). Just as we did for the generalised metric Lie 3-algebras of Sect. 2, we will associate a pair (g, V ) consisting of a metric real Lie algebra g of which V is a complex unitary representation. In trying to reconstruct the 3-algebra from this data, we will actually arrive at a more general class of hermitian 3-algebras. The hermitian 3-algebras of [10] will arise out of (g, V ) precisely when g ⊕ [[V ]] admits the structure of a Lie superalgebra into whose complexification V embeds in a manner similar to the case of Lie triple systems in Sect. 2.3. 3.1. Deconstruction. We start by defining a sesquilinear map D : V × V → End V from the 3-bracket by D(v, w)z := [z, v; w]. Lemma 15. The image of D is a Lie subalgebra of gl(V ). Proof. By definition, [D(x, y), D(v, w)]z = D(x, y)[z, v; w] − D(v, w)[z, x; y] = [[z, v; w], x; y] − [[z, x; y], v; w] = [z, [v, x; y]; w] − [z, v; [w, y; x]]

by (15).

In terms of D, this equation becomes [D(x, y), D(v, w)]z = D([v, x; y], w)z − D(v, [w, y; x])z.

(17)

Finally, abstracting z we see that the image of D closes under the commutator and is hence a Lie subalgebra of gl(V ). We let g denote the span of the D(x, y) in End V . It is a complex Lie algebra. Conditions (11) and (12) together imply that h(D(x, z)y, w) − h(y, D(z, x)w) = 0, which we would like to interpret as a unitarity condition. We have to be careful, though, since g is a complex Lie algebra and hence cannot preserve a hermitian inner product. The notion of unitarity for complex Lie algebras says that h(X y, w) + h(y, c(X )w) = 0,

884

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

where c is a conjugation on g; that is, c is a complex antilinear, involutive automorphism of g. These conditions on c guarantee that its fixed subspace gc := {X ∈ g|c(X ) = X } < u(V ) is a real Lie algebra, said to be a real form of g, which then does leave h invariant. This suggests defining c : g → g by cD(x, y) = −D(y, x). From D(i x, y) = i D(x, y) and D(x, i y) = −i D(x, y) for all x, y ∈ V , it follows that c(D(i x, y)) = c(i D(x, y)), whereas −D(y, i x) = i D(y, x) = −ic(D(x, y)), which shows that c can be extended to a complex antilinear map to all of g. It is clear moreover that c is involutive. Finally, it remains to show that c is an automorphism. To do this we compute [cD(x, y), cD(u, v)] = [−D(y, x), −D(v, u)] = [D(y, x), D(v, u)] = D(D(y, x)v, u) − D(v, D(x, y)u) = c(D(D(x, y)u, v) − D(u, D(y, x)v)) = c[D(x, y), D(u, v)]. The real form gc is then spanned by E(x, y) := D(x, y) − D(y, x) for all x, y ∈ V . For example, if (ea ) is a complex basis for V , then gc is spanned by D(ea , eb ) − D(eb , bea ) and i(D(ea , eb ) + D(eb , ea )). We will now show that gc is a metric Lie algebra. Proposition 16. The bilinear form on gc defined by (E(x, y), E(u, v)) := Reh(E(x, y)u, v) is symmetric, nondegenerate and ad-invariant. Proof. We prove each property in turn. To prove symmetry we simply calculate: h(E(x, y)u, v) = = = = =

h(D(x, y)u, v) − h(D(y, x)u, v) h([u, x; y], v) − h([u, y; x], v) h(x, [y, v; u]) − h(y, [x, v; u]) h(x, D(v, u)y) − h(y, D(v, u)x) h(D(u, v)x, y) − h(D(v, u)x, y),

using (12)

whence taking real parts we find Reh(E(x, y)u, v) = Reh(E(u, v)x, y). To prove nondegeneracy, let us assume that some linear combination X = E(xi , yi ) is orthogonal to all E(u, v) for u, v ∈ V :

i

Reh(X u, v) = 0. Now, since h is nondegenerate, so is Reh because Imh(x, y) = Reh(−i x, y), whence this means that X u = 0 for all u, showing that the endomorphism X = 0. Finally, we show that it is ad-invariant. A simple calculation using Eq. (17) shows that [E(x, y), E(u, v)] = E(E(x, y)u, v) + E(u, E(x, y)v),

On the Lie-Algebraic Origin of Metric 3-Algebras

885

whence (E(z, w), [E(x, y), E(u, v)]) = = = = =

(E(z, w), E(E(x, y)u, v)+ E(u, E(x, y)v)) Reh(E(z, w)E(x, y)u, v)+Reh(E(z, w)u, E(x, y)v) Reh(E(z, w)E(x, y)u, v)−Reh(E(x, y)E(z, w)u, v) Reh([E(z, w), E(x, y)]u, v) ([E(z, w), E(x, y)], E(u, v)).

In summary, we have extracted a metric real Lie algebra gc and a unitary representation (V, h) from the hermitian 3-algebra in [10]. The Lie algebra gc is the gauge algebra in the N = 6 theory. We shall illustrate this construction with the explicit example in [10], which is very similar in spirit to Example 7. Example 17. Let V denote the space of m × n complex matrices. For all x, y, z ∈ V , define [x, y; z] := yz ∗ x − x z ∗ y, where z ∗ denotes the hermitian adjoint of z. The generators of gc are E(x, y) = D(x, y) − D(y, x), where D(y, z)x = [x, y; z]. The action of E(x, y) on V is given by E(y, z)x = D(y, z)x − D(z, y)x = yz ∗ x − x z ∗ y − zy ∗ x + x y ∗ z = x(y ∗ z − z ∗ y) + (yz ∗ − zy ∗ )x, whence it consists of a linear combination of left multiplication by the skewhermitian m × m matrix yz ∗ − zy ∗ and right multiplication by the skewhermitian n × n matrix y ∗ z − z ∗ y. Let us decompose them into traceless + scalar matrices as follows: yz ∗ − zy ∗ = A(y, z) +

i m α(y, z)1m

and

y ∗ z − z ∗ y = B(y, z) − ni α(y, z)1n ,

where A(y, z) and B(y, z) are traceless, whence in su(m) and su(n), respectively, and α(y, z) = 2Im tr yz ∗ . Into the action of E(y, z) on x, we find E(y, z)x = A(y, z)x + x B(y, z) + iα(y, z)( m1 − n1 )x. The hermitian inner product on V is given by h(x, y) = 2 tr x y ∗ , where the factor of 2 is for later convenience. The invariant inner product on gc in Proposition 16 is given by polarizing (E(x, y), E(x, y)) = tr A(x, y)2 − tr B(x, y)2 − α(x, y)2 ( m1 − n1 ). Therefore we see that if m = n, then gc ∼ = su(n) ⊕ su(n) acting on V ∼ = Cn ⊗ (Cn )∗ , which is the bifundamental (n, n), and the inner product is given by X L ⊕ X R, Y L ⊕ Y R = tr(X L Y L ) − tr(X R Y R ), for X L , Y L , X R , Y R ∈ su(n), with the traces in the fundamental. If m = n, then gc ∼ = su(m) ⊕ su(n) ⊕ u(1), which is the quotient of u(m) ⊕ u(n) by the kernel of its action on V . The inner product on the semisimple part is the same as in the case m = n, whereas the inner product on the centre is positive-definite (respectively, negative-definite) according to whether m < n (respectively, m > n).

886

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

3.2. Reconstruction. We would like to reconstruct a hermitian 3-algebra out of a pair (k, V ), where k is a (real) metric Lie algebra and (V, h) is a faithful (complex) unitary representation, as we did for the generalised metric Lie 3-algebras in Sect. 2.2. In the attempt, we will find that we obtain a more general class of algebras than the hermitian 3-algebras of [10]. This prompts us to understand the conditions on (k, V ) which will guarantee that we do obtain the class of hermitian 3-algebras of [10]. We will see in Sect. 3.3 that this condition says that k ⊕ [[V ]] admits the structure of a Lie superalgebra satisfying some natural properties. Let k be a real metric Lie algebra with invariant inner product (−, −) and let (V, h) be a complex hermitian vector space admitting a faithful unitary action of k: h(X · v, w) + h(v, X · w) = 0

for all X ∈ k, v, w ∈ V .

(18)

Let kC = C ⊗R k denote the complexification of k, turned into a complex Lie algebra by extending the Lie bracket on the k complex bilinearly. We also extend the inner product complex bilinearly. As it remains nondegenerate, it turns kC into a complex metric Lie algebra. Furthermore we extend the action of k on V to kC by (X +iY ) · v = X · v + iY · v, for all X, Y ∈ k and v ∈ V . As mentioned above, a complex Lie algebra cannot leave a hermitian inner product invariant. Instead we have that h(X · v, w) + h(v, X · w) = 0,

for all X ∈ kC and v, w ∈ V .

(19)

Indeed, letting X = X + iY , h(X · v, w) = = = = = =

h((X + iY ) · v, w) h(X · v, w) + i h(Y · v, w) −h(v, X · w) − i h(v, Y · w) −h(v, X · w) + h(v, iY · w) −h(v, (X − iY ) · w) −h(v, X · w).

by equation (18) by sesquilinearity of h

Lemma 18. V remains a faithful representation of kC . Proof. Let X = X + iY ∈ kC be such that X · v = 0 for all v ∈ V . We want to show that X = 0. Taking the complex conjugate of that equation says that X · v = 0 for all v ∈ V . Therefore the real and imaginary parts of X satisfy the same equation: X · v = 0 and Y · v = 0 for all v ∈ V . Since X, Y ∈ k and k acts faithfully on V , X = Y = 0 and hence X = 0. Let us define a sesquilinear map D : V × V → kC as follows. Let x, y ∈ V and define D(x, y) by (D(x, y), X) = h(X · x, y), for all X ∈ kC . Lemma 19. D is surjective onto kC .

(20)

On the Lie-Algebraic Origin of Metric 3-Algebras

887

Proof. The proof follows along the same lines as that of Lemma 9. We prove that the image of D is an ideal by explicit computation. Let X ∈ kC and v, w ∈ V . Then for all Y ∈ kC we have since kC is metric ([D(v, w), X], Y) = (D(v, w), [X, Y]) = h([X, Y] · v, w) by (20) = h(X · Y · v, w) − h(Y · X · v, w) since V is a kC -module = −h(Y · v, X · w) − h(Y · X · v, w) by (19) = − D(v, X · w), Y − (D(X · v, w), Y) again by (20), whence abstracting Y, [X, D(v, w)] = D(X · v, w) + D(v, X · w).

(21)

Now let X ∈ kC be perpendicular to the image of D. This means that (X, D(v, w)) = 0 for all v, w ∈ V , or equivalently that h(X · v, w) = 0 for all v, w ∈ V . Nondegeneracy of h implies that X · v = 0 for all v ∈ V , which in turn implies that X = 0 since the action of kC on V is still faithful. Nondegeneracy of the inner product on kC says that (im D)⊥ = 0 implies surjectivity. We now define a ternary product: V × V × V → V , complex linear in the first and third entries and complex antilinear in the second, by (x, y, z) → [z, x; y] := D(x, y) · z,

(22)

where the notation has been chosen to ease the comparison with the 3-bracket in [10]. Lemma 20. In terms of the 3-bracket, the unitarity condition (19) reads h([x, v; w], y) = h(x, [y, w; v]).

(23)

Proof. From Eq. (19), we have that h(D(v, w) · x, y) = −h(x, D(v, w) · y).

(24)

Complex conjugating the definition (20) of D(v, w), we find on the one hand (D(v, w), X) = D(v, w), X , and on the other (D(v, w), X) = h(X · v, w) = h(w, X · v) = −h(X · w, v) = − D(w, v), X ,

using (19)

whence D(v, w) = −D(w, v).

(25)

Substituting this into Eq. (24), h(D(v, w) · x, y) = h(x, D(w, v) · y), which is precisely Eq. (23) when written in terms of the 3-bracket.

888

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

The symmetry of the complex inner product on kC is (D(x, y), D(v, w)) = (D(v, w), D(x, y)), which becomes h(D(v, w) · x, y) = h(D(x, y) · v, w), or equivalently h([x, v; w], y) = h([v, x; y], w)

(26)

in terms of the 3-bracket. Substituting X = D(x, y) in Eq. (21) and using Eq. (25), leads to the fundamental identity [D(x, y), D(v, w)] = D(D(x, y) · v, w) − D(v, D(y, x) · w).

(27)

Applying this equation to z and writing everything in terms of the 3-bracket, we arrive at the fundamental identity in (15). In summary, the most general 3-algebra constructed from a metric Lie algebra and a faithful complex unitary representation V is defined by a 3-bracket [x, y; z], complex linear in the first two entries and complex antilinear in the third, satisfying Eqs. (23) and (26), which are linear in the 3-bracket, and the fundamental identity (15), which is quadratic. The class of such 3-algebras is strictly a superset of the hermitian 3-algebras in [10]. Indeed, it is easy to see that the antisymmetry condition (11) does not follow from (23) and (26). Indeed, starting from h([x, y; z], w) and applying (23) and (26) in turn, one arrives at the following symmetry properties: (26)

(23)

(26)

h([x, y; z], w) = h([y, x; w], z) = h(y, [z, w; x]) = h(x, [w, z; y]) and no others. In particular, as there are no signs involved, one can never derive a skewsymmetry condition such as (11). Nevertheless we can easily characterise the hermitian 3-algebras in [10] in terms of Lie superalgebras, in agreement with an observation in [12] based on [4,9]. This can be thought of as a purely algebraic explanation of this fact. 3.3. Recognition. Consider the hermitian 3-algebra obtained above from a pair (k, V ) consisting of a (real) metric Lie algebra and a (complex) unitary faithful representation (V, h). Recall that it satisfies the axioms (23), (26) and (15). We attempt to construct a Lie (super)algebra on the Z2 -graded vector space h = kC ⊕ (V ⊕ V ∗ ), where the even subspace is the span of the D(x, y) and the odd subspace is V ⊕ V ∗ . The nonzero brackets are given by the following natural maps: [D(x, y), z] = D(x, y) · z = [z, x; y],

[D(x, y), z ] = −D(y, x) · z = −[z, y; x] , [x, y ] = D(x, y),

(28)

where we choose to parametrise V ∗ by elements of V : z → z = h(−, z). As before, there are four components to the Jacobi identity. The 000-component is satisfied by virtue of the fact that kC is a Lie algebra, as shown by the fundamental identity (15). Similarly,

On the Lie-Algebraic Origin of Metric 3-Algebras

889

the 001-component is satisfied because V ⊕ V ∗ is a kC -module. Since the map D is kC -equivariant (again a consequence of the fundamental identity), the 011-component is also satisfied. It remains, as usual, to check the 111-component. There are two such Jacobi identities: [x, [y, z ]] = [[x, y], z ] ± [y, [x, z ]], [x , [y , z]] = [[x , y ], z] ± [y , [x , z]], where the top sign is for the case of a Z2 -graded Lie algebra and the bottom sign for a Lie superalgebra. It is not hard to show that both identities are equivalent, so we need only investigate the first, say. Using that [x, y] = 0 and the definition of the other brackets, we obtain [x, [y, z ]] = [x, D(y, z)] = −D(y, z) · x = −[x, y; z] and ±[y, [x, z ]] = ±[y, D(x, z)] = ∓D(x, z) · y = ∓[y, x; z], whence the condition is precisely [x, y; z] = ±[y, x; z], where the top sign is for a Z2 -graded Lie algebra, whereas the bottom is for a Lie superalgebra. This latter case is precisely condition (11) in the 3-algebra of [10], whence we conclude that these 3-algebras are characterised as those hermitian 3-algebras constructed from (k, V ) for which kC ⊕ V ⊕ V ∗ becomes a (complex) Lie superalgebra. Remark 21. Alternatively, as discussed in [39], one could try to embed the hermitian 3-algebra of [10] in a graded Lie algebra, but even for the case of Example 17, this graded Lie algebra is seemingly infinite-dimensional and not of Kac–Moody type. The complex Lie superalgebra kC ⊕ V ⊕ V ∗ is the complexification of a real Lie superalgebra s = s0 ⊕ s1 defined as follows. Let s1 denote the real subspace of the complex vector space V ⊕ V ∗ of the form

s1 = x + i x x ∈ V . Computing the Lie bracket of two such vectors we find [x + i x , y + i y ] = i[x, y ] + i[x , y] = i D(x, y) + i D(y, x) = D(i x, y) − D(y, i x) = E(i x, y). In other words, [s1 , s1 ] = k, the real form of kC leaving the hermitian inner product invariant. We therefore let s0 = k. Computing the Lie bracket of k and s1 we find

[E(x, y), z + i z ] = [E(x, y), z] + i[E(x, y), z] ∈ s1 ,

(29)

whence s1 is a k-module. It is in fact isomorphic to the underlying real representation [[V ]] of V ⊕ V ∗ . We conclude that s is a real form of the complex Lie superalgebra kC ⊕ (V ⊕ V ∗ ) defined above.

890

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

Let us now show that s is a metric Lie superalgebra. The inner product is obtained by extending the inner product on k given by (see Proposition 16) (E(x, y), E(u, v)) = Reh(E(x, y) · u, v) to all of s by ad-invariance. Since the inner product has even parity, s0 ⊥ s1 . The inner product on s1 is then defined as follows. Let us introduce the notation [[z]] = z + i z , for z ∈ V, ([[[E(x, y), z]]], [[w]]) = = = = = =

([E(x, y), [[z]]], [[w]]) (E(x, y), [[[z]], [[w]]]) (E(x, y), E(i z, w)) Reh([E(x, y), i z], w) Rei h([E(x, y), z], w) −Imh([E(x, y), z], w),

by (29) by ad-invariance by the above calculation by definition by sesquilinearity of h

where Imh denotes the imaginary part of the hermitian structure, which is skewsymmetric. Thus we define the inner product on s1 to be ([[x]], [[y]]) := −Imh(x, y), which is symplectic, as we would expect for a ‘super-symmetric’ bilinear form restricted to the odd subspace of a vector superspace. The resulting inner product on s is adinvariant by construction, making s into a metric Lie superalgebra. Conversely, we will show that a real metric Lie superalgebra s = s0 ⊕ s1 satisfying the following additional properties: (1) s1 ∼ = [[V ]] = [V ⊕ V ∗ ], as an s0 -module, where (V, h) is a complex unitary representation of s0 , and (2) the Lie brackets [V, V ] and [V ∗ , V ∗ ] vanish, defines a 3-algebra obeying the axioms in [10]. To see this, we will reconstruct the 3-bracket by first complexifying the Lie superalgebra to sC = (s0 )C ⊕ (s1 )C , where (s1 )C = V ⊕ V ∗ , and extending the Lie bracket and the inner product complex-bilinearly. This makes sC into a complex metric Lie superalgebra. As before, we parametrise V ∗ in terms of V , via the complex antilinear map v → v = h(−, v). This allows us to define a 3-bracket on V by nesting the Lie brackets [−, −] of sC as follows: [x, y; z] := [[y, z ], x], for all x, y, z ∈ V . As the notation suggests, the 3-bracket is complex linear in the first entries and antilinear in the third. We will now show that this 3-bracket obeys conditions (11), (12) and (13) at the start of this section. Condition (11) is a simple consequence of the Jacobi identity: [x, [y, z ]] = [[x, y], z ] − [y, [x, z ]], for all x, y, z ∈ V , using that [x, y] = 0. The left-hand side is −[x, y; z], whereas the right-hand side is [y, x; z]. Their equality is precisely (11).

On the Lie-Algebraic Origin of Metric 3-Algebras

891

Condition (12) is essentially a consequence of unitarity. Let X ∈ kC and v, w ∈ V . Then on the one hand X, [v, w ] = [X, v], w = h([X, v], w) = −h(v, [X, w]), and taking the complex conjugate X, [v, w ] = −h([X, w], v) = − [X, w], v = − X, [w, v ] , whence [v, w ] = −[w, v ].

(30)

Using this we now calculate h([x, y; z], w) = −h([y, x; z], w)

by (11), already shown to hold

= −h([[x, z ], y], w)

= h(y, [[x, z ], w])

by definition of the 3-bracket by “unitarity”

= −h(y, [[z, x ], w]) = −h(y, [w, z; x]) = h(y, [z, w; x])

by (30) by definition of the 3-bracket again by (11),

which is precisely condition (12). Finally, the fundamental identity (13) is equivalent, as shown in Lemma 14, to Eq. (15), which in turn is equivalent to Eq. (27). It is therefore sufficient to show that Eq. (27) follows from the Jacobi identity of the superalgebra defined in (28). Starting with the left-hand side of Eq. (27) and expanding, we obtain [D(x, y), D(v, w)] = [[x, y ], [v, w ]]

by (28)

= [[[x, y ], v], w ] + [v, [[x, y ], w ]]

= [[[x, y ], v], w ] − [v, [[y, x ], w] ] = D(D(x, y) · v, w) − D(v, D(y, x) · w)

by Jacobi by (28) again by (28).

In summary, we have proved the following Theorem 22. There is a one-to-one correspondence between the hermitian 3-algebras of [10] and metric Lie superalgebras (into whose complexification one embeds the 3-algebra) satisfying that the odd subspace is [[V ]] ∼ = [V ⊕ V ∗ ] for (V, h) a complex unitary faithful representation of the even subalgebra, where [V, V ] = 0 = [V ∗ , V ∗ ]. This is morally in agreement with a statement in [12] based on [4,9] and clarifies the relationship between these works and [10]. In the explicit example of [10], corresponding to Example 17 here, this Lie superalgebra is a “compact” real form of the classical Lie superalgebra A(m − 1, n − 1). Among the classical superalgebras, also C(n + 1) defines a hermitian 3-algebra of the type considered in [10]. For the compact real form of C(n + 1), the even subalgebra is usp(2n) ⊕ u(1) and the odd subspace is the representation [[(2n, 1)]].

892

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

4. A General Algebraic Construction In this section we show that the above reconstructions of 3-algebras are special cases of a general algebraic construction due to Faulkner [25]. 4.1. The Faulkner construction. There are two ingredients in this construction: a finitedimensional metric Lie algebra g and a finite-dimensional representation V . The Lie algebra g, being metric, admits an ad-invariant nondegenerate symmetric bilinear form (−, −). We will assume that g is either real or complex and so is the representation V . The dual representation will be denoted V ∗ . We will let −, − denote the dual pairing between V and V ∗ . This data allows us to define a linear map D : V ⊗ V ∗ → g as follows. If v ∈ V and α ∈ V ∗ , then D(v, α) ∈ g is defined by (X, D(v, α)) = X · v, α for all X ∈ g,

(31)

where the · indicates the action of g on a module, here V . Lemma 23. The image of D is an ideal of g, which is all of g if V is a faithful representation. Proof. The fact that the image of D is an ideal follows by an explicit computation. Let X ∈ g, v ∈ V and α ∈ V ∗ . Then for all Y ∈ g we have ([D(v, α), X ], Y ) = = = = =

(D(v, α), [X, Y ]) [X, Y ] · v, α X · Y · v, α − Y · X · v, α − Y · v, X · α − Y · X · v, α − (D(v, X · α), Y ) − (D(X · v, α), Y )

since g is metric by (31)

by (31),

whence abstracting Y , [X, D(v, α)] = D(X · v, α) + D(v, X · α).

(32)

Of course, this just says that D is g-equivariant. Now let us assume that V is a faithful representation and let X ∈ g be perpendicular to im D. Then for all v ∈ V and α ∈ V ∗ , (X, D(v, α)) = X · v, α = 0. This implies that X · v = 0 for all v ∈ V , which implies in turn that X = 0 because the representation of g on V is faithful. Therefore (im D)⊥ = 0, whence D is surjective. The map D in (31) defines in turn a trilinear product V × V∗ × V → V (v, α, w) → D(v, α) · w.

(33)

Let ∈ (V ⊗ V ∗ )⊗2 be the tensor defined by (v, α, w, β) := D(v, α) · w, β.

(34)

On the Lie-Algebraic Origin of Metric 3-Algebras

893

Lemma 24. For all α, β ∈ V ∗ and v, w ∈ V , (v, α, w, β) = (w, β, v, α). In other words, ∈ S 2 (V ⊗ V ∗ ). Proof. This follows from the observation that (v, α, w, β) = (D(v, α), D(w, β)) , which is manifestly symmetric.

There is a converse to this result. Proposition 25. Let V be a finite-dimensional vector space with dual space V ∗ and let : V × V ∗ × V → V be a trilinear product, defining a bilinear map D : V × V ∗ → End V by (v, α, w) = D(v, α) · w. If [D(v, α), D(w, β)] = D(D(v, α) · w, β) + D(w, D(v, α) · β),

(35)

w, D(v, α) · β = − D(v, α) · w, β ,

(36)

D(v, α) · w, β = D(w, β) · v, α ,

(37)

where

and then D is obtained from (g, V ) by the above construction, where g < End V is the image of D. Proof. Equation (35) says that the image g, say, of D in End V is a Lie subalgebra making V into a g-module. Equation (36) says that V ∗ is the dual module. Define a bilinear form on g by extending (D(v, α), D(w, β)) = D(v, α) · w, β

bilinearly to all of g. Equation (37) says that it is symmetric. If X = i D(vi , αi ) is such that (X, D(w, β)) = 0 for all w ∈ V and β ∈ V ∗ , then it follows as above that X = 0, whence the inner product on g is nondegenerate. Finally we must show that it is g-invariant, namely, for all v, w, x ∈ V and α, β, γ ∈ V ∗ , ([D(v, α), D(w, β)], D(x, γ )) = (D(v, α), [D(w, β), D(x, γ )]) .

(38)

Using Eq. (35) on the right-hand side, we find (D(v, α), [D(w, β), D(x, γ )]) = (D(v, α), D(D(w, β) · x, γ ) + D(x, D(w, β) · γ )) = D(v, α) · D(w, β) · x, γ + D(v, α) · x, D(w, β) · γ = D(v, α) · D(w, β) · x, γ − D(w, β) · D(v, α) · x, γ = [D(v, α), D(w, β)] · x, γ = ([D(v, α), D(w, β)], D(x, γ )) , where in the third equality we have used Eq. (36).

In [25] it is shown that many 3-algebras and triple systems can be obtained via this construction, among them (metric) Lie and Jordan triple systems. The results in this paper indicate that we can add the hermitian 3-algebras of [10] and their generalisations to them.

894

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

4.2. Embedding Lie (super)algebras. It is a natural question to ask, given Remark 8, under which conditions we have a Lie (super)algebra structure on g ⊕ V ⊕ V ∗ defined by the natural actions of g on V and V ∗ and by the map D : V × V ∗ → g, understood as a Lie bracket. In this section we will begin to answer this question and set the stage for future applications of this idea below. Let us define a Z2 -graded vector space k = k0 ⊕ k1 with k0 = g and k1 = V ⊕ V ∗ , and let us extend the Lie bracket on k0 to a graded bracket [X, v] = X · v,

[X, α] = X · α,

and

[v, α] = D(v, α),

(39)

for all X ∈ g, v ∈ V and α ∈ V ∗ . The bracket [α, v] is related to [v, α] by a sign which depends on whether we are trying to obtain a Lie superalgebra or a Z2 -graded Lie algebra. Remark 26. In fact, the brackets (39) respect the 3-gradation V ∗ ⊕ g ⊕ V , which refines the 2-gradation and can be useful to check the Jacobi identities. We would like to understand the conditions guaranteeing that the brackets (39) define a Lie (super)algebra on k. This amounts to imposing the relevant Jacobi identities. Being Z2 -graded, there are four components to the Jacobi identity. The 000-Jacobi is satisfied by virtue of g being a Lie algebra, whereas the 001-Jacobi follows from the fact that V and V ∗ are g-modules. The 011-Jacobi is tantamount to the invariance of D under g, which is precisely what Eq. (32) says. We are left, as usual in these situations, with checking the 111-Jacobi. There are two such Jacobi identities, each with a choice of sign depending on whether we are after a Lie superalgebra or a Z2 -graded Lie algebra: D(v, α) · w = ±D(w, α) · v, D(v, α) · β = ±D(v, β) · α,

(40)

for all v, w ∈ V and α, β ∈ V ∗ , where the upper sign is for the case of a Z2 -graded Lie algebra and the bottom sign for the case of a Lie superalgebra. Finally, it is not hard to see that both identities are equivalent. Indeed the first equation in (40) says that (v, α, w, β) := D(v, α) · w, β defined in (34) is (skew)symmetric in (u, v), whereas the second equation says that it is (skew)symmetric in (α, β). However Lemma 24 says that these two conditions are equivalent.

4.3. Unitary representations. We will now briefly consider a special case of the above construction, in which V possesses an inner product invariant under g and in particular under the D(v, α). We will consider only those kinds of inner products which have a notion of signature, since in order to build lagrangians for manifestly unitary theories, we will want to consider the positive-definite ones. There are three such inner products: real symmetric, complex hermitian and quaternionic hermitian. In all cases, the inner product allows us to identify V and the dual V ∗ ; although in the complex hermitian case this identification is complex antilinear and hence V and V ∗ are inequivalent as representations of g. This means that Eq. (36), once V and V ∗ have been identified, becomes one more condition on D(v, α).

On the Lie-Algebraic Origin of Metric 3-Algebras

895

4.3.1. Real orthogonal representations. Here we will take V to be a faithful real orthogonal representation of g. The inner product on V defines in particular musical isomorphisms : V → V ∗ and : V ∗ → V , which are inverses of each other. The bilinear map D : V × V ∗ → g of Eq. (31) gives rise to a bilinear map D : V × V → g, defined by (u, v) → D(u, v) = D(u, v ) and viceversa with D(v, α) = D(v, α ). Under this dictionary, the construction in Sect. 4.1 coincides with the construction in Sect. 2.2 and hence the resulting 3-algebras are the generalised metric Lie 3-algebras of Definition 1; although they are contained already in Example II of [25]. 4.3.2. Complex unitary representations. Here we take (V, h) to be a faithful complex unitary representation of g. The musical maps V → V∗ v → v

and its inverse

V∗ → V α → α

where v (w) = h(w, v) are now complex antilinear, hence they do not provide an isomorphism of the representations V and V ∗ but rather of V ∗ with the complex-conjugate representation V . Furthermore, since V is a complex vector space, the bilinear map D of Eq. (31) now maps to the complexification gC = C ⊗R g of g. Using the inner product, this map D : V × V ∗ → gC gives rise to a sesquilinear map D : V × V → gC , defined by (u, v) → D(u, v) = D(u, v ) and viceversa with D(v, α) = D(v, α ). Under this dictionary, the construction in Sect. 4.1 coincides with the construction in Sect. 3.2 and hence the resulting 3-algebras are the hermitian 3-algebras obtained in that section. 4.3.3. Quaternionic unitary representations. Finally we consider the case where V is a quaternionic unitary representation of g, by which we will mean a complex unitary representation with a compatible g-invariant quaternionic structure. More precisely, let (V, h) be an (even-dimensional) complex hermitian vector space and let J : V → V be a complex antilinear map satisfying J 2 = − id and h(J x, J y) = h(y, x).

(41)

Moreover both h and J are invariant under g. Using h and J we can construct a g-invariant complex symplectic structure ω(x, y) := h(x, J y).

(42)

It is an easy exercise to show that ω is indeed complex bilinear and that ω(x, y) = −ω(y, x). Since ω is clearly g-invariant, it provides a g-equivariant symplectic musical isomorphism : V → V ∗ with inverse : V ∗ → V . In particular, as representations of g, V and V ∗ are equivalent. The bilinear map D of Eq. (31) again maps to the complexification gC = C ⊗R g of g. We will make gC into a complex metric Lie algebra by extending the bracket and inner product complex bilinearly. As before we will extend also the action of g on V to an action of gC . Since ω is complex bilinear, it remains invariant under gC . Using ω, the map D : V × V ∗ → gC gives rise to a complex bilinear map Dω : V × V → gC , defined by (u, v) → Dω (u, v) = D(u, v ) and viceversa with D(v, α) = Dω (v, α ). Because (V, h) is a complex hermitian vector space on which g acts unitarily, the discussion in Sect. 4.3.2 applies. In particular, the hermitian structure h defines a sesquilinear map, which we now denote Dh : V × V → gC in order to distinguish it from Dω , by Dh (u, v) = D(u, v ). The two maps Dω and Dh are related as follows.

896

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

Lemma 27. For all u, v ∈ V , Dω (u, v) = Dh (u, J v). Proof. Given u, v ∈ V and for all X ∈ gC we have (Dω (u, v), X) = ω(X · u, v).

(43)

Using Eq. (42), we can rewrite ω(X · u, v) = h(X · u, J v), whence using Eq. (20) for Dh , we obtain (Dω (u, v), X) = h(X · u, J v) = (Dh (u, J v), X) . Since this is true for all X, the lemma follows.

The first thing we notice is that in contrast with the complex hermitian case, Dω (u, v) is now symmetric in its arguments. Indeed, (Dω (u, v), X) = ω(X · u, v) = −ω(u, X · v) = ω(X · v, u) = (Dω (v, u), X)

by (43) since gC preserves ω since ω is skewsymmetric again by (43).

In other words, Dω : S 2 V → gC , and hence defines a complex linear 3-bracket S 2 V ⊗ V → V by (x, y, z) → [x, y, z] := Dω (x, y) · z, which is symmetric in the first two arguments [x, y, z] = [y, x, z]

for all x, y, z ∈ V .

(44)

We may decompose S 2 V ⊗ V as S2 V ⊗ V = S3 V ⊕ V , where V contains the symplectic trace and is therefore not irreducible. Analogous to the case of generalised metric Lie 3-algebras, there are two extreme cases of these algebras: namely those for which the 3-bracket is totally symmetric and those for which it obeys the condition [x, y, z] + [y, z, x] + [z, x, y] = 0.

(45)

These latter triple systems are called anti-Lie triple systems and are discussed in [40, §5]. Just like a Lie triple system can be embedded in a Lie algebra, an anti-Lie triple system can be embedded in a Lie superalgebra. Indeed on the Z2 -graded vector space gC ⊕ V , where gC has even parity and V odd parity, we can define the following brackets extending those of gC : [Dω (x, y), z] = [x, y, z]

and

[x, y] = Dω (x, y).

Since Dω (x, y) = Dω (y, x) we see that the bracket on V is indeed symmetric. As usual there are four components to the Jacobi identity, of which only the component in S 3 V → V needs to be checked. This component of the Jacobi identity requires [x, [x, x]] = 0 for all x ∈ V , which is equivalent to [x, x, x] = 0, which in turn is equivalent (using the symmetry of Dω ) to the defining condition (28). Furthermore the complex Lie superalgebra gC ⊕ V is metric. We simply use the inner product on gC and the gC -invariant symplectic form ω on V , while declaring gC ⊥ V

On the Lie-Algebraic Origin of Metric 3-Algebras

897

as befits an even inner product. Both are gC -invariant, hence the only compatibility condition is precisely Eq. (43). Notice however that V is not a real representation of g, whence this complex Lie superalgebra is not the complexification of a real Lie superalgebra. Three remarks are in order. The first remark is that the emergence of a Lie superalgebra here is a different phenomenon than that of a Lie superalgebra in the hermitian 3-algebras of [10] discussed in Sect. 3.3. First of all, the underlying space of the Lie superalgebra is different, but moreover the condition for the existence of an embedding superalgebra in that case was Eq. (11), which in the notation of this section becomes Dh (x, y) · z = −Dh (z, y) · x. In terms of Dω , this condition says that the 3-bracket [x, y, z] = Dω (x, y) · z obeys [x, y, z] = −[x, z, y].

(46)

However this is inconsistent with the symmetry condition [x, y, z] = [y, x, z] unless the 3-bracket and hence D vanishes identically; indeed, (44)

(46)

(44)

(46)

[x, y, z] = [y, x, z] = −[y, z, x] = −[z, y, x] = [z, x, y] (44)

(46)

= [x, z, y] = −[x, y, z].

The second remark is that the link between Lie superalgebras and triple systems is itself not surprising. Any 2-graded algebra (e.g., a Lie superalgebra) defines a triple system on the elements of odd parity, simply by nesting the product on the algebra: (x, y, z) → [[x y]z], for all x, y, z odd elements and where [x y] denotes the multiplication in the 2-graded algebra. Which type of triple system depends on which type of 2-graded algebra we employ. For example, we have seen that we can obtain Lie triple systems from 2-graded Lie algebras and hermitian 3-algebras and anti-Lie triple systems starting from certain kinds of Lie superalgebras. This is done explicitly for exceptional Lie superalgebras by Okubo [41] in terms of the so-called (−1, −1)-balanced Freudenthal– Kantor triple system. The third and last remark is that, of course, not every anti-Lie triple system is of this form, since remember that V possesses a g-invariant quaternionic structure J , which we will now exploit more fully. The existence of the quaternionic structure is equivalent to the fact that this 3-algebra is a special case of the hermitian 3-algebras of Sect. 4.3.2. This allows us to deduce some new identities by relating the properties of Dω and of Dh via the dictionary in Lemma 27. In particular, since V is a faithful representation, Dω is surjective onto gC since so is Dh . Since J is g-invariant but complex antilinear, it follows that Dω (x, y) ◦ J = J ◦ Dω (x, y).

(47)

Similarly, the identity (25) satisfied by Dh becomes Dω (x, J y) = −Dω (y, J x),

(48)

J [x, y, z] = [J x, J y, J z]

(49)

which may be rewritten as

with the help of Eq. (47).

898

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

Using the two conditions (47) and (48), it is not hard to show that the identities (23) and (26), when written in terms of Dω , are automatically satisfied by virtue of the symmetry properties satisfied by the fourth-rank tensor (x, y, z, w) := ω([x, y, z], w). These properties, which follow at once from (44) and from the fact that (x, y, z, w) = (Dω (x, y), Dω (z, w)), are seen to be following: (x, y, z, w) = (y, x, z, w) = (z, w, x, y) = (x, y, w, z),

(50)

whence ∈ S 2 (S 2 V ∗ ). Finally, we come to the fundamental identity. This can be done in any number of ways. One way is to use Lemma 27 to relate the 3-brackets to those of the underlying hermitian 3-algebra as follows: [x, y, z] = Dω (x, y) · z = Dh (x, J y) · z = [z, x; J y], and then rewrite the fundamental identity (15) of the general hermitian 3-algebra in terms of the 3-brackets [x, y, z]. Alternatively, we can simply apply Eq. (35) to z ∈ V , say, and express the result in terms of the 3-brackets [x, y, z]. Either way, we arrive at the following fundamental identity: [v, x, [w, y, z]] − [w, y, [v, x, z]] − [[v, x, w], y, z] − [w, [v, x, y], z] = 0. (51) In summary, the most general 3-algebra constructed out of a metric Lie algebra g and a faithful quaternionic representation (V, h, J ) is defined by a complex trilinear 3-bracket V × V × V → V , satisfying the following minimal set of axioms: (44), (49), (51) and ω([x, y, z], w) = ω([z, w, x], y),

(52)

which is one of the relations in Eq. (50) when written explicitly in terms of the 3-bracket. The simple classical Lie superalgebras of types B(m, n) and D(m, n) can be seen to arise in this way, as the following example shows. This can be thought of as the quaternionic analogue of Example 17. Example 28. Let g = so(m) ⊕ usp(2n). We let U be the complexification of the fundamental representation of so(m); that is, U ∼ = Cm with an invariant real structure or, equivalently, an invariant complex symmetric inner product which we will denote g. Let W be the fundamental representation of usp(2n), so W ∼ = C2n with an invariant quaternionic structure or, equivalently, an invariant complex symplectic structure which we will denote ε. We let V = U ⊗ W be the bifundamental representation. Our aim is to define an anti-Lie triple system on V embedding into the complex simple Lie superalgebra of B or D types, depending on the parity of m: odd or even, respectively. Using ε we can (and will) identify V with the space of complex linear maps W → U : u ⊗ w ∈ V is identified with the linear map (u ⊗ w)(x) := ε(w, x)u. The action of g on V is the following: (X, Y ) · ϕ = X ◦ ϕ − ϕ ◦ Y,

for all (X, Y ) ∈ g and ϕ : W → U .

On the Lie-Algebraic Origin of Metric 3-Algebras

899

We extend this action to one of the complexification gC of g. Given ϕ : W → U we can define its transpose ϕ ∗ by g(ϕw, u) = ε(w, ϕ ∗ u),

(53)

for all w ∈ W and u ∈ U . This allows us to define a gC -invariant complex symplectic structure on V by ω(ϕ, ψ) := Tr W (ψ ∗ ◦ ϕ) = TrU (ϕ ◦ ψ ∗ ).

(54)

Now given ϕ, ψ ∈ V we define D(ϕ, ψ) ∈ gC by (D(ϕ, ψ), (X, Y)) = ω ((X, Y) · ϕ, ψ) = ω (X ◦ ϕ − ϕ ◦ Y, ψ) = TrU (X ◦ ϕ ◦ ψ ∗ ) − Tr W (ψ ∗ ◦ ϕ ◦ Y), where the ad-invariant inner product on gC is defined by ((X1 , Y1 ), (X2 , Y2 )) = TrU (X1 ◦ X2 ) − Tr W (Y1 ◦ Y2 ). Writing D(ϕ, ψ) = (D1 , D2 ), with D1 ∈ so(m)C and D2 ∈ usp(2n)C , we see that TrU (D1 ◦ X) − Tr W (D2 ◦ Y) = TrU (X ◦ ϕ ◦ ψ ∗ ) − Tr W (ψ ∗ ◦ ϕ ◦ Y), whence D1 = pr 1 (ϕ ◦ ψ ∗ ) and D2 = pr 2 (ψ ∗ ◦ ϕ), where pr 1 and pr 2 are the orthogonal projections pr 1 : gl(m, C) → so(m)C

and

pr 2 : gl(2n, C) → usp(2n)C .

A simple calculation using Eq. (53) shows that the transposes of ϕ ◦ ψ ∗ and ψ ∗ ◦ ϕ relative to the inner products on U and W , respectively, are given by (ϕ ◦ ψ ∗ )t = −ψ ◦ ϕ ∗

and

(ψ ∗ ◦ ϕ)t = −ϕ ∗ ◦ ψ,

where the signs are due to the skewsymmetry of ε. This means that 2D1 = ϕ ◦ ψ ∗ + ψ ◦ ϕ ∗ whence

and

2D2 = ψ ∗ ◦ ϕ + ϕ ∗ ◦ ψ,

2D(ϕ, ψ) = ϕ ◦ ψ ∗ + ψ ◦ ϕ ∗ , ψ ∗ ◦ ϕ + ϕ ∗ ◦ ψ .

We may (and will) get rid of the annoying factor of 2 by retroactively rescaling the inner product on gC . The 3-bracket on V is then given by [ϕ, ψ, χ ] = D(ϕ, ψ) · χ = (ϕ ◦ ψ ∗ + ψ ◦ ϕ ∗ ) ◦ χ − χ ◦ (ψ ∗ ◦ ϕ + ϕ ∗ ◦ ψ). It obeys all the identities coming from the Faulkner construction applied to a quaternionic unitary representation and, in addition, [ϕ, ϕ, ϕ] = 0

for all ϕ ∈ V ,

whence it is an anti-Lie triple system. This means that it embeds in a complex Lie superalgebra on the superspace gC ⊕ V . There is precisely one Lie superalgebra with this data—namely the complex orthosymplectic Lie superalgebra, which is simple and of type B((m − 1)/2, n), if m is odd, and D(m/2, n) is m is even.

900

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

These Lie superalgebras were found in [9] to play a rôle in N = 5 superconformal Chern–Simons theories analogous to the rôle played by the A(m, n) and C(n + 1) superalgebras in N = 6 theories. In addition, in [18] further N = 5 theories were constructed by taking a limit of superconformal gaugings of supergravity theories, associated to the exceptional Lie superalgebras F(4) and G(3), as well as to the family of simple Lie superalgebras D(2, 1; α). These Lie superalgebras too can be constructed along the lines of the previous example. We will not give the details, but point the reader to the paper [42] by Kamiya and Okubo, where the construction is made manifest. In that paper it is shown that these Lie superalgebras arise from anti-Lie triple systems which can be constructed from Faulkner data (g, V ) with V a quaternionic unitary representation of the real metric Lie algebra g. To be precise, for F(4), g = so(7) ⊕ usp(2) and V = S ⊗ W , with W the complex two-dimensional representation of W with its invariant quaternionic structure and S the complexification of the 8-dimensional irreducible spinor representation of so(7), and for G(3), g = g2 ⊕ usp(2) and V = E ⊗ W , with W as before and E the complexification of the “fundamental” seven-dimensional irreducible representation of g2 . In both of these cases so(7) and g2 are compact real forms and the metric structure is provided by (a multiple of) the Killing form. Finally, for D(2, 1; α), g = so(4) ⊕ usp(2), and V = F ⊗ W , with W as before and F the complexification of the fundamental representation. The parameter α is reflected in the choice of inner product for so(4), which not being simple has a two-parameter family of invariant inner products. Up to an overall rescaling of the inner product we are left with a pencil of such inner products, parametrised rationally by α. From these examples we conclude that all the embedding superalgebras involved in the new N = 5 superconformal Chern-Simons-matter theories appearing in the recent literature [9,18] can also be realised in the quaternionic hermitian case of the Faulkner construction. 5. Conclusions and Open Problems In this paper we have hopefully shed some light on the Lie-algebraic origin of the metric 3-algebras which lurk behind three-dimensional superconformal Chern–Simons theories. In particular, we have proved that the metric 3-algebras of [11], appearing in N = 2 theories, correspond to pairs (g, V ) consisting of a metric Lie algebra g and a real faithful orthogonal representation V . We have also proved that a class of hermitian 3-algebras generalising those in [10] correspond to pairs (g, V ), where g is again a metric Lie algebra and V is a faithful complex unitary representation. The class of hermitian 3-algebras in [10], which appear in the N = 6 theories, constitute a subclass which are in one-to-one correspondence with a class of metric Lie superalgebras into whose complexification they embed. We have shown that these constructions are special cases of a general construction of pairs due to Faulkner. Applying this to the case of pairs (g, V ), where V is now a quaternionic unitary representation of the real metric Lie algebra g, we obtain a number of complex 3-algebras generalising those which arise in N = 5 superconformal Chern–Simons theories [9,18]. These results begin to explain, in simple algebraic terms, the fact that superconformal Chern–Simons theories which were originally formulated in terms of 3-algebras, can be rewritten using only Lie algebraic data, as in standard gauge theories (albeit with a twist). They similarly explain the relation between Lie superalgebras and N ≥ 4 superconformal Chern–Simons theories. The repackaging of 3-algebraic data in Lie algebraic terms is very compact and has the advantage of being able to construct examples very easily.

On the Lie-Algebraic Origin of Metric 3-Algebras

901

On the other hand, the very efficiency of this repackaging makes it hard to rephrase properties of the 3-algebra in terms of Lie algebras. Along these lines, there are a number of open problems and directions for future research, some of which we are busy exploring and will be reported on elsewhere, which are suggested by the results presented here. The dictionary between metric 3-algebras and Lie-algebraic data presented here points at the possibility of rephrasing properties of the 3-algebra (and of the associated superconformal Chern–Simons theory) directly in terms of the pairs (g, V ). The structure theory of metric 3-algebras, which for the case of metric Lie 3-algebras was exploited in [21,43] in order to classify those algebras of index < 3 and rephrase properties of the Bagger–Lambert model in 3-algebraic terms, ought to be encoded in Liealgebraic terms. An important open problem in this regard is how to recognise the type of metric 3-algebra from the Lie-algebraic data. In particular, which (g, V ) give rise to metric Lie 3-algebras? The construction of superconformal Chern–Simons theories directly from (g, V ) is an interesting problem on which we will report in a forthcoming paper. A better understanding of the dictionary between 3-algebras underlying the N = 3 theories dual to 3-Sasaki 7-manifolds (and the N = 1 theories associated to their squashings) and the geometry of these manifolds would also be desirable. In particular, how does the geometry of the 3-Sasaki manifold (or its squashing) manifest itself in its 3-algebra or in its Lie-algebraic data? Acknowledgments. JMF would like to extend his gratitude to José de Azcárraga and to the Departament de Física Teòrica of the Universitat de València for hospitality and support under the research grant FIS200801980. PdM is supported by a Seggie-Brown Postdoctoral Fellowship of the School of Mathematics of the University of Edinburgh.

References 1. Bagger, J., Lambert, N.: Modeling multiple M2’s. Phys. Rev. D 75, 045020 (2007) 2. Gustavsson, A.: Algebraic structures on parallel M2-branes. http://arxiv.org/abs/0709.1260v5[hep-th], 2008 3. Bagger, J., Lambert, N.: Gauge symmetry and supersymmetry of multiple M2-branes. Phys. Rev. D 77, 065008 (2008) 4. Gaiotto, D., Witten, E.: Janus Configurations, Chern-Simons Couplings, And The Theta-Angle in N = 4 Super Yang-Mills Theory. http://arxiv.org/abs/0804.2907v1[hep-th], 2008 5. Hosomichi, K., Lee, K.-M., Lee, S., Lee, S., Park, J.: N = 4 Superconformal Chern-Simons Theories with Hyper and Twisted Hyper Multiplets. JHEP 07, 091 (2008) 6. Aharony, O., Bergman, O., Jafferis, D.L., Maldacena, J.: N = 6 superconformal Chern-Simons-matter theories, M2-branes and their gravity duals. JHEP 10, 091 (2008) 7. Benna, M., Klebanov, I., Klose, T., Smedbäck, M.: Superconformal Chern–Simons theories and AdS4 /CFT3 correspondence. JHEP 0809, 072 (2008) 8. Mauri, A., Petkou, A.C.: An N = 1 Superfield Action for M2 branes. Phys. Lett. B 666, 527–532 (2008) 9. Hosomichi, K., Lee, K.-M., Lee, S., Lee, S., Park, J.: N = 5,6 Superconformal Chern-Simons Theories and M2-branes on Orbifolds. JHEP 09, 002 (2008) 10. Bagger, J., Lambert, N.: Three-algebras and N= 6 Chern–Simons gauge theories. Phys. Rev. D 79, 025002 (2009) 11. Cherkis, S., Sämann, C.: Multiple M2-branes and generalized 3-Lie algebras. Phys. Rev. D 78, 066019 (2008) 12. Schnabl, M., Tachikawa, Y.: Classification of N = 6 superconformal theories of ABJM type. http://arxiv. org/abs/0807.1102v1[hep-th], 2008 13. Aharony, O., Bergman, O., Jafferis, D.L.: Fractional M2-branes. JHEP 0811, 043 (2008) 14. Ooguri, H., Park, C.-S.: Superconformal Chern-Simons Theories and the Squashed Seven Sphere. JHEP 0811, 082 (2008) 15. Jafferis, D.L., Tomasiello, A.: A simple class of N = 3 gauge/gravity duals. JHEP 0810, 101 (2008)

902

P. de Medeiros, J. Figueroa-O’Farrill, E. Méndez-Escobar, P. Ritter

16. Bergshoeff, E.A., de Roo, M., Hohm, O.: Multiple M2-branes and the embedding tensor. Class. Quant. Grav. 25, 142001 (2008) 17. Bergshoeff, E.A., de Roo, M., Hohm, O., Roest, D.: Multiple Membranes from Gauged Supergravity. JHEP 0808, 091 (2008) 18. Bergshoeff, E.A., Hohm, O., Roest, D., Samtleben, H., Sezgin, E.: The Superconformal Gaugings in Three Dimensions. JHEP 07, 1111 (2008) 19. Schwarz, J.H.: Superconformal Chern-Simons theories. JHEP 11, 078 (2004) 20. Gaiotto, D., Yin, X.: Notes on superconformal Chern-Simons-matter theories. JHEP 08, 056 (2007) 21. de Medeiros, P., Figueroa-O’Farrill, J., Méndez-Escobar, E.: Lorentzian Lie 3-algebras and their Bagger–Lambert moduli space. JHEP 07, 111 (2008) 22. Nagy, P.-A.: Prolongations of Lie algebras and applications. http://arxiv.org/abs/0712.1398v2[math. DG], 2008 23. Papadopoulos, G.: M2-branes, 3-Lie Algebras and Plucker relations. JHEP 05, 054 (2008) 24. Gauntlett, J.P., Gutowski, J.B.: Constraining maximally supersymmetric membrane actions. http://arxiv. org/abs/0804.3078v3[hep-th], 2008; to appear JHEP 25. Faulkner, J.R.: On the geometry of inner ideals. J. Algebra 26, 1–9 (1973) 26. Gustavsson, A.: One-loop corrections to Bagger-Lambert theory. Nucl. Phys. B 807, 315–333 (2009) 27. Van Raamsdonk, M.: Comments on the Bagger-Lambert theory and multiple M2- branes. JHEP 0805, 105 (2008) 28. Gomis, J., Milanesi, G., Russo, J.G.: Bagger-Lambert Theory for General Lie Algebras. JHEP 06, 075 (2008) 29. Benvenuti, S., Rodríguez-Gómez, D., Tonni, E., Verlinde, H.: N = 8 superconformal gauge theories and M2 branes. http://arxiv.org/abs/0805.1087v1[hep-th], 2008 30. Ho, P.-M., Imamura, Y., Matsuo, Y.: M2 to D2 revisited. JHEP 07, 003 (2008) 31. Nambu, Y.: Generalized Hamiltonian dynamics. Phys. Rev. D 7, 2405–2414 (1973) 32. Yamazaki, M.: Octonions, G 2 and generalized Lie 3-algebras. Phys. Lett. B 670, 215–219 (2008) 33. Figueroa-O’Farrill, J.M., Meessen, P., Philip, S.: Supersymmetry and homogeneity of M-theory backgrounds. Class. Quant. Grav. 22, 207–226 (2005) 34. Filippov, V.: n-Lie algebras. Sibirsk. Mat. Zh. 26(6), 126–140, 191 (1985) 35. Figueroa-O’Farrill, J.M., Papadopoulos, G.: Plücker-type relations for orthogonal planes. J. Geom. Phys. 49, 294–331 (2004) 36. Jacobson, N.: General representation theory of Jordan algebras. Trans. Amer. Math. Soc. 70, 509–530 (1951) 37. Lister, W.G.: A structure theory of Lie triple systems. Trans. Am. Math. Soc. 72(2), 217–242 (1952) 38. Yamaguti, K.: On algebras of totally geodesic spaces (Lie triple systems). J. Sci. Hiroshima Univ. Ser. A 21, 107–113 (1957/1958) 39. Nilsson, B.E.W., Palmkvist, J.: Superconformal M2-branes and generalized Jordan triple systems. http:// arxiv.org/abs/0807.5134v2[hep-th], 2008 40. Faulkner, J.R., Ferrar, J.C.: Simple anti-Jordan pairs. Comm. Algebra 8(11), 993–1013 (1980) 41. Okubo, S.: Construction of Lie superalgebras from triple product systems. AIP Conf. Proc. 687, 33–40 (2003) 42. Kamiya, N., Okubo, S.: Construction of Lie superalgebras D(2, 1; α), G(3) and F(4) from some triple systems. Proc. Edinb. Math. Soc. (2) 46(1), 87–98 (2003) 43. de Medeiros, P., Figueroa-O’Farrill, J., Méndez-Escobar, E.: Metric Lie 3-algebras in Bagger–Lambert theory. JHEP 08, 045 (2008) Communicated by G.W. Gibbons

Commun. Math. Phys. 290, 903–934 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0792-6

Communications in

Mathematical Physics

Localization Bounds for Multiparticle Systems Michael Aizenman, Simone Warzel Departments of Mathematics and Physics, Princeton University, Princeton, NJ 08544, USA. E-mail: [email protected] Received: 21 September 2008 / Accepted: 23 December 2008 Published online: 21 April 2009 – © The Authors 2009

Abstract: We consider the spectral and dynamical properties of quantum systems of n particles on the lattice Zd , of arbitrary dimension, with a Hamiltonian which in addition to the kinetic term includes a random potential with iid values at the lattice sites and a finite-range interaction. Two basic parameters of the model are the strength of the disorder and the strength of the interparticle interaction. It is established here that for all n there are regimes of high disorder, and/or weak enough interactions, for which the system exhibits spectral and dynamical localization. The localization is expressed through bounds on the transition amplitudes, which are uniform in time and decay exponentially in the Hausdorff distance in the configuration space. The results are derived through the analysis of fractional moments of the n-particle Green function, and related bounds on the eigenfunction correlators. Contents 1.

2. 3. 4.

5.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 On localization in the presence of interactions . . . . . . . . . 1.2 Statement of the main result . . . . . . . . . . . . . . . . . . 1.3 Remarks on the rate of exponential decay . . . . . . . . . . . 1.4 Comments on the proof . . . . . . . . . . . . . . . . . . . . . Finiteness of the Green Function’s Fractional Moments . . . . . . Localization Domains in the Parameter Space . . . . . . . . . . . . Multiparticle Eigenfunction Correlators and the Green Function . . 4.1 Eigenfunction correlators . . . . . . . . . . . . . . . . . . . . 4.2 Lower bound in terms of Green function’s fractional moments 4.3 Upper bound in terms of Green function’s fractional moments Implications of Localization in Subsystems . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

904 904 905 907 907 908 910 912 912 913 914 916

Present address: Zentrum Mathematik, TU München, München, Germany © 2009 The Authors. Reproduction of this article for non-commercial purposes by any means is permitted.

904

M. Aizenman, S. Warzel

5.1 Localization for non-interacting systems . . . . . . . . . . . . . . 5.2 Decay away from clustered configurations . . . . . . . . . . . . . 6. Proof of the Main Result . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Analyzing clustered configurations . . . . . . . . . . . . . . . . . 6.2 The inductive proof . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Proof of the rescaling inequality . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Some Distances and Separation Lemmata . . . . . . . . . . . . . . . . A.1 Splitting width . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 Distances in the configuration space . . . . . . . . . . . . . . . . B. From Eigenfunction Correlators to Dynamical and Spectral Information C. An Averaging Principle . . . . . . . . . . . . . . . . . . . . . . . . . D. On the Wegner Estimate . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

916 918 920 921 922 925 926 926 927 927 928 930 933 934

1. Introduction 1.1. On localization in the presence of interactions. In the context of non-interacting particle systems, or equivalently one-particle theory, Anderson localization is a well studied phenomenon, which for various regimes of the parameter space can be established even at the level of rigorous mathematical analysis (e.g. [CL90,PF92,St01,AS+01, GK06,Ki08b] and references therein). The picture is far less complete when it comes to systems of interacting particles subject to a random external potential, which generally may be expected to produce localization. Particularly perplexing is the situation where there are n fermions in a region of volume ||, with || → ∞ and n/|| → ρ > 0. It was proposed, through reasoning presented in [BAA06], that if the interactions are weak and the mean particle separation is significantly below the localization length of the non-interacting system, then the interaction would not affect by much the dynamical properties of the system. In particular, such reasoning has lead to the suggestion that if the system is started in a configuration for which the density of particles in one part of the region is higher than in another then the uneven situation will persist indefinitely, assuming the Hamiltonian is time independent. While that would be in violation of the equipartition principle, it would be in line with the dynamical behavior of the non-interacting system in the regime of complete Anderson localization for the one-particle Hamiltonian ([Ai94,AS+01,GK01]). Rigorous methods are still far from allowing one to decide whether complete localization will persist in the presence of interactions, as claimed in [BAA06]. Furthermore, the analysis of even a fixed number of particles with short range interactions, and || → ∞, has presented difficulties. An important step was recently made by Chulaevsky and Suhov [CS07,CS08b] who proved the existence of spectral localization for systems of two interacting particles which are subject to highly disordered external potential. The authors expect that the analysis of the n = 2 case, which is based on the multiscale approach of [FS83,DK89], could be extended to any finite n. In this work we approach the question using somewhat different tools, and address also the issue of dynamical localization. We establish the existence of localization regimes for any finite N , with decay rates which are uniform in the volume. Curiously, as is indicated in the figure below, the bounds which are established here carry a qualitatively somewhat stronger implication for n = 2 than for higher values of n.

Localization Bounds for Multiparticle Systems

905

1.2. Statement of the main result. Our goal here is to present a basic proof of localization for an arbitrary number of particles moving on a lattice of arbitrary dimension, which for convenience is taken to be Zd , in regimes of high disorder or sufficiently weak interactions. There are a number of ways to formulate a quantum system of particles on a lattice, which here is taken to be Zd . We find the following convenient, but the method discussed here can be also be adapted to other formulations. The Hilbert space of n particles on Zd is the direct product H(n) = 2 (Zd )n . We take the Hamiltonian to be an operator of the form: n

H (n) (ω) =

[− j + λ V (x j ; ω)] + U(x; α),

(1.1)

j=1

with: the discrete Laplacian (second difference operator) in Zd , V (x, ω) a random potential (described below), and U a finite range interaction which is given in terms of functions of the occupation numbers p

U(x; α) =

αk

k=2

U A (N A (x)),

(1.2)

A⊂Zd :|A|=k diam A≤U

where N A (x) = {Nu (x)}u∈A ,

(1.3)

with Nu (x) = nj=1 δx j ,u the number of particles the configuration x ∈ (Zd )n has at u ∈ Zd . It is to be understood that U A (x) = 0 unless u∈A Nu (x) = 0. The family of Hamiltonians is parametrized by λ ∈ R+ , which controls the strength of the disorder, and α := (α2 , . . . , α p ) ∈ R p−1 which is the strength of the interaction. Obviously, the value of αk is of relevance for H (n) (ω) only for n ≥ k. It will be assumed throughout the paper that: A1 The random potential is given in terms of a collection of iid random variables, {V (x; ω)}x∈Zd , with E exp(t |V (0)|) < ∞ for all t ∈ R, (1.4) whose probability distribution is of bounded density, i.e., P(V (x) ∈ dξ ) = (ξ ) dξ

with ∈ L ∞ ,

(1.5)

satisfying: (v) ≤ K

|v |≤E 0

(v − u) du,

for all u ∈ R,

(1.6)

at some E 0 < ∞ and K < ∞. A2 The interaction terms are bounded, with |U A (n)| ≤ 1 for all A ⊂ Zd and all n ∈ (Zd )|A| , and translational invariant, i.e., U A = U A if A is a translate of A.

906

M. Aizenman, S. Warzel

The above assumptions could be relaxed. In particular, translation invariance can be replaced by suitable translation invariant bounds, and, as in the case of one particle localization theory, the absolute continuity of the measure and (1.6) can be replaced by a local power-law concentration bound such as the following condition: P(V (x) ∈ [v + ε, v − ε]) ≤ ετ K τ P(V (x) ∈ [v + E 0 , v − E 0 ]),

(1.7)

for some τ ∈ [0, 1) and all 0 < ε ≤ 1. Under such reduced assumption, for which the case τ = 1 corresponds to (1.6), the fractional moment bounds presented below would be limited to 0 < s < τ (and minor adjustment will be required in the argument, cf. [AS+01]), but that would not adversely affect the main results. Our main result is naturally stated in terms of the eigenfunction correlator which is introduced in Sect. 4. However, the statement can also be presented as follows. Theorem 1.1. Under the assumptions A1 - A2, for each p and n ∈ N there is an open ( p) set in the parameter space, n ⊂ R+ × R p−1 such that: ( p)

1. For all (λ, α) ∈ n and up to n particles, i.e., k ∈ {1, . . . , n}, each operator H (k) (ω) has almost surely only pure point spectrum, with the corresponding eigenfuntions being exponentially localized in the sense of distH (x, x0 ), as is explained below. ( p) 2. Furthermore, for all (λ, α) ∈ n , all k ∈ {1, . . . , n}, and all x, y ∈ (Zd )k : E

sup δx(k) , f (H (k) ) δy(k)

f ∞ ≤1

≤ A e−distH (x,y)/ξ ,

(1.8)

where

distH (x, y) := max

max dist({xi }, Y ), max dist({yi }, X ) ,

1≤i≤k

1≤i≤k

(1.9)

(k) (k) with δx δy the k-particle position eigenstates corresponding to x [y], and the constants A and ξ depending on n, p but not on k (≤ n). ( p) 3. The localization region n includes regimes of strong disorder and of weak interactions, i.e., ( p) (a) for each α ∈ R p−1 there is λ(α) such that n ⊃ (λ(α), ∞) × {α}, (1) (b) for any λ ∈ 1 , i.e. one for which the one-particle Hamiltonian exhibits com( p) plete localization, there are α(λ) j > 0, j = {1, ..., p}, such that n includes all (λ, α ) for which |α | ≤ |α(λ)| componentwise. The bound (1.8), applied to f (H ) = e−it H , implies dynamical localization, and through that also the spectral assertion which is made in Theorem 1.1. The latter is explained more explicitly in Appendix B. One may note that the distance between configuration, distH (x, y), which n appears above corresponds to the Hausdorff distance between the sets X = i=1 {x i } and n {yi }, seen as subsets of Zd with its Euclidean metric. The exponential bound Y = i=1 presented above deserves a number of further comments.

Localization Bounds for Multiparticle Systems

907

1.3. Remarks on the rate of exponential decay. For systems of non-interacting particles, i.e., the case α = 0, the one-particle localization theory allows to prove that in regimes of localization a bound like (1.8) holds with a stronger decay rate. For the stronger bound (which can be established for regimes of strong enough disorder, or extremal energies, and in one dimension, the full range of energies) the relevant distance is not dist H (x, y) but: dist(x, y) :=

n

|xj − yj |.

(1.10)

j=1

One could ask whether exponential decay in dist(x, y) persists also in the presence of interactions. Upon reflection, the general answer to this should be negative: The n particle Hamiltonian (1.1) clearly commutes with elements of the permutation group Sn . In the non-interacting case (i.e., α = 0) its spectrum is degenerate, H (n) (ω) being the sum of n commuting unitarily equivalent operators, each affecting only one particle. However, since generically the interactions couple the different permutation-related degenerate eigenstates of the non-interacting systems, it is natural to expect that for α ≡ 0, the operator H (n) (ω) will have no eigenstates in which the probability amplitude is essentially restricted to the vicinity of a particular n particle configuration. Instead, localization may still be manifested in the existence of eigenstates which decay in the sense of the symmetrized configuration distance distS (x, y) := min

π ∈Sn

n

|xj − yπ j |,

(1.11)

j=1

with Sn the permutation group of the n elements {1, ..., n}. The dynamical version of this eigenfunction picture is that for very large t a state of (n) (n) the form e−it H δx , which has evolved from the initially localized state at x, would have non-negligible amplitude not only in the vicinity of x but also in the vicinity of the permuted configurations π x. The above considerations are of course superfluous in case one is interested only in the fully symmetric or antisymmetric sector, where the initial states cannot be localized in the stronger sense and where only decay rates which are symmetric under permutations are of relevance. However, the decay rate exp(− dist H (x, y)/ξ ) is still qualitatively weaker than exp(−distS (x, y)/ξ ). In particular, for configurations which include some tight subclusters with multiple occupancy our bounds do not rule out the possibility, which we do not expect to be realized, of the excess charges being able to hop freely between the different subclusters, as depicted in Fig. 1. Nevertheless, the bounds allow to conclude the main features of localization. Some further explicit comparison between different distances are given in Appendix A. 1.4. Comments on the proof. It would be natural to ask why is there a need for a separate proof of localization for the n-particle system, since the configurations of the system can also be viewed as describing a single particle with nd-dimensional position vector, x = (x1 , ..., xn ). Regarded from this perspective, the Hamiltonian (1.1) may at first appear to have the usual structure for which localization is well understood,consisting of the usual kinetic term, a potential function U (x), and a random potential j V (x j ; ω). The answer is that the values which the random potential V (x; ω) assumes at different

908

M. Aizenman, S. Warzel

??

Fig. 1. A schematic depiction of our localization bounds: Starting from the configuration depicted on the left, at any later time except for events of very small probability the collection of locations which are near an occupied site does not change by much. However, the Hausdorff metric bounds allow for the possibility that if the initial configuration had two or more particles in sufficient proximity [within the localization distance] then such ‘excess’ may transfer among the occupied regions

positions in the nd-dimensional space are not independent. Instead, they are correlated over arbitrary distances, and the number of its independent degrees of freedom (L d ), for the systems in a box of linear size L, scales as only a fractional power of the number of its configurations (L nd ). From this perspective, the randomness is much more limited than what is found in the well understood one-particle situation. The proof of Theorem 1.1 is organized as induction on n, and is guided by the following picture: once localization is established for less than n-particles, one may expect that throughout most of the volume the time evolution of n particles does not disperse, except possibly when the particles are all close to each other and move as some n-particle cloud. This hypothetical mode, with the n particles forming a quasi-particle, is ruled out using one-particle techniques, which are modified to show that such a possibility does not occur at weak enough interactions. Technically, our proof makes use of the Green function fractional-moment techniques, and in particular the finite-volume criteria of [AS+01]. However, in addition to adapting a number of “off the shelf” one-particle arguments we need to show that exponential decay of the fractional moment of the Green function for lower numbers of particles implies: i. exponential decay for systems composed of non-interacting subsystems, and ii. for the interactive system - exponential decay in a distance defined relative to the set of clustered n-particle configurations. These terms are explained more explicitly in the following sections. In Sects. 2–5 we present a number of relations which are utilized in the derivation of Theorem 1.1. These are then strung to a proof in Sect. 6. 2. Finiteness of the Green Function’s Fractional Moments The proof of localization proceeds through exponentially decaying estimates for the Green function G (x, y; z) of finite volume versions of H (n) (ω) evaluated at energies z within the spectrum of the infinite volume operator. Our first step is to establish that for s < 1 each |G (x, y; z)|s is of finite conditional expectation value, regardless of z ∈ C, when averaged over one or two potential variables – provided each of the configurations (x and y) includes at least one averaged site. The basic strategy is familiar from the theory of one-particle localization. However the proofs need to be revisited here since we are now dealing with random potentials whose values for different configurations are no longer independent, and in certain ways are highly correlated.

Localization Bounds for Multiparticle Systems

909

It may be noted that the celebrated Wegner estimate is not being explicitly used in the Fractional Moment Analysis, its role being taken by the finiteness of the Green function’s fractional moments. However, in view of Wegner’s estimate’s intrinsic interest and conceptual appeal, we comment on it in Appendix D. We shall use the following notation: The n-particle Green function associated with some region ⊆ Zd and z ∈ C+ , is −1 (n) (n) δy(n) , G (x, y; z) ≡ G (x, y; z) := δx(n) , H − z

(2.1)

where H(n) (ω) is the (finite-volume) operator obtained by resticting (1.1) to the Hilbert (n) := 2 ()n (with the default choice of Dirichlet boundary conditions). The space H (n) (n) (n) correspond to localized states, i.e. δx(n) , ψ = ψ(x), and are vectors δx , δy ∈ H parametrized by configurations x = (x1 , . . . , xn ) of n-particles. When clear from the context we shall drop the superscript (n) at our convenience. The set of configurations with all particles in is denoted by C (n) () := n . Also: • For a given set S ⊂ , we denote by C (n) (; S) the set of n-particle configurations which have at least one particle in S. In case S = {x} the set will also be denoted as C (n) (; x). (n) • We denote by Cr () := {x ∈ n | diam(x) ≤ r } the set of configurations with diameter less or equal to r , the diameter of a configuration being defined as diam(x) := max j,k |x j − xk |. Theorem 2.1. For any s ∈ (0, 1) there exists Cs < ∞ such that for any ⊆ Zd , any two (not necessarily distinct) sites u 1 , u 2 ∈ and any pair of configurations x ∈ C (n) (; u 1 ) and y ∈ C (n) (; u 2 ), the following bound holds: s (K E 0 )# (n) E G (x, y; z) {V (v)}v∈{u 1 ,u 2 } ≤ Cs (|λ|E 0 )s

(2.2)

for all z ∈ C+ and λ = 0, with # = 2 in case u 1 = u 2 , and # = 1 otherwise. For the proof, let us note that in its dependence on the single-site random variables V (x) the Hamiltonian is of the form H (ω) = A + λ

V (u; ω) Nu ,

(2.3)

u∈

where Nu is the number operator, (Nu ψ) (x) := nk=1 δxk ,u ψ(x), which counts the number of particles on the site u ∈ Zd . In analyzing averages over the potential variables we shall employ the following double sampling bound, which is the dual form of the regularity assumption A1, Eq. (1.6). Under that assumption, for any non-negative function h of one of the single potential parameters V ≡ V (u), for some u ∈ Zd :

R

h(V ) ρ(V ) d V ≤ K E 0

R |V |≤E 0

h(V + V )

dV ρ(V ) d V. E0

(2.4)

910

M. Aizenman, S. Warzel

Proof of Theorem 2.1. We shall estimate the conditional expectation in (2.2) with the help of the double sampling bound (2.4), applied to the pair of variables V (u j ) (or a single one in case they coincide). According to that, it suffices to estimate just the integral over the variable(s) V (u j ) of the fractional moment of ⎛ ⎞−1 V (w)Nw − z ⎠ δy . (2.5) G (x, y; z) := δx , ⎝ H + w∈{u 1 ,u 2 }

At this point, a useful tool is the following weak L 1 -estimate which forms a straightforward extension of [AE+06, Prop. 3.2]: For any pair of normalized vectors φ, ψ in some Hilbert space, any pair of self adjoint operators with N , M ≥ 0, and a maximally dissipative operator K :

√ √ C (2.6) 1 φ, N (ξ N + η M − K )−1 Mψ > t dξ dη ≤ , 2 t [−1,1] for all t > 0 with some (universal) constant C < ∞, where 1[. . . ] denotes the indicator function. A similar bound holds for the one-variable version of (2.6). Applying (2.6) to the expression in (2.5), and noting that δx and δy are eigenvectors of Nu 1 and Nu 2 with eigenvalues greater or equal to one, one gets: W (t) := E 0−2 V (u 1 ), V (u 2 ) ∈ [−E 0 , E 0 ]2 : G (x, y; z) ≥ t ≤ min{4,

C }, |λ|E 0 t

(2.7)

where W (t) is introduced just for the next formula. To estimate the corresponding integral of the kernel’s fractional moment, one may use the Stieltjes integral expression: ∞ G (x, y; z)s d V (u 1 ) d V (u 2 ) = W (t) d(t s ) 2 E E 0 0 [−E 0 ,E 0 ] 0 ≤

41−s C s . (1 − s) (|λ|E 0 )s

(2.8)

A similar bound holds in case u 1 = u 2 for the average over a single variable. The bound (2.2) is implied now through a simple application of (2.4). 3. Localization Domains in the Parameter Space The following notions are useful in describing localization bounds which persist when the strength of the disorder is driven up, and also to present the localization regimes which are discussed in this work. For their formulation we denote by λ1 ∈ R+ the critical coupling above which the one-particle Hamiltonian H (1) exhibits uniform 1-particle localization in the sense of Definition 3.3 below. Its existence was established in [AM93]. Definition 3.1. A robust domain in the parameter space is a non-empty open set ⊂ R+ × R p−1 for some p ∈ N such that: 1. if (λ, α) ∈ , then for all λ > λ also (λ , α) ∈ , 2. for every α ∈ R p−1 there exist λ(α) ∈ R+ such that (λ(α), α) ∈ , 3. includes the half-line (λ1 , ∞) × {0}.

Localization Bounds for Multiparticle Systems

911

Definition 3.2. A subset of the parameter space ⊂ R+ × R p−1 is called sub-conical, if there is some c < ∞ such that for all (λ, α) ∈ , p (2U )dk k=2

k!

|αk | ≤ c |λ|.

(3.1)

In this context it is worth noting that under Assumption A2 the interaction obeys the bound: p (2U )dk U(α) ∞ := sup |U(x; α)| ≤ n |αk | . (3.2) k! x∈(Zd )n k=2

A useful criterion of localization is expressed in terms of the fractional moments of the Green function, with the average being carried out over both the disorder and the energy within an interval I ⊂ R. For this purpose we denote: E I [·] := |I |−1 E [·] d E. (3.3) I

Definition 3.3. A robust subset of the parameter space, ⊂ R+ × R p−1 is said to be a domain of uniform n-particle localization if for some s ∈ (0, 1) there exists ξ = ξ(s, n, p) < ∞ and A = A(s, n, p) < ∞ such that for all (λ, α) ∈ , all k ∈ {1, ..., n}, and all x, y ∈ C (k) : s (k) E I G (x, y) ≤ A e− distH (x,y)/ξ , (3.4) sup sup I ⊂R ⊂Zd |I |≥1

where the energy variable on which G depends was averaged over the intervals I . In the above definition we have incorporated the specific choice of the distance, distH , only for convenience. As was explained in the Introduction, it seems natural to expect exponential decay also in terms of the symmetrized distance, dist S , but that is not proven here. Let us also add that localization in the sense of (3.4) implies various other, more intuitive and physically relevant, expressions of the phenomenon; in particular dynamical (Sect. 4) as well as spectral localization (Appendix B). In this work we shall focus on the proof of existence of robust regimes of localization for any finite n and p, without monitoring closely the values of the localization length ξ , and amplitude A. In particular, the subsequent proof yields a localization length which degrades heavily when the number of particles n increases. Concerning the value of s in the above definition, it is helpful to notice Lemma 3.1. If |λ| is bounded away from zero and the condition (3.4) is satisfied for some s ∈ (0, 1), then it holds for all other s ∈ (0, 1) at adjusted values of ξ < ∞ and A < ∞. Proof. Jensen’s and Hölder’s inequality imply that for all r ≤ s ≤ t < 1, s E I |G (x, y)|r r ≤ E I |G (x, y)|s s−r t−s E I |G (x, y)|r t−r . (3.5) ≤ E I |G (x, y)|t t−r The first term in the last line is bounded, E I |G (x, y)|t ≤ C(t) |λ|−t , thanks to (2.2).

912

M. Aizenman, S. Warzel

4. Multiparticle Eigenfunction Correlators and the Green Function 4.1. Eigenfunction correlators. A convenient expression of localization, and also a convenient tool for the analysis, is provided by the eigenfunction correlators. By this term we refer to the family of kernels, for x, y ∈ (Zd )n : s

δx(n) , P{E} (H(n) ) δx(n) 1−s δx(n) , P{E} (H(n) ) δy(n) , (4.1) Q (n) (x, y; I ; s) := (n)

E∈σ (H )∩I (n)

where I ⊂ R is a subset of the energy range, ⊂ Zd is a finite subset, P{E} (H ) is the spectral projection on the eigenspace corresponding to the eigenvalue E, and s ∈ [0, 1] is an interpolation parameter. This definition extends naturally to also unbounded ⊂ Zd , (n) provided H has only pure point spectrum within I . The notation used here differs from that of [Ai94] by allowing for degeneracies in the spectrum. As mentioned in the Introduction, while the spectrum of a one-particle Hamiltonian with random potential is almost surely non-degenerate, degeneracies do occur in the non-interacting multiparticle case. When the domain and the value of n are clear from the context, or of no particular importance, the sub/super-scripts on Q may be suppressed. When s is omitted, it is understood to take the value s = 1, which for many purposes is the most relevant one. An essential property of the kernel is the bound (at s = 1): sup | δx , f (H ) δy | ≤ |Q (x, y; I )| .

f ∞ ≤1

(4.2)

In its dependence on the parameter s the kernel is log-convex, i.e., for any λ ∈ [0, 1], Q(x, y; I ; (1 − λ) p0 + λp1 ) ≤ Q(x, y; I ; s0 )(1−λ) Q(x, y; I ; s1 )λ .

(4.3)

Moreover: Q(x, y; I ; 0) =

E∈σ (H )∩I

Q(x, y; I ; 1) ≤

δx , P{E} (H ) δx ≤ 1, δx , P{E} (H ) δy ≤ 1,

(4.4)

E∈σ (H )∩I

where the latter is by the Cauchy-Schwarz inequality. A useful implication of Eqs. (4.4) and (4.3) is that for any 0 < s < t ≤ 1, 1−t

Q(x, y; I ; t) ≤ Q(x, y; I ; s) 1−s .

(4.5)

The relations (4.4) and (4.3) played a role in the strategy which was used in [Ai94] for the deduction of dynamical localization through Green function fractional moment bounds. As we shall see next, the method can be extended to many particle systems. Most of our analysis will be done in finite volumes. A minor subtlety concerning the passage to the infinite volume limit, is that we do not have an a-priori statement of convergence in this limit of the eigenfunctions, nor of the eigenfunction correlators. Nevertheless, one has the following statement.

Localization Bounds for Multiparticle Systems

913

Theorem 4.1. Suppose that the following bound holds for a sequence of finite domains which converge to Zd , and a fixed interval I ⊂ R,

(n) E Q (x, y; I ) ≤ A e−K (x,y) , (4.6) with K (·, ·) some kernel, i.e., a two point function defined over the space of pairs of n-particle configurations, and some A = A(n) < ∞. Then, within the n-particle sector, the infinite volume operator H (ω) satisfies: (n) E sup δx , f (H (n) ) δy(n) ≤ A e−K (x,y) . (4.7) f ∞ ≤1

Furthermore, if (4.6) holds with K (x, y) = 2distH (x, y)/ξ , then one may also conclude that the n-particle spectral projection on I is almost surely given by a sum over a collection of rank-one projections on eigenstates which decay exponentially, each satisfying a bound of the form: 2nd+2 − dist (x,x )/ξ ψ H |ψ(x; ω)|2 ≤ A(ω; n) 1 + |xψ | e , (4.8) where A(ω; n) is an amplitude of finite mean, and the decay is from a configuration xψ at which the wave function is non-negligible in the sense that −(nd+1) 1 + |xψ | 2 |ψ(xψ ; ω)| ≥ . (4.9) −(nd+1) y∈(Zd )n (1 + |y|) With the natural modification, the last statement is valid also in case K (x, y) is given in terms of any of the other distances which were mentioned in the introduction, i.e., dist(x, y) or distS (x, y). Except for a minor reformulation of a known bound, this relation is in essence well familiar from the theory of one particle localization (it was used already in [Ai94]). We therefore relegate its proof to the Appendix (B). As it turns out, averages over the disorder of the eigenfuction correlator are closely related with Green function’s fractional moments. The rest of this section is devoted to the relations between the two quantities. 4.2. Lower bound in terms of Green function’s fractional moments. The following (deterministic) estimate allows to bound fractional moments of Green functions in terms of eigenfunction correlators. Theorem 4.2. Let ⊂ Zd . For any s ∈ (0, 1) and any interval I ⊂ R, s 2 |I |1−s (n) (n) Q (x, y, R)s . G (x, y; E) d E ≤ 1−s I

(4.10)

One may note that this bound is useful only in case of complete localization of all eigenfunctions, but that suffices for our purpose. The bound may be improved with a restriction of the eigenfunction correlator to a finite, slightly enlarged, interval I ⊃ I ; the contribution to the Green function from eigenfunctions outside I being handled with the help of quasi-analytic cutoff in the sense of Helffer-Sjöstrand, and the Combes-Thomas Green function estimate.

914

M. Aizenman, S. Warzel

Proof of Theorem 4.2. We split the contribution to the Green function into two terms depending on whether δx , P{E} (H ) δy ≥ 0 or δx , P{E} (H ) δy < 0, G ± (x, y; z) :=

δx , P{E} (H ) δy . E −z

(4.11)

E∈σ (H ) sign δx ,P{E} (H ) δy =±

(Note that the eigenfunctions of (1.1) may be taken to be real. In the complex case one would have four terms instead.) Using |a + b|s ≤ |a|s + |b|s we thus get s |G(x, y; E)|s d E ≤ G # (x, y; E) d E I

#=± I

s = #=±

dt E ∈ I G # (x, y; E) > t 1−s . t

∞ 0

(4.12)

Boole’s remarkable formula, which states that x ∈ R n pn (xn − x)−1 > t = 2 n pn t −1 for all xn ∈ R, pn , t > 0, [Bo57], implies that 2 E ∈ R G # (x, y; z) > t = t

δx , P{E} (H ) δy =:

E∈σ (H ) sign δx ,P{E} (H )δy =#

2 # Q (x, y, R). t (4.13)

Substituting in the integral (4.12) the maximum of (4.13) and the length of the interval, |I |, one arrives at 2s |I |1−s + |G(x, y; E)|s d E ≤ Q (x, y, R)s + Q − (x, y, R)s 1 − s I 2 |I |1−s Q(x, y, R)s . ≤ (4.14) 1−s 4.3. Upper bound in terms of Green function’s fractional moments. For the proof of our main result we need also a converse bound to (4.10). In the one-particle case there is a simple passage from exponential decay of Green function fractional moments to similar bounds on the mean value of the eigenfunction correlators, and thus to dynamical localization [Ai94]. In effect, it is based on the following relation, which is a somewhat more explicit statement than what is found in [Ai94]. Lemma 4.3. For any finite domain ⊂ Zd , x ∈ , s ∈ (0, 1) and Borel set I ⊂ R, s dv (1) s−1 −1

δ Q (1) (x, y; I, s) = |λ| , (H − E) δ x y d E, (4.15) Vx →Vx +v |v|s R I where the left side involves the eigenfunction correlator for the one parameter family of (1) (1) operators H (v) := H + λ v Px acting in 2 ().

Localization Bounds for Multiparticle Systems

915

The combination of (4.15) and the ‘double sampling bound’ (2.4) produces the desired upper bound on the expectation of the eigenfunction correlator - in the oneparticle case. In the multiparticle case, an extension is needed of the averaging principle which is expressed in Lemma 4.3. Following is a suitable generalization. Lemma 4.4. Let s ∈ [0, 1) and ⊂ Zd be a finite set, and u a point in . Then for all x ∈ C (n) (; u) and y ∈ C (n) (): dv 1+ 2s Q (x, y; I, s) Nu (x) s V (u)→V (u)+v |v| R 1 =

δx , κ (E) δx 1−s |λ|1−s I κ∈σ (K u (E)) s × δx , κ (E) Nu (H − E)−1 δy d E, (4.16) where κ (E) ≡ P{κ} (K u (E)) is the spectral projection on the eigenspace at eigenvalue κ for the E- dependent operator: K u (E) :=

Nu (H − E)−1 Nu ,

(4.17)

(n) which we take as acting within the range of Nu in H .

Since the proof takes one on a technical detour, we have placed it here in Appendix C. Using this averaging principle, we get: Theorem 4.5. Let ⊂ Zd be a finite subset, and u ∈ . Suppose x, y ∈ C (n) () is a pair of configurations such that the number Nu (x) of particles of x at u is at least one. Then for any s ∈ (0, 1) and any interval I ⊂ R,

(n) Nu (x) E Q (x, y; I, s) ! " s Nu (w) s/2 K |E 0 |s (n) ≤ E (y, w; E) G d E. (4.18) |λ|1−s N (x) u I (n) w∈C

(;u)

Proof. It follows from (2.4) that the conditional expectation of any non-negative function f of the random variables {V (x)}x∈ , conditioned on the values of V at sites other than u, satisfies: $ # dv s E f (V (u)) {V (x)}x=u ≤ K |E 0 | E f (V (u) + v) s {V (x)}x=u . (4.19) |v| We apply this relation to f the eigenfunction correlator. The quantity which one then finds on the right side of (4.19) can be rewritten with the help of Lemma 4.4. The claimed bound then easily follows using the fact that |a + b|s ≤ |a|s + |b|s (for 0 < s < 1), and 1−s | δ , (E)δ |s ≤ 1, in (4.16). x κ w κ δx , κ (E) δx Applications of the above result are restricted to bounded intervals I . In order to control the eigenfunction correlator associated with the tails of the spectrum we also use:

916

M. Aizenman, S. Warzel

Lemma 4.6. Let ⊆ Zd and E ≥ 0. Then for every x ∈ C (n) (): E δx(n) , PR\(−E,E) H(n) δx(n)

≤ E e|V (0)| exp min{1, (n|λ|)−1 } (2dn + U(α) ∞ − E) .

(4.20)

Proof. The Chebyshev-type inequality, 1R\(−E,E) (x) ≤ e−t E et x + e−t x , reduces the bound to one on the semigroup for which we employ the Feynman-Kac representation (cf. [CL90, Prop. II.3.12]) to show that for any t > 0: E δx , et H δx % % & & t (x;t) ≤ E exp λV (u)Nu (y(s)) + U(y(s); α) ds ν (dy)

≤E e

0

nt |λ||V (0)|

u∈

exp {t (2dn + U(α) ∞ )} δx , et j , j δx ,

(4.21)

(x;t) where ν is the measure generated by j , j on path {y(s)}0≤s≤t , starting in and returning to x 'in time t. The last inequality is a version of Jensen applied to the average t (nt)−1 u∈ 0 (·) Nu (y(s)) ds in the exponential. A similar bound holds for t < 0. The ( ) proof is completed using the operator bound δx , et δx ≤ e2dn|t| for the free semigroup and the choice t = min{1, (n|λ|)−1 }. 5. Implications of Localization in Subsystems In the induction step of the proof of the main result, we shall be considering for a system of n particles the consequences of localization bounds which are already established for subsystems. In this section we present some results which will be useful for that purpose; first considering the case when the subsystems are combined without interaction, and then the more involved situation where the two subsystems are coupled via short range interaction. For a partition of the index set {1, . . . , n} into disjoint subsets J and K , we denote the coordinates of the two subsystems as x J := {x j } j∈J and correspondingly xK .

5.1. Localization for non-interacting systems. When two subsystems are put together with no interaction, the Hamiltonian is – in natural notation, H(J,K ) := H(J ) ⊕ H(K ) ,

(5.1)

acting in 2 ()|J | ⊕ 2 ()|K | . A complete set of eigenfunctions of the operator sum can be obtained by taking products of the subsystems’ eigenfunctions. Clearly, if the subsystems exhibited spectral localization, that property will be inherited by the composite system. The question of localization properties of the corresponding Green function, which is (J,K ) an important tool for our analysis, is a bit less immediate: G is a convolution, with

Localization Bounds for Multiparticle Systems

917

respect to the energy, of the subsystems’ Green functions, i.e., for any x = (x J , x K ), y = (y J , y K ), and z ∈ C\R: dζ (J,K ) (J ) (K ) , (5.2) G (x, y; z) = G (x J , y J ; z − ζ ) G (x K , y K ; ζ ) 2πi C (K )

where C is any closed contour in C, which encloses the spectrum of H but none of (J ) H − z. Given the singular nature of the E-dependence of the Green function, localization in the sense of Definition 3.3 is not immediately obvious. To establish it, we take a detour via eigenfunction correlators. These are less singular in E, but share the convolution structure, which in this case can be written in the form: (J,K ) (J ) (K ) Q (x, y; I ) = Q (x J , y J ; I − E) Q (x K , y K ; d E) (J )

(K )

≤ Q (x J , y J ; R) Q (x K , y K ; R).

(5.3)

The following result will allow us to apply this relation. Lemma 5.1. Let ∈ R+ ×R p−1 be a sub-conical domain of uniform n-particle localization. Then there exist A, ξ ∈ (0, ∞) such that the eigenfunction correlator corresponding to up to n particles, i.e., k ∈ {1, . . . , n}, is exponentially bounded for all (λ, α) ∈ , and all x, y ∈ (Zd )k :

(k) (5.4) sup E Q (x, y; R) ≤ A e− distH (x,y)/ξ . ⊂Zd

Proof. As an immediate consequence of (4.5), Theorem 4.5, and Lemma A.3, we know that there is A, ξ ∈ (0, ∞) such that

E − distH (x,y)/ξ (k) e , E Q (x, y; [−E, E]) ≤ A |λ|

(5.5)

for any E > 0 and any (λ, α) ∈ . For a bound which is uniform in E we combine this with Lemma 4.6, which with the help of the Cauchy-Schwarz inequality implies:

(k) E Q (k) (x, y; R) − E Q (x, y; [−E, E])

1/2 (k) (k) ≤ E Q (x, x; R\[−E, E]) E Q (y, y; R\[−E, E])

≤ E e|V (0)| exp min{1, (k|λ|)−1 } (2dk + U(α) ∞ − E) .

(5.6)

Choosing the cutoff at E = 2dk + U(α) ∞ + max{1, k |λ|} distH (x, y)/ξ one obtains the claimed exponential bound. Using the fact that is sub-conical in the sense of Definition 3.2, the above argument yields a (λ, α)-independent amplitude “A” in (5.4). The two-way relation between the eigenfunction correlators and fractional moments of the Green function, and the factorization property (5.3), allows us now to establish:

918

M. Aizenman, S. Warzel

Theorem 5.2. Let ∈ R+ × R p−1 be a sub-conical domain of uniform n-particle localization. For any s ∈ (0, 1) there is ξ, A ∈ (0, ∞) such that the Green function of the composition (5.1) of any pair (J, K ) of systems of at most n particles (i.e., max{|J |, |K |} ≤ n) is bounded for all (λ, α) ∈ and all x = (x J , x K ), y = (y J , y K ): s (J,K ) (J,K ) sup sup E I G (x, y) ≤ A e− distH (x,y)/ξ , (5.7) I ⊂R ⊂Zd |I |≥1

(J,K )

with dist H

(x, y) := max{distH (x J , y J ), dist H (x K , y K )}.

Proof. By Theorem 4.2, and Jensen’s inequality, for any s ∈ (0, 1) there is a constant C = C(s) < ∞ such that s

s C )) (J,K ) E I G (J,K ≤ (x, y) E Q (x, y; I ) . (5.8) |I |s The claim follows by combining: i) the product formula (5.3), ii) the uniform bound Q ≤ 1, and iii) the bound of Lemma 5.1, applied to the factor with the greater separation. 5.2. Decay away from clustered configurations. We now turn to the more involved situation, where a system consists of subsystems which separately exhibit localization, but which are put together with an interaction. Intuitively, the decay of the fractional moments for the subsystems should imply smallness of the corresponding kernel for the composite system for pairs of configurations where at least one of the pair can be split into two well separated parts. To express this idea in a bound, we shall use the notion of the splitting width of a configuration: (x) :=

max

J,K ˙ ={1,...,n} J ∪K

min

j∈J, k∈K

|x j − xk |,

(5.9)

where the maximum runs over all the two-set partitions of the index set {1, . . . , n}. It is easy see that diam(x)/(n − 1) ≤ (x) ≤ diam(x) (cf. Appendix A). ( p)

Theorem 5.3. Let n−1 ⊂ R+ × R p−1 be a sub-conical domain of uniform (n − 1)particle localization, with n ≥ 2. Then there exist some s ∈ (0, 1), A, ξ < ∞ such that ( p) for all (λ, α) ∈ n−1 and all x, y ∈ (Zd )n :

(n) E I |G (x, y)|s sup sup I ⊂R ⊆Zd |I |≥1

! " 1 ≤ A exp − min {distH (x, y), max{(x), (y)}} . ξ

(5.10)

Proof. We fix x, y ∈ C (n) () and assume without loss of generality that (x) ≥ (y). We then split x into two clusters, x J , x K such that (x) =

min

j∈J, k∈K

|x j − xk |.

(5.11)

Localization Bounds for Multiparticle Systems

919

Between the two clusters, x J , x K we remove all interactions, leaving the inter-cluster interaction untouched. The resulting operator is a direct sum of two non-interacting subsystems, H(J,K ) := H(J ) ⊕ H(K ) acting in 2 ()|J | ⊗ 2 ()|K | , where H

(J )

:=

|J |

− j + λ Vω (x j ) +

j=1

|J | k=2

αk

U A (N A (x J ))

(5.12)

A⊂Zd : |A|=k diam A≤U (J,K )

(J,K )

and similarly for H (K ) . Denoting the Green function corresponding to H by G and using |a + b|s ≤ |a|s + |b|s , we thus have

(J,K ) (J,K ) E I |G (x, y)|s ≤ E I |G (x, y) − G (x, y)|s . E I |G (x, y)|s +

,

(5.13) (J,K ) G

is a Green function of a composite system, whose parts are assumed to Since exhibit uniform (n − 1)-particle localization, Lemma 5.2 guarantees the existence of ( p) s ∈ (0, 1) and A, ξ ∈ (0, ∞) such that for all (λ, α) ∈ n−1 and all x, y: s (J,K ) (J,K ) E I G (x, y) ≤ A e− distH (x,y)/ξ ≤ A e− distH (x,y)/ξ , (5.14) (J,K )

where the last step is by the general relation dist H (x, y) ≥ distH (x, y). To bound the second term we use the resolvent identity ) ) := G (J,K (x, y; z) − G (x, y; z) = G (J,K (x, w; z) U J,K (w) G (w, y; z), w∈C (n) ()

(5.15) where U J,K = H − H (J,K ) . In order to be able to apply the Cauchy-Schwarz inequality we first decrease the exponent with the help of Hölder’s inequality, ! " 1+2s 1−s 1−s 2+s 3s(1+s) 3s(1+s) s s 2+s 2+s s 2(1+2s) E I [|| ] ≤ E I [|| E I [|| 2 ] E I [|| 2 ] ] ≤ c |λ|− 2(2+s) , (5.16) where the last inequality is due to (2.2). Inserting (5.15) and using the Cauchy-Schwarz inequality together with (2.2) we thus obtain s β c U J,K (w)sβ E G (J,K ) (x, w; z) E I [||s ] ≤ , (5.17) |λ|s (n) w∈C

where we abbreviated β :=

()

1−s 2(2+s) .

To estimate the right side note that

|J | sup |U J,K (x)| = sup αk x

x

k=2

U A (N A (x))

A⊂Zd

: |A|=k diam A≤U

× 1 There is j ∈ J , k ∈ K s.t. |x j − xk | ≤ U ≤ U(α) ∞ .

(5.18)

920

M. Aizenman, S. Warzel

Moreover, the distance of x to the support of U J,K is bounded from below, inf

w∈supp U J,K

) dist(J,K (x, w) ≥ (x) − U , H

(5.19)

by the triangle inequality and (5.11). As a consequence, (5.14) and Lemma A.3 yield % &s

|λ| β s (x) exp E I [|| ] β 2ξ U(α) ∞

β ) ≤A exp − dist(J,K (x, w) < ∞. (5.20) H 2ξ nd w∈Z

β

The proof is concluded by noting that U(α) ∞ ≤ C |λ| for some C < ∞ for all ( p) (λ, α) ∈ n−1 since the latter is sub-conical. The above result will allow us to insert in sums over n-particle configurations a restriction to ones of a limited diameter. The following bound will be useful for estimates of the remainder. Corollary 5.4. Under the hypothesis of Theorem 5.3, there exist s ∈ (0, 1) and A, ξ < ∞ such that for all 0 < r ≤ r , the quantity ks (, r, r ) := sup (5.21) E |G (x, y; z)|s |x−y|≥2r

(n) y∈Cr (;y) (n) (n) x∈C (;x)\Cr (;x)

( p)

satisfies for all (λ, α) ∈ n−1 : $ r . exp − (n − 1) ξ #

ks (, r, r ) ≤ A r

d(n−1)

||

n−1

Proof. By Lemma A.2, for all x, y in the sum in (5.21): distH (x, y) ≥ |x − y| − r ≥ r ≥ r /(n − 1). Theorem 5.3 hence guarantees that for some s ∈ (0, 1) and A, ξ < ∞: $ # r . (5.22) E |G (x, y; z)|s ≤ A exp − (n − 1)ξ The proof is completed by bounding the number of configurations in Cr(n) (; y) and C (n) (; x) by |Cr (; y)| ≤ n (4r )d(n−1) and |C(; x)| ≤ n ||n−1 , respectively. 6. Proof of the Main Result We now turn to the proof of Theorem 1.1. That is, we shall prove that there exists a monotone sequence of decreasing sub-conical non-empty domains in the parameter space, ( p)

1

( p)

⊇ · · · ⊇ n

⊇ ...,

(6.1)

Localization Bounds for Multiparticle Systems

921 ( p)

such that for each n, the Hamiltonian (1.1) with parameters in n exhibits uniform n-particle localization in the sense of Definition 3.3; each of the domains including regimes of large disorder and of weak interaction. The proof will proceed by induction on n, the induction step consisting of a constructive restriction of the domain. ( p) Establishing a sub-conical domain n of uniform n-particle localization would be sufficient for our purpose, since Lemma 5.1 then implies that there is A, ξ ∈ (0, ∞) ( p) such that for all (λ, α) ∈ n , all k ∈ {1, . . . , n}, and all x, y ∈ (Zd )k : (k)

sup Q (x, y; R) ≤ A e− distH (x,y)/ξ .

(6.2)

⊂Zd

Spectral and dynamical localization for H (k) , as claimed in 1 and 2 of Theorem 1.1, follow using Theorem 4.1. Assertion 3 on the shape of the nested decreasing domains of uniform localization will be verified in the course of the inductive construction.

6.1. Analyzing clustered configurations. An essential component of the proof is to show the exponential decay of the finite volume quantities: s (n) Bs(n) (L) := sup sup |∂ L | E I G (x, y) , (6.3) I ⊂R ⊆ L |I |≥1

y∈∂ L x∈C (n) (;0) rL (n)

y∈Cr L (;y)

where L := [−L , L]d ∩ Zd , and s ∈ (0, 1). One may note that the sum is resticted to configurations in the form of separate clouds of particles with diameter less than r L := L/2, which are guaranteed to include at least a pair of sites at distance L ∈ N apart. By the Wegner estimate (2.2), for all s ∈ (0, 1) and L ∈ N: Bs(n) (L) ≤

C n 2 2d(n−1) L , |λ|s

(6.4)

where the constant C = C(s, d) < ∞ is independet of (λ, α) ∈ Rn . The following rescaling principle will be used to show that Bs(n) (L) decays exponentially provided that there is some L 0 for which it is sufficiently small. In its formulation, we consider length scales which grow as L k+1 := 2(L k + 1),

(6.5)

i.e., L k = 2k (L 0 + 2) − 2 for all k ∈ N0 . ( p)

Theorem 6.1. Let n−1 ⊂ R+ × R p−1 be a sub-conical domain of uniform (n − 1)particle localization, with n ≥ 2. Then there exists s ∈ (0, 1), a, A, p < ∞, and ν > 0 such that Bs(n) (L k+1 ) ≤ ( p)

a 2p B (n) (L k )2 + A L k+1 e−2ν L k |λ|s s

for all (λ, α) ∈ n−1 and all k ∈ N0 .

(6.6)

922

M. Aizenman, S. Warzel

In order to keep the flow of the main argument clear, we postpone the proof of this assertion to Subsect. 6.3. Quantities satisfying rescaling inequalities as in (6.6) are exponentially decreasing provided that they are small on some scale. This is the content of the following lemma. For its application we note that S( L˜ k ) := Bs(n) (L k ),

L˜ k := 2k (L 0 + 2),

(6.7)

satisfy (6.8). Lemma 6.2. Let S(L) be a non-negative sequence satisfying: S(2L) ≤ a S(L)2 + b L 2 p e−2ν L

(6.8)

for some a, b, p ∈ [0, ∞) and ν > 0. If for some L 0 > 0 there exists η < ∞ such that 2p p , L0 p 2. 1 > a S(L 0 ) + η L 0 e−ν L 0 =: e−µL 0 , 1. η2 ≥ ab + η

then for all k ∈ N0 :

S(2k L 0 ) ≤ a −1 exp −µ 2k L 0 .

(6.9)

Proof. From (6.8) it follows that the quantity R(L) := aS(L) + ηL p e−ν L satisfies % & 2p 2 R(2L) ≤ (aS(L)) + ab + η p L 2 p e−2ν L L0 ≤ (aS(L))2 + η2 L 2 p e−2ν L ≤ R(L)2 . The claimed (6.8) follows by iteration, using R(L 0 ) = e−µL 0 .

(6.10)

6.2. The inductive proof. Proof of Theorem 1.1. As explained above in this section, it suffices to establish subconical domains of uniform n-particle localization (in the sense of Definition 3.3). As an induction anchor we use the fact [AM93] that there is some λ1 ∈ (0, ∞) such ( p) that 1 := (λ1 , ∞) × R p−1 serves as a domain of uniform localization for the one( p) particle Hamiltonian H (1) . (Note that the last ( p − 1)-components of 1 are irrelevant (1) for H .) In the induction step (n −1 → n), we will first construct a robust, sub-conical domain ( p) ( p) n ⊆ n−1 and pick L 0 ∈ N sufficiently large such that at some inverse localization length µ > 0: Bs(n) (L k ) ≤

|λ|s −µ(L k +2) e ≤ 2Bs(n) (L 0 ) e−µ(L k −2L 0 ) , a ( p)

(6.11)

for all k ∈ N0 and all (λ, α) ∈ n . Based on the induction hypothesis, Theorem 6.1 and Lemma 6.2, this is will done separately in the two regimes of interest:

Localization Bounds for Multiparticle Systems

923

Strong disorder regime. Here we choose L 0 ∈ N by the condition 2 p e−ν L 0 < 1/4. Based on that we pick η in the range 2 p+1 1 ν L0 , p <η< p e L0 2L 0

(6.12)

which is non-empty by our previous choice of L 0 . Notice that p for this choice η 2 p /L 0 ≤ η2 /2. We now restrict the domain ( p) n−1 by choosing λn ≥ λn−1 large enough such that for all λ ≥ λn : η2 aA < s λ 2

and

a (n) 1 B (L 0 ) < . λs s 2

(6.13)

This is possible by (6.4) and the fact that A is independent of λ. Lemma 6.2 hence guarantees that (n)

µ=−

log(a λ−s n Bs (L 0 ) + 1/2) L0

≥−

(n) log(2 a λ−s n Bs (L 0 )) >0 2L 0

(6.14)

serves as an inverse localization length in (6.11), which is valid for all (λ, α) in the following regime of strong disorder: ( p) ( p) Dn := (λ, α) λ > λn ∩ n−1 . (6.15) Weak interaction regime. The localization bounds in [AM93] and arguments as in Theorem 5.2 ensure that Bs(n) (L) → 0 as L → ∞ for α = 0 and all λ > λ1 . We then pick L 0 ∈ N and η such that for α = 0: a (n) p B (L 0 ) + ηL 0 e−ν L 0 < 1, λs1 s aA 2p 2 + η p ≤η . λs1 L0

(6.16) (6.17)

(n)

Since the finite-volume quantity Bs (L 0 ) is continuous in α at 0, for every λ > λ1 , there exist α(λ) j > 0, j ∈ {1, . . . , p} such that (6.16) and (6.17) is maintained for all (λ, α) in the following regime of weak interaction: ( p) In := (λ, α) λ > λ1 and |α j | < α(λ) j ( p)

for all j ∈ {1, . . . , p}} ∩ n−1 .

(6.18)

By Lemma 6.2, µ as in (6.14) with λn replaced by λ1 hence serves as an inverse localization length in (6.11), which holds ( p) for all (λ, α) ∈ In .

924

M. Aizenman, S. Warzel ( p)

( p)

Summarizing, we have thus established (6.11) for all (λ, α) ∈ Dn ∪ In and a suitably large L 0 ∈ N. By further restriction, we may thus find a robust, but sub-conical domain ( p)

(∅ =) n

( p)

( p)

⊆ Dn ∪ I n

(6.19)

for which (6.11) holds. To complete the induction step, it is required to establish the exponential decay (3.4) ( p) of the n-particle Green function for all (λ, α) ∈ n . For this purpose, we select x ∈ ∪k {xk } and y ∈ ∪k {yk } such that distH (x, y) = |x − y|. In case |x − y| > L 0 , which we may assume without loss of generality (wlog), there exists a unique k ∈ N0 such that the box L k (x) which is centered at x satisfies: • y ∈ L k (x) and y ∈ L k+1 (x), • L k ≤ |x − y| ≤ c L k for some c ∈ (0, ∞). We may furthermore assume wlog that diam x < L k /2 is sufficiently small such that x ∈ C (n) ( ∩ L k (x)), since otherwise (3.4) (with k = n) follows from Theorem 5.3. The resolvent identity, in which we remove all the terms in the Laplacian which connect ∩ L k (x) and its complement, then implies E I |G (x, y)|s ≤ |Hw,w |s u∈∂ L k (x) w ∈C (n) (;\ L (x)) k w∈C (n) (∩ L k ;u)

s s × E I G ∩ L k (x) (x, w) G (w , y) s C , E I G sup ≤ * (x, w) s |λ| *: *⊂ L (x) (n) k

u∈∂ L k (x) w∈C

*;u) (

(6.20) where Hw,w are the matrix elements of the Hamiltonian between δw and δw . Inequality (6.20) follows from (2.2) by first conditioning on all random variables apart from those associated with sites y and u which are both outside L k (x). The sum in (6.20) is split into two parts, depending on the diameter of the configuration w: s C (n) E I G (x, y) ≤ B (n) (L k ) |λ|s s s C (n) E (6.21) (x, w) + sup G . I * s |λ| *: *⊂ L (x) (n) k

u∈∂ L k (x) w∈C

rL

k

*;u) (

By (6.11) both in the strong disorder regime and the regime of weak interactions, the first term on the right side is exponentially decaying in L k . To bound the second term we employ Theorem 5.3 again,

s Lk (n) E I G (x, w) ≤ A exp − , (6.22) 2(n − 1)ξ since distH (x, w) ≥ |x − u| − diam(x) ≥ L k /2 and (w) ≥ L k /(2(n − 1)). The sums in the second term have at most |∂ L k (x)| and n| L k (x)|n−1 terms, respectively. Hence, this second term is also exponentially decaying in L k and hence in the Hausdorff distance, distH (x, y). This completes the proof of the induction step.

Localization Bounds for Multiparticle Systems

925

6.3. Proof of the rescaling inequality. In the above proof we have postponed the derivation of Theorem 6.1, which provides an essential step for establishing regimes of exponential decay. We shall do that now. The construction and the analysis are inspired by arguments familiar from the one-particle setup. The following lemma essentially extends arguments in [AS+01]. Lemma 6.3. Let ⊂ Zd and V, W ⊂ with dist(V, W ) ≥ 2. Then for all x ∈ C (n) (V ) and y ∈ C (n) (W ), s s C (n) (n) E (x, w; z) E G (x, y; z) ≤ G V |λ|s u∈∂ V \∂ w∈C (n) (V ;u) s (n) (6.23) × E G W (v, y; z) , v∈∂ W \∂ v∈C (n) (W ;v)

where the constant C = C(s, d) < ∞ is independent of (λ, α) ∈ R p . Proof. Let Hw,w denote the matrix element of H between δw and δw . A twofold application of the resolvent identity, in which we remove all terms in the Laplacian entering H which connect sites in V with \V and W with \W , yields G (x, y; z) = G V (x, w; z) Hw,w G (w , v ; z) Hv,v G W (v, y; z). w∈C (n) (V ),w ∈C (n) (;\V ) v∈C (n) (W ),v ∈C (n) (;\W )

(6.24) Using |a + b|s ≤ |a|s + |b|s for any s ∈ (0, 1), we proceed by establishing the following implication of Theorem 2.1, s E |G V (x, w; z)|s G (w , v ; z) |G W (v, y; z)|s C s s |G |G ≤ E . (6.25) E (x, w; z)| (v, y; z)| V W |λ|s For its proof we note that there are two sites u , v ∈ V ∪ W for which the configurations w and v in the sum in (6.24) have a particle at u respectively v . We may therefore first condition on all random variables {V (x)}x∈{u ,v } and use (2.2) to estimate the factor in the middle. The other factors are independent of each other and of V (u ), V (v ), such that the expectation value factorizes. The proof of (6.23) is completed by noting that for any fixed w the number of w ∈ C (n) (; \V ) with Hw,w = 0 is at most 2d. We are now ready to give a Proof of Theorem 6.1. We first restrict the sum in the definition of Bs(n) (L k+1 ) to configurations x, y which have a diameter less than r L k , s (n) * E I G (x, y) . Bs(n) (L k+1 ) := |∂ L k+1 | sup sup I ⊂R : ⊆ L k+1 y∈∂ L k+1 x∈Cr(n) (;0) |I |≥1 L k (n) k

y∈Cr L (;y)

(6.26)

926

M. Aizenman, S. Warzel

The error is controlled with the help of Corollary 5.4: since y ∈ ∂ L k+1 we have |y| ≥ L k+1 and hence 2 Bs(n) (L k+1 ) − * Bs(n) (L k+1 ) ≤ 2 ∂ L k+1 ks ( L k+1 , r L k+1 , r L k ) " ! Lk 2dn−2 . ≤ A L k+1 exp − 2(n − 1)ξ

(6.27)

The sum * Bs(n) (L k+1 ) is now estimated with the help of Lemma 6.3, in which we pick V = L k ∩ and W = L k (y) ∩ , where the last box is centered at y ∈ ∂ L k+1 . (n) (n) (n) Since Cr(n) L k (V ; 0) ⊆ Cr L k (; 0) and Cr L k (W ; y) ⊆ Cr L k (; y) we thus obtain

* Bs(n) (L k+1 ) ≤

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨

2 C sup sup ∂ L k+1 |λ|s I ⊂R ⎪ ⎪ : ⊆ L k u∈∂ L k ⎪ |I |≥1 ⎪ ⎪ ⎩

(n) k w∈C (n) (;u)

s (n) E I G (x, w)

x∈Cr L (;0)

⎫ ⎪ ⎪ ⎪ ⎪ ⎬

⎪ s (n) . × sup E I G (v, y) sup ⎪ y∈∂ L k+1 : ⊆ L k (y) v∈∂ (y) ⎪ (n) ⎪ Lk y∈Cr L (;y) ⎪ ⎪ k ⎭ (n) v∈C

(;v)

(6.28) Thanks to we may shift y to the origin in the last line. Moreover, translation invariance ∂ L ≤ 4d−1 ∂ L and we may again restrict the summation to configurations k+1 k with a smaller diameter using Corollary 5.4 again, C * Bs(n) (L k+1 ) ≤ 42d−2 s |λ| C ≤ 24d−3 s |λ|

2 2 Bs(n) (L k ) + ∂ L k ks ( L k , r L k , r L k ) # "$ ! Lk (n) 2 2 2dn−2 Bs (L k ) + A L k . exp − (n − 1)ξ

(6.29)

Appendix A. Some Distances and Separation Lemmata Following are some natural lengths associated with n-particle configurations, and some elementary geometric estimates which are of use within this work. In general, for a configuration which is denoted by a bold lower-case letter, we shall d , which is the subset use the corresponding capital letter to denote its footprint in Z n d n which it covers; e.g., for x = {x1 , ..., xn } ⊂ (Z ) , we let X = i=1 {xi } ⊂ Zd . Also, d for subsets the index set, J ⊂ {1, .., n}, we let X J := i∈J {xi } ⊂ (Z ).

Localization Bounds for Multiparticle Systems

927

A.1. Splitting width. Two convenient measures of the spread of a configuration x = {x1 , ..., xn }, are: 1. the diameter, diam(x) := maxj,k∈{1,...,n} |xj − xk |, 2. the (maximal) splitting width, which we define as the supremum over r for which there exists a partition of the set X into two subsets at distance r apart, or: (x) :=

max

J, K : J ∪K ={1,..,n}

dist(XJ , XK ),

(A.1)

where dist(A, B) = minu∈A, v∈B |u − v|, with |u − v| the Euclidean distance. Since both quantities depend only on the footprint of x, in a harmless abuse of notation we may also refer to diam(x) as diam(X) and to (x) as (X ). Lemma A.1. For any configuration x ∈ (Zd )n : 1 diam(x) ≤ (x) ≤ diam(x). (A.2) n−1 Proof. The upper bound is totally elementary. To prove the lower bound on (x) consider the one-parameter family of sets X r :=

n 2

{y ∈ Zd : |y − x j | ≤ r }.

(A.3)

j=1

For any r > 0 such that X r is connected one clearly has max

j,k∈{1,...,n}

|x j − xk | ≤ 2r (n − 1).

(A.4)

It follows that for any r such that 2r < diam(x)/(n − 1) the set X r is not connected, and hence there is a partition of the configuration x into two subsets whose points are at distances greater than 2r . This implies that also 2r ≤ (x). Optimizing over such r we find that diam(x)/(n − 1) ≤ (x). A.2. Distances in the configuration space. In addition to the regular distance between subsets of Zd which is mentioned above, there exists also the notion of the Hausdorff distance, which is defined as: distH (X, Y) := max{max dist({u}, Y), max dist({v}, X)}, u∈X

v∈Y

(A.5)

for any X, Y ∈ Zd . In a slight abuse of notation we shall employ the symbol distH also for the induced Hausdorff distance between configurations (i.e., x, y ∈ (Zd )n ): distH (x, y) := distH (X, Y).

(A.6)

This distance is clearly most sensitive to the outliers. Another notion is the symmetrized distance: n distS (x, y) := min |xj − yπ j |, (A.7) π ∈Sn

j=1

with Sn the permutation group of the n elements {1, ..., n}. The following is an elementary consequence of the definitions.

928

M. Aizenman, S. Warzel

Lemma A.2. Let ⊆ Zd and u, v ∈ . For any two configurations x ∈ C (n) (; u) and y ∈ C (n) (; v), which have a particle at u and, respectively, v: distH (x, y) ≥ max{dist({u}, Y ), dist({v}, X )} ≥ |u −v| − min{diam(x), diam(y)}. (A.8) For convenience let us also place here the bound: Lemma A.3. Let ⊆ Zd and x ∈ (Zd )n . 1. For any site u ∈ with dist({u}, X ) = L, and any ξ ≥ 0: e− distH (x,y)/ξ ≤ C max{L , ξ }d(n−1) e−L/ξ ,

(A.9)

y∈C (n) (;u)

with a constant C = C(n) < ∞. 2. For any ξ ≥ 0:

e− distH (x,y)/ξ ≤ C ξ nd ,

(A.10)

y∈C (n) ()

for some C = C(n, d) < ∞. Proof. It is convenient to use the equality: ∞ e−r/ξ e−distH (x,y)/ξ = dr ξ 0 (n) y∈C

(;u)

1[distH (x, y) ≤ r].

(A.11)

y∈C (n) (;u)

To estimate the sum on the right, we note that the configuration y needs to have one of its points at u, and the rest (n − 1) points are all within the distance r from X . A simple estimate yields:

0 r≤L , (A.12) 1[distH (x, y) ≤ r] ≤ n n (2r )d(n−1) r ≥ L y∈C (n) (;u);

where 2 could also be replaced by bd , which is the maximal value of b such that any sphere in Rd of radius r includes not more than br d lattice points (of Zd ). Substituting (A.12) in (A.11) one readily obtains the claimed bound (A.9). The second claim follows from the first by summation over u. B. From Eigenfunction Correlators to Dynamical and Spectral Information In Theorem 4.1 we presented a known method [Ai94] for the derivation of information on the dynamical and spectral properties of an infinite-volume operator from bounds on the eigenfuction correlators of its restrictions to finite domains. For convenience, following is an outline of a proof of this result. We recall that the assumption is that for a sequence of finite domains which converge to Zd , and a fixed interval I :

(n) E Q (x, y; I ) ≤ A e−K (x,y) , (B.1) with some kernel K (x, y) (e.g., K (x, y) = dist(x, y)/ξ ).

Localization Bounds for Multiparticle Systems

929

Proof of Theorem 4.1. i) Through a trivial extension of the finite volume operators, they can be naturally viewed as acting in the same space 2 (Znd ) (acting as 0 on functions supported outside n ). Using the Combes-Thomas estimate on the Green function, one may see that the operator H (ω) is the limit, in the strong resolvent sense, of any sequence of H (ω), as → Zd (allowing in the process also arbitrary self-adjoint boundary conditions at the receding boundary). Thus, a bound like (4.7) but modified through a restriction of f to continuous functions can be deduced from Eq. (4.2) and the general properties of strong resolvent convergence (cf. [RS79, Thm. VIII.20]). Since, by the Wegner estimate, the mean density of states is a continuous measure, Lusin’s approximation theorem allows to extend the resulting bound to all measurable and bounded functions, thus yielding (4.7). Of particular interest is the implied dynamical localization bound: # $ E sup δx , PI (H ) e−it H δy ≤ A e−K (x,y) . (B.2) t∈R

ii) By the RAGE criterion (see, e.g. [RS79]) the projection on the continuous spectrum in the interval I satisfies:

E PI ;cont (H ) δx 2 ⎡ ⎤ T 2 1 = E ⎣ lim lim δx , PI (H ) e−it H δy dt ⎦ R→∞ T →∞ T 0 y: dist (x,y)≥R # $ −it H (B.3) ≤ lim E sup δx , PI (H ) e δy , R→∞

y: dist (x,y)≥R

t∈R

where the last inequality is by Fatou’s lemma and the natural bound (1) on the summed quantity. Thus, under the assumption (4.6),

E PI ;cont (H ) δx 2 ≤ lim A e−K (x,y) . (B.4) R→∞

y: dist (x,y)≥R

In case K (x, y) = 2 distH (x, y)/ξ , or any of the other distances discussed here (which are only larger), the above limit vanishes. Since {δx }x∈(Zd )n is a spanning collection of vectors, one may conclude that, under the assumption (4.6), within the n-particle sector H (ω) has almost surely no continuous spectrum in the interval I . iii) Under the above assumption, we will construct a complete set of exponentially bounded eigenfunctions which form a subset of {P{E} (H (ω)) δy | E ∈ σ (H (ω)), y ∈ (Zd )n }. Clearly functions in this collection are either zero or eigenfunctions of H (ω). For the complete set we chose functions corresponding to configurations y ∈ (Zd )n which are E-representative in the sense that

δy , P{E} (H (ω)) δy ≥

(1 + |y|)−(nd+1) x∈(Zd )n

(1 + |x|)−(nd+1)

.

(B.5)

Note that for any E ∈ σ (H (ω)) there is at least one E-representative configuration, since y δy , P{E} (H (ω)) δy ≥ 1. We claim that the corresponding collection of

930

M. Aizenman, S. Warzel

normalized eigenfunctions: ψ E,y :=

P{E} (H (ω)) δy , with E ∈ σ (H (ω)) ∩ I, y ∈ (Zd )n E-representative, P{E} (H (ω)) δy (B.6)

spans the full subspace of eigenfunctions of H (ω) corresponding to eigenvalues in I . For if not then there exists E ∈ σ (H (ω)) ∩ I and a normalized function satisfying φ = P{E} (H (ω))φ, which is within the orthogonal complement of the subspace spanned by (B.6). This would imply the following contradiction: φ, P{E} (H (ω)) δy 2 ≤

δy , P{E} (H (ω)) δy < 1. (B.7) 1 = φ, φ = y∈(Zd )n not E-representative

y∈(Zd )n not E-representative

For bounds on the eigenfunctions in (B.6), one may apply the Wiener criterion which (combined with the Fatou lemma) yields ⎤ ⎡ 2 δx , P{E} (H ) δy ⎦ E⎣ E∈σ (H )

7

I

#

1 ≤ lim E T →∞ T

T

0

2 $ −it H δy dt ≤ A e− distH (x,y)/ξ , δx , PI (H ) e

(B.8)

as a consequence of (B.2) in case K (x, y) = 2 distH (x, y)/ξ . Summing the resulting bound we get: ⎡ ⎤ ⎢ ⎢ E⎢ ⎣

7 E∈σ (H ) I x,y∈(Zd )n

2 ⎥ edistH (x,y)/ξ ⎥

δ , P (H ) δ ⎥ < ∞. x {E} y ⎦ (1 + |y|)nd+1

(B.9)

Thus, using the Chebyshev principle, there exists a positive function A(ω; n) of finite mean such that for all E ∈ σ (H (ω)) ∩ I and x ∈ (Zd )n : ψ E,y (x)2 1[ y ∈ (Zd )n is E-representative] ≤ A(ω; n) (1 + |y|)2nd+2 e− distH (x,y)/ξ .

(B.10)

Since for any E ∈ σ (H (ω)) and any y in the above collection the function ψ E,y is non-negligble at y in the sense of (4.9), this proves the last claim which is made in Theorem 4.1. C. An Averaging Principle In the proof of Theorem 4.5 we made use of an averaging principle, which is useful for conditional averages where the value of the potential at a single site, u ∈ Zd , is redrawn at fixed values of the other (random) parameters. This provides a generalization of Lemma 4.3 which is suitable for multiparticle systems. Following is its derivation.

Localization Bounds for Multiparticle Systems

931

In the statement of the result, use is made of the holomorphic family of operators K u (z) =

Nu (H − z)−1 Nu ,

(C.1)

(n)

which we take as acting in the range of Nu within H . The operator valued function is analytic in C\σ (H ). For real z = E ∈ R the operators are self adjoint, and for convenience of the (local) argument which follows, we employ an auxiliary index ν to label the eigenvalues (κν (E)), counted without multiplicity, and the corresponding projection operators (κν (E)). Questions of order do not matter here since we shall always be summing over ν. Along R\σ (H ) analyticity implies that for all but possibly finitely many exceptional values of E, at which level-crossings occur, both eigenvalues and projections may be analytically continued to a small neighborhood of E on which the spectral representation K u (z) = κν (z) κν (z), (C.2) ν

holds. The operators κν (z) are projections, satisfying κν (z) κν (z) = κν (z)δν,ν , though they are orthogonal projections only for real z; cf. [Ka95, Ch. II]. Lemma C.1. ( = Lemma 4.4 ) Let s ∈ [0, 1) and ⊂ Zd be a finite set, and u a point in . Then for all x ∈ C (n) (; u) and y ∈ C (n) (): s dv Q (x, y; I, s) Nu (x)1+ 2 s V (x)→V (x)+v |v| R s 1 1−s −1

δ =

δ , (E) δ , (E) N (H − E) δ d E. x κ x x κ u y ν ν |λ|1−s I ν (C.3) (v) := H + λv Nu with the extra Proof. Let us consider the family of operators H parameter v, which in effect modifies the value of the potential Vu of H . A standard resolvent identity leads to the Krein formula: (v) − z −1 = (1 + λv K u (z))−1 Nu (H − z)−1 . Nu H (C.4) √ (v) − z −1 is an analytic in z ∈ For each v, the family of operators Nu H √ (v)), with residues given by Nu P {E} are the projection oper{E} , where P C\σ ( H (v)). The operator valued ators on the eigenspaces of H (v), at eigenvalues E ∈ σ ( H function on the right-hand side of (C.4), is singular if and only if either: − (λv)−1 ∈ σ (K u (E))

(C.5)

E ∈ σ (H ),

(C.6)

or

and we argue next that the singularities at σ (H ) are removable for almost every value of v ∈ R. (v) is monotone non-decreasing in v, its spectral projecSince the spectrum of H {E} = P (mon) + P (fix) , for E ∈ σ ( H ω (v))) into a part for tions can be decomposed ( P {E} {E} which the corresponding spectrum is strictly monotone and another corresponding to

932

M. Aizenman, S. Warzel

spectrum which does not move with v. (Monotonicity plays here only an auxiliary role, and could also be replaced by analyticity in v or just smoothness.) At Lebesgue almost (v) corresponding to P (mon) is disjoint from σ (H ), in every v ∈ R the spectrum of H {E} (v) − z)−1 may have at σ (H ) are only due to which case the singularities which ( H √ (v) − z)−1 the fixed part of the spectrum, with the corresponding residues of Nu ( H √ (fix) (fix) are being given by Nu P{E} . However, functions in the range of the projections P √ √ (v) − z)−1 has only simple (fix) = 0. Since Nu ( H annihilated by Nu , and thus Nu P {E} pole singularities, the vanishing of the residue at E implies that the singularity of the expression on the right-hand side of (C.4) is removable there. In conclusion,√for almost every v ∈ R, even if there is an overlap in the spectra of (v) and H , Nu P {E} = 0 for all E ∈ σ (H ). H We now consider E ∈ σ (H ) at which (C.5) holds. Such energies will not coincide with exceptional points of level-crossing for K u (E) for almost all v ∈ R. Therefore a simple residue calculation based on (C.2) yields: {E} = κν (E) κν (z) Nu (H − E)−1 , Nu P κν (E)

(C.7)

where κν (E) is that eigenvalue of K u (E) at which −(λv)−1 = κν (E) holds, and κν (E) is the derivative of the eigenvalue with respect to E evaluated at that particular point. In particular, $ # 2 {E} Nu = κν (E) κν (E) = − d κν (E)−1 κν (E). Nu P (C.8) κν (E) dE Using the relation δ( g(E) ) g (E) = u∈g −1 ({0}) δ(E − u), we find that for any f ): which is continuous on a neighborhood of σ ( H {E} Nu δx f (E)

δx , Nu P )∩I E∈σ ( H

=

ν

I

δ(λv + κν (E)−1 ) δx , κν (E) δx f (E) d E.

The eigenfunction correlator, which is defined as: s Nu (x)1+ 2 Q (x, y; I, s) V (x)→V (x)+v {E} Nu δx 1−s δx , =

δx , Nu P

s {E} δy , Nu P

(C.9)

(C.10)

)∩I E∈σ ( H

can be presented in the form of (C.9), with f (E) the function which is defined in the neighborhood of the zeros of λv + κν (E)−1 as 1 δ , (E)√ N (H − E)−1 δ s x κν u y f (E) = κν (E)

δx , κν (E) δx √ s δx , Nu P{E} δy . = (C.11) √ √

δx , Nu P{E} Nu δx

Localization Bounds for Multiparticle Systems

933

The claim follows now upon integration over v, of the expression which is obtained by substituting the above function in (C.9).

D. On the Wegner Estimate As noted in Sect. 2, the Wegner estimate is not being explicitly used in the Fractional Moment Analysis. However the statement is of intrinsic interest, and it provides also a useful tool for various other purposes. The basic bound has already been extended to multiparticle systems [CS08a,Ki08a]. Our purpose here is to comment on the subject from the perspective of the approach used in Sect. 2. A local version of the bound is the following statement of finiteness of the conditional mean of the density of the spectral measure associated with the vector δx for an arbitrary configuration x ∈ C (n) (), obtained by averaging over one of the potential variables associated with the occupied sites, V (x j ; ω), j ∈ {1, ..., n}. In the following, for a self adjoint operator H we denote by PI (H ) the spectral projection associated with a Borel set I whose Lebesgue measure is denoted by |I |. Theorem D.1. Let u be a site in ⊆ Zd , x ∈ C (n) (; u) a configuration with at least one particle at u, and (n) µx (I ; ω) := δx(n) , PI (H (ω)) δx(n) ,

(D.1)

(n)

(n)

the spectral measure of the operator H (ω) which is associated with the vector δx and a Borel set I ⊆ R. Then the average of µx (I ; ω) over the values of the potential at u satisfies: R

µx (I ; ω)(Vu )d Vu ≡ E µx (I, ω) | {V (v)}v=u ≤

∞ |I |. |λ| Nu (x)

(D.2)

Proof. Bearing in mind the dependence of the Hamiltonian on Vu as expressed in (2.3), one may apply the averaging principle (cf. [Ko86,SW86,CH94]) which states that for any self-adjoint operator A and a bounded operator N ≥ B † B on a Hilbert space: : : : † : : B PI (A + τ N ) B : (τ ) dτ ≤ ∞ |I |. R

(D.3)

The claim, (D.2), follows from (D.3) by observing that for x ∈ C (n) (; u) one has: 1/2 1/2 Nu (x) δx , PI (H ) δx = δx , Nu PI (H ) Nu δx . Acknowledgements. With pleasure we thank Yuri Suhov for useful discussions of multiparticle localization, and Shmuel Fishman and Uzy Smilansky for stimulating discussions of related topics on a visit to the Center for Complex Systems at Weizmann Inst. of Science. Support for the latter was received from the BSF grant 710021. The work was supported in parts by the NSF grants DMS-0602360 (MA), DMS-0701181 (SW) and a Sloan Fellowship (SW).

934

M. Aizenman, S. Warzel

References [AM93] [Ai94] [AS+01] [AE+06] [BAA06] [Bo57] [CL90] [CS07] [CS08a] [CS08b] [CH94] [DK89] [FS83] [GK01] [GK06] [Ka95] [Ki08a] [Ki08b] [Ko86] [PF92] [RS79] [SW86] [St01] [We81]

Aizenman, M., Molchanov, S.: Localization at large disorder and at extreme energies: an elementary derivation. Commun. Math. Phys. 157, 245–278 (1993) Aizenman, M.: Localization at weak disorder: some elementary bounds. Rev. Math. Phys. 6, 1163–1182 (1994) Aizenman, M., Schenker, J.H., Friedrich, R.M., Hundertmark, D.: Finite-volume criteria for anderson localization. Commun. Math. Phys. 224, 219–253 (2001) Aizenman, M., Elgart, A., Naboko, S., Schenker, J.H., Stolz, G.: Moment analysis for localization in random schrödinger operators. Invent. Math. 163, 343–413 (2006) Basko, D.M., Aleiner, I.L., Altshuler, B.L.: Metal-insulator transition in a weakly interacting many-electron system with localized single-particle states. Ann. Phys. 321, 1126–1205 (2006) Boole, G.: On the comparison of transcendents, with certain applications to the theory of definite integrals. Philos. Trans. Royal. Soc. 147, 780 (1857) Carmona, R., Lacroix, J.: Spectral Theory of Random Schrödinger Operators. Boston: Birkhäuser, 1990 Chulaevsky, V., Suhov, Y.M.: Anderson localisation for an interacting two-particle quantum system on Z. Preprint, http://arxiv.org/abs/0705.0657v1[math-ph], 2007 Chulaevsky, V., Suhov, Y.M.: Wegner bounds for a two-particle tight binding model. Commun. Math. Phys. 283, 479–489 (2008) Chulaevsky, V., Suhov, Y.M.: Eigenfunctions in a two-particle Anderson tight-binding model. To appear in Commun. Math. Phys. DOI:10.1007/s00220-008-0721-0, 2009 Combes, J.-M., Hislop, P.D.: Localization for some continuous, random Hamiltonians in d-dimensions. J. Funct. Anal. 124, 149–180 (1994) von Dreifus, H., Klein, A.: A new proof of localization in the Anderson tight binding model. Commun. Math. Phys. 124, 285–299 (1989) Fröhlich, J., Spencer, T.: Absence of diffusion in the Anderson tight binding model for large disorder or low energy. Commun. Math. Phys. 88, 151–184 (1983) Germinet, F., Klein, A.: Bootstrap multiscale analysis and localization in random media. Commun. Math. Phys. 222, 415–448 (2001) Germinet, F., Klein, A.: New characterizations of the region of complete localization for random Schrödinger operators. J. Stat. Phys. 122, 73–94 (2006) Kato, T.: Perturbation Theory for Linear Operators. Berlin: Springer, 1995 Kirsch, W.: A wegner estimate for multi-particle random Hamiltonians. J. Math. Phys. Anal. Geom. 4, 121–127 (2008) Kirsch, W.: An invitation to random Schrödinger operators (with appendix by F. Klopp). In: Random Schrödinger Operators. Disertori, M., Kirsch, W., Klein, A., Klopp, F., Rivasseau, V. (eds.), Panoramas et Synthèses 25, Paris: Soc. Math. de France, 2008, pp. 1–119 Kotani, S.: Lyaponov exponents and spectra for one-dimensional random Schrödinger operators. In: Proc. Conf. on Random Matrices and their Appl., Contemp. Math. 50, Providece, RI: Amer. Math. Soc., 1986 Pastur, L., Figotin, A.: Spectra of Random and Almost-Periodic Operators. Berlin: Springer, 1992 Reed, M., Simon, B.: Methods of Modern Mathematical Physics I+III: Functional Analysis (Revised and enlarged edition)+Scattering Theory. New York: Academic Press, 1979/80 Simon, B., Wolff, T.: Singular continuous spectrum under rank one perturbations and localization for random Hamiltonians. Commun. Pure Appl. Math. 39, 75–90 (1986) Stollmann, P.: Caught by Disorder: Bound States in Random Media. Boston: Birkhäuser, 2001 Wegner, F.: Bounds on the density of states in disordered systems. Z. Physik B 44, 9–15 (1981)

Communicated by B. Simon

Commun. Math. Phys. 290, 935–939 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0724-x

Communications in

Mathematical Physics

A Family of Schrödinger Operators Whose Spectrum is an Interval Helge Krüger Department of Mathematics, Rice University, Houston, TX 77005, USA. E-mail: [email protected] Received: 22 September 2008 / Accepted: 21 October 2008 Published online: 8 January 2009 – © Springer-Verlag 2008

Abstract: By approximation, I show that the spectrum of the Schrödinger operator with potential V (n) = f (n ρ (mod 1)) for f continuous and ρ > 0, ρ ∈ / N is an interval. 1. Introduction In this short note, I wish to describe a family of Schrödinger operators on l 2 (N) whose spectrum is an interval. To set the stage introduce for a bounded sequence V : N → R and u ∈ l 2 (N) the Schrödinger operator HV defined by (HV u)(n) = u(n + 1) + u(n − 1) + V (n)u(n), n ≥ 2, (HV u)(1) = u(2) + V (1)u(1).

(1.1)

We will denote by σ (V ) the spectrum of the operator HV . It is well known that if V (n) is a sequence of independent identically distributed random variables with distribution µ satisfying supp(µ) = [a, b], we have that for almost every V the spectrum σ (V ) is [a − 2, b + 2]. For the Almost–Mathieu Operator with potential Vλ,α,θ (n) = 2λ cos(2π(nα + θ )), where λ > 0, α ∈ / Q, the set σ (Vλ,α,θ ) contains no interval [2]. Bourgain conjectured in [4], that if one considers the potential n(n − 1) α + nx + y Vλ,α,x,y (n) = λ cos 2 with λ > 0, α ∈ / Q, the spectrum σ (Vλ,α,x,y ) is an interval. Denote by T = R/Z the circle. I will prove the following result H. K. was supported by NSF grant DMS–0800100.

(1.2)

(1.3)

936

H. Krüger

Theorem 1.1. For any continuous function f : T → R, any α = 0, θ, and ρ > 0 not an integer, introduce the potential V (n) = f (αn ρ + θ ).

(1.4)

σ (V ) = [min( f ) − 2, max( f ) + 2].

(1.5)

Then we have that

Potentials of the type (1.4), were already discussed in Bourgain [3], Griniasty–Fishman [6], Last–Simon [7], and Stolz [8]. In particular, the case 0 < ρ < 1 is due to Stolz [8] under an additional regularity assumption on f. The proof of this theorem depends essentially on the following lemma on the distribution of n ρ , which is a consequence of a result of Boshernitzan [5]. Lemma 1.2. Let r ≥ 0 be an integer and r < ρ < r + 1. Given any α = 0, θ, K ≥ 1, ε > 0, a0 , . . . , ar , there exists an integer n ≥ 1 such that sup α(n + k)ρ + θ −

|k|≤K

r

a j k j ≤ ε,

(1.6)

j=0

where x = dist(x, Z) denotes the distance to the closest integer. We will prove this lemma in the next section. . is not a norm, but it obeys the triangle inequality x + y ≤ x + y

(1.7)

for any x, y ∈ R. In particular for any integer N , we have that N x ≤ |N |x. Proof of Theorem 1.1. By Lemma 1.2, we can find for any x ∈ [0, 1) a sequence nl such that sup α(nl + k)ρ + θ − x ≤ 1/l.

|k|≤l

Hence, the sequence Vl (n) = V (n − nl ) converges pointwise to f (x). The claim now follows from a Weyl–sequence argument. It is remarkable that combined with the Last–Simon semicontinuity of the absolutely continuous spectrum [7], one also obtains the following result Theorem 1.3. For r ≥ 0 an integer, r < ρ < r + 1, and f a continuous function on T, introduce the set Br ( f ) as Br ( f ) =

a0 ,...,ar

σac ( f (

r

a j n j )).

(1.8)

j=0

Then for α = 0 and any θ , σac ( f (αn ρ + θ )) ⊆ Br ( f ). Here σac (V ) denotes the absolutely continuous spectrum of HV .

(1.9)

A Family of Schrödinger Operators Whose Spectrum is an Interval

937

We note that for r = 0, we have that B0 ( f ) = [−2 + max( f ), 2 − min( f )].

(1.10)

Under additional regularity assumptions on f and r = 0, Stolz has shown in [8] that we have equality in (1.9). Furthermore note that Br +1 ( f ) ⊆ Br ( f ).

(1.11)

In [7], Last and Simon have stated the following conjecture: B1 (2λ cos(2π.)) = ∅

(1.12)

for λ > 0. They phrased this in poetic terms as ’Does Hofstadter’s Butterfly have wings?’. The best positive result in this direction as far as I know, is by Bourgain [3] showing |B1 (2λ cos(2π.))| → 0, λ → 0.

(1.13)

It is an interesting question if Theorem 1.1 holds for ρ ≥ 2 an integer. In the particular case of f (x) = 2λ cos(2π x), ρ = 2, this would follow from Bourgain’s conjecture. However, there is also the following negative evidence. Consider the skewshift T : T2 → T2 given by T (x, y) = (x + α, x + y), (1.14) n(n − 1) α + nx + y), T n (x, y) = (x + nα, 2 where α ∈ / Q. Then Avila, Bochi, and Damanik [1] have shown that for generic continuous f : T2 → R, the spectrum σ ( f (T n (x, y))) contains no interval. So, it is not clear what to expect in this case. As a final remark, let me comment on a slight extension. If one replaces V by the following family of potentials: V (n) = f (αn ρ +

K

αk n βk ),

k=1

where βk < ρ and αk are any numbers, then Theorem 1.1 and 1.3 remain valid. 2. Proof of Lemma 1.2 Let in the following r be an integer such that r < ρ < r + 1. By Taylor expansion, we have that α(n + k)ρ =

r

x j (n)k j + α

j=0

ρ . . . (ρ − r ) (n + ξ )ρ−r −1 k r +1 (r + 1)!

(2.1)

for some |ξ | ≤ k and x j (n) = α

ρ . . . (ρ − j + 1) ρ− j n . j!

We now first note the following lemma

(2.2)

938

H. Krüger

Lemma 2.1. For any K ≥ 1 and ε > 0, there exists an N0 (K , ε) such that |α(n + k)ρ −

r

x j (n)k j | ≤ ε

(2.3)

j=0

for |k| ≤ K and n ≥ N0 (K , ε). Proof. This follows from (2.1) and that ρ − r − 1 < 0.

A sequence x(n) is called uniformly distributed in Tr +1 if for any 0 ≤ a j < b j ≤ 1, j = 0, . . . , r we have that 1 #{1 ≤ k ≤ n : x j (k) ∈ (a j , b j ), n→∞ n lim

j = 0, . . . , r } =

r

(b j − a j ). (2.4)

j=0

If x(n) is a sequence in Rr +1 , we can view it as a sequence in Tr +1 by considering x(n) (mod 1), and call it uniformly distributed if x(n) (mod 1) is. We need the following consequence of Theorem 1.8 in [5]. Theorem 2.2 (Boshernitzan). Let ( f 1 , . . . , f s ) be functions R → R of subpolynomial growth, that is, there is an integer N such that lim f j (x)x −N = 0, 1 ≤ j ≤ s.

x→∞

(2.5)

Then the following two conditions are equivalent: (i) The sequence { f 1 (n), . . . , f s (n)}n≥1 is uniformly distributed in Ts . (ii) For any (m 1 , . . . , m s ) ∈ Zs \{0}, and for every polynomial p(x) with rational coefficients, we have that s j=1 m j f j (x) − p(x) = ±∞. (2.6) lim x→∞ log(x) We will use the following consequence of this theorem Lemma 2.3. The sequence x(n) = x0 (n) . . . xr (n)

(2.7)

is uniformly distributed in Tr +1 . Proof. This follows from the fact that for any polynomial p(n) and integer vector (m 0 , . . . , m r ) we have that |

r

m j x j (n) − p(n)|

j=0

grows at least like n ρ−r , which grows faster than log(n). Now we come to

A Family of Schrödinger Operators Whose Spectrum is an Interval

939

Proof of Lemma 1.2. By Lemma 2.1, there exists an N0 such that ρ

α(n + k) −

r

x j (n)k j ≤

j=0

ε 2

for any n ≥ N0 and |k| ≤ K . By Lemma 2.3, we can now find n ≥ N0 such that ε xl (n) − a˜ l ≤ , l = 0, . . . , r, 2(r + 1)K l where a˜ 0 = a0 − θ and a˜ l = al , l ≥ 1. For |k| ≤ K , we now have that using (1.7), α(n + k)ρ + θ −

r

ajk j

j=0

≤ α(n + k)ρ −

r

x j (n)k j +

j=0

≤ ≤ ≤

ε + 2 ε + 2 ε + 2

r j=0 r j=0 r j=0

r j=0

x j (n)k j −

r

ajk j + θ

j=0

(x j (n) − a˜ j )k j

(2.8)

|k j |x j (n) − a˜ j

(2.9)

Kj

ε = ε, 2(r + 1)K j

where we used the definition of a˜ l in (2.8), and that k is an integer in (2.9). This finishes the proof. Acknowledgements. I am indebted to J. Chaika, D. Damanik, and A. Metelkina for useful discussions, and the referees for many useful suggestions on how to improve the presentation.

References 1. Avila, A., Bochi, J., Damanik, D.: Cantor spectrum for Schrödinger operators with potentials arising from generalized skew-shifts. Duke. Math. J. (to appear), available at http://arxiv.org/abs/07092667v2[math. DS], 2007 2. Avila, A., Jitomirskaya, S.: The Ten Martini Problem. Ann. Math. (to appear), available at http://annals. math.princeton.edu/issues/2006/FinalFiles/AvilaJitomirskayaFinal.pdf 3. Bourgain, J.: Positive Lyapounov exponents for most energies. In: Geometric aspects of functional analysis, Lecture Notes in Math. 1745, Berlin: Springer, 2000, pp. 37–66 4. Bourgain, J.: Green’s function estimates for lattice Schrödinger operators and applications. Annals of Mathematics Studies, 158. Princeton, NJ: Princeton University Press, 2005, x+173 pp 5. Boshernitzan, M.: Uniform distribution and Hardy fields. J. Anal. Math. 62, 225–240 (1994) 6. Griniasty, M., Fishman, S.: Localization by pseudorandom potentials in one dimension. Phys. Rev. Lett. 60, 1334–1337 (1988) 7. Last, Y., Simon, B.: Eigenfunctions, transfer matrices, and absolutely continuous spectrum of one-dimensional Schrödinger operators. Invent. Math. 125, 329–268 (1999) 8. Stolz, G.: Spectral theory for slowly oscillating potentials. I. Jacobi matrices. Manus. Math. 84(3–4), 245–260 (1994) Communicated by B. Simon

Commun. Math. Phys. 290, 941–972 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0723-y

Communications in

Mathematical Physics

The Area of Horizons and the Trapped Region Lars Andersson1,2, , Jan Metzger1,3, 1 Albert-Einstein-Institut, Am Mühlenberg 1, 14476 Potsdam, Germany.

E-mail: [email protected]; [email protected]

2 Department of Mathematics, University of Miami, Coral Gables, FL 33124, USA 3 Stanford University, Mathematics, 450 Serra Mall, Stanford, CA 94305, USA

Received: 23 September 2008 / Accepted: 16 October 2008 Published online: 24 January 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com

Abstract: This paper considers some fundamental questions concerning marginally trapped surfaces, or apparent horizons, in Cauchy data sets for the Einstein equation. An area estimate for outermost marginally trapped surfaces is proved. The proof makes use of an existence result for marginal surfaces, in the presence of barriers, curvature estimates, together with a novel surgery construction for marginal surfaces. These results are applied to characterize the boundary of the trapped region. 1. Introduction Trapped and marginally trapped surfaces play a central role in the analysis of spacetime geometry. By the singularity theorems of Hawking and Penrose [HE73], a spacetime which satisfies suitable energy and causality conditions, and which in addition contains a trapped surface, must contain a black hole. Marginally trapped surfaces, or apparent horizons, serve as the quasi-local version of black hole boundary. In numerical general relativity, they are used as excision surfaces for the evolution of black hole initial data, and approximations to physical characteristics of a black hole such as linear and angular momentum [KLZ07,CLZ+ 07] can be calculated in terms of data induced on the apparent horizon. We briefly recall some basic facts. A two dimensional spacelike surface in a 4-dimensional Lorentzian spacetime has, up to normalization, two future pointing null normals. We designate one of these, + , the outward pointing, and the other − , the inward pointing null normal. Corresponding to ± we have the null mean curvatures or null expansions θ ± . Let (M, g, K ) be a Cauchy data set containing . Then θ ± is given by θ ± = P ± H, Supported in part by the NSF, under contract no. DMS 0407732 with the University of Miami.

Supported in part by a Feodor-Lynen Fellowship of the Humboldt Foundation.

942

L. Andersson, J. Metzger

where H is the mean curvature of in M with respect to the outward pointing normal, and P = tr K , the trace of the projection of K to . The surface is said to be (future) trapped if θ ± < 0, and (future) marginally trapped if θ − < 0, while θ + = 0. If θ + < 0 or θ + > 0, with no condition imposed on θ − , then is called outer trapped or outer untrapped, respectively. Finally, if the condition θ + = 0 holds, with no further condition on θ − , then is called a marginally outer trapped surface, or MOTS. We will explicitly review notation and further conditions needed on (M, g, K ) in Sect. 2. From a mathematical point of view, MOTS are the natural generalization of minimal surfaces to a Lorentzian setting, see the discussion in [AM05]. In particular, in the case of time-symmetric Cauchy data, where K ≡ 0, a MOTS is a minimal surface. However, a fundamental difference between minimal surfaces and MOTS, is that MOTS are not stationary with respect to an elliptic functional. In spite of this, there is a notion of stability for MOTS analogous to the notion of stability for minimal surfaces, cf. [AMS05,AMS07]. Although the stability operator in the case of MOTS fails to be self-adjoint, many of the results and ideas generalize from the case of stable minimal surfaces to the case of stable MOTS. In particular, a curvature estimate, generalizing the classical result of [SSY75] was proved in [AM05] for the case of stable MOTS. The so-called Jang’s equation [Jan78] is closely related to the equation θ + = 0. Both are prescribed mean curvature equations, where the right hand side depend on the normal. A careful study of Jang’s equation is a crucial ingredient in the positive mass proof of Schoen and Yau [SY81]. Among other things, their argument makes use of the fact that the boundary of the blowup set for Jang’s equation consists of marginal surfaces. This means that the question of existence of MOTS may be approached by studying the existence of blowup solutions to Jang’s equation. This observation was used by Yau [Yau01] to give a criterion for a Cauchy data set to contain a marginal surface. A consequence of the fact that MOTS are not critical points for a variational principle is that the familiar barrier arguments for the existence of minimal surfaces do not generalize to MOTS. However, as was pointed out by Schoen in a talk given at the Miami Waves conference in 2004 [Sch04], the fact that blowup surfaces for Jang’s equation are marginal surfaces actually provides a result which replaces the above mentioned barrier arguments. Theorem 1.1. Let (M, g, K ) be a Cauchy data set. Assume that M is compact with two boundary components, an inner and an outer boundary and assume that the inner boundary is outer trapped and the outer boundary is outer untrapped. Then M contains a stable MOTS. This theorem follows from Schoen’s original result, stated as Theorem 3.1 and a closer analysis of the blow-up surface, cf. Theorem 4.1. Unfortunately, a proof of Theorem 3.1 has not been published. In Sect. 3 we therefore prove this result in detail, since it is crucial for the results in this paper. We wish to remark here that if the ambient manifold is asymptotically flat with appropriate fall-off conditions, then spheres near infinity will be untrapped and can serve as outer barriers in Theorem 1.1. Starting from the curvature estimates for MOTS mentioned above, it is easy to show that the set of all stable marginally trapped surfaces in a compact region is compact, given a uniform estimate for the area. However, such an estimate cannot be expected to hold in general. Examples due to Colding-Minicozzi and others [CM99,Dea03] show that for each genus g ≥ 1 there is an example of a compact three dimensional manifold containing a sequence of stable minimal surfaces of genus g with unbounded area.

The Area of Horizons and the Trapped Region

943

Recalling that minimal surfaces are MOTS in the special case K = 0, this shows that an a priori area estimate for MOTS requires further conditions. If we consider surfaces minimizing area in a given homology class, on the other hand, there is no need to prove an area bound to obtain compactness, as one can assume that the area is bounded by the area of any comparison surface. For the case of MOTS, the appropriate analogue of a minimizing surface is an outermost MOTS. We say that a MOTS is outermost in M if there is no other MOTS in the complement of the region which bounds with a, possibly empty, inner boundary. In this respect, the main result of this paper, cf. Theorem 6.5 is an area estimate for the outermost MOTS. Theorem 1.2. There exists a constant C which is an increasing function of M RmC 0 (M) , K C 1 (M) , injρ (M, g, K ; ∂ M)−1 , and Vol M such that the area of an outermost MOTS satisfies the estimate || ≤ C. The quantity injρ (M, g, K ; ∂ M) is explained in Definition 2.8. This result does not require the MOTS to be connected. Thus, in combination with the curvature estimate for stable MOTS we infer an estimate for the number of components of the outermost MOTS. Note, even for outward minimizing surfaces the above bound does not actually follow from the variational principle, as it does not refer to the area of a comparison surface. In this respect our area estimate is related to the area estimate in [NR06] for minimizing minimal surfaces in terms of volume and the homologial filling functions of the ambient manifold, which must have simple enough homology. To put Theorem 1.2 into perspective, recall that the Penrose inequality is a conjectured relation between the ADM mass and the area of the horizon. For a general Cauchy data set, the exact statement of the Penrose inequality is a subtle issue. Although, the area estimate stated in Theorem 1.2 holds for outermost MOTS, a counterexample due to Ben-Dov [BD04] shows that an inequality between the area of the outermost MOTS and the ADM mass does not hold in general. One of the main steps in the proof of Theorem 1.2 is a surgery argument, which is given in Sect. 6. This argument constructs, given a stable MOTS with sufficiently large area and an outer barrier surface, another stable MOTS outside . The two main steps in the argument is to show, using the curvature estimate, that given a stable MOTS with sufficiently large area, it is possible to glue in a neck with negative θ + , thereby constructing a outside with θ + ≤ 0. Together with Theorem 1.1 this yields a contradiction to the assumption that is outermost. The surgery argument may also be used to give a replacement for the strong maximum principle for outermost MOTS. It should be noted that for general MOTS, the strong maximum principle does not apply in general, in particular it can not be used to rule out that a surface touches itself in points where the normals of the two touching pieces point into opposite directions. This is exactly the situation which we can address with the surgery argument. Combining the above area estimate for outermost MOTS and the curvature estimate of [AM05] yields, as already mentioned, a compactness result for the class of outermost MOTS in a compact region. Using this fact in combination with the surgery technique discussed above enables us to give a characterization of the boundary of the trapped region.

944

L. Andersson, J. Metzger

The outer trapped region is the union of all domains bounded by a weakly outer trapped surface and the, possibly empty, interior boundary of the initial data set. It has been proposed by several authors that the boundary of the outer trapped region is a smooth MOTS. However, the arguments put forth to prove this, see for example [HE73,KH97], relied on strong extra assumptions such as a piecewise smoothness of the boundary. Using the techniques developed in this paper we are able to settle this problem completely. Theorem 1.3. The boundary of the outer trapped region is a smooth outermost MOTS. Furthermore, it is the unique outermost MOTS. The boundary of the outer trapped region is defined and examined in Sect. 7, where Theorem 7.3 is proved, a more precise version of Theorem 1.3. The main idea here is that barrier constructions using a smoothing result from Kriele-Hayward [KH97], cf. Lemma 2.14, and Theorem 5.1 can be used to prove a replacement for the maximum principle for outermost MOTS. Together with the compactness properties for stable MOTS, and the area estimate for outermost MOTS, this gives the result. Although the presentation here is restricted to the n = 3 dimensional case, most of the techniques proposed generalize to higher dimensions. The points which need to be addressed in the higher dimensional case are regularity issues for Jang’s equation, cf. Remark 3.2, and the a priori curvature estimates for stable MOTS used in the surgery procedure of Sect. 6. See [Eic07] for a treatment of these issues in the higher dimensional case. 2. Preliminaries An initial data set for the Einstein equations is a 3-dimensional Riemannian manifold (M, g) together with a symmetric two-tensor K representing the second fundamental form of M viewed as a Cauchy hypersurface in a four dimensional spacetime. In this paper we will not make further use of the spacetime geometry and in particular, energy conditions or constraint equations on (g, K ) are not needed for this paper. A surface in M is called two-sided if its normal bundle is orientable, i.e. if it is possible to choose a globally defined normal. As there are two such choices we will assume that there is one distinguished direction which we call the outer normal. We will denote this outer normal vector field by ν. Given a two-sided surface in M, we denote its second fundamental form, defined with respect to its outer normal ν, by A. Further, we denote by H, P the mean curvature, H = div ν, and the trace of K = K |T along , P = trK , respectively. The outward null expansion of is the quantity θ + = P + H and the inward null expansion is θ − = P − H . The null expansions θ ± are the traces of the null second fundamental forms χ ± = K ± A. Definition 2.1. A smooth, embedded, compact, two-sided surface is a marginally outer trapped surface (MOTS) if θ + = 0 on . Unless otherwise stated, we shall consider data sets (M, g, K ) with the following properties. We assume M is a compact manifold with boundary ∂ M such that ∂ M = ∂ − M ∪ ∂ + M is the disjoint union of a possibly empty inner boundary ∂ − M, which we endow with the normal vector field pointing into M and the non-empty outer boundary ∂ + M which we endow with the normal vector field pointing out of M. We assume the outer boundary is a barrier, i.e. θ + [∂ + M] > 0. All fields are assumed to be smooth up to boundary.

The Area of Horizons and the Trapped Region

945

Definition 2.2. We say that bounds a region ⊂ M with respect to ∂ + M, if the boundary ∂ is the disjoint union ∂ = ∪ ∂ + M. In this case, the normal pointing into will be used as the outer normal for . Note that if bounds with respect to ∂ + M, then is homologous to ∂ + M. For the existence results, Theorems 3.1 and 5.1, we need a non-empty ∂ − M with + θ [∂ − M] < 0 as inner barrier surface. On the other hand, for the area bound, Theorem 6.5, and Theorem 7.3, which shows regularity of the trapped region, we allow ∂ − M to be empty, and assume that ∂ − M is a weak barrier, θ + [∂ − M] ≤ 0, if nonempty. Definition 2.3. If (M, g, K ) is as before, with ∂ − M possibly empty, then an outermost MOTS is a MOTS which bounds a region with respect to ∂ + M as in Definition 2.2 with the following properties. If is a MOTS bounding a set with respect to ∂ + M with ⊂ , then = . We recall the strong maximum principle for MOTS. Note that it is only valid if two surfaces touch with the normals pointing in the same direction, as the surfaces have to be oriented the same way to use the maximum principle for quasilinear elliptic equations of second order [AG05,GT98]. Proposition 2.4. Let (M, g, K ) be an initial data set and let i ⊂ M, i = 1, 2 be two connected C 2 -surfaces touching at one point p, such that the outer normals of i agree at p. Assume furthermore that 2 lies to the outside of 1 , that is in the direction of its outer normal near p, and that sup θ + [1 ] ≤ inf θ + [2 ]. 1

2

Then 1 = 2 . If θ + [∂ − M] < 0 and θ + [∂ + M] > 0 then by continuity the parallel surfaces to ∂ ± M, i.e. the level sets of the distance dist(·, ∂ ± M), will satisfy the same inequality if the distance is sufficiently small. For later use we formalize this in the following definition. Definition 2.5. Assume θ + [∂ − M] < 0 and θ + [∂ + M] > 0. Denote by s± the parallel surface to ∂ ± M at distance s. Let ρ + (M, g, K ; ∂ + M) := sup s : s+ is smooth, embedded and θ + [s+ ] > 0 and

ρ − (M, g, K ; ∂ − M) := sup s : s− is smooth, embedded and θ + [s− ] < 0 ,

where we set ρ − (M, g, K ; ∂ − M) = ∞ if ∂ − M = ∅. Let ρ(M, g, K ; ∂ M) := min ρ + (M, g, K ; ∂ + M), ρ − (M, g, K ; ∂ − M) . Note that ρ(M, g, K ; ∂ M) only depends on the geometry of (M, g, K ). In fact we have Lemma 2.6. Assume θ + [∂ − M] < 0 and θ + [∂ + M] > 0. Let AC 0 (∂ M) be the norm of the second fundamental form of the boundary. There is a constant C depending only on inf ∂ M |θ + [∂ M]|, K C 1 (M) , M RmC 0 (M) , and AC 0 (∂ M) such that ρ(M, g, K ; ∂ M)−1 ≤ C.

946

L. Andersson, J. Metzger

The significance of Definition 2.5 lies in the following lemma, which is an immediate consequence of the strong maximum principle. Lemma 2.7. If (M, g, K ) is as before, with ∂ − M possibly empty, and ⊂ M is a smooth MOTS homologous to ∂ + M, then dist(, ∂ M) ≥ ρ(M, g, K ; ∂ M). Later, we will need the injectivity radius of (M, g), restricted to MOTS. By the previous lemma these surfaces cannot enter a collar neighborhood of ∂ M if ∂ M is a barrier, and thus we only need to consider the injectivity radius of points at least distance ρ(M, g, K ; ∂ M) away from ∂ M. Definition 2.8. For p ∈ M let inj(M, g; p) be the injectivity radius of (M, g) at p. Then denote injρ (M, g, K ; ∂ M) := inf {inj(M, g; p) : dist( p, ∂ M) ≥ ρ(M, g, K ; ∂ M)}. Let be a MOTS and let F : × (ε, ε) → M be a normal variation of , that is F(·, 0) = id and ∂∂sF s=0 = f ν for a function f ∈ C ∞ (). Then the variation of θ + at is given by the operator ∂θ + [F(, s)] = LM f ∂s s=0 = − f +2S( ∇ f )+ f divS− 21 |χ + |2 −|S|2 + 21 Sc−µ−J (ν) . Here , ∇ and div are the Laplace-Beltrami operator, the tangential gradient and T the divergence along . Furthermore S(·) = K (ν, ·)T , where orthogonal (·) denotes 1 M projection to T . Sc is the scalar curvature of , µ = 2 Sc − |K |2 + (tr K )2 , and J = div K − d(tr K ). This operator is not self-adjoint. However, the general theory for elliptic operators of second order implies that L M has a unique eigenvalue λ with minimal real part. This eigenvalue is real, and the corresponding eigenfunction does not change sign. It is called the principal eigenvalue of L M . In [AMS05,AMS07] the following notion was introduced: Definition 2.9. A MOTS is called stable if the principal eigenvalue of L M is non-negative. A strictly stable MOTS, that is with λ > 0, can be deformed in the direction of the outer normal such that θ + > 0 on the deformed surfaces. To see this simply use the principal eigenfunction with the positive sign as the lapse of a normal deformation. Analogously, unstable surfaces can be deformed in the direction of the outer normal such that θ + < 0 on the deformed surface. For a further discussion on stability see [AMS05,AMS07,AM05]. We shall need Theorem 1.2 from [AM05]. Theorem 2.10. Suppose is a stable MOTS in (M, g, K ) homologous to ∂ + M. Then the second fundamental form A satisfies the inequality A∞ ≤ C K C 1 (M) , M RmC 0 (M) , injρ (M, g, K ; ∂ M)−1 .

The Area of Horizons and the Trapped Region

947

Note that in the reference [AM05] this theorem is proven for M without boundary. The same method gives the estimate where the dependency inj(M, g) in the original statement is replaced by injρ (M, g, K ; ∂ M), as this is the quantity which needs to be controlled to apply the Hoffman-Spruck Sobolev inequality. Subsequently we denote by BrM (O) the open ball in M with radius r around O, and by Br ( p) the intrinsic open ball in . Let M be as above and let ⊂ M, be a compact smooth embedded two-sided surface, and let G be the normal exponential map of : G : × (−dist(, ∂ M), dist(, ∂ M)) → M : ( p, r ) → exp M p (r ν),

(2.1)

where exp M p : T p M → M is the exponential map of M at p. Locally G is injective and well behaved, this is the content of the following well-known lemma. We shall focus on the local outer injectivity in the following sense. We denote by inj(M, g; ) the injectivity radius on (M, g) restricted to . Lemma 2.11. If ⊂ M is as above with bounded curvature, there exists 0 < i 0+ () < inj(M, g; ), depending only on inj(M, g; ), M RmC 0 , and sup |A|, such that for all x ∈ the map G | B +

i 0 ()

(x)×[0,i 0+ ())

: Bi+ () (x) × [0, i 0+ ()) → M 0

is a diffeomorphism on its image, and such that the sheets s x,i + () := G Bi + () (x), s 0

0

are discs with bounded curvature sup s |A| ≤ 2 sup |A|, for s ∈ [0, i 0+ ()). This lemma reflects the local well-behavedness of the distance surfaces to , in particular including the curvature bound. In contrast the next definition aims at the global behavior. Again, we only focus on the outward injectivity. Definition 2.12. The outer injectivity radius of is i + () := sup δ : G |×[0,δ) → M is injective . It is intuitively clear that if i + () is smaller than i 0+ (), then the surface nearly meets itself on the outside. A precise formulation is given by the following lemma. Lemma 2.13. Let be a compact, embedded and two-sided surface with i + () < M 1 + + 2 i 0 (). Then there exist two points p, q ∈ with dist( p, q) = 2i () but dist( p, q) ≥ + + i 0 () > 2i (). The points p and q can be joined by a geodesic segment γ in M, which is orthogonal to at p and q and as a set γ = G | B +

i 0 ()

( p)×[0,i 0+ ()) ( p, [0, 2i

+

]) = G | B +

i 0 ()

(q)×[0,i 0+ ()) (q, [0, 2i

+

]).

Proof. From the definition of i + we know that G (·, i + ()) : → M is not injective. Thus there exist two points p, q ∈ which map to the same point O ∈ M. By Lemma 2.11 dist( p, q) ≥ i 0+ (). Furthermore O has distance i + () to

948

L. Andersson, J. Metzger

Fig. 1. A surface that nearly meets itself

and to p, q so dist(O, ) = dist(O, p) and hence the geodesic segment γ p joining O to p is perpendicular to . Similarly the geodesic segment γq joining O and q is perpendicular to . Thus dist( p, q) ≤ 2i + (). If dist( p, q) < 2i + () then there would be a parallel surface to at distance d < i + () which intersects itself, which is not possible as G (·, d) is injective. Thus dist( p, q) = 2d and γ p and γq must form a smooth geodesic, as otherwise the angle at O could be smoothed out to yield a shorter geodesic. Figure 1 shows the situation in the lemma. It follows from the definition of i + () that the points p, q minimize the distance between the sheets Bi+ () ( p) and Bi+ () (q), 0 0 and hence γ is orthogonal to at p and q. In addition γ does not intersect in any other points except p and q. If we parameterize γ by arc length as a curve joining p to q, the tangent to γ at p coincides with the normal ν to . Similarly, with γ arc length parameterized as a curve joining q to p, the tangent to γ at q coincides with the normal ν to at q. This means that γ lies completely on the outside of . For later reference, we need the following smoothing result from [KH97, Lemma 6]. Lemma 2.14. Let 1 , 2 ⊂ M be smooth two-sided surfaces which intersect transversely in a smooth curve γ . Let νi be the outer normals of i , i = 1, 2. Choose one connected component ± of each set i \γ such that in a neighborhood of γ the piece − lies in the outside of 1 and the piece + in the outside of 2 . Then for any neighborhood U of γ there exists a smooth surface and a continuous and piecewise smooth bijection : + ∪ − ∪ γ → such that (1) (x) = x for all x ∈ ( + ∪ − )\U , (2) ( + ∪ − )\U = \U , and (3) θ + [](x) ≤ θ + [ + ](x) for x ∈ + and θ + [](x) ≤ θ + [ − ](x) for x ∈ − . Moreover lies in the connected component of U \( + ∪ − ∪ γ ) into which the outer normals ν ± of ± point. Briefly stated, this procedure works by replacing the inward corner near γ by a smooth patch with θ + very negative. The reason why this procedure works is that the corner is a concentration of negative mean curvature, that is negative θ + .

The Area of Horizons and the Trapped Region

949

3. Existence of MOTS This section is devoted to a proof of Schoen’s existence theorem for MOTS [Sch04] in the presence of barrier surfaces. Theorem 3.1. Let (M, g, K ) be a smooth, compact initial data set with ∂ M the disjoint union ∂ M = ∂ − M ∪ ∂ + M such that ∂ ± M are non-empty, smooth, compact surfaces without boundary and θ + [∂ − M] < 0 with respect to the normal pointing into M and θ + [∂ + M] > 0 with respect to the normal pointing out of M. Then there exists a nonempty, smooth, embedded MOTS homologous to ∂ + M. Remark 3.2. The proof presented here readily carries over to n dimensional M with 3 ≤ n ≤ 5. The dimensional restriction is due to the method used for the curvature estimates in Proposition 3.3 in [SY81]. Higher dimensional replacements for this proposition are accessible via methods from geometric measure theory, cf. [Eic07]. 3.1. Setup and outline. Consider M¯ := M ×R equipped with the metric g¯ = g+dz 2 , and define K¯ on M¯ as the pull-back of K under the projection π : M ×R → M : ( p, z) → p. For a function f on M we consider N = graph f := {( p, f ( p)) : p ∈ M}, with induced metric g, ¯ which is of the form g¯i j = gi j + ∇i f ∇ j f,

g¯ i j = g i j −

∇i f ∇ j f . 1 + |∇ f |2

The mean curvature of N with respect to the downward normal is

∇f H[ f ] = div . 1 + |∇ f |2 Furthermore let P[ f ] = tr N K¯ be the trace of K¯ taken along N . Now we can write Jang’s equation as J [ f ] = H[ f ] − P[ f ] = 0.

(3.1) We shall consider the Dirichlet problem for this equation with boundary values f ∂ ± M = ∓Z , for constants Z > 0. Equation (3.1) is a quasilinear elliptic equation of divergence form. In particular, it is a prescribed mean curvature equation with gradient dependent lower order term. For such equations the strong maximum principle does not apply directly to give upper and lower bounds for the solution, without assuming extra conditions for example on the size of the domain. Further, the boundary gradient estimates needed for the proof of existence of classical solutions typically require restrictions on the geometry of the boundary. Therefore we cannot prove existence of solutions to the Dirichlet problem directly for Eq. (3.1). In general it is to be expected that solutions to the Dirichlet problem blow up in the interior. We follow the approach of [SY81] and regularize Jang’s equation by adding a capillarity term. Thus we consider instead of (3.1), the equation Jτ [ f ] = J [ f ] − τ f = 0

(3.2)

950

L. Andersson, J. Metzger

for τ > 0. After suitably modifying the data, we are able to apply Leray-Schauder theory [GT98] to prove existence of solutions to the Dirichlet problem. Letting τ → 0 gives a sequence of solutions which by uniform curvature estimates for graph f τ has a subsequence which converges to a solution of Jang’s equation (which in general may have blowups). The goal is in fact to prove existence of MOTS by constructing a blowup solution to Jang’s equation. For this purpose, we set Z = δ/τ for a suitable δ and let τ → 0. A key observation of [SY81] is that solutions to (3.2) satisfy interior estimates for the second fundamental form, uniformly in τ . These estimates allow us to pick out a subsequence of solutions which converges to a blowup solution of Jang’s equation. After applying a sequence of renormalizations using the fact that Jang’s equation is translation invariant, we get a vertical solution, which projects to a MOTS on M. The last part of the argument proceeds exactly as in [SY81], and therefore the only thing which needs to be discussed here is the Dirichlet problem. 3.2. Preparing the data. We will assume that (M, g, K ) is embedded into a four-dimensional Lorentz manifold (L , h) such that g and K are the first and second fundamental forms of M induced by h. As we do not require the dominant energy condition to hold, it is rather simple to produce an extension (L , h) of (M, g, K ). To this end extend g to M × R by setting gt = g + t K on the slice M × t. As K is symmetric, so is gt and there exists t0 > 0 such that gt is positive definite for t ∈ (−t0 , t0 ). Then define h on L := M × (−t0 , t0 ) to be h = −dt 2 + gt . This is a Lorentz metric and obviously induces g as the first fundamental form on the slice M0 = M × {0}. That K is the second fundamental form follows from the second variation formula, which implies that the second fundamental form of M0 is given by ∂ gt = K . ∂t t=0 Let t be a time function on L with M = {t = 0} and s + (x) := dist(x, ∂ + M) the dis+ be the surface given by the intersection tance function to ∂ + M. For small s, t, let s,t of the level sets of s + and t. Let n be the timelike normal of the t-level sets and let ν be the spacelike normal of the s + -level sets, inside the t-levels, extending the outward + as well as pointing normal on ∂ + M. This defines normal fields n, ν at the surfaces s,t ± + + the corresponding null normals l = n ± ν. For small s, t, we have θ [s,t ] > 0. Now perform a Lorentz rotation of the normals n, ν to get ν˜ = cosh αν + sinh αn, n˜ = sinh αν + cosh αn. µ IIab

+ so that H = h ab II , ν Let be the second fundamental form of the surfaces s,t ab ab + and P = h IIab , n, where h ab is the metric on s,t . Then with respect to the normals ν˜ , n˜ we have

H˜ = cosh α H + sinh α P,

P˜ = sinh α H + cosh α P

and the corresponding null expansions θ˜ ± = P˜ ± H˜

The Area of Horizons and the Trapped Region

951

are given by θ˜ ± = e±α θ ± . Further we note H˜ = 21 eα θ + − 21 e−α θ − , P˜ = 21 eα θ + + 21 e−α θ − . Deform M to M˜ by bending up along the outgoing future light cone at ∂ + M. By doing so, we get the spacelike and timelike normals to agree with ν˜ , n˜ for any α. As the deformed M˜ approaches the light cone, we have α → ∞. Therefore there is an α such that H˜ , P˜ are arbitrarily close to 21 eα θ + . In particular, if θ + > 0, we can achieve that both H˜ and ˜ P˜ are positive near the outer boundary of M. We can proceed similarly at the inner boundary ∂ − M, where θ + < 0 with respect to the inward pointing normal. This means that θ − < 0 with respect to the outward pointing normal. Then we can proceed as above, bending along the past inward lightcone. This will result in H˜ > 0, P˜ < 0 (where now H˜ is defined with respect to the outward normal of M as usual). ˜ g, This constructs a deformed Cauchy data set ( M, ˜ K˜ ). Let ∂ M˜ be the boundary of M˜ constructed by bending as above. Clearly the boundary ∂ M˜ is the union ˜ with H˜ > 0 on ∂ M˜ and P˜ > 0 on ∂ + M, ˜ P˜ < 0 on ∂ − M. ˜ Let ∂ M˜ = ∂ − M˜ ∪ ∂ + M, ˜ =s s± := x ∈ M˜ : dist(x, ∂ ± M) be the parallel surfaces to ∂ ± M˜ and ˜ <s Us± := x ∈ M˜ : dist(x, ∂ ± M) be the respective tubular neighborhoods. Given ε > 0, there exists δ > 0 such that we can ensure the following properties: θ + [s− ] < 0 and θ + [s+ ] > 0 for s ∈ [0, 4ε], H [s− ] > δ and H [s+ ] > δ for s ∈ [0, 2ε], P[s− ] ≤ 0 and P[s+ ] ≥ 0 for s ∈ [0, 2ε],

(3.3)

the data is unchanged in M3ε . We abuse notation here by computing H with respect to the outward pointing normal ˜ ˜ but compute θ + still with respect to the inward pointing normal near ∂ − M, for ∂ M, ˜ which makes θ + = P − H near ∂ − M. Fix such an ε > 0 and let ζ (s) be a non-negative cutoff function on s ≥ 0, such that ζ (s) = 0 for s ∈ [0, ε], ζ (s) > 0 for s > ε, and ζ (s) = 1 for s ≥ 2ε. Now define ˜ ζ (x) = ζ (d(x, ∂ M)), and consider the data set (g, ˜ ζ K˜ ). From now on we denote this data set by (M, g, K ). The important point to note here is that this final cut-off does not affect the first property of (3.3), so that we still retain the barrier effect of the boundary.

952

L. Andersson, J. Metzger

We find that with respect to the cut-off data we have the following properties near the boundary: θ + [s− ] < 0 and θ + [s+ ] > 0 for s ∈ [0, 4ε], H [s− ] > δ and H [s+ ] > δ for s ∈ [0, 2ε], K ≡ 0 in Uε , and the data is unchanged in M3ε .

(3.4)

3.3. Existence proof. In order to construct solutions to the Dirichlet problem for (3.2), we consider, following [SY81], the family of equations H[ f ] − σ P[ f ] = τ f, f ∂ M = σ φ (3.5) for σ ∈ [0, 1] and τ ∈ [0, 1]. We need the following estimates. Proposition 3.3. Let N be the graph of a function f satisfying the equation H[ f ] − σ P[ f ] = F in M ¯ then the second fundamental form A of N satisfies the estimate with F ∈ C 1 ( M), |A|( p, f ( p)) ≤ C M RmC 0 , K C 1 , dist( p, ∂ M)−1 , inj(M, g, p)−1 , FC 1 . In fact, if we extend the normal ν¯ of N to M × R, then |∇¯ ν¯ |( p, t) ≤ C M RmC 0 , K C 1 , dist M ( p, ∂ M)−1 , inj(M, g, p)−1 , FC 1 . Proof. This is analogous to [SY81, Prop. 1 and Prop. 2]. Proposition 3.4. Let f σ,τ be a solution to (3.5) with parameters and τ . Then f σ,τ satisfies the estimates

sup | f σ,τ | ≤ max 3K C 0 /τ, sup |φ| , M

and

∂M

M sup |∇ f σ,τ | ≤ max c( RmC 0 + ∇ K C 0 )/τ, sup |∇ f σ,τ | . ∂M

M

Proof. This follows from the maximum principle, as in [SY81, Sect. 4]. Hence we can estimate the gradient once we have a boundary gradient estimate. Proposition 3.5. Let (M, g, K ) be a data set such that there are ε > 0, δ > 0, such that for s ∈ [0, ε] the surfaces s := { p ∈ M : dist( p, ∂ M) = s} satisfy H > δ. Further, assume that K ≡ 0 in { p : dist( p, ∂ M) < ε}. Let f τ,σ be a solution of Jτ,σ [ f τ,σ ] = H[ f τ,σ ] − σ P[ f τ,σ ] − τ f τ,σ = 0,

The Area of Horizons and the Trapped Region

953

such that f τ,σ is constant on each component of ∂ M. Suppose that sup | f τ,σ | = m < ∞ and sup | f τ,σ | ≤ ∂M

M

δ 2τ .

Then sup |∇ f τ,σ | ≤ max{ √1 , 2ε−1 m}. 3

∂M

Proof. We proceed by constructing a barrier near ∂ − M. Consider functions w of the form w = ψ(s),

s = dist(·, ∂ − M),

where ψ : [0, ε] → R is a scalar function. For functions of this form we have Jτ,σ [w] = −

ψ ψ H [ ] + − τψ s (1 + (ψ )2 )1/2 (1 + (ψ )2 )1/2

(3.6)

in the neighborhood where K ≡ 0. To construct an upper barrier near one component of ∂ − M, set w + := ψ + (s) with ψ + (s) = a + bs, where a is the value of f τ,σ on . We can then pick b so large that (1+bb2 )1/2 ≥ 21 , that is b ≥ √1 . Then (3.6) yields that 3

Jτ,σ [w + ] ≤ − 2δ + τ |a| − τ bs ≤ − 2δ + τ sup | f | − τ bs ≤ −τ bs ≤ 0. ∂M

We can then choose b so large that a + bε ≥ m, that is b ≥ 2ε−1 m. Thus we have constructed an upper barrier, the construction of the lower barrier is analogous. The barrier near ∂ + M can be constructed analogously, using the expression Jτ,σ [w] =

ψ ψ H [ ] + − τψ s (1 + (ψ )2 )1/2 (1 + (ψ )2 )1/2

(3.7)

for Jτ,σ near ∂ + M. As a corollary, we find that given suitable boundary data, Eq. (3.5) is uniformly elliptic, where the ellipticity constant does not depend on σ ∈ [0, 1]. Thus we conclude that there exists a solution to (3.5) with σ = 1 and τ > 0 for such data by applying Leray-Schauder theory. Corollary 3.6. Let (M, g, K ) and φ ∈ C ∞ (∂ M) be as in Proposition 3.5. Then the equation H[ f τ ] − P[ f τ ] = τ f τ (3.8) f τ |∂ M = φ ¯ with has a solution f τ in C 2,α ( M) f τ C 2,α ( M) ¯ ≤ C/τ,

where the constant C = C M RmC 0,α , K C 1,α , ε−1 .

954

L. Andersson, J. Metzger

Proof. This is analogous to [SY81, Lemma 3]. We now specify the precise data on ∂ M. Set δ on ∂ − M 2τ φ= , δ − 2τ on ∂ + M where δ is as in Proposition 3.5. We then solve (3.8) with this data to obtain a family of functions f τ . Note that the gradient estimate forces f τ to be uniformly large near the boundary. Denote Mε = { p ∈ M : dist( p, ∂ M) > ε}. Lemma 3.7. There exists an ε > 0 such that the functions f τ satisfy | fτ | ≥

δ 4τ

in

M\Mε .

As in [SY81, Sect. 4] we can now use the curvature estimate from Proposition 3.3 to obtain a limit for graph f τ as τ → 0. By the previous lemma we can restrict ourselves to Mε away from the boundary, as f τ → ∞ uniformly on M\Mε . This gives the following result. Proposition 3.8. There exists a sequence τi → 0 such that graph f τi in Mε converges to a smooth manifold N0 satisfying H + P = 0. N0 consists of a disjoint collection of components, which are either graphs or cylinders over compact surfaces . Let ± := { p : f τi ( p) → ±∞} and 0 := { p : supi≥1 | f τi ( p)| < ∞}. Then M is a disjoint union M = 0 ∪ + ∪ − . The set := ∂ − \∂ + M consists of marginally trapped surfaces with θ + = 0 with respect to the normal pointing into − . The fact that satisfies θ + = 0, can be seen as follows. Since the f τi converge to −∞ in − and are bounded below outside of − , there are just two possibilities for the convergence of Nτi = graph f τi to N0 near each component of . The first possibility is that is the interface between + and − . Then N0 has a cylindrical component × R, and the convergence is such that the downward normal ν¯ τ of Nτi converges to the normal of pointing out of − . As N0 satisfies H[N0 ]−P[N0 ] = 0 with respect to the limit of ν¯ τi , this implies that H − P = 0 on with respect to the outward pointing normal, and hence θ + = P + H = 0 with respect to the inward pointing normal as claimed. The second possibility is that is an interface between 0 and − . Then near , N0 is a graph over 0 which asymptotes to × R, and since f τi → −∞ in − , this graph goes to −∞ near as well. Again we can conclude that ν¯ τi converges to the normal of N0 pointing out of − . Furthermore, H − P = 0 on × R with respect to this normal, as it is the limit of N0 , which satisfies H − P = 0. Hence we again conclude that θ + [ ] = 0. From Lemma 3.7 we know that + contains a neighborhood of ∂ − M and − contains a neighborhood of ∂ + M, so neither one of them is trivial. In particular ∂ − is the disjoint union ∂ − = ∪ ∂ + M, where ⊂ M is contained in the interior of M. Recall that we had to modify the data for the existence proof. We now show that can not enter the region where we modified the data. To see this, note that a neighborhood of ∂ − M is foliated by surfaces s− with θ + [s− ] < 0. If enters this region there is a minimal s, with s− ∩ = ∅. This surface touches with their outward normals pointing in the same direction. Thus, by the strong maximum principle, = s− , a contradiction. Furthermore, there is a neighborhood of ∂ + M foliated by surfaces s+ with θ + [s+ ] > 0. We can then proceed analogously to get a contradiction to entering

The Area of Horizons and the Trapped Region

955

this neighborhood. As data set is modified only in the neighborhoods discussed above, we find that lies entirely in the region where the data is unchanged. We thus conclude the proof of Theorem 3.1 by finding our solution in the unmodified region of (M, g, K ). It is an interesting possibility that the existence theory developed here for the Dirichlet problem for Jang’s equation can be used to generalize Yau’s result in [Yau01, Theorem 5.2] to more general boundary geometries. This possibility will be investigated by the authors in future work. 4. Blowup Surfaces are Stable While not actually necessary for the main result of the paper, we present an extension of the results of Sect. 3. From the arguments in [SY81] it is clear that has only components which are symmetrized stable, where symmetrized stable refers to non-negativity of the operator (cf. [GS06]) L˜ M f = − f + f 21 Sc − 21 |χ |2 − µ − J (ν) . Here we want to show that they are in fact stable in the sense of MOTS. Theorem 4.1. The surface constructed in the proof of Theorem 3.1 is a stable MOTS. Remark 4.2. By the same argument we can prove that any blow-up surface obtained by the capillarity term regularization of Jang’s equation is a stable surface, in particular those in [SY81]. Note that all of these surfaces are MOTS provided one chooses the right orientation of the normal. Proof. The stability of will follow from a barrier argument. Assume that is an unstable component of . We will show that in this case the functions f τi are bounded below +∞ in a neighborhood of . Hence lies in the interior of + ∪ 0 and can not be part of ∂ − , which contradicts the assumption that is a component of . If is unstable, let φ > 0 be a suitably scaled eigenfunction to the principal eigenvalue. We can extend the vector field φν to a neighborhood of , and flow by this vector field. This yields a map F : × [−1, 1] → M and constant > 0 with the following properties. We will denote s = F(, s). 1. 0 = . 2. s ⊂ + if s ∈ [−1, 0) and s ∩ + = ∅ if s ∈ (0, 1]. 3. ∂∂sF = βν, where ν is the normal to s extending the outward pointing normal ν on , and β satisfies the estimates ∂β −1 ≤ . ≤ β ≤ , and ∂s 4. Outside of + we have θ + [s ] < 0 and inside θ + [s ] > 0 and −1 s ≤ |θ + [s ]| ≤ s 5. We can assume that K C 0 (M) ≤ .

for all

s ∈ [−1, 1].

956

L. Andersson, J. Metzger

For an interval (s1 , s2 ) ⊂ [−1, 1] we denote by A(s1 , s2 ) the annular region F ( × (s1 , s2 )), which is foliated by the s for s ∈ (s1 , s2 ) and has boundary ∂ A(s1 , s2 ) = s1 ∪ s2 . We will construct a subsolution w of Jang’s equation, satisfying J [w] ≥ η > 0. The function w will be constant on the s , that is w = φ(s). We will later use the positivity of η to infer that w + m τ are in fact subsolutions for Jτ , where m τ is a suitably chosen constant. Lemma 4.3. For w = φ(s) we can compute Jang’s operator to be the following expression: φ + φ φ φ ∂β J [w] = θ − 1+ P − σ −2 K (ν, ν) + 2 3 − 3 3 . (4.1) βσ βσ β σ β σ ∂s Here σ 2 = 1 + β −2 φ 2 . To construct w we will proceed in three steps, which amount to constructing w on the annuli A1 := A(−δ, 0), A2 := A(0, ε), and A3 := A(ε, 2ε), where δ and ε will be fixed during the construction. We start with the construction of φ in A2 = A(0, ε), which will fix ε, but not quite φ. In this region all we know is that θ + [s ] ≤ 0, so we make the assumption φ ≤ −µ < 0, where we will fix µ in the course of the argument. This renders the first term in (4.1) to be non-negative. We can thus estimate that J [w] ≥ −

c1 φ + c , 2 µ2 |φ |3

(4.2)

for constants c1 , c2 > 0 depending only on , provided we choose µ ≥ . To see this, note that σ is comparable to |φ | provided the latter is bounded away from zero. The fact that the term containing P in (4.1) is of the form c1 /µ2 follows from the Taylor expansion of the square root. To get that the right hand side of (4.2) is positive we must satisfy φ2 c0 ≥ 2, 3 |φ | µ

(4.3)

where c0 = c1c+1 + 1 is a positive constant depending only on . We will later use c0 > 1 2 and c0 c2 > 1. We make the following ansatz for φ in [0, ε]: s 2/3 φ2 (s) = a2 1 + + b2 (4.4) ε for constants a2 , b2 to be determined. We compute that s −1/3 2a2 φ2 (s) = 1+ , (4.5) 3ε ε s −4/3 2a2 9ε2 φ2 (s) = − 2 1 + = − 3 φ2 (s)4 . (4.6) 9ε ε 8a2 As we want to have φ2 < 0, we must choose a2 < 0 which renders φ2 (s) > 0. So in order to get φ (s) ≤ −µ it is sufficient to take a2 2/3 2 , −µ = φ2 (ε) = 3ε

The Area of Horizons and the Trapped Region

957

as |φ | is increasing. This implies a22 = 2−4/3 9ε2 µ2 .

(4.7)

To satisfy (4.3), we require that c0 φ (ε) 9ε2 3ε ≤ = φ (ε) = 2 2−7/3 . 3 2 3 µ |φ (ε)| 8a2 a2 This is equivalent to a22 ≤

3εµ2 −7/3 2 . c0

(4.8)

Combining with (4.7) we find the condition 9ε2 µ2 2−4/3 ≤

3εµ2 −7/3 2 c0

(4.9)

or ε≤

1 . 6c0

Thus we choose ε = 6c10 . Note that since c0 > 1, ε < 16 < 21 . Modulo fixing µ and the vertical shift, we are done with φ on (0, ε). Note that ε does not depend on µ which is important in view of the fact that we will later choose µ as a function of ε. Note further that J [w] ≥ µ12 on A2 by construction. For A3 := A(ε, 2ε) we will make the ansatz w = φ3 (s), with s ∈ [ε, 2ε). As we are in the region s > ε, where ε has been fixed by the construction in A2 , we have θ + ≤ −−1 ε and thus the first term in (4.1) is estimated by κ := √ε > 0 from below. 2 We can estimate the whole expression as follows: J [w] ≥ κ −

|φ3 (s)| c1 − c , 2 µ2 |φ3 (s)|3

(4.10)

where we again assumed |φ (s)| ≥ µ ≥ , and c1 and c2 are constants depending only on . We can ensure that the second term is small, that is c1 κ ≤ , µ2 4 provided µ2 ≥

4c1 . κ

(4.11)

It remains to find a function, which allows us to choose µ large while keeping the term c2

|φ3 (s)| κ < . 3 |φ3 (s)| 4

(4.12)

958

We make the ansatz

L. Andersson, J. Metzger

s−ε φ3 (s) = a3 log 1 − + b3 ε

(4.13)

and compute s − ε −1 1− , ε a3 s − ε −2 φ3 (s) = − 2 1 − . ε ε φ3 (s)

a3 =− ε

As we need φ3 (ε) = −µ, to be able to fit φ3 to φ2 , we compute −µ = φ3 (ε) = − aε3 or a3 = εµ > 0. Hence φ3 (s) < 0 and φ3 (s) ≤ µ for s ∈ (ε, 2ε), as desired. We still have to fix µ. The goal is to simultaneously satisfy (4.11) and (4.12). Compute |φ3 (s)| s−ε 1 1 ≤ 2 . = 2 1− |φ3 (s)|3 µ ε ε µ ε 2 Thus we can ensure (4.12) provided µ2 ≥ 4c εκ . We choose 4c1 4c2 , , , µ = max κ εκ

and are done constructing φ3 up to fixing b3 in such a way to ensure φ2 (ε) = φ3 (ε). Note that we have that φ3 (s) → −∞ as s → 2ε, which is the desired behavior. Furthermore we have J [w] ≥ κ2 > 0 in A3 . In the region A1 = A(−δ, 0), where 0 < δ < 1 will be chosen later, we set w(s) = φ1 (s). Then we estimate from (4.1) that J [w] ≥ −c3 + c4

φ1 (s) , |φ1 (s)|3

(4.14)

where c3 , and c4 > 0 are again constants depending only on . Here we assumed that |φ1 (s)| ≥ as before. The only chance to get the right hand side of this expression positive is to take φ1 (s) to be a function with φ1 (s) c3 + 1 ≥ := c5 . 3 |φ1 (s)| c4 We make the ansatz s 1/2 + b1 , φ1 (s) = a1 1 + 2δ and compute s −1/2 a1 1+ , 4δ 2δ s −3/2 a1 1+ . φ1 (s) = − 16δ 2 2δ φ1 (s) =

The Area of Horizons and the Trapped Region

959

We fix b1 such that φ1 (−δ) = 0. This then fixes b2 and b3 by the requirement that w is continuous on A(−δ, 2ε). From the requirement φ1 (0) = φ2 (0) =: −µ , we infer that a1 = −4µ δ.

(4.15)

Recall that −µ is fixed and can not be chosen freely. From φ1 (s) > 0 we find that |φ1 (s)| ≥ |φ1 (0)| = µ = 21/3 µ ≥ µ = |φ2 (ε)| ≥ , so φ1 is automatically large enough to justify (4.14). To get positivity of the right hand side of (4.14) we need that c5 ≤

φ (s) 4δ = 2. 3 |φ (s)| a1

Solving for a12 yields the condition a12 ≤

4δ . c5

(4.16)

As we already fixed a1 in (4.15), we infer the condition δ≤

1 . 4c5 µ2

So we fix δ = 4c 1µ2 and are done. Note that J [w] ≥ 1 by construction. 5 To summarize, we have constructed a function w on A(−δ, 2ε) with the following properties: (i) w is C 1,1 up to the boundary in every A(−δ, s) with s ∈ (−δ, 2ε). Hence w ∈ W 2,∞ ∩ C 1,1 away from 2ε , (ii) there exists η > 0 such that J [w] ≥ η, (iii) w ≡ 0 on −δ , w ≤ 0 on A(−δ, 2ε), (iv) there exists C1 < ∞ such that 0 ≥ w ≥ −C1 in A(−δ, ε), and (v) w|s → −∞ as s → 2ε. Here η and C1 are constants that only depend on , as do δ and ε. With this subsolution w, we can get a lower bound of the functions f τ solving J [ f τ ] = τ f τ near as follows. Set

η , m := min inf f τ , −δ τ and consider the function wm := w + m.

(4.17)

The goal is to apply the comparison principle for the quasilinear operator J to show that wm ≤ f τ in A(−δ, 2ε). To this end let U be the region where f τ ≤ m. From the equation we conclude that J [ fτ ] = τ fτ ≤ τ m ≤ η in U , and moreover f τ = m on ∂U . As f τ ≥ − Cτ is bounded below as in Proposition 3.4, we can choose s¯ ∈ (ε, 2ε) such that wm |s¯ ≤ inf M f τ .

960

L. Andersson, J. Metzger

Set V := U ∩ A(−δ, s¯ ). Then, as ∂ V ⊂ ∂U ∪ −δ ∪ s¯ , we find that wm ≤ f τ on ∂ V . An application of the comparison principle [GT98, Chap. 10] allows us to conclude that wm ≤ f τ in V and thus wm ≤ f τ in A(−δ, 2ε). By construction, there is a constant C1 such that w + C1 ≥ 0 in A(−δ, ) and hence m − C1 ≤ wm in A(−δ, ε). Thus we infer the estimate η f τ ≥ min inf f τ , − C1 in A(−δ, ε). −δ τ

(4.18)

We can now conclude the argument. Take the sequence τi and the functions f τi from Proposition 3.8. By construction f τi is uniformly bounded below on −δ as −δ is compactly contained in + ∪ 0 , hence as τi → 0 the term on the right hand side of (4.18) is bounded below as τi → 0. Thus A(−δ, ε) ⊂ + ∪ 0 , which is a contradiction, since we assumed that ⊂ A(−δ, ε) was a boundary component of ∂ − . This concludes the proof of Theorem 4.1. 5. Weak Barriers In this section we will slightly improve Theorem 3.1 to allow interior boundaries where we just have the weak inequality θ + [∂ − M] ≤ 0, instead of the strict inequality assumed in Theorem 3.1. Theorem 5.1. Let (M, g, K ) be a smooth, compact initial data set with ∂ M the disjoint union ∂ M = ∂ − M ∪ ∂ + M such that ∂ ± M are non-empty, smooth, compact surfaces without boundary and θ + [∂ − M] ≤ 0 with respect to the normal pointing into M and θ + [∂ + M] > 0 with respect to the normal pointing out of M. Then there exists a smooth, embedded, stable MOTS ⊂ M homologous to ∂ + M. may have components which agree with components of ∂ − M that satisfy θ + = 0. In this case we can not use the strong maximum principle to exclude that touches ∂ − M as in Lemma 2.7. For the proof of Theorem 5.1 we shall need the following lemma. Lemma 5.2. Let be a connected, two-sided, compact, embedded surface with θ + ≤ 0 and θ + ≡ 0. Then for every ε > 0 there exists a smooth, embedded surface in the ε-neighborhood of , which lies to the outside of but does not touch , is a graph over , and satisfies θ + < 0. Proof. Consider the following equation for a function F : × [0, s¯ ) → M: dF + ds = −θ ν F(·, 0) = id .

(5.1)

Here, ν is the outer normal as usual. This is a weakly parabolic equation for F, in fact it is a generalization of the mean curvature flow. To see this, recall that θ + = H + P,

The Area of Horizons and the Trapped Region

961

where H is the mean curvature, and P = M trK − K (ν, ν) is a term only depending on first derivatives of F. Thus the flow in Eq. (5.1) is dF = −H ν − lower order. ds Hence it has the same symbol as the mean curvature flow and thus is a quasilinear parabolic equation. The theory of parabolic equations guarantees the existence of a solution for a small time interval [0, s¯ ), see for example [HP99, Sect. 7]. Furthermore, any surface s = F(, s) for s ∈ (0, s¯ ) is smooth. From a standard argument using the strong maximum principle we conclude that θ + < 0 instantly. To see this, recall that the evolution equation for θ + has the form ∂θ + = −L s θ + = θ + − 2S(∇θ + ) − θ + Q, ∂s where L s is the linearization of θ + along s , with Q = divS − 21 |χ + |2 − |S|2 +

1 2 Sc − µ +

J (ν) − 21 (θ + )2 + θ + tr K ,

where all geometric quantities are computed on s . Note that L s equals L M on MOTS. By smoothness we have that Q is bounded for a short time, whence we can choose a>

max

s∈[0,¯s /2],x∈s

|Q(x, s)|.

Let u = e−as θ + and compute ∂ ∂s − u = −2S(∇u) − (Q + a)u. The coefficient of the zeroth order term is negative. Hence the strong maximum principle from [Lie96] is applicable to u and implies that u instantly becomes negative, implying that θ + instantly becomes negative. If s is small enough, s will also be embedded. As θ + ≤ 0, the flow (5.1) moves the surface in the direction of ν everywhere, and hence outward, in particular s ∩ = ∅. As the initial speed is given by |θ + |, which is bounded, the surfaces s will be arbitrarily close to , as long as s > 0 is small enough. Hence we can choose to be one of the s . Proof of Theorem 5.1. The main difficulty here is that ∂ − M may have multiple connected components ∂ − M = 1 ∪ . . . ∪ N where some of the k satisfy θ + = 0, to which we can not apply Lemma 5.2 directly. Lemma 5.2 allows us to flow the boundary components k with θ + ≤ 0 and θ + ≡ 0 in the direction of their outer normal ν, that is into M, to replace M by a manifold M1 which is such that ∂ − M1 is still embedded and each component of ∂ − M1 either has θ + < 0 or θ + = 0. As the boundary components with θ + = 0 do not allow the application of Theorem 3.1, we have to tweak them a little. Pick one such component of ∂ − M with θ + [] = 0, then there are three cases. Either, as a MOTS, is not stable, is stable, but not strictly stable, or is strictly stable. When is not stable, let φ > 0 be an eigenfunction for the principal eigenvalue λ < 0 for the operator L M on . Extend the vector field φν to a neighborhood of and

962

L. Andersson, J. Metzger

flow for a short time interval along this vector field. This yields a foliation {s }s∈[0,ε) of a neighborhood of , such that 0 = and s lies inside of M and has θ + < 0 when s > 0. Hence, we push a little inward and obtain a strictly trapped surface. In the other two cases we need to flow the components with respect to the vector field −φν, where φ > 0 is again the principal eigenfunction of L M on . So we have to assume that there is an extension (M , g , K ) of (M, g, K ) with M ⊂ M , g = g | M and K = K | M such that ∂ − M lies in the interior of M . Such an extension can be constructed by simply gluing [0, 1] × ∂ − M to M along ∂ − M and smoothly extending g and K to the added piece. Keeping this in mind, we can now move the other boundary components inwards in the following way. If is strictly stable, then by flowing in the direction −φν, we construct a foliation {s }s∈(−ε,0] of a neighborhood of , such that 0 = and s lies in the direction −ν, that is outside of M and has θ + < 0 if s < 0. We choose one of the s as a new inner boundary. We will later use the fact that the region between the former boundary and the new boundary s is foliated by surfaces with θ + < 0 to ensure that the constructed MOTS does not enter this region. The last case is where is stable but not strictly stable. In this case we also flow in the direction −φν and construct a foliation {s }s∈(−ε,0] of a neighborhood of , such that 0 = and s lies outside of M and ∂ θ + [s ] = 0. (5.2) ∂s s=0 We will change the data K along the surfaces s by replacing K by K˜ = K − 21 ψ(s)h s , where h s is the metric on s and ψ : R → R is a C 1 function with ψ(s) = 0 for s > 0. Note that θ˜ + [s ], which means the quantity θ + computed with respect to the new data (M , g , K˜ ), satisfies θ˜ + [s ] = θ + [s ] − ψ(s). As θ + [s ] vanishes to first order in s at s = 0 by (5.2), we can extend ψ as a C 1,1 function to M˜ such that θ + < 0 on all s , if s < 0 is close enough to zero. Hence, this case is similar to the strictly stable case. It is clear that we can choose s in such a way that K˜ C 1 ( M) ˜ ≤ 2K C 1 (M) . ˜ g, In summary, by this construction we have replaced (M, g, K ) by a manifold ( M, ˜ K˜ ) which are both embedded in a data set (M , g , K ). The outer boundaries of M and M˜ ˜ < 0. The data K˜ agree and have θ + > 0, while the inner boundary of M˜ has θ + [∂ − M] ˜ is C 1,1 in M. The set U := M\ M˜ ⊂ M , corresponding to the boundary components we moved ˜ K˜ ). out of M, is foliated by surfaces s with θ + [s ] < 0 with respect to the data (g, ˜ We can now invoke Theorem 3.1 to find a smooth, embedded, stable MOTS in M, ˜ Note that it is only necessary to assume K ∈ C 1,α which bounds with respect to ∂ − M. for some 0 < α ≤ 1 for the theorem to apply. If one of the components of enters U , say the component U of U , then let s¯ := min{s : s ∩k = ∅}, where the s constitute the foliation of U by strictly trapped surfaces, as above. At the point where the minimum is assumed, the outward normals of and s¯ point into the same direction, and hence the strong maximum principle implies that k = s¯ , a contradiction. Thus ∩ U = ∅,

The Area of Horizons and the Trapped Region

963

Fig. 2. The δ-standard neck

and ⊂ M is the desired solution. Note that some components of might agree with components of ∂ − M which have θ + = 0. The assertion that is stable then follows from Theorem 4.1. As an immediate consequence of Theorem 5.1, we infer the following corollary. Corollary 5.3. Let (M, g, K ) be such that ∂ M is the disjoint union ∂ M = ∂ − M ∪ ∂ + M, where ∂ + M is non-empty with θ + [∂ + M] > 0 and ∂ − M is possibly empty. If is an outermost MOTS homologous to ∂ + M, then there do not exist outer trapped surfaces enclosing . In particular, is a stable MOTS. 6. Surgery In this section we describe a surgery procedure to construct an outer trapped surface outside of a MOTS with small i + () and bounded curvature. In view of the existence part in Theorem 5.1, we infer a lower bound on i + () for outermost MOTS. This implies an area estimate. Moreover, the surgery procedure guarantees that a fixed amount of the volume outside of is consumed. By iterating the surgery procedure and application of Theorem 5.1, we then infer that after a finite number of steps we arrive at a MOTS outside of with a lower bound on i + ( ). As usual, we assume that is homologous to ∂ + M and denote the region bounded by and ∂ + M, that is the outside of , by . 6.1. Neck geometry. The surgery procedure works by inserting necks with negative θ + . We start by constructing a suitable neck in Euclidean space, and transfer it to the geometry of M in normal coordinates. Let δ > 0 and consider the map ⎛ ⎞ δ sin φ(3 − cos θ ) F : [0, 2π ] × [− π2 , π2 ] → R3 : (φ, θ ) → ⎝δ cos φ(3 − cos θ )⎠ . δ sin θ The image of F is shown in Fig. 2; we will call it the δ-standard neck. Denote by the interior Iδ of the neck the points (x 1 , x 2 , x 3 ) with x 3 ∈ (−δ, δ), x 3 = δ sin θ and (x 1 )2 + (x 2 )2 ≤ δ 2 (3 − cos θ 2 ).

964

L. Andersson, J. Metzger

Fig. 3. Selecting the points p and q where a ball Bδ (O) touches 3

Clearly, the open ball BδR (0) is contained in Iδ . The Euclidean mean curvature of the standard neck with respect to the normal pointing out of Iδ is H e = −δ −1 1 − (3 − cos θ )−1 cos θ ≤ −(2δ)−1 . Thus the Euclidean mean curvature of the δ-standard neck can be arbitrarily negative if δ is chosen small enough. Let r0 be such that at any point O ∈ M with dist(O, ∂ M) ≥ ρ(M, g, K ; ∂ M)/2 we have geodesic normal coordinates {x i } such that for r ≤ r0 we have r −2 |gi j − δi j | + r −1 |∂k gi j | + |∂k ∂l gi j | ≤ C, where r is the Euclidean distance in x-coordinates. Then, the image of the standard neck in these coordinates will have H < −(4δ)−1 if δ < r0 is small enough. Thus, choosing δ −1 large compared to K C 0 (M) , we can ensure that the δ-standard neck has θ + < 0. 6.2. Point selection. The goal is to consume a fixed amount of volume by application of the surgery. To this end, we have to insert a neck with δ bounded away from zero in terms of the geometry of M. Hence, it is not sufficient to do surgery at the points p, q which realize i + (). Instead, we have to find points p, q such that there is a point O with dist(O, ∂ M) ≥ ρ(M, g, K ; ∂ M)/2 such that BδM (O) touches at p and q, and the angle of the segments joining O to p and q at O is close to π . These points p, q, O can be found as follows. Let r0 be as above. There exist r1 < r0 and C > 0 depending only on M RmC 0 , such that ∂ BrM (O) has a second fundamental form Ar ≥ Cr γr where γr is the induced metric on ∂ BrM (O) (use the Hessian comparison theorem for the distance function to O [SY94]). Furthermore, there exists 0 < r2 < r1 /2, depending additionally on sup |A| with the following property. If O and r < r2 are such that ∂ BrM (O) touches at p, then the -ball Br2 ( p) does not intersect the interior of BrM (O). The important point to note is that the radius of the -ball does not depend on r .

The Area of Horizons and the Trapped Region

965

Now fix r < r2 and consider the set Ur ⊂ of points which can be touched by a ball which lies completely outside of , that is, Ur := p ∈ : ∃O ∈ s.t. BrM (O) ⊂ and p ∈ ∂ BrM (O) . Clearly Ur is non-empty if 2r < dist(, ∂ + M), as then the point p1 ∈ which realizes dist(, ∂ + M) is in Ur . Let 1 be the component of containing p1 . If 1 ⊂ Ur , then dist (1 , \1 ) ≥ 2r . We then select p2 ∈ \1 such that p2 realizes the distance dist(\1 , ∂ + M ∪ 1 ), clearly p2 ∈ Ur . We can continue this process until either we found a component k of with k ⊂ Ur and Ur ∩ k = ∅, or we showed that = Ur . However, the latter can not happen if i + () < r , as the points p, q from Lemma 2.13 are not in Ur . Thus, there is a component k of which contains a point p ∈ ∂Ur , the boundary of Ur relative to . As Ur is closed in , there exists O ∈ such that p ∈ ∂ BrM (O) and BrM (O) ⊂ . We claim that there exists q ∈ ∩ ∂ BrM (O), q = p. This q can be constructed as follows. Choose a sequence of points pk ∈ \Ur with pk → p. Consider the geodesic normal to emanating from pk outward. Let Ok be the point at distance r from pk on this geodesic. As pk is not in Ur , the ball Br (Ok ) intersects in a point qk with dist(qk , Ok ) < r and dist ( pk , qk ) ≥ r2 , by our choice of r . By compactness we can assume that the qk converge to q with dist(q, O) ≤ r and dist ( p, q) ≥ r2 . As p ∈ Ur , the open ball BrM (O) does not intersect and thus dist(q, O) = r . Thus we find that, if r < r2 and i + () < r , there exist points p = q ∈ and O ∈ such that p, q ∈ ∂ Br (O). Denote the geodesic segment joining O and p by γ p and the segment joining O and q by γq . We now want to show that the angle between γ p and γq at O is close to π if r is small enough. Consider geodesic normal coordinates around O. Then the segments γ p and γq are straight lines emanating from O. Let L p be the plane orthogonal to γ p through p. As the curvature of is bounded, Br3 ( p) is the graph of a function u p over L p with r −2 u p + r −1 |∂k u p | + |∂k ∂l u p | ≤ C

(6.1)

∞ depend only on injρ (M, g, K ; ∂ M)−1 , M RmC 0 (M)

for r < r3 , where r3 > 0 and C < and sup |A|). In particular, Br3 ( p) is contained in a small tubular neighborhood of L p . Similarly, Br3 (q) is contained in a neighborhood of L q . Let α be the angle of γ p and γq at O. We claim that for each η > 0 there exists r > 0 such that |α − π | < η. Otherwise, if α is not close to π , the planes L p and L q intersect at r distance d with d = cos(α/2) ≤ rε . Thus, choosing r small enough, we can make L p and L q intersect within d ≤ r3 /2. This implies that Br3 ( p) and Br3 (q) must also intersect. This is a contradiction, as is assumed to be embedded.

6.3. Surgery. With the previous preparations, we can carry out the surgery procedure. We choose r so small that the above considerations apply, giving the following properties: 1. The (2δ)-standard neck in normal coordinates around any point O ∈ M with dist(O, ∂ M) > injρ (M, g, K ; ∂ M) has θ + < 0 in (M, g, K ). 2. The M-ball BδM (O) is contained in the interior of the image of the (2δ)-standard neck. 3. If i + () < δ, then there exist points p, q ∈ and O ∈ such that Bδ (O) ⊂ and p, q ∈ ∂ Bδ (O). 4. The angle α of γ p and γq at O satisfies |1/ cos α + 6 tan α| ≤ 3/2.

966

L. Andersson, J. Metzger

Fig. 4. The surgery in geodesic normal coordinates

Now assume that i + () < δ and pick p, q, O as in Condition 3 above, and consider geodesic normal coordinates around O such that γq lies on the negative x 3 -axis. Let N be the image of the (2δ)-neck centered at O with its axis aligned with the x 3 -coordinate axis, as in Fig. 4. Condition 4 on α implies that the plane L p is such that 3 3 L p ∩ {(x 1 )2 + (x 2 )2 ≤ 6δ} ⊂ {− δ ≤ x 3 ≤ δ]}. 2 2 Recall that the component p of ∩ {−2δ ≤ x 3 ≤ 2δ} containing p is the graph over L p of a function u p with r −2 u p + r −1 |∂k u p | + |∂k ∂l u p | ≤ C, where C is as in Eq. (6.1). Thus, we can choose δ, depending only on C so small, that first p ⊂ {−2δ ≤ x 3 ≤ 2δ}, and second p and N intersect transversely (note that the angle of and L p is of order δ, whereas the angle between the neck and L p is uniformly bounded away from zero). We can similarly argue for q , so that we find that Fig. 4 is indeed accurate. The surgery can now be performed as follows. Let p be the component of \N that contains p and q be the component that contains q. Let N be the component of N \ between p and q . Construct a non-smooth surface N by removing p and q and adding N . By construction this surface is homologous to , and hence to ∂ + M. By Condition 1, we find that the inserted neck has θ + < 0. Condition 2 implies that Bδ (O) is indeed contained in the neck we added. Furthermore, at the corner ∩ N , the normals ν N of N and ν of enclose an angle < π . We proceed by using Lemma 2.14 to smooth out this corner, thereby constructing a surface . This lies outside of N , and agrees with N except in an arbitrarily small neighborhood of the corner and has θ + ≤ 0 and θ + ≡ 0. Note that in particular, the component of , which contains part of N has θ + < 0 somewhere. 6.4. Results. By the previous surgery procedure we arrive at the following proposition: Proposition 6.1. Let (M, g, K ) be a data set such that ∂ M is the disjoint union ∂ M = ∂ + M ∪ ∂ − M of smooth compact surfaces without boundary. Assume that θ + (∂ + M) > 0 and if ∂ − M is non-empty, that θ + (∂ − M) < 0.

The Area of Horizons and the Trapped Region

967

There exists δ > 0 depending only on injρ (M, g, K ; ∂ M)−1 , M RmC 0 and K C 1 with the following property. If ⊂ M is a stable MOTS, homologous to ∂ + M, bounding together with ∂ + M, and i + () < δ, then there exists a MOTS outside of , homologous to ∂ + M and bounding together with ∂ + M such that Vol( ) ≤ V ol( ) − v0 , where 0 < v0 := inf{Vol BδM ( p) : dist( p, ∂ M) ≥ δ}. Proof. The fact that is stable yields a curvature bound in view of Theorem 2.10. Then the above surgery procedure can be applied to construct . An immediate corollary of the above proposition is the following. Corollary 6.2. Let (M, g, K ) and δ be as in Proposition 6.1. If is an outermost MOTS in M, then i + () ≥ δ. Proof. If i + () < δ, then Proposition 6.1, guarantees the existence of a barrier surface outside of , and Theorem 5.1 implies the existence of a MOTS outside of . Thus is not outermost. More importantly, as already indicated, the fact that a surgery takes away a uniform amount of volume, gives a finiteness result, which allows us to prove the following theorem. Theorem 6.3. Let (M, g, K ) be a data set such that ∂ M is the disjoint union ∂ M = ∂ + M ∪ ∂ − M of smooth compact surfaces without boundary. Assume that θ + (∂ + M) > 0 and if ∂ − M is non-empty, that θ + (∂ − M) < 0. Let δ be as in Proposition 6.1. If ⊂ M is a MOTS homologous to ∂ + M, then there exists a stable MOTS , with i + ( ) ≥ δ, such that does not intersect the region bounded by (and ∂ − M if non-empty). Proof. If is not stable we use Theorem 5.1 with inner boundary to construct a stable MOTS 1 outside of . If i + (1 ) < δ, then Proposition 6.1 applies and yields a barrier outside of 1 which can be fed into Theorem 5.1 to construct a stable MOTS 2 outside of 1 . The region bounded by 1 and 2 has volume bounded below by v0 , where v0 is from Proposition 6.1. If i + (2 ) < δ, we can iterate. As each step consumes at least volume v0 outside of , this procedure must end after a finite number of steps with a surface k with i + (k ) ≥ δ. A lower bound on i + () can be used to estimate the area of . This area estimate is crucial to get the compactness of the class of stable MOTS with i + () bounded below. Proposition 6.4. Let (M, g) be a compact Riemannian manifold with boundary, and ⊂ M an embedded, two-sided surface with bounded curvature |A| ≤ C. Let δ := min{i 0+ (), i + ()}. Then there exists an absolute constant c such that the following area estimate holds: || ≤ c(δ −1 + sup |A|) Vol(M).

(6.2)

968

L. Andersson, J. Metzger

Proof. Let ν be the outward pointing normal to the surfaces s := G (, s) for s ∈ [0, δ], where G is as in Eq. (2.1). Then M div(ν) = H s , where H s denotes the mean curvature of s . As δ ≤ i 0+ (), the estimate | M divν| ≤ 2 sup |A| ≤ 4 sup |A| s

i 0+ ()

(which has the bound on sup s |A| built in). follows from the definition of Let φ be a cut-off function with φ(s) = 1 for s ≤ δ/4, φ = 0 for s ≥ δ/2 and 0 ≤ φ (s) ≤ 8δ −1 . Using the divergence theorem for the vector field N = −φ(s)ν in the volume U := G(, [0, δ)), we infer that M || = N , ν dµ = divN ≤ Vol(U )| div N |.

U

This yields the desired area estimate. As outermost MOTS are stable, and thus have bounded curvature, we can combine this proposition with Corollary 5.3 to infer the following area bound for outermost MOTS. Theorem 6.5. Let (M, g, K ) be a smooth, compact initial data set with ∂ M the disjoint union ∂ M = ∂ − M ∪ ∂ + M, where ∂ + M is non-empty and has θ + [∂ + M] > 0, and θ − [∂ − M] < 0 if ∂ − M is non-empty. Then, if is an outermost MOTS, we have the estimate || ≤ C, where C depends only on M RmC 0 (M) , K C 1 (M) , injρ (M, g, K , ∂ M)−1 , and Vol(M). As the proof of the previous theorem does not assume that is connected, it also implies a bound on the number of components of an outermost MOTS. Corollary 6.6. Let (M, g, K ) be as above. Then there exists a constant N , depending only on M RmC 0 (M) , K C 1 (M) , injρ (M, g, K ; ∂ M)−1 , and Vol(M) such that any outermost MOTS has at most N components. Proof. Since outermost MOTS are stable, their curvature is bounded in view of Theorem 2.10. This implies a lower bound on the area of each component. From Theorem 6.5 we thus infer a bound on the number of components. 7. The Trapped Region In this section we examine the weakly outer trapped region, or more precisely the boundary of the weakly outer trapped region. We make the usual assumptions on (M, g, K ), that is (M, g, K ) is a smooth initial data set with ∂ M the disjoint union ∂ M = ∂ − M ∪ ∂ + M, where ∂ − M may be empty, but ∂ + M is non-empty, such that ∂ ± M are smooth, compact surfaces without boundary and θ + [∂ − M] < 0 with respect to the normal pointing into M and θ + [∂ + M] > 0 with respect to the normal pointing out of M. The definition of a trapped set and the trapped region below make sense only if θ + [∂ − M] < 0. However, we can circumvent this requirement for the main theorem as discussed in Remark 7.4 below. To define the weakly outer trapped region, we introduce the notion of a weakly outer trapped set.

The Area of Horizons and the Trapped Region

969

Definition 7.1. An open set ⊂ M with smooth embedded boundary ∂ is called a weakly outer trapped set if ∂ is the disjoint union ∂ = ∂ − M ∪ ∂ + , where ∂ + is a smooth, compact surface without boundary and θ + [∂ + ] ≤ 0 with respect to the normal pointing out of . Note that ∂ + is homologous to ∂ + M in this definition. Definition 7.2. The weakly outer trapped region is the union of all weakly outer trapped sets enclosing ∂ − M: T := . (7.1) is outer trapped

We will henceforth refer to T simply as the trapped region. If ∂ − M is non-empty, then the trapped region is non-empty as well, but if ∂ − M is empty it might happen that T is empty. In this case the statements below are void. Let ∂ − T := ∂ T ∩ ∂ − M and ∂ + T = ∂ T \∂ − M. The definition of T is analogous to the set out,M in [KH97, Def. 3]. It is known in the literature that provided ∂ + T is smooth, it satisfies θ + = 0 [HE73,KH97]. The most general result about ∂ + T we are aware of is [KH97, Prop. 7], which asserts that if ∂ + T is C 0 and piecewise smooth, then it is smooth and satisfies θ + = 0. In contrast, we do not assume any initial regularity for ∂ + T for the following theorem. Theorem 7.3. Let (M, g, K ) be such that ∂ M is the disjoint union ∂ M = ∂ + M ∪ ∂ − M such that θ + [∂ − M] < 0 if ∂ − M is non-empty, and ∂ + M is non-empty and has θ + [∂ + M] > 0. Let T be the trapped region in M. If T is non-empty, then ∂ T is the disjoint union ∂ T = ∂ − T ∪ ∂ + T of smooth, compact surfaces without boundary, where ∂ − T = ∂ − M and ∂ + T is a smooth stable MOTS homologous to ∂ + M. Remark 7.4. If (M , g , K ) is a data set where ∂ − M is only a weak barrier θ + [∂ − M ] ≤ ˜ g, ˜ K˜ ) such that ∂ − M˜ is a strong barrier 0, then (M , g , K ) can be modified to ( M, + − ˜ θ [∂ M] < 0. This construction was already used in Sect. 5. The trapped region T˜ ⊂ M˜ of this extension is such that ∂ + T˜ ⊂ M , that is, it lies in M , since the region bounded by ∂ − M˜ and ∂ − M is a trapped set. However, it might be possible that ∂ + T˜ ∩∂ − M = ∅. In this case the intersection ∂ + T˜ ∩ ∂ − M is a sub-collection of the components of ∂ − M which are stable MOTS. Remark 7.5. If the dominant energy condition holds, then ∂ + T is a collection of spheres or tori [HE73,AK03,GS06]. The proof is along the lines of [HI01, Sect. 4]. Before we begin the proof of the theorem we prove some lemmas, which essentially replace the maximum principle, which is not as powerful for MOTS, as it is for minimal surfaces. Lemma 7.6. Let (M, g, K ) be an initial data set as in Theorem 7.3. Let 1 ⊂ M and 2 ⊂ M be open sets such that ∂ i is the disjoint union ∂ i = ∂ − M ∪ ∂ + i , where i j i is the union of disjoint, stable, con∂ + i is smooth, embedded, and ∂ + i = Nj=1 nected MOTS i , i = 1, 2. Then for any δ > 0, there exists 1 ⊂ 1 and data K on M with the following properties: j

1. ∂ 1 = ∂ − M ∪ ∂ + 1 , 2. ∂ + 1 and ∂ + 2 intersect transversally,

970

L. Andersson, J. Metzger

3. dist(∂ + 1 , ∂ + 1 ) < δ, 4. K ∈ C 1,1 (M) and K = K on M\ 1 , 5. θ + on ∂ + 2 ∩ M\ 1 computed with respect to K is at most its value with respect to K , and 6. there exists a foliation s , s ∈ (−ε, 0] of 1 \ 1 such that 0 = ∂ + 1 and θ + [s ] < 0 with respect to the data K . Proof. By pushing the components of ∂ + 1 into 1 , as in the proof of Theorem 5.1, while changing the data K to K near components of ∂ 1 which are stable but not strictly stable, we can construct K and a foliation s near ∂ 1 such that each s has θ + [s ] < 0, thus satisfying Properties 1, 4 and 6. By Sard’s theorem, s and ∂ + 2 intersect transversally for almost every s ∈ (−ε, 0). Hence we can pick one such s, for which also Properties 2 and 3 are satisfied. Property 5 follows by construction, as we were subtracting a non-negative definite tensor from K to obtain K . Subsequently, for two sets 1 , 2 we denote by 1 2 the symmetric difference, defined by 1 2 = ( 1 \ 2 ) ∪ ( 2 \ 1 ). Lemma 7.7. Let (M, g, K ), 1 and 2 be as in the previous lemma. Assume furthermore that 1 2 = ∅. Then there exists ⊃ 1 ∪ 2 , such that ∂ is the disjoint union ∂ = ∂ − M ∪ ∂ + , where ∂ + is an embedded stable MOTS. Any connected component of ∂ + 1 which intersects 2 , lies in the interior of . Proof. There is nothing to prove if ∂( 1 ∪ 2 ) is a smooth embedded manifold. Thus we can assume that ∂ + 1 and ∂ + 2 intersect. Fix δ > 0 to be the distance at which we can apply Proposition 6.1 in (M, g, K ). We use Lemma 7.6, to deform 1 and K to 1 and K with the stated properties for this choice of δ. As ∂ + 1 and ∂ + 2 intersect transversally, Lemma 2.14 allows us to smooth out the corner of ∂( 1 ∪ 2 ) in the outward direction. Furthermore, all stable components of ∂ + 1 which were touching ∂ 2 but not intersecting 2 give rise to components of ∂ + 1 , which are disjoint of ∂ + 2 and at a distance at most δ to ∂ + 2 . Thus we can apply the surgery procedure of Proposition 6.1 to join these components to ∂ 2 . This yields an open set with ⊃ 1 ∪ 2 and ∂ is the disjoint union ∂ = ∂ − ∪ ∂ + , where ∂ − = ∂ − M and ∂ + is C 1,1 and has θ + [∂ + ] ≤ 0 and θ + [∂ + ] ≡ 0, as θ + ≡ 0 on the components of ∂ + which were created from joining a component of ∂ + 1 to a component of ∂ + 2 . We can then use the flow from Lemma 5.2 to smooth out the boundary of , yielding ⊃ ⊃ 1 ∪ 2 with an open set. Note, by construction all components of ∂ + 1 and all components of ∂ + 2 which were joined with components from ∂ + 1 are contained in the interior of . Now an application of Theorem 5.1 to the data (M\ , g, K ), with inner boundary ∂ − (M\ ) = ∂ + , and outer boundary ∂ + M yields a set ⊃ with boundary ∂ the disjoint union ∂ = ∂ − M ∪ ∂ + , where ∂ + is a smooth, stable MOTS. By construction all components of ∂ + 1 and ∂ + 2 are in the interior of . Furthermore, an application of the strong maximum principle as in the proof of Theorem 5.1 implies that ∂ + can not penetrate the region 1 \ 1 as this is foliated by trapped surfaces. In particular all components of ∂ + 1 which meet ∂ + 2 are contained in the interior of . Remark 7.8. The preceding lemma implies the uniqueness of outermost MOTS. Proof of Theorem 7.3. Subsequently we assume that T is non-empty, and therefore (M, g, K ) contains trapped regions, as otherwise there is nothing to prove. We first

The Area of Horizons and the Trapped Region

971

show that we can define ∂ + T by a collection of sets with much more well-behaved boundaries. We define T to be the collection of all outer trapped sets , such that the outer boundary ∂ + satisfies the following four assumptions: 1. θ + [∂ + ] = 0; 2. every component of ∂ + is stable, and thus satisfies sup |A| ≤ C, where C is the constant from Theorem 2.10, and depends only on M RmC 0 (M) , K C 1 (M) and injρ (M, g, K ; ∂ M); 3. i + (∂ + ) ≥ δ where δ depending on the same data as C above is the δ from Theorem 6.3; 4. |∂ + | ≤ C, where C is the area resulting from Proposition 6.4 applied to ∂ + with i + (∂ + ) ≥ δ for the above δ. This C also depends only on injρ (M, g, K ; ∂ M), M RmC 0 (M) and K C 1 (M) . To this end, assume that is an outer trapped set, which does not lie in T . Then we construct a set ⊃ which lies in T by applying Theorem 6.3 and using Proposition 6.4 to prove the area estimate. We thus see that T = . ∈T

The first claim is that for each point p ∈ ∂ + T there exists ∈ T such that p ∈ ∂ + . Clearly, for every n there exists n such that dist(n , p) < n1 , where n = ∂ + n . We can now appeal to the compactness theorem [AM05, Theorem 1.3] for stable MOTS with bounded curvature and bounded area, which, after passing to a sub-sequence, yields a limit of ∂ + n in C 1,α . This is a smooth stable MOTS with bounded curvature and bounded area. Furthermore, is the outer boundary of a set , as the ∂ + n can eventually be represented as graphs over . However, is not necessarily embedded, as the limit of embedded surfaces might meet itself. As i + (∂ + ) ≥ δ, the only thing that prevents from being embedded are points where touches itself from the inside. To remedy this, we can replace the sequence of the n by a sequence n which is increasing in the sense that n ⊂ n+1 for all n. We proceed inductively and let 1 := 1 . Assume that we have constructed 1 ⊂ 2 ⊂ · · · ⊂ n−1 with k ∈ C T for k = 1, . . . , n − 1. Consider the set n ∪ n−1 . Either this set has a smooth embedded boundary, in which case we can use Theorem 5.1 to ensure the existence of n ⊃ n ∪ n−1 or n ∪ n−1 does not have a smooth boundary. Then Lemma 7.7 yields a barrier for Theorem 5.1 and allows us to construct n ⊃ n ∪ n−1 . By eventually applying Theorem 6.3, we can assume that n ∈ T . We will now relabel n := n and n := n . As explained above, there is a subsequence of the n such that the n converge in C 1,α to a stable MOTS which is the outer boundary of a set and has i + () ≥ δ, thus can not touch itself on the outside. Since the n are increasing, can not touch itself on the inside either. This follows from the fact that the n converge as graphs from the inside to . Thus if touches itself on the inside, so would the n . But each n is embedded, and hence is embedded and ∈ T . Next we show that ∂ + T consists of a smooth collection of MOTS. To this end assume first that 1 and 2 are such that the outer boundaries ∂ + k meet ∂ + T for k = 1, 2.

972

L. Andersson, J. Metzger

Let k be a component of ∂ + k that meets ∂ T . From Lemma 7.7 we infer that either 1 = 2 or dist(1 , 2 ) > 0. It follows that ∂ T is a collection of disjoint stable MOTS. Acknowledgements. The authors wish to thank Walter Simon, Marc Mars, Greg Galloway, Rick Schoen and Gerhard Huisken for helpful conversations. The second author would also like to thank Michael Eichmair and Leon Simon for their comments. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References [AG05]

Ashtekar, A., Galloway, G.J.: Some uniqueness results for dynamical horizons. Adv. Theor. Math. Phys. 9(1), 1–30 (2005) [AK03] Ashtekar, A., Krishnan, B.: Dynamical horizons and their properties. Phys. Rev. D 68(10), 104030 (2003) [AM05] Andersson, L., Metzger, J.L.: Curvature estimates for stable marginally trapped surfaces. http:// arXiv:org/abs/gr-qc/0512106, 2005 [AMS05] Andersson, L., Mars, M., Simon, W.: Local existence of dynamical and trapping horizons. Phys. Rev. Lett. 95, 111102 (2005) [AMS07] Andersson, L., Mars, M., Simon, W.: Stability of marginally outer trapped surfaces and existence of marginally outer trapped tubes. http://arXiv:org/abs/0704.2889v2[gr-qc], 2007 [BD04] Ben-Dov, I.: Penrose inequality and apparent horizons. Phys. Rev. D 70(12), 124031 (2004) + [CLZ 07] Campanelli, M., Lousto, C.O., Zlochower, Y., Krishnan, B., Merritt, D.: Spin flips and precession in black-hole-binary mergers. Phys. Rev. D 75, 064030 (2007) [CM99] Colding, T.H., Minicozzi, W.P. II.: Examples of embedded minimal tori without area bounds, internat. Math. Res. Notices 99(20), 1097–1100 (1999) [Dea03] Dean, B.: Compact embedded minimal surfaces of positive genus without area bounds. Geom. Dedicata 102, 45–52 (2003) [Eic07] Eichmair, M.: The plateau problem for apparent horizons. http://arXiv:org/abs/0711.4139, 2007 [GS06] Galloway, G.J., Schoen, R.: A generalization of hawking’s black hole topology theorem to higher dimensions. Commun. Math. Phys. 266(2), 571–576 (2006) [GT98] Gilbarg, D., Trudinger, N.S.: Elliptic partial differential equations of second order. Rev. 3. printing. Second ed., Berlin-Heidelberg-New York: Springer-Verlag, 1998 [HE73] Hawking, S.W., Ellis, G.F.R.: The large scale structure of space-time. Cambridge Monographs on Mathematical Physics, No. 1. London: Cambridge University Press, 1973 [HI01] Huisken, G., Ilmanen, T.: The inverse mean curvature flow and the riemannian penrose inequality. J. Differ. Geom. 59(3), 353–437 (2001) [HP99] Huisken, G., Polden, A.: Geometric evolution equations for hypersurfaces. In: Calculus of variations and geometric evolution problems (Cetraro, 1996), Lecture Notes in Math., vol. 1713, Berlin: Springer, 1999, pp. 45–84 [Jan78] Jang, P.S.: On the positivity of energy in general relativity. J. Math. Phys. 19, 1152–1155 (1978) [KH97] Kriele, M., Hayward, S.A.: Outer trapped surfaces and their apparent horizon. J. Math. Phys. 38(3), 1593–1604 (1997) [KLZ07] Krishnan, B., Lousto, C.O., Zlochower, Y.: Quasi-local linear momentum in black-hole binaries. Phys. Rev. D 76, 081501 (2007) [Lie96] Lieberman, G.M.: Second order parabolic differential equations. River Edge, NJ: World Scientific Publishing Co. Inc., 1996 [NR06] Nabutovsky, A., Rotman, R.: Curvature-free upper bounds for the smallest area of a minimal surface. Geom. Funct. Anal. 16(2), 453–475 (2006) [Sch04] Schoen, R.: Talk given at the Miami Waves conference. January 2004 [SSY75] Schoen, R., Simon, L., Yau, S.T.: Curvature estimates for minimal hypersurfaces. Acta Math. 134(3–4), 275–288 (1975) [SY81] Schoen, R., Yau, S.-T.: Proof of the positive mass theorem. II. Commun. Math. Phys. 79(2), 231–260 (1981) [SY94] Schoen, R., Yau, S.-T.: Lectures on differential geometry. Conference Proceedings and Lecture Notes in Geometry and Topology, Boston: International Press, 1994 [Yau01] Yau, S.-T.: Geometry of three manifolds and existence of black hole due to boundary effect. Adv. Theor. Math. Phys. 5(4), 755–767 (2001) Communicated by G.W. Gibbons

Commun. Math. Phys. 290, 973–996 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0796-2

Communications in

Mathematical Physics

Standing Ring Blow up Solutions to the N-Dimensional Quintic Nonlinear Schrödinger Equation Pierre Raphaël1 , Jérémie Szeftel2,3 1 Institut de Mathématiques, Université Paul Sabatier, Toulouse, France 2 Institut de Mathématiques, Université Bordeaux 1, Bordeaux, France 3 Department of Mathematics, Fine Hall, Princeton University, Princeton,

NJ 08544-1000, USA. E-mail: [email protected] Received: 23 September 2008 / Accepted: 18 January 2009 Published online: 1 April 2009 – © Springer-Verlag 2009

Abstract: We consider the quintic nonlinear Schrödinger equation i∂t u = −u−|u|4 u in dimension N ≥ 3. This problem is energy critical in dimension N = 3 and energy super critical for N ≥ 4. We prove the existence of a radially symmetric blow up mechanism with L 2 concentration along the unit sphere of R N . This singularity formation is moreover stable by smooth and radially symmetric perturbation of the initial data. This result extends the result obtained for N = 2 in [29] and is the first result of description of a singularity formation in the energy supercritical class for (NLS) type problems. Our main tool is the proof of the propagation of regularity outside the blow up sphere in the presence a so-called log-log type singularity.

1. Introduction 1.1. Setting of the problem. We consider in this paper the quintic N dimensional focusing nonlinear Schrödinger equation (N L S)

i∂t u = −u − |u|4 u, (t, x) ∈ [0, T ) × R N , u(0, x) = u0 (x), u0 : R N → C,

(1)

with u0 ∈ H N (R N ). It is a special case of the more general system: iut = −u − |u| p−1 u.

(2)

Let us recall that for p ≥ 3 an odd integer, the smoothness of the nonlinearity ensures the existence of a local flow for smooth enough initial data, see Ginibre and Velo [12,13] or the monograph [5]. In particular for u0 ∈ H N , there exists 0 < T ≤ +∞ such that u(t) ∈ C([0, T ), H N ) and either T = +∞, we say the solution is global, or T < +∞ and then lim supt↑T u(t) H N = +∞, we say the solution blows up in finite time.

974

P. Raphaël, J. Szeftel

Smooth solutions satisfy the following three conservation laws: |u(t, x)|2 dx = |u0 (x)|2 dx; L 2 − norm : 1 |u(t, x)| p+1 dx = E(u0 ); Energy : E(u(t)) = 21 |∇u(t, x)|2 dx − p+1 Momentum : I m ∇uu(t, x)dx = I m ∇u0 u0 (x)dx . Moreover, there holds the scaling symmetry: if u(t, x) solves (2), then ∀λ > 0, so does: 2

uλ (t, x) = λ p−1 u(λ2 t, λx). The condition uλ (t) H˙ xsc = u(λ2 t) H˙ xsc computes the critical Sobolev exponent of (2), explicitly: sc =

2 N − . 2 p−1

It is well known that for sc ≤ 1, the local Cauchy theory may be extended to the energy space H 1 , see Ginibre and Velo [11]. Moreover, for p < 1 + N4 , i.e. sc < 0, the problem is L 2 subcritical and all H 1 solutions are global and bounded in H 1 . On the other hand, for sc ≥ 0, finite time blow up is known to possibly happen from the celebrated virial argument, see [32 and 14]. While the virial argument ensures the existence of finite time blow up solutions for all the L 2 critical and super critical problems sc ≥ 0, very little is known in general on the description of the blow up dynamics. In fact, for (NLS) type problems, only two explicit blow up dynamics are known and only for the so-called L 2 critical case p = 1 + N4 i.e. sc = 0 in dimension N ≤ 5, see [4,22]. In particular, the series of works [9,22– 26,28] provide for the L 2 critical case a precise description of the so-called log-log blow up regime which is a stable in H 1 finite time blow up dynamic with universal speed given by a double log correction to the self similar law. More precisely, a blow solution of log-log type looks near blow up time like 1 x − x(T ) iγ (t) e u(t, x) = Q + u(t, ˜ x), x(T ) ∈ R N (3) N λ(t) λ(t) 2 for some phase parameter γ (t) and a universal blow up speed given by the log-log law: log|log(T − t)| C |∇u(t)| L 2 ∼ ∼C . λ(t) T −t Here the universal blow up profile Q is given by the ground state solution to (2) with p = 1 + N4 which is the unique nonnegative radially symmetric solution in H 1 to 4

Q − Q + Q 1+ N = 0,

(4)

see [2,10,18]. Moreover, while the singular part in the decomposition (3) focuses a Dirac mass in L 2 at the singular point x(T ) ∈ R N : 2 1 x − x(T ) iγ (t) e Q Q 2 δx(T ) as t → T N λ 2 (t) λ(t)

Standing Ring Blow up Solutions to N-Dimensional NLS Equation

975

in the weak sense of measures, the regular part u˜ remains L 2 smooth at blow up: u˜ → u∗ in L 2 as t → T and |u∗ |2 = |u0 |2 − Q 2 .

(5)

The log-log regime corresponds to a mechanism of ejection of mass outside the singularity which stabilizes the blow up dynamic but induces a strong coupling between the regular and the singular part of the solution. This coupling forces a striking universal singular behavior of the limiting profile u∗ given by (5). Indeed, u∗ ∈ L 2 from (5) but: / L p (|x − x(T )| < R), ∀R > 0, ∀ p > 2, u∗ ∈

(6)

see [26]. 1.2. Statement of the result. Let us now focus on the case p = 5, i.e. sc =

N −1 . 2

Then (1) is L 2 critical sc = 0 in dimension N = 1, energy subcritical sc = 21 for N = 2, energy critical sc = 1 in dimension N = 3 and energy super critical sc > 1 for N ≥ 4. In [29], Raphaël observed that the one dimensional log-log type dynamics (3) may be used to exhibit spherically blow up dynamics on the unit sphere for the two dimensional problem. A central issue in the analysis -as will become clear from the strategy of the proof below- is the regularity of the excess of mass u˜ in the decomposition (3) outside the blow up point in order to gain with respect to the pathological rough global regularity (6). In fact, while only 1/2 a derivative gain was enough to close the analysis in N = 2, the uniform control of at least sc = N 2−1 derivatives is required in dimension N . Our main claim is that arbitrarily many derivatives can in fact be propagated outside a log-log singularity from smooth enough radially symmetric initial data. This allows us to carry over the program of reduction of the N dimensional problem to a leading order one dimensional blow up problem in any dimension N ≥ 3 despite the energy supercritical nature of the problem. Theorem 1 (Existence and stability of a solution blowing up on a sphere in R N ). Let Q be the one dimensional ground state solution to (4) with p = 5, explicitly Q(x) =

3 2 ch (x)

1 4

.

N (R N ) such that the following holds true. Let There exists an open subset P ⊂ Hrad u0 ∈ P, then the corresponding solution u(t) to (1) blows up in finite time 0 < T < +∞ according to the following dynamics. There exist λ(t) > 0, r (t) > 0 and γ (t) ∈ R such that r − r (t) iγ (t) 1 e Q → u∗ (r ) in L 2 as t → T. (7) u(t, r ) − 1 λ(t) λ(t) 2

Here the radius of the singular circle converges r (t) → r (T ) > 0 as t → T

(8)

976

P. Raphaël, J. Szeftel

and λ(t) Moreover,

N −1 2

log|log(T − t)| T −t

1 2

→

√ 2π as t → T. |Q| L 2

(9)

derivatives propagate outside the singularity: ∀R > 0, u∗ ∈ H

N −1 2

(|r − r (T )| > R).

(10)

Comments on Theorem 1. 1. Regularity outside the blow up point: In addition to the log-log analysis, the core of our argument relies on propagating regularity outside the blow up sphere. More generally, we expect our strategy to be a starting point to prove some propagation of regularity outside the blow up point for the L 2 critical problem, in particular in the physically relevant case N = 2. Note that we do not claim any sharpness in the loss of derivatives N −1 from H N initial regularity to H 2 boundedness of u˜ outside the singularity. Here the question of optimal regularity in the energy space is open: do H 1 log-log blow up data produce an asymptotic profile u∗ ∈ H 1 outside the blow up point or is there always a loss of regularity? Note that the other known blow up dynamic for the L 2 critical regime -which is expected to be the unstable one, see [4]- does produce H 1 profiles, see [26] for a further discussion of this issue. More generally, the question of which profiles u∗ can be obtained as exterior profiles of a log-log dynamics is open. This corresponds to understanding in some sense the range of the nonlinear wave operator associated to the log-log blow up dynamics. We expect the gain of regularity to be a useful tool towards the understanding of this issue. 2. More general problems: We are focused in this work on the radial dynamics on the circle which reduces to the one dimensional log-log dynamic. However, this approach formally extends to a larger class of problems and the strategy of reducing the leading order part of the N dimensional flow to a lower dimensional dynamic using suitable symmetries seems very general. We expect our strategy of gain of regularity to be robust enough to open up the way towards this kind of generalized result. Note that in general, the question of which sets are admissible blow up sets for a super critical (NLS) problem is open. Moreover, on the contrary to the proof in [29], the propagation of regularity does not rely anymore on the smoothing effect for the linear Schrödinger flow but rather on interior regularity estimates which follow from energy techniques. In particular the combination of the techniques developed in [27] and in this paper would allow one to obtain Theorem 1 for (1) posed in the interior of a ball with Dirichlet boundary condition. 3. More ring solutions: Let us mention that in the numerical and formal breakthrough work [8], radially symmetric collapsing ring solutions to (2) in the range 1+ N4 < p < 5, N = 2, 3, are numerically observed. These are solutions which focus on spheres but with a time dependent radius collapsing to zero at blow up time. These solutions like our standing ring solutions seem to generically generate stable singular dynamics in the radial class. The rigorous derivation of such collapsing ring solutions and more generally the role of these solutions in a more global theory of super critical blow up are important open problems. 4. Energy critical and super critical problems: In dimension N = 3, Kenig and Merle derived in [16] a sharp criterion for the global existence of solutions to (1) related to the exact size of the corresponding ground state. The question of the structure of the blow

Standing Ring Blow up Solutions to N-Dimensional NLS Equation

977

up solutions above the minimal energy is open. Theorem 1 provides an explicit example of so-called type I blow up solutions, that is solutions for which |∇u(t)| L 2 → ∞ as t → T . At the energy critical level, one expects the existence of type II blow up solutions for which |∇u(t)| L 2 < +∞ as t → T but the solution forms a Dirac mass in H˙ 1 as t → T . See Rodnianski, Sterbenz [30], Krieger, Schlag, Tataru [17] for the construction of explicit such solutions for energy critical semilinear wave equations. Let us also say that for energy super critical problems, very little is known on the long time dynamics or the singularity formation. In particular, shock waves type solutions are expected. N large Theorem 1 shows that such dynamics may be ruled out for an open set of Hrad initial data, but of course this has to do with the very specific regime we are considering. 5. H N energies: The proof of propagation of regularity relies partly on the use of H N almost energies for (NLS) -see the strategy of the proof below. Similar quantities were used for KdV problems, see Kato [15], Martel [21]. Clearly the use of these modified energies would allow one to derive polynomial bounds for high Sobolev norms for the defocusing (NLS), at least for N = 1, as first obtained with different tools by Bourgain [3] and Staffilani [31]. 1.3. Strategy of the proof. The basic heuristics is that in radial variables, the problem becomes: i∂t u + ∂r2 u +

N −1 ∂r u + u|u|4 = 0, r

and hence one can expect that if the singularity formation happens along the circle r = 1, then ∂r u 2 r ∼ |∂r u| << ∂r u and the leading order blow up dynamics should be given by the one dimensional quintic (NLS) for which a stable log-log dynamics is known. The strategy is then to consider solutions to (1) of the form: 1 ˜ r − r (t) iγ (t) u(t, r ) = e + u(t, ˜ r ). (11) Q b(t) 1 λ(t) λ(t) 2 Here the profiles Q˜ b are suitable self similar deformations of the ground state Q and u˜ is chosen to be initially small in H N . The log-log regime corresponds to a regime with the following behavior of the geometrical parameters describing the singular part of the solution:

C C T −t b(t) b(t) ∼ , λ(t) ∼ e−e , r (t) ∼ 1. (12) ∼ log|log(T − t)| log|log(T − t)| The derivation of these controls corresponds to the study of the blow up problem on the singular sphere r ∼ 1. However, a fundamental issue in the log-log proof is that it requires the fact that globally in space, the excess of mass u˜ is such that its potential energy is negligible with respect to its kinetic energy: |u| ˜ 6 << |∇ u| ˜ 2. (13) RN

RN

978

P. Raphaël, J. Szeftel

It is a consequence of the fact that the log-log analysis in [22,25] uses in a dynamical way the conservation of the energy which is a global information in space. In order to prove (13), we use the radiallity and split the space in two regions. For r ≥ 21 , we may use the one dimensional Sobolev embedding to conclude: |u| ˜ 6 |u| ˜ 4L 2 |∇ u| ˜ 2, RN

r ≥ 21

and it thus suffices to ensure |u| ˜ L 2 << 1 which follows from the L 2 conservation law. Near the origin which is the supercritical zone, the Sobolev embedding is much worse: |u| ˜ 6 |u| ˜ 4 N −1 |∇ u| ˜ 2L 2 . 1 r ≤ 12

H

2

(r ≤ 2 )

We thus face a problem of propagation of the regularity outside the blow up sphere and need to prove: ∀t ∈ [0, T ), |u(t)| ˜

H

N −1 2 (r ≤ 1 ) 2

<< 1.

(14)

A specific feature of the log-log analysis, [25], is that the leading order term of u˜ is essentially explicit and given by the so-called radiation: 1 r − r (t) iγ (t) e u(t, ˜ r) ∼ ζ 1 b(t) λ(t) λ(t) 2 for some smooth profile with a very slow decay rate: |ζb (t)| H N ∼

1 as t → T. |log(T − t)|C

(15)

This anomalously slow decay is eventually responsible for the rough behavior (6) of u∗ on the blow up point, see [26]. Indeed, observe that it implies that the best global bound on higher derivatives we may hope for u˜ is: ∀1 ≤ p ≤ N , |u(t)| ˜ Hp ∼

1 |ζb(t) | H p ∼ . p λ (t) |log(T − t)|C λ p (t)

(16)

Note that from (12), the extra logarithmic decay provided by (15) is very small compared to λ1 , and hence (16) is very far from (14). However, the dispersive estimate (15) is enough to gain the integrability: T |∇ u(t)| ˜ 2L 2 dt < +∞. (17) 0

This estimate is fundamental for our analysis as it shows that u˜ behaves in a smoother way than the singular part of the solution which does not satisfy (17). It is the starting point to gain regularity outside the blow up set. Indeed, after localization in space and using interior regularity type of estimates for the linear Schrödinger flow, (17) implies: |u(t)| ˜

1

H2

1 1 4 ≤r ≤ 2

<< 1, ∀t ∈ [0, T ).

(18)

Standing Ring Blow up Solutions to N-Dimensional NLS Equation

979

In dimension N = 2, this corresponds to the gain of 21 = N 2−1 derivatives (14) and hence the possibility of a pure H 1 theory, [29]. For N ≥ 3, we need to propagate N 2−1 derivatives and argue as follows. The first step is to prove a global H N bound on the full solution |u(t)| H N

C , ∀t ∈ [0, T ), λ N +δ (t)

(19)

for some small loss 0 < δ << 1. Equivalently, this means an H N control of the small excess of mass in rescaled variables. This estimate is derived using the observation that (1) admits H N almost energies. These would correspond to the exact sequence of conservation laws for the integrable (NLS), p = 3 in dimension N = 1. Of course such exact energies no longer exist in our setting for N ≥ 2 but approximate energies E N can be exhibited which provide enough cancellation to ensure a structure of the form: d 2 |u| H N E N (u) and E N (u) |u|2−α for some 0 < α < 2. HN dt The gain 2 − α < 2 is the key to bootstrap the global rough bound (19). The second step is to use iteratively space localization outside the singular circle and interior regularity estimates for the linear Schrödinger flow. We claim that the initial gain of 1/2 a derivative (18) together with the global bound (19) are in fact enough to propagate (14). In both steps of the proof, we use in a crucial way the fact that the log-log singularity is an almost self similar blow up regime which allows us to gain from various time integrations. Notations. For an integer l, we define the differential operator: p for l = 2 p, ∇l = ∇ p for l = 2 p + 1.

(20)

Otherwise, for s > 0 non integer, ∇ s will denote the Fourier integral operator ∇ s f (ξ ) = |ξ |s fˆ(ξ ). Moreover, we let [x]+ = max{0, x}. 2. Propagation of Regularity in the Log-Log Regime This section is devoted to the proof of Theorem 1. We start by explicitly describing the set of initial data which lead to the standing ring log-log blow up scenario. The core of the argument is a bootstrap on both the log-log dynamics and the propagation of regularity near the origin. We then focus on the proof of propagation of regularity which follows in four steps: global H N bound using modified H N energies, initial gain of 1/2 a derivative in annuli away from the origin using the extra integrability of u˜ (17), refined bound on annuli and eventually control of N 2−1 derivatives near the origin using interior regularity estimates for the Schrödinger flow.

980

P. Raphaël, J. Szeftel

2.1. Setting of the bootstrap. We set up in this subsection the main bootstrap argument. For the sake of completeness, we explicitly exhibit the open set P of initial data leading to the standing ring blow up dynamics described by Theorem 1, as well as the complete set of estimates needed to close the blow up analysis. Here we follow closely [29] and refer to it for further explanations. Let us first recall from [23] the existence of a one parameter family of localized self similar profiles which are a refinement of the Q blow up profile, see Proposition 1 in [29]. Given b, η > 0 small enough, there exists a unique even solution Q b to ⎧ 2 1 4 ⎪ ⎨ ∂y Q b − Q b + ib 2 Q b + y∂y Q b + Q b |Q b | = 0, b|y|2

Pb = Q b ei 4 > 0 for y ∈ [0, b2 (1 − ⎪ η)), ⎩ Q b (0) ∈ (Q(0) − η, Q(0) + η), Q b b2 (1 − η) = 0,

and Q b is a smooth function of b with Q b=0 = Q. We now define 2(1 − η) ˜ Qb = Qbχ y b for some smooth radial cut off function χ (r ) = 1 for r ≤ 21 , χ (r ) = 0 for r≥ 1. We also let π

b = e − b

which is the leading order size of the radiative parts of the log-log solutions in rescaled variables. Given two parameters λ, r > 0, we denote µλ,r (y) = (λy + r ) N −1 1λy+r ≥0 the weight of the Lebesgue measure in rescaled variables. We now describe the set P of admissible initial data: Definition 1 (Geometrical description of the set P). Let 0 < α ∗ denote some small enough universal number. We let P be the set of radial distributions u0 ∈ HrN of the form r − r0 iγ0 1 ˜ 1 ˜ r − r0 iγ0 u0 (r ) = 1 ( Q b0 + ε0 ) e = 1 Q b0 e + u˜ 0 (r ) (21) λ0 λ0 λ02 λ02 with the following controls: (i) Localization of the singular circle: |r0 − 1| < α ∗ ;

(22)

(ii) Closeness to Q on the singular circle: 0 < b0 + |u˜ 0 | L 2 < α ∗ , and ε0 (y) satisfies the orthogonality conditions Re ε0 , |y|2 Q˜ b0 = Re ε0 , y Q˜ b0 = −I m ε0 , 2 Q˜ b0 = −I m ε0 , Q˜ b0 = 0,

(23)

Standing Ring Blow up Solutions to N-Dimensional NLS Equation

981

and the smallness estimate 2 N 2 (|∂y ε0 (y)| + |∂y ε0 (y)| )µλ0 ,r0 (y)dy +

6

|y|≤ b10

|ε0 (y)|2 e−|y| dy < b70 ; (24)

0

(iii) Normalization of the energy and the localized momentum: 2 λ0 |E 0 | + λ0 I m ∂r u0 u0 < b100 ; 1 |r −1|≤

(25)

2

(iv) u0 is in the log-log regime: e−e

2π b0

< λ0 < e−e

π 2b0

;

(26)

|u˜ 0 | H N (|r −1|≥ 1 ) < α ∗ .

(27)

(v) H N smallness outside the singular circle: 10

Remark 1. Note that a standard application of the implicit function theorem ensures that N , see [29]. P is an open non empty set of Hrad Let now u0 ∈ P and u(t) be the corresponding solution to (1) with [0, T ), 0 < T ≤ +∞, its maximum time interval existence in H N . Then the continuity u ∈ C([0, T ), HrN ) and standard arguments from modulation theory ensure the existence of t1 ∈ [0, T ] such that for all t ∈ [0, t1 ), u(t) ∈ P and admits a geometrical decomposition r −r (t) iγ (t) 1 r −r (t) iγ (t) 1 ˜ ˜ e e u(t, r ) = ( Q b(t) +ε) = +u(t, ˜ r ), Q b(t) 1 1 λ(t) λ(t) λ(t) 2 λ(t) 2 (28) where the uniqueness of the decomposition is ensured through the set of orthogonality conditions: Re ε(t), |y|2 Q˜ b(t) = Re ε(t), y Q˜ b(t) = −I m ε(t), 2 Q˜ b(t) = −I m ε(t), Q˜ b(t) = 0 for some continuous functions of time (λ(t), r (t), γ (t)). We may thus assume: ∀t ∈ [0, t1 ), √ 1 1 ∗ 10 |r (t) − 1| < α ∗ , 0 < b(t) < (α ∗ ) 10 , |u(t)| ˜ L 2 < (α ) , 3 4 |∂y ε(t, y)|2 µλ(t),r (t) (y)dy + |ε(t, y)|2 e−|y| dy < b(t) , 10 |y|≤ b(t)

2 2 , λ(t) I m ∇ψ · ∇u(t)u(t) < b(t) . λ2 (t)|E 0 | < b(t) Moreover, let the rescaled time t 3π dτ s= + s0 , s0 = e 4b0 , s1 = s(t1 ), 2 0 λ (τ )

(29)

982

P. Raphaël, J. Szeftel

then ∀s ∈ [s0 , s1 ),

10π π 10π π b(s) 10b(s) < b(s) < , e−e < λ(s) < e−e 10logs logs

(30)

and the scaling parameter is almost nonincreasing: ∀s0 ≤ s2 ≤ s3 ≤ s1 , λ(s3 ) ≤ 3λ(s2 ).

(31)

These estimates correspond to a control of the log-log dynamic on the circular circle. In order to control the propagation of regularity, we bootstrap a rough global H N bound: |u(t)| H N ≤

1 , [λ(t)] N +δ

(32)

and better bounds after excision of the singularity and localization near the origin: ∀1 ≤ k ≤ N , |u(t)|

k

H N − 2 (r ≤ 21 )

≤

1 , [λ(t)] N −k+(1+k)δ

(33)

1

|u(t)| H N −1 (r ≤ 1 ) ≤ (α ∗ ) 10 . 2

(34)

2

We then claim that provided that δ = δ(N ), α ∗ > 0 have been chosen small enough, we may bootstrap these estimates according to the following proposition which is the core of the proof: Proposition 1 (Bootstrap on the log-log regime). For all t ∈ [0, t1 ), we may bootstrap the log-log controls: 2

1

1

∗ 5 |r (t) − 1| < (α ∗ ) 3 , 0 < b(t) < (α ∗ ) 5 , |u(t)| ˜ L 2 < (α ) , 4 5 |∂y ε(t, y)|2 µλ(t),r (t) (y)dy + |ε(t, y)|2 e−|y| dy < b(t) , 10 |y|≤ b(t)

4 4 , λ(t) I m ∇ψ · ∇u(t)u(t) < b(t) , λ2 (t)|E 0 | < b(t) 5π π 5π π b(t) 5b(t) < b(s) < , e−e < λ(t) < e−e , 5logs logs ∀s0 ≤ s2 ≤ s3 ≤ s1 , λ(s3 ) ≤ 2λ(s2 ).

Moreover, there holds the improved regularity estimates: |u(t)| H N ≤

1 , 2[λ(t)] N +δ

∀1 ≤ k ≤ N , |u(t)|

(35) k

H N − 2 (r ≤ 21 ) 1

|u(t)| H N −1 (r ≤ 1 ) ≤ (α ∗ ) 5 . 2

2

≤

1 2[λ(t)] N −k+(1+k)δ

,

(36) (37)

The rest of the paper is now devoted to the proof of the improved regularity bounds (35), (36), (37). The proof of the improved log-log bounds then follows as in [29], and then Proposition 1 implies Theorem 1 as in [29] again. This is left to the reader.

Standing Ring Blow up Solutions to N-Dimensional NLS Equation

983

2.2. Modified H N energies and proof of (35). In this section, we prove (35) using the bootstrap assumptions (32), (33). The proof relies on the specific structure of the nonlinearity and the derivation of modified H N energies. The error term in the corresponding pseudo-conservation law is controlled using the a priori knowledge of the blow up speed. Here in particular the fact that the log-log regime is almost self similar is crucial and allows one not to lose in the process -see also Colliander, Raphaël [7] for a related strategy. Even though the proof is very different, the estimate (35) is reminiscent of the H 2 type estimates derived in the context of H 1 critical wave maps by Rodnianski, Sterbenz [30]. Step 1. Computation of the H N energy. We claim the following pseudo-conservation law of the H N energy where we use the notation (20): Lemma 1 (H N pseudo energy). Let the H N energy: N 2 N −1 2 4 N −1 2 2 2 u| |u| − 2Re u) u |u| , E N (u) = |∇ u| − 3 |∇ (∇ then:

d E N (u) |∇ N u| + |∇ j1 u| · · · |∇ j5 u| |∇ l1 u| · · · |∇ l5 u| dt + |∇ N −1 u|2 (|∇u|2 + |u|6 )|u|2 ,

(38)

(39)

where l1 , . . . , l5 and j1 , . . . , j5 satisfy: l1 + · · · + l5 = N , lm ≤ N − 1 for m = 1, . . . , 5, and j1 + · · · + j5 = N − 2. (40) Remark 2. To ease notations, we have omitted in (39) the summation symbol over lk , jk ≥ 0 satisfying (40) and we will use this convention all along the paper. Proof of Lemma 1. First compute from (1): 1 d N 2 N N N −2 N 4 [−i∂t u−u|u| ] |∇ u| = Re ∇ (∂t u)∇ u = Re ∇ (∂t u)∇ 2 dt (41) = Re ∇ N −1 (∂t u)∇ N −1 (u|u|4 ) . We now exploit a cancellation in the above RHS which is of similar nature like the one leading to the conservation of the H 1 Hamiltonian. Indeed, we expand from Leibniz rule: ∇ N −1 (u|u|4 ) = ∇ N −1 (u3 u2 ) = 3|u|4 ∇ N −1 u+2u2 |u|2 ∇ N −1 u+

Cl1 ···l5 ∇ l1 u · · · ∇ l5 u.

l1 +···+l5=N −1, lj ≤N −2

Hence:

Re ∇ N −1 (∂t u)∇ N −1 (u|u|4 ) ⎛ = Re ⎝∇ N −1 (∂t u)[3|u|4 ∇ N −1 u + 2u2 |u|2 ∇ N −1 u +

l1 +···+l5 =N −1, lj ≤N −2

3 4 |u| ∂t |∇ N −1 u|2 + Re(u2 |u|2 ∂t (∇ N −1 u)2 ) + terms of type 2 ∇ N −1 (∂t u)∇ l1 u · · · ∇ l5 u with l1 + · · · + l5 = N − 1, lj ≤ N − 2.

=

⎞ Cl1 ···l5 ∇ l1 u · · · ∇ l5 u]⎠

984

P. Raphaël, J. Szeftel

We now integrate by parts in time. Let A = A1 + A2 = 3 |∇ N −1 u|2 ∂t |u|4 + 2Re (∇ N −1 u)2 ∂t (u2 |u|2 ) ,

(42)

then we get: d E N (u) ≤ |A| + C |∇ N −2 (∂t u)||∇ l1 u| · · · |∇ l5 u| dt ≤ |A| + C |∇ N u| + |∇ j1 u| · · · |∇ j5 u| |∇ l1 u| · · · |∇ l5 u|, where (lm )1≤m≤5 , (jm )1≤m≤5 satisfy (40). It remains to treat A1 , A2 in (42). For A1 , we use (1) and keep integrating by parts to get: N −1 2 2 N −1 2 2 4 u| |u| Re(u∂t u) = 12 |∇ u| |u| I m u(u + u|u| ) |A1 | = 12 |∇ ≤ C |∇ N −1 u|2 (|∇u|2 + |u|6 )|u|2 +C |∇ N u| + |∇ j1 u| · · · |∇ j5 u| |∇ l1 u| · · · |∇ l5 u|. A2 is treated similarly and (39) follows. This concludes the proof of Lemma 1. Remark 3. We have gotten rid of the term containing a priori |∇ N u|2 in the RHS of (41) which would have been uncontrollable. Note that we may assume without loss of generality that the indices in (40) satisfy: l1 ≥ · · · ≥ l5 , and j1 ≥ · · · ≥ j5 .

(43)

We now split the integral in the RHS of (39) in two zones: r ≤ 41 , r ≥ 41 . In the next two steps, we evaluate the RHS in each zone. Step 2. Control of the RHS of (39) for r ≥ 41 . In this zone, we use the radial assumption which ensures a better control of the nonlinearity using the one dimensional Sobolev embeddings. One should have in mind that the estimates in this zone include the singular part of the solution and hence the control we derive is sharp up to small δ losses, and relies on the fact that the blow up rate is of almost self similar nature. We claim: N j1 j5 l1 l5 |∇ u| + |∇ u| · · · |∇ u| |∇ u| · · · |∇ u| + |∇ N −1 u|2 (|∇u|2 + |u|6 )|u|2 r ≥ 14

1 λδ

4N −5 2(N −1)

r ≥ 14

2+2N 1 . λ

(44)

Remark 4. The key here is that in (44): 4N − 5 < 2. 2(N − 1) This will allow us to strictly improve the constant in front of δ in (32). This is in particular a consequence of the gain pointed out in Remark 3.

Standing Ring Blow up Solutions to N-Dimensional NLS Equation

985

Proof of (44). First estimate from Hölder: N j1 j5 l1 l5 |∇ u|+|∇ u| · · · |∇ u| |∇ u| · · · |∇ u|+ r ≥ 14

r ≥ 14

|∇ N −1 u|2 (|∇u|2 +|u|6 )|u|2

≤ |∇ N u| L 2 |∇ l1 u| L 2 |∇ l2 u| L ∞ (r ≥ 1 ) · · · |∇ l5 u| L ∞ (r ≥ 1 ) 4 4 j1 5 jm l1 + |∇ u| L 2 m=2 |∇ u| L ∞ (r ≥ 1 ) |∇ u| L 2 5m=2 |∇ lm u| L ∞ (r ≥ 1 ) 4

4

+ |∇ N −1 u|2L 2 (|∇u|2L ∞ (r ≥ 1 ) + |u|6L ∞ (r ≥ 1 ) )|u|2L ∞ (r ≥ 1 ) . 4

4

(45)

4

We now claim the following lemma: Lemma 2. Let n1 ≥ · · · ≥ nk ≥ 1, nk+1 = · · · = n5 = 0 such that n1 +· · ·+nk = N − p, with 1 ≤ k ≤ 5 and 0 ≤ p ≤ N − 1. Then: 5

|∇ n1 u| L 2

m=2

|∇ nm u| L ∞ (r ≥ 1 )

4

1 λδ

2(N − p−1)−(k−1) 2+N − p 2(N −1) 1 . λ

Proof of Lemma 2. Away from the origin, u is radial so we may use Sobolev as if we were in dimension 1 -see for example [20]. This yields: |∇ n1 u| L 2

5

|∇ nm u| L ∞ (r ≥ 1 ) 4

m=2

|∇ N u|

n1 −1 N −1 L2

≤ |∇ N u|

N − p−1 k−1 N −1 − 2(N −1) 2 L

1− |∇u| L 2

n1 −1 N −1

k

|∇ N u|

m=2 p

|∇u| LN2−1

nm −1/2 N −1 L2

1− nm −1/2 |∇u| L 2 N −1

+(k−1)(1+ 2(N1−1) )+ 5−k 2

5−k

5−k

|∇u| L 22 |u| L 22

5−k

|u| L 22 .

Now we use the fact that as long as (34) holds, the log-log analysis ensures the control |∇u| L 2 ≤

C λ

(46)

with constant C universal. We thus get from the L 2 conservation and (32): 5

n1

|∇ u| L 2

m=2

|∇

nm

u| L ∞ (r ≥ 1 ) 4

1 λδ

2(N − p−1)−(k−1) 2+N − p 2(N −1) 1 . λ

This concludes the proof of Lemma 2. We now may estimate the terms in the RHS of (45). For the first term which is the most dangerous one, we use Lemma 2 with p = 0 and k ≥ 2 from (40). This yields using also (32): |∇ N u| L 2 |∇ l1 u| L 2 |∇ l2 u| L ∞ (r ≥ 1 ) · · · |∇ l5 u| L ∞ (r ≥ 1 )

4

4

2(N −1)−(2−1) 2+2N 4N −5 2+2N 1 1 1+ 2(N −1) 1 2(N −1) 1 = . δ λ λ λδ λ

986

P. Raphaël, J. Szeftel

Similarly, we estimate for the second term: j1

|∇ u| L 2

5

|∇

jm

u|

m=2

L ∞ r ≥ 14

|∇ u| L 2 l1

5

|∇ u| lm

m=2

L ∞ r ≥ 14

2(N −2−1)−(1−1) 2+N −2 2(N −0−1)−(2−1) 2+N 2(N −1) 2(N −1) 1 1 1 λ λδ λ 4N −9 2+2N 1 2(N −1) 1 = . λδ λ 1 λδ

For the third term: |∇ N −1 u|2L 2 |∇u|2L ∞ (r ≥ 1 ) |u|2L ∞ (r ≥ 1 ) 4 4 2 2 1 N −2 1 1− 2(N1−1) −1) |∇ N u| L2(N |∇ N u| LN2−1 |∇u| LN2−1 |∇u| |∇u| L 2 |u| L 2 2 L2 2N −3

3N −2

|∇ N u| LN2−1 |∇u| LN2−1 |u| L 2

1 λδ

2N −3 2+2N N −1 1 . λ

For the last term: |∇

N −1

u|2L 2 |u|8L ∞ (r ≥ 1 ) 4

|∇ u| N

2N −4 N −1 L2

|∇u|

2 N −2 1 N −1 N −1 N |∇ u| L 2 |∇u| L 2 |∇u|4L 2 |u|4L 2

4N −2 N −1 L2

|u|4L 2

1 λδ

2N −4 2+2N N −1 1 . λ

Injecting these estimates into (45) concludes the proof of (44). Step 3. Control of the RHS of (39) for r ≤ 41 . In this zone, Sobolev is much worse, but the singularity is absent. So the key is to reboot the much better estimates on u (33)-(34). We claim: (|∇ N u| + |∇ j1 u| · · · |∇ j5 u|)|∇ l1 u| · · · |∇ l5 u| + |∇ N −1 u|2 (|∇u|2 + |u|6 )|u|2 r ≤ 14

≤

1 λδ

C(N ) 2N 1 . λ

(47)

Remark 5. On the contrary to the estimate (44) and Remark 4, the constant C(N ) in front of δ in (47) is here irrelevant in the sense that a strict gain of almost λ2 is obtained from (47) compared to (44) as soon as δ > 0 is chosen small enough. This reflects the fact that the singular part of the solution is absent of this zone. Proof of (47). First remark that the interpolation of (32) and (33) with k = N yields: |u|

H s r ≤ 14

≤

1 λ2s−N +(1+2N −2s)δ

, for all s ∈ R such that

N ≤ s ≤ N. 2

(48)

Standing Ring Blow up Solutions to N-Dimensional NLS Equation

987

We define ( pm , qm , rm )1≤m≤5 by: 2N 2N 2 2N (N − 2) , qm = , rm = , for m = 1, . . . , 5. lm (N + 1)lm (N − 1)jm

pm =

Observe from (40) that pm , qm , rm ≥ 2, m = 1, · · · , 5, 1 1 1 1 1 p1 + · · · + p5 = 2 , q 1 + · · · + q 5 +

1 r1

+ ··· +

= 1.

1 r5

N This yields using the Sobolev embedding H 2 +δ r ≤ 21 → L ∞ r ≤ 21 : N j1 j5 l1 l5 |∇ u|+|∇ u| · · · |∇ u| |∇ u| · · · |∇ u|+ |∇ N −1 u|2 (|∇u|2 +|u|6 )|u|2 r ≤ 14

r ≤ 14

≤ |∇ N u| L 2 |∇ l1 u| +|∇ j1 u|

· · · |∇ l5 u| L p1 r ≤ 14 L p5 r ≤ 14

· · · |∇ j5 u| |∇ l1 u| · · · |∇ l5 u| L r1 r ≤ 14 L r5 r ≤ 14 L q1 r ≤ 14 L q5 r ≤ 14

+|∇ N −1 u|2 2 L

r ≤ 14

(|u|2 H

N +1+δ 2 r ≤ 14

+ |u|6

H

N +δ 2 r ≤ 14

)|u|2

H

N +δ 2 r ≤ 14

.

(49)

We now estimate the four terms in the RHS of (49). For the first term, we observe from N −lm the Sobolev embedding H 2 +δ r ≤ 41 → L pm r ≤ 14 together with (48) that: |∇ N u| L 2 |∇ l1 u|

· · · |∇ l5 u| L p1 r ≤ 14 L p5 r ≤ 14

5

|u| H N

|u|

m=1

1 λ2N +C(N )δ

H

N +lm +δ 2 r ≤ 14

.

N −jm jm For the second term, we have from the Sobolev embeddings H 2 − 2(N −2) +δ r ≤ 41 → N −lm lm L rm r ≤ 41 and H 2 − 2N +δ r ≤ 14 → L qm r ≤ 14 together with (48) that:

|∇ j1 u|

· · · |∇ j5 u| |∇ l1 u| · · · |∇ l5 u| L r1 r ≤ 14 L r5 r ≤ 14 L q1 r ≤ 14 L q5 r ≤ 14

5 m=1

|u|

H

N +jm jm 2 − 2(N −2) +δ r ≤ 1 4

5

|u|

m=1

H

N +lm − lm +δ 2 2N r ≤ 14

1 . λ2N −4+C(N )δ

For the third term, we have by (48): |∇ N −1 u|2 2 L

r ≤ 14

|u|2 H

N +1+δ 2 r ≤ 14

|u|2

H

N +δ 2 r ≤ 14

1 λ2N +C(N )δ

Eventually, the last term is controlled by: |∇ N −1 u|2 2 L

r ≤ 14

|u|8 H

N +δ 2 r ≤ 14

1 . λ2N −4+C(N )δ

Injecting these estimates into (49) conclude the proof of (47).

.

988

P. Raphaël, J. Szeftel

Step 4. Time integration using the almost self similar blow up speed. Injecting (44) and (47) into (39) yields: 4N −5 2+2N C(N ) 2N d E N (u) 1 2(N −1) 1 1 1 + dt λδ δ λ λ λ 4N −5 2+2N 1 2(N −1) 1 λδ λ for δ < δ ∗ (N ) small enough. We integrate on [0, t]:

t

|E N (u(t))| |E N (u(0))| +

dτ (4N −5)δ

[λ(τ )]2+2N + 2(N −1)

0

.

(50)

We now estimate the time integral in the RHS of (50) thanks to the knowledge of the blow up speed. We claim: t C(µ) for µ < 2, 1 (51) dτ ≤ |log(λ(t))|101 µ for µ ≥ 2. 0 λ(τ ) λ(t)µ−2 Indeed, from (29): 0

t

1 dτ = λ(τ )µ

s

λ(σ )2−µ dσ.

s0 1

Equation (30) implies s ≤ |log(λ)|100 and λ(s) ≤ e−s 100 , and hence (51) follows for µ < 2. For µ ≥ 2, we also use the almost monotonicity (31) to derive:

s

λ(σ )2−µ dσ ≤ C

s0

s − s0 |log(λ(t))|101 ≤ , µ−2 λ(s) λ(t)µ−2

and (51) follows. We now inject (51) with µ = 2 + 2N +

(4N −5)δ 2(N −1)

|E N (u(t))| |E N (u(0))| +

into (50) to derive the bound:

|log(λ(t))|101 (4N −5)δ

[λ(t)]2N + 2(N −1)

.

(52)

Step 5. Subcriticality of the H N energy and reboot of (35). We are now in position to conclude the proof of (35) which relies on the fact that the H N energy is in some sense H N subcritical. Indeed, recall the definition of E N (38) and let us split the space in two regions again. For r ≥ 14 , r ≥ 14

|∇ N −1 u|2 |u|4 |∇ N −1 u|2L 2 |u|4L ∞ (r ≥ 1 ) |∇ N −1 u|2L 2 |∇u|2L 2 |u|2L 2 4

|∇ u| N

2(N −2) N −1 L2

|∇u|

2N N −1 L2

|u|2L 2

1 λδ

2N −4 N −1

1 λ2N

.

Standing Ring Blow up Solutions to N-Dimensional NLS Equation

989

For r ≤ 14 , r ≤ 14

|∇ N −1 u|2 |u|4 |u|4

H

N +δ 2 r ≤ 14

|∇ N −1 u|2 2

r ≤ 14

L

1 λ2N −4+C(N )δ

.

Hence from (38):

|∇ u(t)| |E N (u(t))| + N

2

1 λ(t)

2N + (2N −4)δ (N −1)

.

(53)

This also implies at time t = 0 from (21), (24), (27) and the almost monotonicity (31):

|E N (u(0))|

|∇ N u(0)|2 + 2

1 λ(t)

2N + (2N −4)δ (N −1)

1 λ(0)

2N + (2N −4)δ (N −1)

≤

1 λ(0)

2N

+

1 λ(0)

2N + (2N −4)δ (N −1)

.

Injecting this together with (53) into (52) yields:

|∇ N u(t)|2

1 λ(t)

2N + (2N −4)δ (N −1)

+

|log(λ(t))|101 [λ(t)]

≤

−5)δ 2N + (4N 2(N −1)

1 , 2λ2N +2δ

for α ∗ and δ small enough. This concludes the proof of (35).

2.3. A first improved bound away from the origin. Observe that the interpolation between (46) and (32) provides a first global bound: |∇ l u| L 2 ≤

1 l+ Nl−1 −1 δ

λ

, 1 ≤ l ≤ N.

(54)

Our aim in this section is to show that (54), the knowledge of the blow up speed and the initial smallness (27) imply in fact better bounds in annuli away from the origin corresponding to a gain of 1/2 a derivative away from the singularity -and the origin for the moment. The key to the proof of this initial gain of regularity outside the singularity is first the log-log dispersive estimate on the excess of mass (17), and then interior regularity type of estimates for the linear Schrödinger flow. Lemma 3 (Improved bounds away from the origin and the singularity). There holds: ∀ 1 ≤ k ≤ 2N − 1, |u|

k

H N− 2

1 7 8 ≤r ≤ 8

≤

1 λ

N − k2 − 21 + 23 δ

.

(55)

990

P. Raphaël, J. Szeftel

Proof of Lemma 3. Step 1. The interior regularity estimate. Let us start with the following interior regularity estimate which follows from standard energy techniques. Let I = [0, t]. Let c2 < c1 < c1 < c2 and a cut off function χ which is one on [c1 , c1 ] and zero outside of [c2 , c2 ]. We claim: ∀s ≥ 21 , s |u(0)| H s (c2 ≤r ≤c2 ) + |u| |u| L ∞ I H (c1 ≤r ≤c1 )

1

L 2I H s+ 2 (c2 ≤r ≤c2 )

(56)

+|χ u|u|4 | L 1 H s . I

Proof of (56). Let v = χ u,

(57)

i∂t v + v = uχ + 2∇χ · ∇u − χ u|u|4 .

(58)

then:

We compute from (58): 1 d s 2 |∇ v| L 2 = I m 2 dt

∇ (uχ + 2∇χ · ∇u − χ u|u| )∇ v . s

4

s

(59)

The first term is of lower order and left to the reader. For the second term, we integrate by parts and use standard commutation estimates and (57) to derive: s ∇ (∇χ · ∇u)∇ s v |∇ s+ 12 v| L 2 |∇ s− 12 (∇χ · ∇u)| L 2 |u|2 1 . s+ H

2 (c2 ,c2 )

The nonlinear term is estimated through the standard energy estimate: s ∇ (χ u|u|4 )∇ s v |∇ s (χ u|u|4 )| L 2 |∇ s v| L 2 . The time integration of (59) now easily yields (56). Step 2. Case k = 2N − 1: the initial gain of 1/2 a derivative. Let us prove (55) for k = 2N − 1, that is the initial gain of 1/2 a derivative away from the singular circle: |u|

1 2 1 ≤r ≤ 8 L∞ I H 9 9

≤

1 3

λ2δ

.

(60)

Such an estimate was first derived in [29] and relies on the regularity of the excess of mass u˜ which is a -non trivial- consequence of the log-log analysis. Indeed, we claim: t1 |∇ u(t)| ˜ 2 dt ≤ θ (α ∗ ) (61) 0

θ (α ∗ )

α∗

→ 0 as → 0, and refer to [25,29] for a proof. Note that the integrability with (61) is clearly false for the full solution u from (9). This estimate is the key to separate between the singular and the regular part of the solution, see [25,26,29]. It is important to notice that the constant in the RHS of (61) does not depend on the bootstrap constants

Standing Ring Blow up Solutions to N-Dimensional NLS Equation

991

(32), (33), (34). The bootstrap assumption is used in this step only to ensure the global N −1 smallness of the critical H 2 norm in order to ensure the key control (13). Let now ν=

1 δ 1 − 2ν − so that = δ; 2 2(1 − δ) 2(1 − ν)

(62)

we claim |u|

ν L∞ I H

1 8 9 ≤r ≤ 9

≤ 1.

(63)

Assume (63). Then the interpolation between (63) and (46) yields: |u(t)|

1

H

1 2 1 ≤r ≤ 8 9 9

|u(t)| 2(1−ν) ν 1 H

9

1−2ν

|u(t)| 2(1−ν) H1 ≤r ≤ 8

9

1 , λ(t)δ

where we used (62) in the last step, and (60) follows. Remark 6. Interpolating between (60) and (46) yields: |u(t)|

H

1 2 +δ 1 ≤r ≤ 8 9 9

|u(t)|1−2δ 1 H2

1 8 9 ≤r ≤ 9

|u(t)|

2δ H1

1 λ(t)

7 2 2 δ−3δ

,

1

which together with the one dimensional Sobolev embedding H 2 +δ → L ∞ yields: |u|

L∞

1 8 9 ≤r ≤ 9

≤

1 7

λ2δ

.

(64)

Proof of (63). !We briefly recall the proof from ! [29]. We take a cut off function χ which 1 9 is one on 19 , 89 and zero outside of 10 , 10 . We then define v as in (57) which satisfies (58). Because ν < 21 , we argue slightly differently with respect to the proof of (56). Recall (59). We estimate: ν ∇ (∇χ · ∇u)∇ ν v |∇ 2ν v| L 2 |u| H 1 |u|2 1 . (65) H For the nonlinear term, we use the definition of v and the energy estimate: ν ∇ (χ u|u|4 )∇ ν v |∇ ν (v|u|4 )| L 2 |∇ ν v| L 2 . We now let q=

1 1 1 1 > 2 and = − ν p 2 q

and estimate from standard commutation estimates -[6]- and the one dimensional 1 Sobolev embeddings H ν → L p , ν < 21 , H 2 → W ν,q : |∇ ν (v|u|4 )| L 2 |∇ ν v| L 2 |u|4 ∞ 1 L

9 10 ≤r ≤ 10

+ |v| L p |u|3 ∞ 1 L

3

|∇ ν v| L 2 |u| ˜ 2H 1 + |∇ ν v| L 2 |u| ˜ H2 1 |u| ˜

1

H2

9 10 ≤r ≤ 10

|u|

W ν,q

1 9 10 ≤r ≤ 10

|∇ ν v| L 2 (1 + |u| ˜ 2H 1 ).

992

P. Raphaël, J. Szeftel

Injecting this and (65) into (59) and using the L 2 conservation law yields the pointwise differential inequation: d ν 2 |∇ v| L 2 |u| ˜ 2H 1 (1 + |∇ ν v|2L 2 ). dt Integrating this from [0, t] and using Gronwall’s Lemma yields: |∇

ν

v|2L ∞ L 2 I

2 |u(0)| ˜ 1 9 H 1 10 ≤r ≤ 10

t1

+

|∇ u(t)| ˜ dt e 2

t1 0

2 dt |∇ u(t)| ˜

≤1

0

from (27) and (61). This concludes the proof of (63). Step 3. The case 1 ≤ k ≤ 2N − 2. We now conclude the proof of (55) by propagating the initial gain of 1/2 a derivative (60) using the interior regularity! estimate (56). Let a ! cut off function χ which is one on 18 , 78 and zero outside of 19 , 89 . Equation (56) with s = N − 2k and (27) yield: k

|∇ N − 2 u|

2 L∞ I L

1 7 8 ≤r ≤ 8

1 + |∇ N −

k−1 2

k

u| L 2 L 2 + |∇ N − 2 (χ u|u|4 )| L 1 L 2 . I

I

(66)

From (54) and (51):

|∇

N − k−1 2

u| L 2 L 2 I

⎛ ⎝

⎞1 t 0

λ

2

dτ

⎠ k−1 2N −k+1+2 1− 2(N −1) δ

[λ(τ )] |log(λ)|51

k−1 1− 2(N −1) δ

N − k2 − 21 +

≤

1 λ

N − 2k − 21 + 54 δ

.

(67)

For the nonlinear term, we use Moser’s tame product estimate in Sobolev spaces -see for example [1], (54) and (64): |u|u|4 |

k

H N− 2

1 8 9 ≤r ≤ 9

|u|

k

H N− 2

|u|4 ∞ 1 L

8 9 ≤r ≤ 9

1 λ

N − 2k +C(N )δ

and thus from (51): |log(λ)|101

k

|∇ N − 2 (χ u|u|4 )| L 1 L 2 I

λ

[N − 2k +C(N )δ−2]+

Injecting this together with (67) into (66) yields (55). This concludes the proof of Lemma 3.

≤

1 λ

N − 2k − 21 + 54 δ

.

,

Standing Ring Blow up Solutions to N-Dimensional NLS Equation

993

2.4. Proof of (36), (37). We are now in position to conclude the proof of (36), (37) by iterating the gain of 1/2 a derivative (55). Let us insist on the fact that the initial gain (55) is not restricted to the region in space defined by the bootstrap assumptions (36), (37) but actually lives on a strictly larger domain. Step 1. Iteration of (55) away from the origin. Let us start by proving (36), (37) in the annulus 14 ≤ r ≤ 21 by iterating (55): Lemma 4 (Iterative gain of regularity in annuli). Let n ≥ 0, and let . . . < c1n < 41 < 21 < c2n < . . . < c21 < c20 = 78 . Let the sequence ρn+1 =

1 8

= c10 < c11 <

2N − 2 9 2N − 3 + 4δ ρn + δ, ρ0 = , 2N − 1 10(2N − 1) 2(2N − 1)

(68)

then there exists Cn such that: Cn α ∗

k

∀n ≥ 0, ∀1 ≤ k ≤ 2N − 2, |∇ N − 2 u| L 2 (cn ≤r ≤cn ) ≤ 1

2

λ[N −k+(1+k)ρn ]+

9 δ as n → +∞ and hence (36), (37) in Note that ρn → 10 choosing n = n(δ) large enough.

1 4

≤r ≤

1 2

.

(69)

follow by

Proof of Lemma 4. First observe that the definition (68) of ρ0 implies: ∀1 ≤ k ≤ 2N − 2, 0 ≤ N −

k 1 − + 2δ ≤ N − k + (1 + k)ρ0 , 2 2

and hence (69) for n = 0 follows from (55). We now assume (69) for n and prove it for n + 1. Let 1 ≤ k ≤ 2N − 2. If k = 1, then from (55): |u|

1 H N − 2 18 ≤r ≤ 78

≤

1 1

1

3

λN − 2 − 2 + 2 δ

≤

α∗ λ N −1+2ρn

from 2ρn ≥ 95 δ > 23 δ. For 2 ≤ k ≤ 2N − 2, we take a cut off function χ which is one on [c1n+1 , c2n+1 ] and zero outside of [c1n , c2n ]. Then v = χ u satisfies (58). From (56): k

|∇ N − 2 u| L 2 (cn+1 ≤r ≤cn+1 ) ≤ α ∗ + |∇ N − 1

k−1 2

2

+|∇

N − k2

u| L 2 L 2 (cn ≤r ≤cn ) I

1

2

(|u| u)| L 1 L 2 (cn ≤r ≤cn ) . 4

I

1

2

We now use the iteration assumption (69) for n from k ≥ 2, together with (51): |∇

N −(k−1)/2

u| L 2 L 2 (cn ≤r ≤cn ) ≤ I

1

2

≤ ≤

0

t

dτ

21

[λ(τ )]2[N −(k−1)+kρn ]+

|log(λ)|51 λ N −k+kρn Cn+1 α ∗

for N − k + kρn ≥ 0, for N − k + kρn < 0,

Cn+1 α ∗ 9

λ[N −k+kρn + 10 δ]+

.

(70)

994

P. Raphaël, J. Szeftel

For the nonlinear term, we first estimate: k

|∇ N − 2 (|u|4 u)| L 2 (cn ≤r ≤cn ) ≤ |u| 1

2

k

H N − 2 (c1n ≤r ≤c2n )

|u|4L ∞ (cn ≤r ≤cn ) 1

2

1 , λ[N −k+(1+k)ρn ]+ +C(N )δ where we have used (64). Hence, from (51) again:

Cn+1 α ∗

k

|∇ N − 2 u| L 1 L 2 (cn ≤r ≤cn ) ≤ 1

I

2

9

λ[N −k+kρn + 10 δ]+

.

Injecting these estimates into (70) yields: Cn+1 α ∗

k

|∇ N − 2 u| L 2 (cn+1 ≤r ≤cn+1 ) ≤

9

λ[N −k+kρn + 10 δ]+ Equation (69) for n + 1 now follows from the observation that: 1

2

∀1 ≤ k ≤ 2N − 2, N − k + kρn +

.

9 δ ≤ N − k + (1 + k)ρn+1 . 10

This concludes the proof of Lemma 4. Step 2. Proof of (36), (37) near the origin. First observe that (69) implies (36), (37) for 1 1 1 4 ≤ r ≤ 2 . So we only have to control (36), (37) for r ≤ 4 . Let 1 ≤ k ≤ N + 1. Take a cut off function χ which is one on [0, 41 ] and zero after 21 , and define v as χ u. Then, from (56): k

|∇ N − 2 u|

L 2 r ≤ 14

≤ α ∗ + |∇ N −

k−1 2

k

u| L 2 L 2 (r ≤ 1 ) + |∇ N − 2 (|u|4 u)| L 1 L 2 (r ≤ 1 ) . I

I

2

2

We now estimate from (33), (34) and (51): 21 t 1 N − k−1 2 |∇ u| L 2 L 2 (r ≤ 1 ) ≤ I 2 λ2[N −(k−1)+kδ]+ 0 |log(λ)|51 ≤ λ∗N −k+kδ for N − k + kδ ≥ 0, α for N − k + kδ < 0, ≤

α∗ λ[N −k+(1+k)δ]+

.

For the nonlinear term, we first estimate from the N -dimensional Sobolev embedding N H 2 +δ → L ∞ and the bootstrap estimate (33): |u| L ∞ (r ≤ 1 ) ≤ 2

1 . λC(N )δ

We thus estimate from (33), (34): k

|∇ N − 2 (|u|4 u)| L 2 (r ≤ 1 ) ≤ |u| 2

H

N − 2k

(r ≤ 12 )

|u|4L ∞ (r ≤ 1 ) 2

1 λ[N −k]+ +C(N )δ

and hence from (51): k

|∇ N − 2 (|u|4 u)| L 1 L 2 (r ≤ 1 ) ≤

α∗

λ[N −k−1+C(N )δ]+ These estimates conclude the proof of (36), (37) near the origin. I

2

.

,

Standing Ring Blow up Solutions to N-Dimensional NLS Equation

995

This concludes the proof of the bootstrap Proposition 1. Acknowledgements. Both authors are supported by the French Agence Nationale de la Recherche, ANR jeune chercheur SWAP.

References 1. Alinhac, S., Gérard, P.: Opérateurs pseudo-différentiels et théorème de Nash-Moser, Savoirs Actuels. [Current Scholarship], Paris: InterEditions, 1991 2. Berestycki, H., Lions, P.-L.: Nonlinear scalar field equations. I. Existence of a ground state. Arch. Rat. Mech. Anal. 82(4), 313–345 (1983) 3. Bourgain, J.: On growth in time of Sobolev norms of smooth solutions of nonlinear Schrödinger equations in R D. J. Anal. Math. 72, 299–310 (1997) 4. Bourgain, J., Wang, W.: Construction of blowup solutions for the nonlinear Schrödinger equation with critical nonlinearity. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 25(1–2), 197–215 (1997) 5. Cazenave, Th.: Semilinear Schrödinger equations, Courant Lecture Notes in Mathematics 10, New York Univ./Courant Institute, Providence, RI: Amer. Math. Soc., 2003 6. Christ, M.: Lectures on singular integral operators. In: CBMS Regional Conference Series in Mathematics 77, Published for the Conference Board of the Mathematical Sciences, Washington, DC, Providence, RI: Amer. Math. Soc., 1990 7. Colliander, J., Raphaël, P.: Rough blow up solutions to the L 2 critical NLS. Preprint 8. Fibich, G., Gavish, N., Wang, X.P.: Singular ring solutions of critical and supercritical nonlinear Schrodinger equations. Physica D: Nonlinear Phenomena 231(1), 55–86 (2007) 9. Fibich, G., Merle, F., Raphael, P.: Numerical proof of a spectral property related to the singularity formation for the L 2 critical nonlinear Schrödinger equation. Phys. D 220(1), 1–13 (2006) 10. Gidas, B., Ni, W.M., Nirenberg, L.: Symmetry and related properties via the maximum principle. Commun. Math. Phys. 68, 209–243 (1979) 11. Ginibre, J., Velo, G.: On a class of nonlinear Schrödinger equations. I. The Cauchy problem, general case. J. Funct. Anal. 32(1), 1–32 (1979) 12. Ginibre, J., Velo, G.: On the global Cauchy problem for some nonlinear Schrödinger equations. Ann. Inst. H. Poincaré Anal. Non Linéaire 1, 309–323 (1984) 13. Ginibre, J., Velo, G.: The global Cauchy problem for the nonlinear Schrödinger equation revisited. Ann. Inst. H. Poincaré Anal. Non Linéaire 2(4), 309–327 (1985) 14. Glassey, R.T.: On the blowing up of solutions to the Cauchy problem for nonlinear Schrödinger equations. J. Math. Phys. 18, 1794–1797 (1977) 15. Kato, T.: On the Cauchy problem for the (generalized) Korteweg de Vries equation. In: Studies in Applied Mathematics, Adv. Math. Suppl. Stud. 8, New York: Academic Press, 1983, pp. 93–128 16. Kenig, C.E., Merle, F.: Global well-posedness, scattering and blow-up for the energy-critical, focusing, non-linear Schrödinger equation in the radial case. Invent. Math. 166(3), 645–675 (2006) 17. Krieger, J., Schlag, W., Tataru, D.: Renormalization and blow up for charge one equivariant critical wave maps. Invent. Math. 171(3), 543–615 (2008) 18. Kwong, M.K.: Uniqueness of positive solutions of u − u + u p = 0 in R n. Arch. Rat. Mech. Anal. 105(3), 243–266 (1989) 19. Landman, M.J., Papanicolaou, G.C., Sulem, C., Sulem, P.-L.: Rate of blowup for solutions of the nonlinear Schrödinger equation at critical dimension. Phys. Rev. A (3) 38(8), 3837–3843 (1988) 20. Lions, P-L.: Symétrie et compacité dans les espaces de Sobolev. J. Funct. Anal. 49, 315–334 (1982) 21. Martel, Y.: Asymptotic N -soliton-like solutions of the subcritical and critical generalized Korteweg-de Vries equations. Amer. J. Math. 127(5), 1103–1140 (2005) 22. Merle, F., Raphaël, P.: Blow up dynamic and upper bound on the blow up rate for critical nonlinear Schrödinger equation. Ann. Math. 161(1), 157–222 (2005) 23. Merle, F., Raphaël, P.: Sharp upper bound on the blow up rate for critical nonlinear Schrödinger equation. Geom. Funct. Ana 13, 591–642 (2003) 24. Merle, F., Raphaël, P.: On Universality of Blow up Profile for L 2 critical nonlinear Schrödinger equation. Invent. Math. 156, 565–672 (2004) 25. Merle, F., Raphaël, P.: On a sharp lower bound on the blow-up rate for the L 2 critical nonlinear Schrödinger equation. J. Amer. Math. Soc. 19(1), 37–90 (2006) 26. Merle, F., Raphaël, P.: Profiles and quantization of the blow up mass for critical non linear Schrödinger equation. Commun. Math. Phys. 253(3), 675–704 (2004)

996

P. Raphaël, J. Szeftel

27. Planchon, F., Raphaël, P.: Existence and stability of the log-log blow-up dynamics for the L 2 -critical nonlinear Schrödinger equation in a domain. Ann. Henri Poincaré 8(6), 1177–1219 (2007) 28. Raphaël, P.: Stability of the log-log bound for blow up solutions to the critical nonlinear Schrödinger equation. Math. Ann. 331, 577–609 (2005) 29. Raphaël, P.: Existence and stability of a solution blowing up on a sphere for a L 2 supercritical nonlinear Schrödinger equation. Duke Math. J. 134(2), 199–258 (2006) 30. Rodnianski, I., Sterbenz, J.: On the formation of singularities in the critical O(3) σ -model, preprint 2008 31. Staffilani, G.: On the growth of high Sobolev norms of solutions for KdV and Schrödinger equations. Duke Math. J. 86(1), 109–142 (1997) 32. Zakharov, V.E., Shabat, A.B.: Exact theory of two-dimensional self-focusing and one-dimensional self-modulation of waves in non-linear media. Sov. Phys. JETP 34, 62–69 (1972) Communicated by P. Constantin

Commun. Math. Phys. 290, 997–1024 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0861-x

Communications in

Mathematical Physics

Strominger–Yau–Zaslow Geometry, Affine Spheres and Painlevé III Maciej Dunajski, Prim Plansangkate Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Wilberforce Road, Cambridge CB3 0WA, UK. E-mail: [email protected]; [email protected] Received: 29 September 2008 / Accepted: 23 April 2009 Published online: 23 June 2009 – © Springer-Verlag 2009

Abstract: We give a gauge invariant characterisation of the elliptic affine sphere equation and the closely related Tzitzéica equation as reductions of real forms of S L(3, C) anti– self–dual Yang–Mills equations by two translations, or equivalently as a special case of the Hitchin equation. We use the Loftin–Yau–Zaslow construction to give an explicit expression for a six–real dimensional semi–flat Calabi–Yau metric in terms of a solution to the affinesphere equation and show how a subclass of such metrics arises from 3rd Painlevé transcendents. 1. Introduction Let X be a six real dimensional Calabi–Yau (CY) manifold - a complex Kähler three-fold with covariantly constant holomorphic three-form . Any such manifold admits a Ricci flat Kähler metric with holonomy contained in SU (3). We shall consider a subclass of CY manifolds which are fibred over a real three dimensional manifold B, and the fibres are special Lagrangian tori T 3 . This means that there exists a projection π : X −→ B such that the restrictions of the Kähler form ω and the real part of the holomorphic three-form Re() vanish on any fibre π −1 ( p) ∼ = T 3 over a point p ∈ B. The corresponding CY metric is called semi–flat if it is flat along the fibres. Consider the Kähler form ω = i∂∂φ, where φ is the Kähler potential. A natural class of semi– flat CY manifolds are the T 3 invariant manifolds. In this case the potential φ can be chosen 2not to depend on the coordinates of the fibres of π. The Ricci–flat condition det ∂z∂j ∂φz¯ k = 1 then reduces to the real Monge–Ampére equation 2 ∂ φ = 1, (1.1) det ∂x j ∂xk

998

M. Dunajski, P. Plansangkate

where x j , j = 1, 2, 3, are local coordinates on B. The work of Cheng and Yau [6] shows that semi–flat CY metrics on compact complex three-fold are flat, so in what follows we allow CY manifolds to be non–compact, and some fibres of π to be singular. The conjecture of Strominger, Yau and Zaslow (SYZ) [28] states that near the large complex structure limit both X and its mirror should be the fibrations over the moduli space of special Lagrangian tori. More precisely, SYZ consider the moduli space of special Lagrangian submanifolds admitting a unitary flat connection. They write down a metric on X and compute the metric on the moduli space. In the tree level contribution this metric is derived from the Born–Infeld action for the brane, assuming that the moduli parameters slowly vary in time and expanding the action up to second order in time derivatives. The metric on the moduli space Y arises from the kinetic term in the Born–Infeld action. This method is based on Manton’s moduli space approximation [21] and was originally used by SYZ. The metric resulting on Y admits the T 3 action even if the original metric on X does not. The full agreement between Y and the mirror of X is therefore expected when instanton contribution from minimal area holomorphic discs whose boundaries wrap the tori are taken into account. These corrections are suppressed in the large complex structure limit. One approach to a proof of the Strominger Yau Zaslow conjecture [28] would be to describe Ricci-flat metrics on Calabi-Yau manifolds near large complex structure limits. It is expected that in the large complex structure limit the base of the fibration π : X −→ B admits an affine structure and a special metric of Hessian form. To test this conjecture Loftin, Yau and Zaslow (LYZ) [20] aimed to prove the existence of the metric of Hessian form1 gB =

∂ 2φ dx j ⊗ dxk, ∂x j ∂xk

(1.2)

where φ is homogeneous of degree 2 in x j and satisfies (1.1). Given such a Hessian metric on B, the semi–flat Calabi–Yau metric g on T B and the corresponding Kähler form are given by g = φ jk (d x j ⊗ d x k + dy j ⊗ dy k ),

ω=

i φ jk dz j ∧ dz k , 2

(1.3)

where y j are coordinates on the fibres of T B and z j = x j + iy j . LYZ constructed a candidate for such metric as a cone over the elliptic affine sphere metric with three singular points. One consequence of Mirror Conjecture is that the base metric g B should have singularities in codimension two, and LYZ were interested in a local metric model near the trivalent vertex of a Y-shaped singularity. The monodromy of the resulting affine structure has not been calculated, so it is not yet clear that the metric coincides with the one predicted by Gross-Siebert [10] and Haase-Zharkov [12]. The LYZ construction of the metric comes down to looking for solutions of the definite affine sphere equation [27] 1 ψz z¯ + eψ + |U |2 e−2ψ = 0, 2

Uz¯ = 0,

(1.4)

1 It follows from the work of Hitchin [13] that the natural Weil-Petersson metric on the space of special Lagrangian submanifolds has this form. More precisely, it is shown in [13] that the Kähler potentials of X and its mirror Y both satisfy the Monge-Ampére equation (1.1) and are related by a Legendre transform on the base. The fibres of the special Lagrangian fibration of Y are dual (by a Fourier transform) tori to the fibres of π : X −→ B.

Strominger–Yau–Zaslow Geometry, Affine Spheres and Painlevé III

999

where ψ and U are real and complex functions respectively on an open set in C. LYZ set U = z −2 to account for the singularity of the metric they considered. They then proved the existence of the radially symmetric solution ψ of (1.4) with a prescribed behaviour near the singularity z = 0, and established the existence of the global solution to the coordinate-independent version of (1.4) on S 2 minus three points. In this paper, we study the integrability of Eq. (1.4). We show that the affine sphere equation and a closely related equation called the Tzitzéica equation arise as reductions of anti–self–dual Yang–Mills (ASDYM) system by two translations, and hence it admits a twistor interpretation. Moreover, the ODE characterising its radial solutions gives rise to an isomonodromy problem described by the Painlevé III ODE. The two-dimensional group of translations reduces the Euclidean ASDYM equations to the Hitchin equations [14] and Theorem 1.1 below gives an invariant characterisation of (1.4) as a special case of the SU (2, 1) Hitchin equations. Let A be an su(2, 1) valued connection on a rank 3 complex vector bundle E → C with the curvature FA = d A + A ∧ A and let be a one-form with values in adj(E). Choose a local trivialisation of E and set A = A z dz + (A z )∗ d z¯ , = Qd z¯ ,

D = d + A,

where m ∗ := −η−1 m¯ t η with η = diag(1, 1, −1), so that ∗ = Q ∗ dz. Theorem 1.1. The Hitchin equations FA − ∧ ∗ − ∗ ∧ = 0,

D = 0

(1.5)

hold with ⎛ 0

⎜ Az = ⎝ 0 0

ψ √1 e 2 2 − 21 ψz

0

⎞ 0

⎟ −U e−ψ ⎠ , 1 2 ψz

⎛ 0 0 ⎜ Q = ⎝0 0 0 0

ψ √1 e 2 2

0 0

⎞ ⎟ ⎠

(1.6)

if the functions (ψ, U ) satisfy the affine sphere equation (1.4). Conversely, any solution to the SU (2, 1) Hitchin equations such that 2 1. Q has Tr (Q Q ∗ ) = 0, minimal polynomial t and ∗ 2 ∗ 2 2. Tr (Dz Q ) = 0, Tr (Dz Q ) (Dz¯ Q)2 = 0, 3. Tr [(Q Q ∗ )4 − (Q ∗ Q)2 (Dz Q ∗ )(Dz¯ Q) + Q ∗ Q(Dz Q ∗ )Q Q ∗ (Dz¯ Q)] = 0

is equivalent to (1.6) by gauge and coordinate transformations. The connection between solutions to the affine sphere equation (1.4) and the Calabi–Yau metric (1.3) in six dimensions has not been made explicit in [20]. The Lax representation of (1.4) will be used to prove the following Proposition 1.2. Given a semi-flat Calabi–Yau metric (1.3), where φ(x) satisfies the Monge–Ampére equation (1.1), and φ(cx) = c2 φ(x), where c is a non–zero constant, there exist complex coordinates {z, w, ξ } such that the metric g and the Kähler form ω can be written as g = e1 e¯1 + e2 e¯2 + e3 e¯3 , i ω = (e1 ∧ e¯1 + e2 ∧ e¯2 + e3 ∧ e¯3 ) , 2

(1.7)

1000

M. Dunajski, P. Plansangkate

where i e1 = dw − eψ (ξ¯ dz + ξ d z¯ ), 2 eψ/2 e2 = √ (w + iξ ψz )dz + i(dξ + e−ψ U¯ ξ¯ d z¯ ) , 2 ψ/2 e e3 = √ i(d ξ¯ + e−ψ U ξ dz) + (w + i ξ¯ ψz¯ )d z¯ , 2

(1.8)

and ψ(z, z¯ ), U (z) are real and complex functions respectively defined on an open set in C which satisfy the affine sphere equation (1.4). The Hitchin equations (1.5) are integrable as they arise from ASDYM and their solutions can be described by holomorphic twistor data. Therefore any ODE arising as reduction of (1.4) by another symmetry must be of Painlevé type in agreement with an integrable dogma [1,8,22]. If U = z n , n ∈ Z, Eq. (1.4) admits rotational symmetry z → eic z, c ∈ R.

(1.9)

Therefore one can consider the group invariant solutions ψ and look for the ODE characterising such reduction. For concreteness, let us consider U = z −2 following LYZ. Proposition 1.3. Solutions to (1.4) with U = z −2 invariant under a group of rotations (1.9) are of the form ψ(z, z¯ ) = log H (s) − 3 log (s),

s = |z|1/2 ,

where H satisfies Hss =

Hs 8H 2 16 (Hs )2 − − − , H s s H

which is the Painlevé III equation with parameters (−8, 0, 0, −16). In the next section we follow Leung [18] and review the semi–flat Calabi-Yau manifolds. Then, in Sect. 3 we summarise the results about affine spheres which are used in the LYZ construction [20]. In Sect. 4 we prove Theorem 1.1 and give a gauge invariant characterisation of the definite affine sphere equation and the closely related Tzitzéica equation as symmetry reductions of the anti–self–dual Yang–Mills equations. As a byproduct, in Sect. 5 we shall obtain a characterisation of a reduction of the Hitchin equations to the Z3 two dimensional Toda chain. In Sect. 6 we discuss other possible gauge inequivalent reductions of the ASDYM equations to the affine sphere equation and the Tzitzéica equation. In Sect. 7 we give a proof of Proposition 1.2 and recover the toric Calabi–Yau metric in terms of the solutions of the affine sphere equation. Finaly in Sect. 8 we establish Proposition 1.3 and demonstrate that the existence theorem for Hessian metrics with prescribed monodromy comes down to the study of the Painlevé III equation with special values of parameters, and obtain the corresponding 3 × 3 isomonodromic Lax pair.

Strominger–Yau–Zaslow Geometry, Affine Spheres and Painlevé III

1001

2. Semi–Flat Calabi–Yau Manifolds and the SYZ Conjecture Let z j = x j + iy j be holomorphic coordinates on a Calabi–Yau three-fold X , and let φ(z j , z¯ j ) be the Kähler potential such that ω = i∂∂φ. The Ricci–flat condition for the corresponding Riemannian metric is ∧ = ω3 , where = dz 1 ∧ dz 2 ∧ dz 3 is the holomorphic three-form on X . Now let us consider the T 3 invariant case. Assume that the potential φ is invariant under translations in the imaginary directions y j . In this case the Riemannian metric and the Kähler form are given by (1.3) where φ jk :=

∂ 2φ ∂x j ∂xk

and the Ricci–flat condition reduces to the real Monge–Ampére equation (1.1) for φ = φ(x 1 , x 2 , x 3 ). We shall regard the x j as local coordinates in an open set B ⊂ R3 . The freedom in choosing the coordinates x j without changing Eq. (1.1) is given by affine transformations x → Mx + b, where M ∈ S L(3, R), and b is a vector. The affine transformations induce the change in the potential φ −→ (detM)2 φ, thus φ should be regarded as a section of the second power of the real determinant line bundle over B. Conversely, given a three real dimensional affine manifold B with a metric of Hessian type (1.2), where φ satisfies the Hessian condition (1.1) one can construct the Calabi–Yau metric on X = T B by (1.3). We then compactify the fibres quotienting them by a lattice thus producing a T 3 invariant Calabi–Yau structure on the total space of a toric fibration π : X −→ B. We are now ready to formulate the SYZ conjecture. If X, Y are mirror Calabi–Yau manifolds (see [11] for a discussion of what it means) then there exists a compact real three-manifold B such that • π : X −→ B, ρ : Y −→ B are special Lagrangian fibrations by tori (the fibres can be singular at some points of B). • The fibres of π and ρ are dual tori. The second condition only makes sense for flat tori, therefore the conjecture holds in the large complex structure limit, where the volume of the fibres is small in comparison to the volume of the base space and the metric on the fibres is approximately flat. To understand the large complex structure limit consider a one parameter family of complex structures J (t) given by the holomorphic coordinates z j (t) = t −1 x j + iy j , and the corresponding Calabi–Yau metrics rescaled by t 2 g(t) = φi j (d x j d x k + t 2 dy j dy k ). Thus we get a one parameter family of special Lagrangian fibrations. In a limit t −→ 0 the Gromov–Hausdorff limit of metric g(t) is the Hessian metric (1.2) on B, and the size of the fibres shrinks to zero. The SYZ conjecture predicts that such a limit exists for any Calabi–Yau metric on a (not necessarily T 3 symmetric) toric special Lagrangian fibration.

1002

M. Dunajski, P. Plansangkate

3. Affine Geometry and Hessian Metrics The Hessian equation (1.1) is known not to be integrable, at least in the sense of the hydrodynamic reductions [9]. Its homogeneous solutions are however characterised by an integrable PDE. We shall carry over the homogeneity analysis for a general Hessian metric in (n + 1) dimensions, and then restrict our attention to n = 2 where there is a direct connection with the semi–flat CY manifolds on one side and integrability on the other. The following proposition follows from combining results of Calabi [5] and Baues-Cortés [2] about parabolic and elliptic affine spheres. Here, we give a direct elementary proof not based on affine differential geometry. It has certain advantages as it exhibits explicit coordinate transformations between solutions to various forms of homogeneous Hessian equations. Proposition 3.1. Let φ = φ(x i ) be a solution to the Hessian equation (1.1) on an open ball B ⊂ Rn+1 such that φ(cx) = c2 φ(x) for any non-zero constant c. Then there exists a local coordinate system ( p1 , . . . , pn , r ) on B such that the metric (1.2) is 2 ∂ w 1 dpα dpβ , α, β = 1, . . . , n, (3.1) g B = dr 2 + r 2 w ∂ pα ∂ pβ where w = w( pα ) satisfies

∂ 2w det ∂ pα ∂ pβ

=

1 w n+2

.

(3.2)

Proof. Consider the Hessian metric (1.2) with φ homogeneous of degree 2. Therefore V = x i ∂/∂ x i is a homothety with LV g B = 2g B . Locally there exists a function r : B −→ R such that V = r ∂/∂r and g B = γ (dr + r α)2 + r 2 h, where h, α, γ are a metric, a one–form and a function respectively on the space of orbits of V. The relation ∂i (x j φ j ) = 2φi gives g B (V, . . .) = x i φi j d x j = dφ. Thus d(γ (dr + r α)) = 0 and we can redefine r to set α = 0 and γ = 1. We also note that |V |2 = x i x j φi j = 2φ, and recognise g B as a cone over h, g B = dr 2 + r 2 h, φ =

r2 . 2

(3.3)

Now let us consider the surface r = 1 given by a graph in Rn+1 , (x˜ 1 , . . . , x˜ n ) −→ (x˜ 1 , . . . , x˜ n , v(x˜ α )), where x˜ α , α = 1, . . . , n, parametrise the surface. We shall show that its induced metric h is given by h=

∂α ∂β v d x˜ α d x˜ β , γ x˜ ∂γ v − v

(3.4)

Strominger–Yau–Zaslow Geometry, Affine Spheres and Painlevé III

1003

where ∂α := ∂/∂ x˜ α . To prove it, restrict the function φ to the surface r = 1. This gives an identity φ(x˜ α , v(x˜ α )) = 1/2. We differentiate this identity implicitly with respect to x˜ α and express the first and second derivatives of φ in terms of the derivatives of v, 0 = ∂α φ + ∂n+1 φ ∂α v, 2 0 = ∂α ∂β φ + ∂α ∂n+1 φ ∂β v + ∂β ∂n+1 φ ∂α v + ∂n+1 φ ∂α v∂β v + ∂n+1 φ ∂α ∂β v, α 2φ = x˜ ∂α φ + v∂n+1 φ = 1, where the last relation is just the homogeneity condition restricted to the hypersurface φ = 1/2. Substituting all that to g B gives (3.4). Now if the function φ in the Hessian metric g B satisfies the Hessian condition (1.1) then v satisfies det

∂ 2v = (x˜ α ∂α v − v)n+2 . ∂ x˜ α ∂ x˜ β

(3.5)

To see it, let us write the coordinates x i on Rn+1 as (x 1 , . . . , x n , x n+1 ) = (r x˜ 1 , . . . , r x˜ n , r v(x˜ α )), that is, regard Rn+1 as the cone over the r = 1 surface. Now consider the invariant volume element

|g B | d x 1 ∧ · · · ∧ d x n ∧ d x n+1 = |g˜ B | d x˜ 1 ∧ · · · ∧ d x˜ n ∧ dr, (3.6) where |g B | is the absolute value of the determinant of Hessian metric (1.2) written in the coordinates x i and g˜ B is the same metric expressed in the basis {d x˜ α , dr }. We contract both sides of (3.6) with V . On the LHS of (3.6) we use the form V = x i ∂/∂ x i and on the RHS use V = r ∂/∂r. We now set r = 1 and impose the Hessian equation (1.1), det g B = det φi j = 1. This yields

v − x˜ α ∂α v = |g˜ B |. On the surface r = 1, one has det g˜ B = det h, where h is given by (3.4). Substituting this in the above formula and taking squares of both sides yields (3.5). Note2 that we have taken det h > 0 from the assumption that det g B = det φ jk = 1. To obtain the statement in the proposition, perform a Legendre transform pα =

∂v ∂v ∂w , w( pα ) = x˜ α α − v, x˜ α = . ∂ x˜ α ∂ x˜ ∂ pα

Using dpα = ∂α ∂β v d x˜ β yields h=

1 ∂ 2w dpα dpβ w ∂ pα ∂ pβ

(3.7)

and ∂ 2w = ∂ pα ∂ pβ

∂ 2v ∂ x˜ α ∂ x˜ β

−1 ,

which implies (3.1) and (3.2). 2 If we started with det φ = −1, which implies det h < 0, the analogous argument would lead to ij

det

∂ 2 v = −( x˜ α ∂ v − v)n+2 . α ∂ x˜ α ∂ x˜ β

1004

M. Dunajski, P. Plansangkate

Now, let us consider a hypersurface immersed in Rn+1 with the flat metric δ jk d x j d x k , given by a graph r = (x˜ 1 , . . . , x˜ n , v(x˜ 1 , . . . , x˜ n )).

(3.8)

The first and second fundamental forms on are given by h I = dr · dr = (δαβ + ∂α v∂β v)d x˜ α d x˜ β , ∂ 2v 1 d x˜ α d x˜ β , h I I = −dr · dn = 1 + (∂1 v)2 + · · · + (∂n v)2 ∂ x˜ α ∂ x˜ β where n is the unit normal to . Tzitzéica [29,30] has studied surfaces in R3 for which the ratio of the Gaussian curvature K to the fourth power of a distance from a tangent plane to some fixed point is a constant. If K = 0, we can always rescale the coordinates to set this constant to +1 or −1 depending on the sign of the Gaussian curvature. We shall call this the Tzitzéica condition. The generalisation of the Tzitzéica condition to hypersurfaces in Rn+1 is given by K = ±Dn+2 , where D = r · n is the same as the distance up to sign. In the adapted coordinates, D and the Gaussian curvature K are given by D=

v − x˜ α ∂α v 1 + (∂1 v)2 + · · · + (∂n v)2

,

1

K= ( 1 + (∂1 v)2 + · · · + (∂n v)2 )n+2

∂ 2v . det ∂ x˜ α ∂ x˜ β

It follows that the Tzitzéica condition holds if and only if v satisfies det

∂ 2v = ±(v − x˜ α ∂α v)n+2 , ∂ x˜ α ∂ x˜ β

(3.9)

where plus and minus signs correspond to positive and negative Gaussian curvature respectively. It is well known in affine differential geometry that an immersed hypersurface in Rn+1 is an affine hypersphere with the origin as its centre if and only if the Tzitzéica condition (3.9) holds [25]. It turns out that the metric (3.4), with v satisfying (3.5), is the same as the Blaschke metric (or affine metric) of a proper affine hypersphere. The Blaschke metric is conformally related to the second fundamental form, and is defined as follows. Let N denote the transversal vector field of the surface such that the unit N normal n is given by n = |N| , i.e. N = ∇(x˜ n+1 − v(x˜ 1 , . . . , x˜ n )). Consider a bilinear form hˆ = −dr · dN = |N| h I I . The Blaschke metric is then given by ˆ − n+2 h. ˆ h := | det h| 1

(3.10)

Strominger–Yau–Zaslow Geometry, Affine Spheres and Painlevé III

1005

Therefore, for the surface given by the graph (3.8), we have − 1 ∂ 2 v n+2 ∂ 2 v d x˜ α d x˜ β , h = det α β ∂ x˜ ∂ x˜ ∂ x˜ α ∂ x˜ β which coincides with the metric (3.4) if Eq. (3.5) holds. In affine differential geometry, it is also known [5] that a Hessian metric (1.2) which satisfies det φi j = 1 is a parabolic (improper) affine hypersphere metric. We have demonstrated that Hessian equation (1.1) on φ implies (3.5) on v. Therefore, this is in agreement with a result of Baues and Cortéz [2] that a parabolic affine hypersphere metric which admits a homothety LV g B = 2g B is the metric cone over a proper affine hypersphere. Let us now restrict our attention to n = 2, and consider the metric h (3.4). For n = 2, det h > 0 implies that h is a definite metric. In the context of the Calabi–Yau manifolds, the metric g B is Riemannian, hence one is interested in positive–definite h. Baues and Cortés [2] have shown that in such case h is the Blaschke metric of a definite elliptic affine sphere, with affine mean curvature 1. Since h is positive definite we can adopt isothermal coordinates for the affine metric (which are asymptotic coordinates for the second fundamental form h I I ) and write it as h = eψ dzdz,

(3.11)

for some real valued function ψ = ψ(z, z¯ ). In this form, Simon and Wang [27] proved that the structure equations3 of definite affine sphere imply that ψ necessarily satisfies Eq. (1.4), 1 ψz z¯ + eψ + |U |2 e−2ψ = 0, 2

Uz¯ = 0,

where U dz 3 is the holomorphic cubic differential. Conversely, given a solution of (1.4) one can construct an affine sphere with h = eψ dzdz as its Blaschke metric. We should note here that if the holomorphic cubic 3 The usual affine immersion in Rn+1 only assumes a flat connection D and a parallel volume element on Rn+1 , but not an ambient metric. In particular, the structure equations of a Blaschke hypersurface immersion f : (, ∇) −→ (Rn+1 , D) are given by

D X f ∗ (Y ) = f ∗ (∇ X Y ) + h(X, Y )ξ,

(3.12)

D X ξ = − f ∗ (S X ),

(3.13)

where ∇ is an affine connection on , X, Y ∈ T , ξ is a transversal vector field chosen uniquely up to sign to satisfy certain properties, called the affine normal field, and h is the Blaschke metric defined by (3.12). This definition turns out to be equivalent to (3.10) if one were to use the Euclidean metric on Rn+1 . The operator S : T −→ T is called the affine shape operator and H = n1 Tr(S) the affine mean curvature. A proper affine sphere is defined to be a Blaschke hypersurface with S = H I, I being the identity metric. Another affine invariant quantity is a totally symmetric tensor called the cubic form Cˆ and is defined by ˆ C(X, Y, Z ) = h(C(X, Y ), Z ), where C is the difference tensor C = ∇ˆ − ∇ and ∇ˆ is the Levi-Civita connection of h. Consider h as in (3.11) ¯ be the components of C in the basis e1 = dz, e1¯ = d z¯ . Then it can be shown and let C ijk , i, j, k ∈ {1, 1} ¯ ¯ 11 ¯ ψ 1 by U = C11 e . It follows that the cubic form is Cˆ = U dz 3 + U¯ d z¯ 3 . See [5,19,25,27] for details.

1 and C 1 = C 1 , and the function U in (1.4) is defined that the only nonvanishing components of C are C11 11 ¯¯

1006

M. Dunajski, P. Plansangkate

differential U (z)dz 3 is non-zero, we can choose the isothermal coordinates such that U = 1. For example, defining ξ = ξ(z) by dξ = 2−1/3 U 1/3 dz transforms (1.4) into ˆ ˆ ψˆ ξ ξ + eψ + e−2ψ = 0,

(3.14)

where ψˆ = ψ −

1 1 1 log U − log U¯ − log 2. 3 3 3

We will make use of such coordinate transformation in Sect. 4.4 Loftin, Yau and Zaslow [20] proved the existence of a semi–flat Calabi–Yau metric (1.3) with the base metric g B as the metric cone over an elliptic affine sphere g B = φi j d x i d x j = dr 2 + r 2 eψ dzdz,

(3.15)

with the prescribed singularity, by proving the existence of a radially symmetric solution ψ of (1.4) for U (z) = z −2 and the corresponding global solution on S 2 minus three points. Motivated by this work, we are interested in the integrability of the definite affine sphere equation (1.4). The affine sphere equation is closely related to a well known integrable equation, namely the Tzitzéica equation u xy = eu − e−2u .

(3.16)

In the context of affine spheres, the Tzitzéica equation arises if det h < 0. By writing the metric in isothermal coordinates as h = 2eu d xdy and considering the structure equations, Simon and Wang [27] also show that h is the Blaschke metric of the indefinite affine sphere (with negative affine mean curvature) if and only if u satisfies u xy = eu − r (x)b(y)e−2u , where r (x), b(y) are arbitrary non-vanishing functions of one variable, which can be normalised by rescaling the isothermal coordinates. Thus, we obtain u xy = eu − e−2u ,

(3.17)

where = ±1. The equation with = 1, (3.16), was first derived in [29,30] for the Tzitzéica surface in R3 with negative Gaussian curvature K = −D4 , where the indefinite second fundamental form is written in asymptotic coordinates as h I I = 2eu D d xdy. The difference between the two equations (3.16) and (1.4) lies in the relative sign of the two exponential terms on the RHS. For the Tzitzéica equation u = 0 is a solution and other solutions may be constructed using Darboux and Bäcklund transformations, for example see [4]. The definite affine sphere equation does not seem to have such obvious solutions. However, Calabi [5] has shown that an elliptic affine hypersphere with complete Blaschke metric is an ellipsoid. This is in agreement with the fact that (1.4) admits solutions in term of elliptic functions, which can be found by making an ansatz ψ(z, z¯ ) = f (z + z¯ ) in (3.14). 4 We note that the analytic continuation ˆ ˆ ψˆ ξ ξ + eψ − e−2ψ = 0

of Eq. (3.14) was used by McIntosh [23] to describe minimal Lagrangian immersions in CP2 and special Lagrangian cones in C3 .

Strominger–Yau–Zaslow Geometry, Affine Spheres and Painlevé III

1007

4. Reduction of ASDYM It was shown in [7] that the Tzitzéica equation (3.16) can be obtained from a special ansatz to the anti–self–dual Yang–Mills in R2,2 with gauge group S L(3, R). In this section, we shall give a gauge and coordinate invariant characterisation of the Tzitzéica equation and the definite affine sphere equation as different real forms of a reduction of ASDYM on C4 with gauge group S L(3, C), via the holomorphic Hitchin equations on C2 . 4.1. Holomorphic Tzitzéica equation. Consider a holomorphic metric and volume element on C4 , ds 2 = 2(dz d z˜ − dw d w), ˜ ν = dw ∧ d w˜ ∧ dz ∧ d z˜ . Let A = A z dz + Aw dw + A z˜ d z˜ + Aw˜ d w˜ be a Lie algebra valued connection on a vector bundle E → C4 . The anti–self–dual Yang–Mills equations are given by Fzw = 0,

Fz z˜ − Fww˜ = 0,

Fz˜ w˜ = 0.

These equations arise from a Lax pair [Dz + λDw˜ , Dw + λDz˜ ] = 0,

(4.1)

where Dz = ∂z + A z , etc, are covariant derivatives, Fz z˜ = [Dz , Dz˜ ], and (4.1) is required to hold for any value of the spectral parameter λ. Choose a gauge group to be S L(3, C) and assume that A is invariant under the action of two dimensional group of translations C2 such that the metric restricted to the planes spanned by the generators of the group is non-degenerate. Let X 1 , X 2 be the generators of the group, then the Higgs fields P = X1

A,

Q = X2

A

belong to the adjoint representation. We can always choose the coordinates so that the group is generated by the two null vectors X 1 = ∂/∂ w˜ and X 2 = ∂/∂w. The ASDYM system reduces to the holomorphic form of the Hitchin equations [14] Dz Q = 0, Dz˜ P = 0, Fz z˜ + [P, Q] = 0,

(4.2a) (4.2b) (4.2c)

where Fz z˜ = ∂z A z˜ − ∂z˜ A z + [A z , A z˜ ] is a curvature of a holomorphic connection A = A z dz + A z˜ d z˜ on C2 . The Hitchin equations are invariant under the gauge transformations A → g −1 Ag + g −1 dg,

P → g −1 Pg,

Q → g −1 Qg,

(4.3)

and later we shall also make use of the following coordinate freedom: z −→ zˆ (z), z˜ −→ zˆ˜ (˜z ).

(4.4)

1008

M. Dunajski, P. Plansangkate

The Lax pair (4.1) for the ASDYM reduces to the following Lax pair for the holomorphic Hitchin equations: [Dz + λP, Q + λDz˜ ] = 0.

(4.5)

There are several gauge inequivalent ways to embed the Tzitzéica equation (3.16) as a special case of the Hitchin equations. The gauge used in [7] is ⎛ ⎞ ⎛ ⎞ 0 0 1 0 0 0 Aw˜ = P = ⎝ 0 0 0 ⎠ , Aw = Q = ⎝ 0 0 0 ⎠ , (4.6) 0 0 0 eu 0 0 ⎛

⎞ uz 0 0 A z = ⎝ 1 −u z 0 ⎠ , 0 1 0

⎞ 0 e−2u 0 A z˜ = ⎝ 0 0 eu ⎠ , 0 0 0 ⎛

(4.7)

where u(z, z˜ ) is a complex valued function holomorphic in (z, z˜ ). With this ansatz the Hitchin equations yield the holomorphic Tzitzéica equation u z z˜ = eu − e−2u .

(4.8)

Choosing the real form S L(3, R) of S L(3, C) and regarding u = u(x, y) as a real function of real coordinates z = x, z˜ = y reduces (4.8) to (3.16). On the other hand, performing the coordinate transformation d zˆ =

U (z) 2

− 1 3

dz, d zˆ˜ =

U˜ (˜z ) 2

− 1 3

d z˜

and setting 1 U 1 U˜ 1 u = ψ(z, z˜ ) − log − log + log − 3 2 3 2 2 for any branch of log − 21 puts (4.8) in the form 1 ψz z˜ + eψ + U (z)U˜ (˜z )e−2ψ = 0, 2

(4.9)

where we have dropped hats of the new variables. Equation (4.9) then reduces to the affine sphere equation (1.4) under the Euclidean reality conditions z˜ = z¯ and reducing the gauge group to SU (2, 1), which implies the constraint U˜ = U¯ . Now we shall establish a gauge invariant characterisation of the ansatz (4.6), (4.7) in terms of the gauge and Higgs fields of the Hitchin equations. We will make use of the following lemma. Lemma 4.1. Consider 3 by 3 complex matrices P, Q such that P 2 = Q 2 = 0,

T r (P Q) = ω = 0.

(4.10)

There exists a gauge transformation such that P, Q are in the form (4.6) for some u.

Strominger–Yau–Zaslow Geometry, Affine Spheres and Painlevé III

1009

Proof. The conditions (4.10) are invariant under the gauge transformations P −→ g −1 Pg,

Q −→ g −1 Qg.

These conditions imply that the nullities (dimensions of the kernels of the associated linear maps) satisfy n(Q P) < 3 and n(P) = 2. Thus Ker(Q P) = Ker(P). Also rank(Q P) = 1 and Im(Q P) is contained in the one-dimensional image of Q, therefore Im(Q P) = Im(Q).

(4.11)

Choose a Jordan basis (v, u, w) of C3 such that P(w) = v,

P(v) = 0,

P(u) = 0.

(4.12)

From (4.11) Im(Q) = span (Q(v)), thus Q(u) = a Q(v), Q(w) = bQ(v) for some a, b so that Ker(Q) = span (u − av, w − bv). Use the freedom in the basis (4.12) to set w = w − bv, u = u − av, v = v. Now P(w ) = v , P(v ) = 0, P(u ) = 0, Q(w ) = 0, Q(u ) = 0, Q(v ) = cu + ωw , where ω = 0 as T r (P Q) = ω = 0. There is still freedom in (4.12): v = v , u = u , w = w + (c/ω)u so that, dropping primes, P(w) = v, Q(w) = 0,

P(v) = 0, Q(u) = 0,

P(u) = 0, Q(v) = ωw.

Ordering the basis (v, u, w) yields the matrices in the desired form, i.e. P13 = 1, Q 31 = ω, and all other components vanish. The residual gauge freedom is w → αw, v → αv, u → βu, and the change of basis matrix gives the residual G L(3, C) gauge transformation. In the S L(3, C) case we set β = α −2 . The statement of the lemma now follows by setting ω = eu . We shall now give a set of necessary and sufficient conditions allowing solutions of the Hitchin equations (4.2a, b, c) to be transformed into (4.6), (4.7) by gauge and coordinate symmetries. Proposition 4.2. Let (Q, P, A = A z dz+ A z˜ d z˜ ) be a solution of the holomorphic Hitchin equations (4.2a, b, c), with gauge group S L(3, C). Then, (Q, P, A z , A z˜ ) can be transformed into (4.6),(4.7) by gauge symmetry and coordinate symmetry (4.4) if and only if the following conditions hold:

1010

M. Dunajski, P. Plansangkate

2 (i) P and = 0. Q have minimal polynomial t , with Tr (P Q) 2 2 (ii) Tr (Dz P) = 0 = Tr (Dz˜ Q) and Tr (Dz P)2 (Dz˜ Q)2 = 0. (iii) Tr M = 0, where

M = (P Q)4 + (P Q)2 (Dz P)(Dz˜ Q) − P Q(Dz P)Q P(Dz˜ Q). Proof. The proof of the necessary conditions is straightforward. It can be shown by direct calculation that (4.6),(4.7) satisfy conditions (i), (ii), (iii). The three conditions are gauge invariant by the cyclic property of the trace. Under the coordinate transformation (4.4), the connection (A z , A z˜ ) and the Higgs fields (P, Q) transform as −1 −1 d z ˆ d zˆ˜ Aˆ zˆ = A z , Aˆ z˜ˆ = A z˜ , dz d z˜ −1 −1 d zˆ˜ d zˆ Qˆ = Q, Pˆ = P. d z˜ dz Thus, using condition (i), the square of the covariant derivative is given by −4 ˆ 2 = d zˆ ( Dˆ zˆ P) (Dz P)2 dz and similarly for (Dz˜ Q)2 . Therefore, conditions (i) and (ii) are invariant under the coordinate transformation. A similar calculation shows that (iii) is also invariant under (4.4). Conversely, we shall now show that any solution to (4.2a, b, c) such that all the conditions in Proposition 4.2 hold, can be gauge and coordinate transformed into the form (4.6),(4.7). Firstly, by Lemma 4.1, condition (i) implies that we can use gauge symmetry to put the Higgs fields (Q, P) in the form (4.6). Equations (4.2a) and (4.2b) imply that A z , A z˜ are of the form ⎛ ⎛ ⎞ ⎞ n 0 0 p s h 0 ⎠ , A z˜ = ⎝ 0 −2 p k ⎠ , A z = ⎝ r u z − 2n (4.13) 0 0 p m t n − uz where n, r, m, t, p, s, h, k are some functions of (z, z˜ ). Note that we have also used the assumption that the fields are sl(3, C) valued, hence traceless. Next, to set the diagonal elements of (A z , A z˜ ) to be as in (4.7), we consider the residual gauge freedom. Lemma 4.1 implies that the gauges preserving (Q, P) are given by ⎛ ⎞ a 0 0 g(z, z˜ ) = ⎝ 0 a12 0 ⎠ (4.14) 0 0 a for an arbitrary function a(z, z˜ ) = 0. Thus, using (4.3), we have ⎛ ⎞ n + aaz 0 0 a ⎠, 0 A z −→ ⎝ ra 3 u z − 2n − 2 az az t n − u + m z 3 a a ⎛ ⎞ a s p + az˜ h a3 a A z˜ −→ ⎝ 0 −2 p − 2 az˜ ka 3 ⎠ . a 0 0 p + az˜

Strominger–Yau–Zaslow Geometry, Affine Spheres and Painlevé III

1011

We choose a(z, z˜ ) such that (ln a)z = u z − n, and (ln a)z˜ = − p. This is allowed because the compatibility condition ∂z p + ∂z ∂z˜ u − ∂z˜ n = 0

(4.15)

holds automatically as a consequence of condition (iii). To see it, note that Eq. (4.2c) implies ∂z p + ∂z ∂z˜ u − ∂z˜ n + mh + tk = eu . Hence, condition (4.15) is equivalent to mh + tk = eu , which holds by (iii). Note that at this point elements of (A z , A z˜ ) will be transformed, however, for convenience we will label them with the same letters as in (4.13). n = uz Thus we have set and p = 0. We now proceed to deal with r, m, t, s, h, k. Tr (Dz P)2 (Dz˜ Q)2 = 0 in condition (ii) implies that r, t, s, k = 0, and Tr (Dz P)2 = 0 = Tr (Dz˜ Q)2 gives m = 0 = h. Hence (4.2c) becomes u z z˜ + r s sz + 2su z r z˜ k z − ku z tz˜ tk

= = = = = =

eu , 0, 0, 0, 0, eu .

Since r, t, s, k = 0, we can solve the above equations. The last three equations imply that t is a constant, and thus can be set to 1 by a constant gauge transformation of the form (4.14) with a = t −1/3 , and s is determined to be of the form b(˜z )e−2u . This results in ⎛ ⎞ ⎛ ⎞ 0 0 1 0 0 0 P = ⎝0 0 0⎠, Q = ⎝ 0 0 0⎠, 0 0 0 eu 0 0 ⎛

⎞ uz 0 0 A z = ⎝ r (z) −u z 0 ⎠ , 0 1 0

⎛

⎞ 0 b(˜z )e−2u 0 A z˜ = ⎝ 0 0 eu ⎠ . 0 0 0

(4.16)

1012

M. Dunajski, P. Plansangkate

Note that the gauge is now fixed. To get to ansatz (4.6),(4.7), we will now use the coordinate symmetry. Define zˆ , zˆ˜ such that d zˆ = e j (z) dz, d zˆ˜ = el(˜z ) d z˜ , and set uˆ := u − j (z) − l(˜z ). By choosing j (z), l(˜z ) such that e3 j (z) = r (z) and e3l(˜z ) = b(˜z ), (4.16) becomes gauge equivalent to (4.6),(4.7) in the new variables (ˆz , zˆ˜ , u). ˆ The gauge transformation we need in the final step is given by (4.3) with ⎛

⎞ e− j (z(ˆz )) 0 0 g(ˆz , zˆ˜ ) = ⎝ 0 e j (z(ˆz )) 0 ⎠ . 0 0 1 We note that substituting (4.16) to the Hitchin equations yields u z z˜ = eu − r (z)b(˜z )e−2u .

(4.17)

Therefore, the change of coordinates can, roughly speaking, be regarded as setting r (z) and b(˜z ) to constants such that r (z)b(˜z ) = 1. We shall now choose the Euclidean reality condition and select the real form SU (2, 1) of S L(3, C) to deduce Theorem 1.1 from the last proposition. Proof of Theorem 1.1. Consider the ansatz (4.16) and Eq. (4.17). By changing the dependent variable from u to 1 ψ = u − log − 2 for any branch of log − 21 , Eq. (4.17) becomes 1 ψz z˜ + eψ + U (z)U˜ (˜z )e−2ψ = 0, 2

(4.18)

where U (z) = 2r (z), U˜ (˜z ) = 2b(˜z ). Then, after an S L(3, C) gauge transformation with ⎛

0 ⎜ g(z, z˜ ) = ⎝ 0 1

0 ψ 1 √ e2 2 0

√ ψ ⎞ − 2e− 2 ⎟ ⎠, 0 0

Strominger–Yau–Zaslow Geometry, Affine Spheres and Painlevé III

the ansatz (4.16) becomes

Aw

Aw˜

Az

A z˜

1013

⎞ ψ 0 0 √1 e 2 2 ⎟ ⎜ = Q = ⎝0 0 0 ⎠, 0 0 0 ⎞ ⎛ 0 0 0 ⎜ 0 0 0⎟ = P=⎝ ⎠, ψ 1 −√ e 2 0 0 2 ⎞ ⎛ ψ 0 0 √1 e 2 2 ⎟ ⎜ = ⎝ 0 − 1 ψz −U (z)e−ψ ⎠ , 2 1 0 0 2 ψz ⎞ ⎛ 0 0 0 ψ ⎜ 1 0 ⎟ = ⎝ − √12 e 2 ⎠. 2 ψz˜ 1 −ψ ˜ − ψz˜ 0 −U (˜z )e ⎛

(4.19)

2

Impose the Euclidean reality conditions z˜ = z¯ , w˜ = −w, ¯ resulting in a positive-definite metric on R4 . The ASDYM equations with these reality conditions are Fzw = 0, Fz z¯ + Fww¯ = 0.

(4.20) (4.21)

Take the gauge group to be SU (2, 1). A matrix M is in the Lie algebra su(2, 1) if it is trace-free and satisfies ¯ t = −η M η−1 , M

(4.22)

where η = η−1 = diag(1, 1, −1). Let z = p + iq, w = r + is, so ( p, q, r, s) are standard flat coordinates on R4 . The gauge fields A p , Aq , Ar , As are su(2, 1) valued. The relations A z = (A p − i Aq )/2, A z¯ = (A p + i Aq )/2 together with (4.22) imply that t A¯z = −η A z¯ η−1 ,

with a similar relation between Aw and Aw¯ . Concretely, this means that ⎞ ⎛ ⎞ ⎛ −a¯ −d¯ g¯ a b c A z¯ = ⎝ d e f ⎠ , A z = ⎝ −b¯ −e¯ h¯ ⎠ , g h k c¯ f¯ −k¯ where a + e + k = 0 (and of course Aw and Aw¯ are related in the same way). Choosing a real form SU (2, 1) of S L(3, C) on restriction to the Euclidean slice imposes a constraint U˜ = U¯ and yields the affine sphere equation (1.4). To sum up, one could achieve the characterisation of the ansatz (4.19), with z˜ = z¯ , U˜ = U¯ , analogous to Proposition 4.2. Let us again choose the double null coordinates such that the generators of the symmetry group of the ASDYM are given by ∂w˜ , ∂w .

1014

M. Dunajski, P. Plansangkate

With the chosen reality condition the ASDYM equations reduce to the SU (2, 1) Hitchin equations Dz Aw = 0, Fz z¯ + [Aw , Aw¯ ] = 0,

(4.23) (4.24)

where A z¯ = −η−1 A¯z η and Aw¯ = −η−1 A¯w η. t

t

(4.25)

We now consider the reduction of the system (4.23),(4.24). Theorem 1.1 arises as a corollary of Proposition 4.2. 4.2. Tzizéica equation. The Tzitzéica equation (3.16) is a different real form of (4.8). It arises from the ASDYM with the gauge group S L(3, R) on restriction to the ultrahyperbolic real slice R2,2 in C4 with (w, w, ˜ x = z, y = z˜ ) real. The Higgs fields are given by P = Aw˜ , Q = Aw and the metric on the space of orbits of X 1 = ∂w˜ and X 2 = ∂w has signature (1, 1). The real version of the ansatz (4.6),(4.7) can be characterised analogously to the holomorphic case treated in Proposition 4.2. However, one needs to take care of the fact that eu(x,y) > 0 for real valued function u(x, y). There are two places where this needs to be considered. First is where we use condition (i) in Proposition 4.2 to put (Q, P) in the form (4.6),(4.7). To write Tr(P Q) = eu(x,y) , we require that Tr(P Q) > 0. Assume that this can be done at a point (x0 , y0 ) (if not then change coordinates y → −y) and restrict the domain of u to a neighbourhood of this point where the positivity still holds. The second place where the problem of the sign arises is when we use the coordinate symmetry to transform u xy = eu − r (x)b(y)e−2u to the Tzitzéica equation (3.16). This can only be done for r (x)b(y) > 0. The sign of r (x)b(y) is governed by the quantity Tr (Dx P)2 (Dy Q)2 in condition (ii). To see it, note that in the notation of (4.16), Tr (Dx P)2 (Dy Q)2 = (sktr )e2u . After we set t = 1, the condition (iii) implies that k =eu > 0. Hence, the sign of sr, and thus the sign r (x)b(y) is the same as the sign of Tr (Dx P)2 (Dy Q)2 . However, this cannot be changed by real coordinate transformation x → x(x), ˆ y → y(y), ˆ because d xˆ −4 d yˆ −4 Tr (Dx P)2 (Dy Q)2 −→ Tr (Dx P)2 (Dy Q)2 , dx dy where we have used Q 2 = 0 = P 2 . Therefore, condition (ii) in Proposition 4.2 needs to be replaced by Tr (Dz P)2 = 0 = Tr (Dz˜ Q)2 and Tr (Dz P)2 (Dz˜ Q)2 > 0 in the domain of u.

Strominger–Yau–Zaslow Geometry, Affine Spheres and Painlevé III

1015

We remark that Tr (Dx P)2 (Dy Q)2 < 0 corresponds to the equation u xy = eu + e−2u , whereas Tr (Dx P)2 (Dy Q)2 = 0 yields Louiville equation u xy = eu . Therefore, the sign of Tr (Dx P)2 (Dy Q)2 corresponds to the sign of in (3.17). 5. Z3 Two Dimensional Toda Chain As a byproduct of the proof of Proposition 4.2, we find that, dropping condition (iii) in this proposition, the Hitchin equations can be reduced to a coupled system which includes the Z3 two dimensional Toda chain [24] as a special case. Recall that a two dimensional Toda chain is given by (u α )xy − e(u α+1 −u α ) + e(u α −u α−1 ) = 0,

(5.1)

where α ∈ Z. In this paper (5.1) is called the Z3 two dimensional Toda chain when i) α ∈ Z/Z3 and ii) u 1 + u 2 + u 3 = 0. We summarise the result in the following proposition. Proposition 5.1. Let u 1 , u 2 be functions of (x, y). The coupled system of equations (u 1 )xy − 1 e(u 2 −u 1 ) + e2u 1 +u 2 = 0, (u 2 )xy + 1 e

(u 2 −u 1 )

− 2 e

−2u 2 −u 1

(5.2)

= 0,

where 1 , 2 = ±1, is gauge equivalent to the S L(3, R) Hitchin equations (4.2a, b, c) with z = x, z˜ = y real, and (i) the Higgs fields polynomial t 2 , with Tr P and Q have minimal (P Q) = 0, (ii) Tr (Dx P)2 = 0 = Tr (Dy Q)2 and Tr (Dx P)2 (Dy Q)2 = 0. Proof. These conditions are the first two conditions in Proposition 4.2. Following the proof and assuming condition (i) gives (4.13). However, now it is not possible to use gauge symmetry to set the diagonal elements of both A x and Ay to be the same as in (4.7) without the compatibility condition. Instead, let us use only the gauge transformation (4.14) to eliminate the diagonal elements of Ay , by choosing (ln a)y = − p. As before, condition (ii) implies that m = h = 0 and sktr = 0. The Hitchin equations (4.2a, b, c) imply that t is a function of x only. Hence, we can use the residual gauge freedom (4.14) with a = a(x) to set t = 1. Equation (4.2c) then gives n y + r (x)s 2n y − u xy + r (x)s − k sx + 3ns − su x k x + 2ku x − 3kn

= = = =

eu , 0, 0, 0.

(5.3) (5.4) (5.5) (5.6)

1016

M. Dunajski, P. Plansangkate

Equations (5.5) and (5.6) imply that sk = c(y)e−u , where c(y) is some arbitrary function which arises from the integration. Now, since s = 0, let us write k=

c(y) −u and n = αx , s = ±eβ , e s

for some functions α(x, y) and β(x, y). Then, (5.5) becomes eβ (βx + 3αx − u x ) = 0, which can be integrated to give s = b(y)eu−3α and n = αx for some b = b(y) = 0. Finally, (5.3) and (5.4) give a coupled system αxy + r (x)b(y)eu−3α − eu = 0, 2αxy − u xy + r (x)b(y)eu−3α − c(y)b−1 (y)e−2u+3α = 0.

(5.7)

Set u 1 = α, u 2 = −2α + u, and change the coordinate y → −y. The system (5.7) becomes (u 1 )xy − r (x)b(y)eu 2 −u 1 + e2u 1 +u 2 = 0, (u 2 )xy + r (x)b(y)eu 2 −u 1 − c(y)b−1 (y)e−2u 2 −u 1 = 0, which can be transformed into (5.2) by the change of dependent variables and coordinates. There are four distinct cases depending on the signs of 1 , 2 . Since the coordinates are real, the signs of 1 , 2 are the same as those of r (x)b(y) and c(y)b−1 (y), respectively. Similar to the real version of Proposition 4.2 for the Tzitzéica equation, r (x)b(y) and c(y)b−1 (y) can be related to some gauge invariant quantities. It can be shown that at a given point (x0 , y0 ) the signs of r (x)b(y) and c(y)b−1 (y) are determined by the signs of (a) := Tr (Dx P)2 (Dy Q)2 , (b) := Tr (P Q)2 (Dx P)(Dy Q) − P Q(Dx P)Q P(Dy Q) . We shall analyse these signs and then restrict the domains of (u 1 , u 2 ) to a neighbourhood of (x0 , y0 ) where the signs remain constant. If (a) > 0, setting t = 1 gives skr > 0, which gives r (x)c(y) > 0. This implies that r (x)b(y) and c(y)b−1 (y) have the same signs. Now if (b) > 0, then k > 0 meaning c(y)b−1 (y) > 0, hence r (x)b(y) > 0. Similarly if (b) < 0 then c(y)b−1 (y) and r (x)b(y) < 0. On the other hand, (a) < 0 implies that r (x)b(y) and c(y)b−1 (y) have opposite signs. Then, the sign of (b) determines the sign of c(y)b−1 (y). The important point is that the signs of (a) and (b) cannot be changed by real coordinate transformations. This completes the proof.

Strominger–Yau–Zaslow Geometry, Affine Spheres and Painlevé III

1017

6. Other Gauges There are several gauge inequivalent ways to reduce the ASDYM equations to the Tzitzéica equation or to the definite affine sphere equation. The reductions are relatively easy to obtain, but their gauge invariant characterisation requires much more work. Here we shall mention one other possibility which is not gauge equivalent to (4.6, 4.7). It can be shown that the holomorphic Tzitzéica equation (4.8) also arises from the Hitchin equations with ⎛ ⎞ ⎛ ⎞ 0 0 1 0 e−2u 0 P = ⎝1 0 0⎠, Q = ⎝ 0 0 eu ⎠ , u 0 1 0 e 0 0 ⎛ ⎞ (6.1) uz 0 0 A z = ⎝ 0 −u z 0 ⎠ , A z˜ = 0. 0 0 0 The real version of this ansatz was implicitly used by E. Wang [31]. Let us comment on how this formulation is related to (4.6), (4.7). First note that the Lax pairs (4.5) with (4.6),(4.7) and (6.1) are equal for λ = 1. Now consider the ansatz (4.6),(4.7) and set λ = 1 in the Lax pair (4.5). Introduce the new spectral parameter by exploiting the Lorentz symmetry and rescaling the coordinates ˆ λˆ −1 z˜ ) (z, z˜ ) −→ (λz, and read off new A z , A z˜ , P, Q from (4.5) with λ replaced by λˆ . This yields the ansatz (6.1). Choosing the Euclidean reality conditions and reducing the gauge group to SU (2, 1) we find another reduction of ASDYM to the affine sphere equation. Take the following ansatz, in which the gauge fields are independent of w and w, ¯ ψ = ψ(z, z¯ ) is a real function, and U (z, z¯ ) is a complex function: ⎛ ⎞ √1 eψ/2 0 0 2 ⎜ ⎟ Aw = ⎝ U¯ e−ψ 0 0 ⎠, √1 eψ/2 0 0 2 ⎛ ⎞ 0 0 −U e−ψ ⎜ √1 eψ/2 ⎟ , 0 Aw¯ = ⎝ 0 ⎠ √1 eψ/2 2 ⎛ 1 − 2 ψz

Az = ⎝

⎛1

0 0

2 ψz¯ A z¯ = ⎝ 0 0

2

0

⎞

0

(6.2)

0 0 1 ⎠, 2 ψz 0 0 0 ⎞ 0 0 − 21 ψz¯ 0 ⎠ . 0 0

Recall that Aw = Q and Aw¯ = −P. The equation Fzw = 0 is satisfied provided that Uz¯ = 0, i.e. U must be holomorphic. The second ASDYM equation Fz z¯ + Fww¯ = 0 is satisfied if and only if (1.4) holds.

1018

M. Dunajski, P. Plansangkate

7. Semi–Flat Calabi–Yau Metric In this section we consider the semi–flat Calabi–Yau metric constructed by Loftin, Yau and Zaslow, and obtain the local expression of the metric explicitly in terms of solution of the definite affine sphere equation. Let us first recall the Simon–Wang approach to affine spheres [27]. Consider the parametrisation of an elliptic affine sphere (z, z¯ ) → f = ( f 1 (z, z¯ ), f 2 (z, z¯ ), f 3 (z, z¯ )) ∈ R3 . The structure equations5 defining the affine sphere can be written as a linear first order system of PDEs in f, f z and f z¯ ⎞⎛ ⎞ ⎛ ⎞ ⎛ 0 1 0 f f ∂ ⎝ ⎠ ⎝ 0 ψz U e−ψ ⎠ ⎝ f z ⎠ , fz = ∂z f z¯ f z¯ − 1 eψ 0 0 ⎞⎛ ⎞ ⎛ ⎞ ⎛ 2 (7.1) 0 0 1 f f ∂ ⎝ ⎠ ⎝ 1 ψ fz = − 2 e 0 0 ⎠ ⎝ fz ⎠ , ∂ z¯ f z¯ f z¯ 0 U¯ e−ψ ψz¯ where we have set the affine mean curvature to 1. The compatibility condition for this over-determined system is the affine sphere equation (1.4). Therefore, given a solution ψ, one can find f and hence the cone over the sphere (z, z¯ , r ) −→ (x 1 = r f 1 (z, z¯ ), x 2 = r f 2 (z, z¯ ), x 3 = r f 3 (z, z¯ )).

(7.2)

This expression can be inverted locally to give r = r (x). Proof of Proposition 1.2. The metric cone over an elliptic affine sphere is given by (3.15) with φ(x) = r 2 /2 and the corresponding semi-flat metric (1.3). The matrix φ jk in (1.3) can be obtained by contracting the metric (3.15) with ∂/∂ x j , ∂/∂ x k . Given a solution of the affine sphere equation ψ, we know g B in the basis (dr, dz, d z¯ ), thus we want to express ∂/∂ x j in terms of ∂/∂r, ∂/∂z, ∂/∂ z¯ . Now, from (7.2), we have that ⎞ ⎛ 1 ⎛ ⎞ ⎛ ⎞ f2 f3 f ∂/∂r ∂/∂ x 1 ⎟ ⎝ ∂/∂ x 2 ⎠ = N −1 ⎝ r −1 ∂/∂z ⎠ , where N = ⎜ ⎝ f z1 f z2 f z3 ⎠ . −1 3 r ∂/∂ z¯ ∂/∂ x f1 f2 f3 z¯

z¯

z¯

Moreover, N is the matrix solution of the linear system (7.1), whose existence and the existence of its inverse N −1 are guaranteed by the affine sphere equation. Writing ⎞ ⎛ p1 q1 q¯1 N −1 = ⎝ p2 q2 q¯2 ⎠ , p3 q3 q¯3 5 For the elliptic affine sphere with affine mean curvature set to 1, the shape operator is S = I. Now, with the affine metric (3.11), the affine normal chosen to point inward from the surface is given by minus the position vector − f, and the structure equations (3.12) and (3.13) become

D X f ∗ (Y ) = f ∗ (∇ X Y ) + h(X, Y )(− f ), D X (− f ) = − f ∗ (X ). Note that we have abused the notation so that f also denotes the immersion.

Strominger–Yau–Zaslow Geometry, Affine Spheres and Painlevé III

1019

one calculates φ jk and thus the metric on the fibre to be φ jk dy j dy k = ( p j pk + eψ q j q¯k )dy j dy k . Now, let us introduce new coordinates τ := pi y i , ξ := qi y i , ξ¯ := q¯i y i and write pi dy i = dτ − y i dpi etc. Denote the two matrices of coefficients in the linear system (7.1) by −A(z) and −A(¯z ) respectively, so that (7.1) is ∂z N + A(z) N = 0,

∂z¯ N + A(¯z ) N = 0.

Then, by considering the corresponding equation for N −1 , the one–forms y i dpi , y i dqi , y i d q¯i can be written in terms of coordinates τ, ξ, ξ¯ and components of A(z) and A(¯z ) , which are known in terms of ψ. Finally, we can write the metric (1.3) as g = dr 2 + r 2 eψ |dz|2 + |dτ + α|2 + eψ |dξ + β|2 , where 1 α = − eψ (ξ¯ dz + ξ d z¯ ), β = (τ + ξ ψz )dz + e−ψ U¯ ξ¯ d z¯ . 2 By similar calculation, the Kähler form can be written as r ¯ ω = dr ∧ (dτ + α) + eψ (d z¯ ∧ (dξ + β) + dz ∧ (d ξ¯ + β)). 2 Using the relation between the metric, the Kähler form and the complex structure, we find holomorphic basis {e1 , e2 , e3 } (1.8) and write g and ω as in Proposition 1.2, where we have introduced a complex coordinate w = r + iτ. Remark 1. The Ricci flat condition for the metric (1.7) reduces to the affine sphere equation (1.4) for ψ(z, z¯ ) and U (z). Equation (1.4) is invariant under the transformations ˆ U → Uˆ , where ∂/∂z → ∂/∂ zˆ , ψ → ψ, ∂/∂zˆ = e− j (z) ∂/∂z , ψˆ = ψ − j (z) − j (z), and Uˆ = e−3 j (z) U. This can be understood geometrically, as eψ dzd z¯ and U dz 3 are the affine metric and the cubic differential respectively of the affine sphere. The metric (1.7) is invariant under the above transformations, together with ξ → ξˆ = e j (z) ξ. Remark 2. One expects the linear system associated with the structure equations of affine spheres (7.1) to be equivalent to the Hitchin Lax pair (4.5) giving rise to the affine sphere equation. The matrices A(z) and A(¯z ) in (7.1) are unique up to gauge transformations A(z) −→ g −1 A(z) g + g −1 ∂z g,

A(¯z ) −→ g −1 A(¯z ) g + g −1 ∂z¯ g.

If we write A(z) = (A z + λP),

A(¯z ) = (A z¯ + λ−1 Q)

(7.3)

1020

M. Dunajski, P. Plansangkate

for some value of λ, then it follows that (A z , A z¯ , Q, P) will satisfy the Hitchin equations (4.2a, b, c), with reality condition z˜ = z¯ . Conversely, given a solution (A z , A z¯ , Q, P) to the Hitchin equations, we should be able to find a value of spectral parameter λ such that (A z + λP) and (A z¯ + λ−1 Q) can be gauge transformed to A(z) and A(¯z ) respectively. For example, we can obtain A(z) and A(¯z ) in (7.1) from the ansatz (4.19), with z˜ = z¯ and U˜ = U¯ , by gauge transformation with ⎛ ⎞ 1 0 √ 0 −ψ/2 ⎠ g = ⎝ 0 − 2e √ 0 −ψ/2 , 0 0 − 2e and choosing the value of spectral parameter in (7.3) to be λ = 1. Note that we need det g = 1, since A(z) and A(¯z ) are not traceless. 8. Painlevé III One of the main results of Loftin, Yau and Zaslow [20] is the existence of radially symmetric solutions of the affine sphere equation (1.4) for U (z) = z −2 , with prescribed behaviour near the singularity z = 0. In this section we shall show that the radially symmetric solutions of (1.4) are Painlevé III transcendents. Proof of Proposition 1.3. Set U = z −2 , and look for solutions of (1.4) of the form ψ = ψ(ρ), where ρ = |z|. Making a substitution ψ(ρ) = log (ρ −3/2 H (ρ)) and introducing a new independent variable by ρ = s 2 yields the following ODE for H = H (s): Hs 8H 2 16 (Hs )2 − − − . H s s H This is the celebrated Painlevé III equation [15] Hss =

(8.1)

Hs α H 2 + β (Hs )2 δ − + + γ H3 + H s s H with special values of parameters Hss =

(α, β, γ , δ) = (−8, 0, 0, −16). In the classification of Okamoto [26] it falls in the type D7. Remarks. • One can consider the radial symmetry reduction of the affine sphere equation (1.4) with U = z −n for general n ∈ Z. n = 3. Changing the independent variable to s = (z z¯ ) and usin the ansatz

3−n 4

1+n − ψ = log s 3−n H (s)k

with k = ±1 reduces III equation with (1.4) to the Painlevé parame−8 −16 8 ters (α, β, γ , δ) = (3−n)2 , 0, 0, (3−n)2 and (α, β, γ , δ) = 0, (3−n) 2, 16 , 0 for k = 1 and k = −1, respectively. In both cases, the Painlevé (3−n)2 III equations are of type D7 in Okamoto’s classification.

Strominger–Yau–Zaslow Geometry, Affine Spheres and Painlevé III

1021

1

n=3. Setting ψ = ψ(s) where s = (z z¯ ) 2 in Eq. (1.4) yields ψs 4e−2ψ + 2eψ = 0, + s s6 which, under multiplication by ψ2s + 1s , gives a first-order ODE ψss +

ψs 2 ψs e−2ψ c + + eψ − 6 + 2 = 0, 4 s s s

(8.2)

(8.3)

where c is a constant of integration. Hence any solution to (8.3) such that sψs = −1/2 gives rise to a solution to (8.2), and conversely all solutions to (8.2) arise from (8.3). Equation (8.3) is integrable by quadratures in terms of the elliptic functions. • In general, a Painlevé III equation may have two types of special (i.e. non–transcendental) solutions: the finite number of rational solutions and a one parameter family of Riccati type solutions expressible by special functions [15]. For the values of parameters in (8.1) the Riccati solutions do not exist, and there exists a unique algebraic solution H = −(2s)1/3 . This corresponds to ψ=

4 1 log (2) − log (|z|) + log (−1) 3 3

which is not real. There are Bäcklund transformations leading to new solutions, but they change the value of the parameters. This shows that the desired radial solution to the affine sphere equation (1.4) is transcendental. In [3,17] it has been shown that the radial solutions of the Tzitzéica equation (3.16) also satisfies Painlevé III of type D7.

8.1. Lax pair for Painlevé III. The standard isomonodromic approach to Painlevé III identifies this equation with S L(2, C) isomonodromic problem with two double poles. The connection with affine differential geometry and its underlying isospectral Lax pair suggests that there is an alternative isomonodromic Lax pair for PIII given in terms of 3 by 3 matrices, as opposed to the standard Lax pair with 2 by 2 matrices [16]. (See also [22] where S L(2, C) ASDYM has been reduced to PIII.) Let us now return to the holomorphic setting, and consider the Lax pair for ASDYM in C4 with gauge group S L(3, C), (Dw + λDz˜ ) = 0, (Dz + λDw˜ ) = 0, where is a vector-valued function of w, w, ˜ z, z˜ and λ. We require that the connection is invariant under the 3 dimensional subgroup of the conformal group P G L(4, C) generated by {∂w , ∂w˜ , z∂z − z˜ ∂z˜ },

(8.4)

1022

M. Dunajski, P. Plansangkate

and introduce coordinates (ρ, θ ) ∈ C2 such that z = ρeiθ , z˜ = ρe−iθ , and z∂z − z˜ ∂z˜ = ∂ . Then the ASDYM Lax pair becomes −i ∂θ −ζ ∂ρ + ρ −1 ζ 2 ∂ζ + 2(Aw − ζ e−iθ A z˜ ) = 0, ∂ρ + ρ −1 ζ ∂ζ + 2(eiθ A z − ζ Aw˜ ) = 0, where the gauge fields are in an invariant gauge; (Aw , Aw˜ , eiθ A z , e−iθ A z˜ ) are functions of ρ only, and ζ = −λeiθ is an invariant spectral parameter6 . Taking linear combinations of these two linear PDEs gives a Lax pair of the form ∂ = Lˆ , ∂ζ where

∂ = Mˆ , ∂ρ

(8.5)

Lˆ = ρζ −2 ζ 2 Aw˜ − Aw + ζ (e−iθ A z˜ − eiθ A z ) , Mˆ = ζ −1 Aw + ζ 2 Aw˜ − ζ (eiθ A z + e−iθ A z˜ ) .

The calculation leading to Painlevé III (8.1) implies that if we gauge transform ansatz (4.19) with U (z) = z −2 , U˜ (˜z ) = z˜ −2 into an invariant gauge and substitute it into (8.5), then in the new coordinate s = ρ 1/2 the system (8.5) becomes Lax pair of the Painlevé III with special values of parameters (8.1). We shall now present this calculation: An invariant gauge of (4.19) can be obtained using the gauge transformation with ⎛ iθ/3 ⎞ e 0 0 g = ⎝ 0 e−i2θ/3 0 ⎠, iθ/3 0 0 e which does not change Aw and Aw˜ , but gives ⎞ ⎛ ψ 1 √1 e 2 0 6ρ 2 ⎟ ⎜ 1 1 eiθ A z = ⎜ − ρ12 e−ψ ⎟ ⎠, ⎝ 0 − 4 ψρ + 3ρ 1 1 0 0 4 ψρ + 6ρ ⎞ ⎛ 1 0 0 − 6ρ ⎟ ⎜ 1 ψ 1 1 ⎟ − √ e 2 4 ψρ + 3ρ 0 e−iθ A z˜ = ⎜ 2 ⎝ ⎠. 1 0 − ρ12 e−ψ − 41 ψρ + 6ρ Then, in terms of s = ρ 1/2 and H (s) = s 3 eψ , the system (8.5) gives a Lax pair for the Painlevé III equation (8.1) as ∂ = L , ∂ζ

∂ = M , ∂s

(8.6)

6 The spectral parameter λ is not constant along the lift of the generators (8.4) to C4 ×CP1 ∈ (w, w, ˜ z, z˜ , λ) where is defined. However, the invariant spectral parameter ζ is constant along the lift, and hence we are allowed to express as a function of ρ and ζ only.

Strominger–Yau–Zaslow Geometry, Affine Spheres and Painlevé III

where

⎛ L=−

1 ⎜ ⎜ ζ2 ⎝

ζ 3 √1 ζ (s H )1/2 2 √1 ζ 2 (s H )1/2 2

⎛

M=

√

2

H s

0

√1 ζ (s H )1/2 2 Hs 1 ζ 12 − s4H ζ Hs

−1

1/2 ⎜ ⎜ 1 0 ⎜ ⎝ √ s 1/2 −ζ 2 H3

1023

⎞ √1 (s H )1/2 2 ⎟ ⎟, −ζ Hs ⎠ s Hs 5 ζ 4H − 12 ⎞

1 ζ 1/2 ⎟ √ ⎟ 2 Hs 3 ⎟.

⎠

0

The matrix L has two double poles as expected for Painlevé III [16], at ζ = 0 and ζ = ∞. We note here that a different (i.e. gauge inequivalent) 3 × 3 isomonodromic Lax pair for Painlevé III of type D7 was used by Kitaev in [17]. The Lax pair can also be derived from the ASDYM Lax pair, from a solution to Hitchin equations which is gauge equivalent to (6.1). Acknowledgements. We wish to thank Philip Boalch, Robert Conte, Eugene Ferapontov, Nigel Hitchin, John Loftin, Ian McIntosh, Yousuke Ohyama and Wolfgang Schief for valuable comments. Prim Plansangkate is grateful to the Royal Thai Government for funding her research.

References 1. Ablowitz, M.J., Ramani, A., Segur, H.: A connection between nonlinear evolution equations and ordinary differential equations of P-type. I, II. J. Math. Phys. 21, 715–721 and 1006–1015 (1980) 2. Baues, O., Cortés, V.: Proper Affine Hyperspheres which fiber over rojective Special Kähler Manifolds. Asian J. Math. 7, 115–132 (2003) 3. Bobenko, A.I., Eitner, U.: Painlevé Equations in the Differential Geometry of Surfaces. Lecture notes in Mathematics 1753, Springer-Verlag, Berlin, 2000 4. Boldin, A. Yu., Safin, S.S., Sharipov, R.A.: On an old article of Tzitzéica and the inverse scattering method. J. Math. Phys. 34, 5801–5809 (1993) 5. Calabi, E.: Complete affine hyperspheres I. Symposia Mathematica, Vol X. pp. 19–38 London, Academic Press, 1972 6. Cheng, S.-Y., Yau, S.-T.: Complete affine hyperspheres. Part I. The completeness of affine metrics. Commun Pure Appl Math. 39, 839–866 (1986) 7. Dunajski, M.: Hyper complex four manifolds from the Tzitzéica equation. J. Math. Phys. 43, 651–658 (2002) 8. Dunajski, M.: Solitons, Instantons and Twistors. Oxford Graduate Texts in Mathematics, Oxford University Press (ISBN 9780198570622), in Press, 2009 9. Ferapontov, E.V., Khusnutdinova, K.R.: Hydrodynamic reductions of multi-dimensional dispersionless PDEs: the test for integrability. J. Math. Phys. 45, 2365–2377 (2004) 10. Gross, M., Siebert, B.: Affine manifolds, log structures, and mirror symmetry. Turkish J. Math. 27, 33–60 (2003); Mirror Symmetry via Logarithmic Degeneration Data I. J. Differ. Geom. 72, 169–338 (2006) 11. Gross, M., Huybrechts, D., Joyce, D.: Calabi–Yau Manifolds and Related Geometries. Springer-Verlag, Berlin, 2003 12. Haase, C., Zharkov, I.: Integral affine structures on spheres and torus fibrations of Calabi-Yau toric hypersurfaces II. http://arxiv.orglabs/math/0301222v1[math.AG], 2003 13. Hitchin, N.J.: The Moduli space of special Lagrangian submanifolds. Ann. Scuola Norm. Sup. Pisa Cl. Sci. 25, 503–515 (1997) 14. Hitchin, N.J.: The self-duality equations on a Riemann surface. Proc. Lond. Math. Soc. 55, 59–126 (1987) 15. Ince, E.L.: Ordinary Differential Equations. New York: Dover, 1956 16. Jimbo, M., Miwa, T.: Monodromy preserving deformation of linear ordinary differential equations with rational coefficients, II and III. Physica. 2D, 407–448 and 4D, 26–46 (1981)

1024

M. Dunajski, P. Plansangkate

17. Kitaev, A.V.: The Method of isomonodromic deformations for the ‘degenerate’ third Painlevé equation. J. Sov. Math. 46, 2077–2083 (1989) 18. Leung, N.C.: Mirror symmetry without corrections. Comm. Anal. Geom. 13, 287–331 (2005) 19. Loftin, J.: Survey of Affine Spheres. http://arxiv.org/abs/0809.1186v1[math.DG], 2008 20. Loftin, J., Yau, S.T., Zaslow, E.: Affine manifolds, SYZ geometry and the “Y” vertex. J. Diff. Geom. 71, 129–158 (2005) 21. Manton, N.S.: A remark on the scattering of BPS monopoles. Phys. Lett. B 100, 54–56 (1982) 22. Mason, L.J., Woodhouse, N.M.J.: Integrability Self-Duality, and Twistor Theory. Oxford: Clarendon Press, 1996 23. McIntosh, I.: Special Lagrangian cones in C3 and primitive harmonic maps. J. London. Math. Soc. 67, 769–789 (2003) 24. Mikhailov, A.V.: The reduction problem and the inverse scattering method. Physica. 3D 1&2, 73–117 (1981) 25. Nomizu, K., Sasaki, T.: Affine Differential Geometry: Geometry of Affine Immersions. Cambridge: Cambridge University Press, 1994 26. Okamoto, K.: Studies on the Painlevé equations IV. Third Painlevé equation PIII . Funkcial. Ekvac. 30, 305–332 (1987) 27. Simon, U., Wang, C.P.: Local theory of affine 2-spheres, In: Proceedings of Symposia in Pure Mathematics 54, Providence, RI: Amer. Math. Soc., 1993, pp. 585–598 28. Strominger, A., Yau, S.-T., Zaslow, E.: Mirror symmetry is T -duality. Nucl. Phys. B 479, 243–259 (1996) 29. Tzitzéica, G.: Sur une nouvelle classe de surfaces. Rend. Circolo Mat. Palermo 25, 180–187 (1908) 30. Tzitzéica, G.: Sur une nouvelle classe de surfaces. C. R. Acad. Sci. Paris 150, 955–956 (1910) 31. Wang, E.: Tzitzéica transformation is a dressing action. J. Math. Phys. 47, 053502 (2006) Communicated by G. W. Gibbons

Commun. Math. Phys. 290, 1025–1031 (2009) Digital Object Identifier (DOI) 10.1007/s00220-008-0722-z

Communications in

Mathematical Physics

A Negative Mass Theorem for Surfaces of Positive Genus K. Okikiolu Department of Mathematics, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA. E-mail: [email protected] Received: 5 October 2008 / Accepted: 13 October 2008 Published online: 22 January 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com

Abstract: Let M be a closed surface. For a metric g on M, denote the Laplace-Beltrami operator by = g . We define trace −1 = M m( p) d A, where d A is the area element for g and m( p) is the Robin constant at the point p ∈ M, that is the value of the Green function G( p, q) at q = p after the logarithmic singularity has been subtracted off. Since trace −1 can also be obtained by regularization of the spectral zeta function, it is a spectral invariant. Heuristically it represents the sum of squares of the wavelengths −1 of the surface. We define the -mass of (M, g) to equal (trace −1 g − trace S 2 ,A )/A, where S 2 ,A is the Laplacian on the round sphere of area A. This is an analog for closed surfaces of the ADM mass from general relativity. We show that if M has positive genus, the minimum of the -mass on each conformal class is negative and attained by a smooth metric. For this minimizing metric, there is a sharp logarithmic Hardy-LittlewoodSobolev inequality and a Moser-Trudinger-Onofri type inequality. Section 1. Introduction Let M be a closed Riemann surface and let g be a metric on M compatible with the complex structure. With respect to complex coordinates z, the metric g is the real part of the Kähler metric eu dz ⊗ d z¯ , for some smooth real valued function u on M. The area element is ieu dz ∧ d z¯ . dA = 2 Denote the total area by A. The Laplace Beltrami operator for the metric g is given by = g = −4e−u ∂z ∂z¯ . The author would like to acknowledge the support of the Institute for Advanced Study.

1026

K. Okikiolu

(This is sometimes called the geometer’s Laplacian: note the sign.) The Green’s function for the metric g is the smooth real valued function G on M × M\{( p, p) : p ∈ M} such that 1 G( p, q) f (q) d A(q) = f ( p) − f dA A M M for smooth functions f on M. It follows that G is symmetric, G( p, q) d A(q) = 0, and p G( p, q) = −

1 A

when p = q.

(1.1)

For the smooth function f on M we define −1 f ( p) = G( p, q) f (q) d A(q). M

If d( p, q) is the geodesic distance between p and q in the metric g then there exists a smooth function m on M such that G( p, q) =

1 log d( p, q) + m( p) + O(d( p, q)), 2π

as d( p, q) → 0.

The value m( p) is known as the Robin constant at p. We define trace −1 = m g d A. g M

This is a spectral invariant for , since it can be obtained from the spectral zeta function associated to , see [S1,S2,M3], or [Ok1]. Heuristically it represents the sum of squares of the wavelengths of the surface (up to a constant). It is convenient to normalize to get a scale invariant quantity. Indeed, define the -mass to be M(g) =

−1 trace −1 g − trace S 2 ,A

A

,

where S 2 ,A is the Laplacian for the round metric on S 2 with area A. Then the above results show that M(g) is always positive when g is a metric on S 2 . In this paper we show the following: Theorem 1 (Negative mass theorem for positive genus surfaces). Given a metric g on the closed surface M of positive genus, there exists a conformal metric eφ g such that M(eφ g) < 0. In fact eφ g can be chosen to minimize M within the conformal class. When M is a torus, this was proved in [Ok2]. We remark that Theorem 1 fails on the sphere. This follows from the logarithmic Hardy-Littlewood-Sobolev inequality for the sphere [On,CL,B]: Theorem (Morpurgo [M2]) For a metric g on the sphere S 2 , the value M(g) is strictly positive unless g is round. From [Ok1], we immediately obtain the following corollary to Theorem 1

Negative Mass Theorem for Surfaces of Positive Genus

1027

Corollary 2 (Analogs of Logarithmic HLS inequality and the Moser-Trudinger-Onofri Inequality for general surfaces). If the metric g minimizes M within its conformal class, then 1 1 ψ eψ d A − eψ −1 eψ d A ≥ 0 4π M A M for all functions ψ : M → R with M eψ d A = A such that M ψ eψ d A is finite. Here, d A and are associated to g. Moreover, for ψ ∈ C ∞ (M), 1 1 1 ψψ d A − log eψ d A + ψ d A ≥ 0. 16π M A M A M For some related results, see [Ch,CheC,DJLW,LL1,LL2,M2,M3,NT,Ok1,Ok2,OPS, S2]. Remark. The quantity M(g) can be viewed as an analog of the ADM mass. Indeed, writing K ( p) for the Gaussian curvature of g at p, it is shown in [S1,S2], that for any metric g on the 2-sphere, the natural analog of the ADM mass for metrics on the sphere is the constant 1 −1 1 m g ( p) − K ( p) = trace −1 (1.2) g . 2π A Although the left-hand side of (1.2) is not constant in general for surfaces of higher genus, the right-hand side can be thought of as the analog of the mass. (The left-hand side of (1.2) is constant for the canonical metric, a fact we use in next section.) For a probabilistic interpretation of trace −1 , see [DS1]. There it is shown that trace −1 is the constant term in an asymptotic expansion in ε, of the time it takes a Brownian particle starting at a randomly chosen point on the surface to get ε-close to another randomly chosen point. Section 2. The Proof There are two main ingredients in the proof of this result. The first is an identity concerning the Arakelov Green’s function which is used in the construction of the Arakelov metric. The second is a delicate result on the mean field equation on surfaces proved in [DJLW]. That paper gives conditions under which a general mean field equation has a solution. Here we show that for the particular case of the canonical metric on M and the mean field equation arising from trace −1 , the conditions of the [DJLW] theorem are satisfied. We start by recalling the way that Robin’s constant and the sum of squares of the wavelengths change under a conformal change of the metric. Proposition 2.1. Conformal change of the Robin constant. If φ is a smooth function on M then 2 φ 1 φ φ − m eφ g ( p) = m g ( p) + (−1 e )( p) + eφ −1 g e d A, 4π Aφ g A2φ M where

Aφ =

eφ d A. M

For the proof, see for example [S1,S2,M3] or [Ok1].

1028

K. Okikiolu

Proposition 2.2. Conformal change of trace −1 (Morpurgo’s Formula) If φ is a smooth function on M, then 1 1 −1 φ φ φ trace eφ g = mge d A + φe dA − eφ −1 g e d A. (2.1) 4π A φ M M M Now we discuss the Canonical metric and the Arakelov Green’s function. We refer the reader to [W] and [F] for more details on this subject. If M is a Riemann surface of genus H , there exists a metric g on M known as the canonical metric which is compatible with the complex structure. It is defined by taking the Jacobian embedding of the Riemann surface M into a 2H -dimensional torus, and pulling back the flat metric on the torus to M. Indeed, let {A j , B j } be a symplectic homology basis for H1 (M, Z) satisfying the intersection pairings #[Ai , A j ] = 0,

#[Bi , B j ] = 0,

#[Ai , B j ] = δi j .

Take a basis θ j for the space of homomorphic 1-forms satisfying θk = δ jk . Aj

Then the period matrix i j given by

i j =

θj Bi

is positive definite. The Jacobian variety associated to M is J (M) = C H /(Zh + Z H ). The Abel map gives an embedding of M into J (M), z (θ1 , . . . , θ H ). I :z→ z0

The canonical Kähler metric on M is given in terms of local holomorphic coordinates z by ⎛ ⎞ H ¯ dθ 1 ⎝ d θ j k⎠ µ(z) dz ⊗ d z¯ , where µ(z) = (Im )−1 . jk H dz d z¯ j,k=1

The real part of the Kähler metric is the Riemannian metric g. It can be checked that this metric has unit area. The Green’s function for this metric is known as the Arakelov Green’s function and the following result is well known. Proposition 2.3. If M is a closed Riemann surface of genus H and if g is the canonical metric on M with unit area, then the Robin constant m( p) for g satisfies m( p) = 2H − 2 +

K ( p) . 2π

Negative Mass Theorem for Surfaces of Positive Genus

1029

Proof of Proposition 2.3. The Gaussian curvature K is given by 2∂z ∂z¯ log µ = −µK . The Arakelov Green’s function is given by G(z, w) = −

H 1 1 log |E(z, w)| + (Im )−1 jk Im(Z − W ) j Im(Z − W )k 2π 2 j,k=1

+

log µ(z) m(w) log µ(w) m(z) − + − . 2 8π 2 8π

Here, Z = I (z) and W = I (w) and E(z, w) is the prime form which plays the role of z − w, is holomorphic in z and w, and transforms as a (−1/2, −1/2) form in each variable. We notice that log |E(z, w)| is harmonic, and so from (1.1) we have that for w = z, µ(z) = 4∂z ∂z¯ G(z, w) = H µ(z) + 2∂z ∂z¯ m(z) +

µ(z)K (z) . 4π

Hence we see that 4µ−1 ∂z ∂z¯ m = 2 − 2H −

K . 2π

Theorem 1 is now an application of the following result on the mean field equation which is obtained from Theorem 1.2 of [DJLW] and its proof. Theorem 2.4. [DJLW] Let (M, g) be a closed surface of unit area and let h be a smooth positive function on M. Suppose p0 is a point at which 8π m +2 log h attains its maximum value, and suppose in addition that log h( p0 ) < 8π − 2K ( p0 ). Then the minimum of the functional 1 |∇u|2 d A + u d A − log heu d A J (u) = 16π M M M

(2.2)

over functions u in the Sobolev space H 1 (M) is attained at a smooth function u satisfying u = 8π heu − 8π. Moreover, for this minimum point u we have J (u) < − 1 + log π + max(4π m g ( p) + log h( p)) . p∈M

Proof of Theorem 1. We take g to be the canonical metric on M, and we set h = e−4π m g .

(2.3)

(2.4)

1030

K. Okikiolu

Then by Proposition 2.3, we have log h = −4π m g = 8π − 8π H − 2K < 8π − 2K . Hence we obtain the conclusion of Theorem 2.4 From (2.3), the function u satisfies heu d A = 1. M

From (2.2) and (2.3) we see that J (u) =

1 2

u(heu + 1) d A. M

However, writing φ = u − 4π m g , we have

eφ d A = 1, M

and φ −1 g e

1 = 8π

Hence from (2.1), trace −1 = eφ g

1 8π

u− u .

(2.5)

M

u(heu + 1) = M

J (u) . 4π

Now from (2.4) and the fact that trace −1 = S 2 ,1

−1 − log π , 4π

we see that trace −1 < trace −1 . eφ g S 2 ,1 In fact we remark that eφ g minimizes trace −1 among unit area metrics in the conformal class of g. Indeed, from [Ok] Theorem 1, we conclude that the minimum of trace −1 eψ g ψ ψ among ψ conformal factors with e = 1, must in fact be attained at a metric e g with M e d A = 1, which must also satisfy the Euler-Lagrange equation, namely that the Robin constant m eψ g ( p) is constant: 1 ψ ψ + 4π m −1 e = − (ψ + 4π m ) d A . g g g 8π M However, setting v = ψ + 4π m g , we find that v is a critical metric for J , and hence trace −1 = eψ g

J (u) J (v) ≥ = trace −1 . eφ g 4π 4π

We also remark that in general the metric eφ g need not coincide with the canonical metric or the constant curvature metric or the Arakelov metric, see [Ok2]. On a long thin rectangular torus, the minimizer is close to being a round sphere with a short worm hole joining the poles.

Negative Mass Theorem for Surfaces of Positive Genus

1031

Acknowledgement. I would like to thank Richard Wentworth for helpful discussions. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References [B]

Beckner, W.: Sharp sobolev inequalities on the sphere and the moser-trudinger inequality. Annals of Math. 138, 213–242 (1993) [CL] Carlen, E., Loss, M.: Competing symmetries, the logarithmic hls inequality and onofri’s inequality on s n . Geom. Funct. Anal. 2, 90–104 (1992) [Ch] Chang, S.-Y.A.: Conformal invariants and partial differential equations. Bull. Amer. Math. Soc. 42, 365–393 (2005) [CheC] Chen, C.-C., Lin, C.-S.: Sharp estimates for solutions of multi-bubbles in compact riemann surfaces. Comm. Pure Appl. Math. 55(6), 728–771 (2002) [DJLW] Ding, W., Jost, J., Li, J., Wang, G.: The differential equation u = 8π − 8π heu on a compact riemann surface. Asian J. Math. 1, 230–248 (1997) [DS1] Doyle, P., Steiner, J.: Spectral invariants and playing hide and seek on surfaces. Preprint, available at http://www.cims.nyu.edu/~steiner/hideandseek.pdf, 2005 [DS2] Doyle, P., Steiner, J.: Blowing bubbles on the torus. Preprint, available at http://www.cims.nyu.edu/ ~steiner/torus.pdf, 2005 [F] Fay, T.: Theta functions on riemann surfaces. Ann. Math. 119, 387 (1994) [LL1] Lin, C.-S., Lucia, M.: Uniqueness of solutions for a mean field equation on the torus. J. Diff. Eqs. 229(1), 172–185 (2006) [LL2] Lin, C.-S., Lucia, M.: One-dimensional symmetry of periodic minimizers for a mean field equation. Ann. Sc. Norm. Super. Pisa Cl. Sci. (5) 6(2), 269–290 (2007) [M1] Morpurgo, C.: The logarithmic hardy-littlewood-sobolev inequality and extremals of zeta functions on s n . Geom. Funct. Anal. 6, 146–171 (1996) [M2] Morpurgo, C.: Zeta functions on S 2 . Extremal Riemann surfaces (San Francisco, 1995), Contemp. Math., 201, Providence, RI: Amer. Math. Soc., 1997, pp. 213–225 [M3] Morpurgo, C.: Sharp inequalities for functional integrals and traces of conformally invariant operators. Duke Math. J. 114, 477–553 (2002) [NT] Nolasco, M., Tarantello, G.: On a sharp sobolev-type inequality on two-dimensional compact manifolds. Arch. Ration. Mech. Anal. 145, 161–195 (1998) [Ok1] Okikiolu, K.: Extremals for logarithmic hls inequalities on compact manifolds.. GAFA 107(5), 1655–1684 (2008) [Ok2] Okikiolu, K.: A negative mass theorem for the 2-torus. Commun. Math. Phys. 284(3), 775–802 (2008) [On] Onofri, E.: On the positivity of the effective action in a theory of random surfaces. Commun. Math. Phys. 86, 321–326 (1982) [OPS] Osgood, B., Phillips, R., Sarnak, P.: Extremals of determinants of laplacians. J. Funct. Anal. 80, 148–211 (1988) [S1] Steiner, J.: Green’s Functions, Spectral Invariants, and a Positive Mass on Spheres. Ph. D. Dissertation, University of California San Diego, June 2003 [S2] Steiner, J.: A geometrical mass and its extremal properties for metrics on s 2 . Duke Math. J. 129, 63–86 (2005) [W] Wentworth, R.: The asymptotics of the arakelov-green’s function and faltings’ delta invariant. Commun. Math. Phys. 137, 427–459 (1991) Communicated by S. Zelditch

Commun. Math. Phys. 290, 1033–1049 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0793-5

Communications in

Mathematical Physics

Hamiltonian Systems Admitting a Runge–Lenz Vector and an Optimal Extension of Bertrand’s Theorem to Curved Manifolds Ángel Ballesteros1 , Alberto Enciso2 , Francisco J. Herranz1 , Orlando Ragnisco3 1 Depto. de Física, Universidad de Burgos, 09001 Burgos, Spain.

E-mail: [email protected], [email protected]

2 Depto. de Física Teórica II, Universidad Complutense, 28040 Madrid, Spain.

E-mail: [email protected]

3 Dip. di Fisica, Università di Roma 3, and Istituto Nazionale di Fisica Nucleare,

00146 Rome, Italy. E-mail: [email protected] Received: 6 October 2008 / Accepted: 9 December 2008 Published online: 8 April 2009 – © Springer-Verlag 2009

Abstract: Bertrand’s theorem asserts that any spherically symmetric natural Hamiltonian system in Euclidean 3-space which possesses stable circular orbits and whose bounded trajectories are all periodic is either a harmonic oscillator or a Kepler system. In this paper we extend this classical result to curved spaces by proving that any Hamiltonian on a spherically symmetric Riemannian 3-manifold which satisfies the same conditions as in Bertrand’s theorem is superintegrable and given by an intrinsic oscillator or Kepler system. As a byproduct we obtain a wide panoply of new superintegrable Hamiltonian systems. The demonstration relies on Perlick’s classification of Bertrand spacetimes and on the construction of a suitable, globally defined generalization of the Runge–Lenz vector. 1. Introduction and Preliminary Definitions The Kepler problem and the harmonic oscillator are probably the most thoroughly studied systems in classical mechanics. The reasons for this are twofold. First, these potentials play a preponderant role in Physics due their connection with planetary motion and oscillations around a nondegenerate equilibrium. Second, these potentials are of particular mathematical interest due to the existence of additional (or “hidden”) symmetries yielding additional constants of motion. In fact, both the Kepler and the harmonic oscillator Hamiltonians are (maximally) superintegrable in the sense that they have the maximum number (four) of functionally independent first integrals other than the Hamiltonian.1 Bertrand’s theorem [6] is a landmark result which characterizes the Kepler and harmonic oscillator Hamiltonians in terms of their qualitative dynamics. A precise statement of this theorem is given below. We recall [18] that the first condition, which is occasionally forgotten, is necessary in order to exclude potentials of the form V (q) = −K q−s , with K > 0 and s = 2, 3, . . . . 1 As usual, by functional independence of the integrals I , . . . , I we mean that the (k + 1)-form dH ∧ 1 k dI1 ∧ · · · ∧ dIk is nonzero in an open and dense subset of phase space, H being the Hamiltonian function.

1034

Á. Ballesteros, A. Enciso, F. J. Herranz, O. Ragnisco

Theorem 1 (Bertrand). Let H = 21 p2 + V (q) be a natural, spherically symmetric Hamiltonian system in a domain of R3 . Let us suppose that: (i) There exist stable circular orbits. (ii) All the bounded trajectories are closed. Then the potential is either a Kepler (V (q) = A/q + B) or a harmonic oscillator potential (V (q) = Aq2 + B). In particular, H is superintegrable. Analogues of the Kepler and harmonic oscillator systems in curved spaces have been of interest since the discovery of non-Euclidean geometry. In fact [49], the “intrinsic” Kepler and harmonic oscillator problems on spaces of constant curvature were studied by Lipschitz and Killing already in the 19th century, and later rediscovered by Schrödinger [48] and Higgs [25]. In both cases it was established that these systems are superintegrable and satisfy Properties (i) and (ii) above. A considerably more ambitious development was Perlick’s introduction and classification of Bertrand spacetimes [47], which was based on the following observation. Let (M, g) be a Riemannian 3-manifold and consider the space M = M × R endowed with the warped Lorentzian metric η = g − V1 dt 2 , with V a smooth positive function on M. Then the trajectories in (M, η), that is, the projections of inextendible timelike geodesics to a constant time leaf M × {t0 }, correspond to integral curves of the Hamiltonian H = 21 p2g +V (q) in (the cotangent bundle of) M. Thus Perlick introduced the following Definition 2. A Lorentzian 4-manifold (M × R, η) is a Bertrand spacetime if: (i) It is spherically symmetric and static in the sense that η = g − V1 dt 2 and M is diffeomorphic to (r1 , r2 ) × S2 , where the smooth function V depends only on r and the Riemannian metric g on M takes the form (1) g = h(r )2 dr 2 + r 2 dθ 2 + sin2 θ dϕ 2 in the adapted coordinate system (r, θ, ϕ). Here r1 , r2 ∈ R+ ∪ {+∞}. (ii) There is a circular (r = const.) trajectory passing through each point of M. (iii) The above circular trajectories are stable, that is, any initial condition sufficiently close to that of a circular trajectory gives a periodic trajectory. Perlick’s main result was the classification of all Bertrand spacetimes, recovering the classical Bertrand theorem as a subcase. However, two main related questions remained to be settled. On the one hand, the potentials V in Perlick’s classification lacked any physical interpretation, and this was in strong contrast with the Euclidean case. This drawback was circumvented in Ref. [3], where we showed that the two families of Perlick’s potentials correspond to either the “intrinsic” Kepler or harmonic oscillator potentials in the underlying 3-manifold (M, g). On the other hand, the issue of whether the corresponding Hamiltonian systems were superintegrable in some reasonable sense was left wide open. In fact, Perlick’s only remark in this direction was that, by virtue of a theorem of Hauser and Malhiot [24], only two concrete models among the family of Bertrand spacetimes admitted a quadratic additional integral coming from a second rank Killing tensor. A careful analysis of the literature reveals that many particular cases of Bertrand metrics have been thoroughly analyzed and shown to be superintegrable [21,22,27,28],

Optimal Extension of Bertrand’s Theorem to Curved Manifolds

1035

and that in many cases they have been shown to admit a generalization of the classical Runge–Lenz vector as an additional first integral. The physical and mathematical interest of these models (and thus of Bertrand spacetimes) is fostered by their connections with the theory of magnetic monopoles, with differential and algebraic geometry, and with low-dimensional manifold theory [9,10,33–35,43,45,46,50]. The relation between Bertrand spaces and monopole motion is not totally incidental. Indeed, an ample subclass of Bertrand spacetimes admitting some kind of generalized Runge–Lenz vectors (the so-called multifold Kepler systems) were introduced by Iwai and Katayama [27,28] as generalizations of the Taub–NUT metric, whose geodesics asymptotically describe the relative motion of two monopoles (see, for instance, [1,7,11,29,39,40]). Interestingly, superintegrable Hamiltonian systems on curved spaces have recently attracted considerable attention also within the integrable systems community, especially in low dimensions (cf. [2,4,5,30–32] and references therein). The main result of this article is that all Bertrand spacetimes are indeed superintegrable, their superintegrability being linked to the existence of a generalized Runge–Lenz vector. This enables us to present an optimal version of Bertrand’s theorem (Theorem 16) on spherically symmetric manifolds which includes the classification of the natural Hamiltonians whose bounded orbits are all periodic [47], the physical interpretation of the corresponding potentials as Kepler or harmonic oscillator potentials, in each case, and the proof of the superintegrability of these models. This settles in a quite satisfactory way a problem with a large body of previous partial results scattered in the literature. It is standard that the superintegrability of the Kepler system stems from the existence of a conserved Runge–Lenz vector, whose geometric significance is described from a modern perspective in [23]. On the other hand, the superintegrability of the harmonic oscillator is usually established either using explicit (scalar) first integrals or the conserved rank 2 tensor C = 2ω2 q ⊗ q + p ⊗ p, which is sometimes preferable for algebraic reasons [19]. That the latter approach is closely related to a (multivalued) analogue of the Runge–Lenz vector was firmly established in [26]. Motivated by this connection, we have based our approach to the integrability of the Bertrand systems on the construction of a generalized Runge–Lenz vector, globally defined on a finite cover of M. This construction relies on a detailed analysis of the integral curves of the appropriate Hamiltonians. The literature on generalizations of the Runge–Lenz vector for central potentials on Euclidean space is vast (see the survey [36] and references therein), but unfortunately several interesting papers are severely flawed by the lack of distinction between local, semi-global and global existence. The article is organized as follows. In Sect. 2 we recall the two families of Bertrand spacetimes entering Perlick’s classification, which are labeled by two coprime positive integers n and m. We also include the characterization of Perlick’s potentials as the intrinsic Kepler or harmonic oscillator potentials of the corresponding Riemannian 3-manifolds (M, g) and briefly discuss several physically relevant examples. In Sect. 3 we consider the associated natural Hamiltonian systems on (M, g) and compute their integral curves in closed form (Proposition 7). Using this result we easily derive that the latter Hamiltonians are geometrically superintegrable (cf. Definition 9 and Proposition 10) in the region of phase space foliated by bounded orbits, as happens with the harmonic oscillator and Kepler potentials in R3 . Our central result is a stronger superintegrability theorem (Theorem 12) that we present in Sect. 4, where we construct a generalized Runge–Lenz vector globally defined on an n-fold cover of M. As a corollary of this construction we also obtain a global rank n tensor field in M invariant under the flow and a wide panoply of new superintegrable Hamiltonian systems. Lastly, in

1036

Á. Ballesteros, A. Enciso, F. J. Herranz, O. Ragnisco

Sect. 5 we combine the results established in the previous sections to obtain an optimal extension of Bertrand’s theorem to curved spaces (Theorem 16). 2. Harmonic Oscillators and Kepler Potentials in Bertrand Spacetimes In this section we shall define the “intrinsic” Kepler and harmonic oscillator potentials in a spherically symmetric 3-manifold and show how Bertrand spacetimes are related to the Kepler and harmonic oscillator potentials of any of its constant time leaves. Most of the material included here is essentially taken from Ref. [3]; for the sake of completeness, let us mention that further information on geometric properties of Green functions can be consulted e.g. in [13–15,37,38]. We start by letting (M, g) be a Riemannian 3-manifold as in Definition 2. In particular, the metric g takes the form ds 2 = h(r )2 dr 2 + r 2 dθ 2 + sin2 θ dϕ 2 . (2) It is standard that if u(r ) is a function which depends only on the radial coordinate, then its Laplacian is also radial and is read: 2 d r du 1 g u(r ) = 2 . r h(r ) dr h(r ) dr As the Kepler potential in Euclidean three-dimensional space is simply the radial Green function of the Laplacian and the harmonic oscillator is its inverse square, it is natural to make the following Definition 3. The (intrinsic) Kepler and the harmonic oscillator potentials in (M, g) are respectively given by the radial functions r r −2 −2 −2 VK (r ) = A1 r h(r ) dr + B1 , VH (r ) = A2 r h(r ) dr + B2 , (3) a

a

where a, A j , B j are constants. Example 4. Let (M, g) be the simple connected, three-dimensional space form of sectional curvature κ. In this case the metric has the form (2) with h(r )2 =

1 . 1 − κr 2

The corresponding Kepler and harmonic oscillator potentials are therefore 1 VH = −2 VK = r −2 − κ , r −κ

(4)

up to additive and multiplicative constants. In terms of the distance function ρκ to the point r = 0 this can be rewritten as √ √ √ tan2 ( κ ρκ ) VK = κ cot κ ρκ , VH = , κ thus reproducing the known prescriptions for the sphere and the hyperbolic space [5,49]. The Euclidean case is recovered by letting κ → 0.

Optimal Extension of Bertrand’s Theorem to Curved Manifolds

1037

Now let us consider the spherically symmetric spaces (M, g j ) ( j = I, II) defined by the metrics Type I : Type II :

m 2 dr 2 + r 2 (dθ 2 + sin2 θ dϕ 2 ), (5a) n2 1 + K r 2 2m 2 1 − Dr 2 ± (1 − Dr 2 )2 − K r 4 ds 2 = dr 2 + r 2 (dθ 2 + sin2 θ dϕ 2 ), n 2 (1 − Dr 2 )2 − K r 4 (5b)

ds 2 =

where D and K are real constants and m and n are coprime positive integers. The maximal interval (r1 , r2 ) can be easily found from these expressions. These Riemannian 3-manifolds, which first appeared in [47] (where the quotient n/m was called β), will be henceforth called Bertrand spaces. A short computation shows that, up to a multiplicative constant, the Kepler potential of a Bertrand space of type I is √ VI = r −2 + K + G , (6a) whereas the harmonic oscillator potential of one of type II can be written in the convenient form −1 VII = G ∓ r 2 1 − Dr 2 ± (1 − Dr 2 )2 − K r 4 . (6b) Here G is an arbitrary constant. By comparing with Ref. [47], the above digression immediately yields the following Proposition 5. (M, η) is a2 Bertrand spacetime if and only if it is isometric to the warped product M × R, g j − dtV j , with (M, g j ) a Bertrand space of type j ( j = I, II, cf. (5)) and V j given by (6). In particular, this shows that Perlick’s obtention of two different kinds of Bertrand spacetimes has a natural interpretation [3]: they are associated to either Kepler (type I) or harmonic oscillator (type II) potentials. The multiplicative constant of the potentials is inessential and can be eliminated by rescaling the time variable. Example 6. We conclude this section with a brief discussion of a few examples of physically relevant spaces that are Bertrand. This intends both to serve as motivation and to help the reader gain some insight on Bertrand spaces. A more detailed discussion can be found in [3]. (i) Spaces of constant curvature. The metric of the simply connected Riemannian 3-manifold of constant sectional curvature κ is usually written as ds 2 =

dr 2 2 2 2 2 dθ . + r + sin θ dϕ 1 − κr 2

We have already seen that the Kepler and harmonic oscillator potentials in these spaces are given by (4), and it is well known that all the bounded integral curves of both systems are periodic. This result is immediately recovered by noticing that the Kepler system is recovered from the type I Bertrand spacetimes when n = m = 1 and K = −κ, whereas the harmonic oscillator is obtained as the type II Bertrand spacetime with n/m = 2, K = 0 and D = κ.

1038

Á. Ballesteros, A. Enciso, F. J. Herranz, O. Ragnisco

(ii) Darboux space of type III. Consider the metric √ k 2 + 2r 2 + k k 2 + 4r 2 2 ds = dr + r 2 dθ 2 + sin2 θ dϕ 2 , 2 2 2(k + 4r ) 2

whose intrinsic harmonic oscillator potential is given by VII =

2k 2 r 2 √ k 2 + 2r 2 + k k 2 + 4r 2

up to multiplicative and additive constants. This defines a Bertrand spacetime of type II with parameters n/m = 2, K = 4/k 4 and D = −2/k 2 . Let us introduce coordinates Q = (Q 1 , Q 2 , Q 3 ) as Q=

(k 2 + 4r 2 )1/2 − k 2

1/2 (cos θ cos ϕ, cos θ sin ϕ, sin θ ) .

In terms of these coordinates, the above metric and potential is read: ds 2 = k + Q2 dQ2 ,

VII =

k 2 Q2 . k + Q2

Thus we recover the three-dimensional Darboux system of type III [32]. The Darboux system of type III is the only quadratically superintegrable natural Hamiltonian system in a surface of nonconstant curvature which is known to admit quadratically superintegrable N -dimensional generalizations [2]. (iii) Multifold Kepler systems. The family of multifold Kepler systems was introduced by Iwai and Katayama [27,28] as Hamiltonian reductions of the geodesic flow in a generalized Taub–NUT metric. These systems are given by the metrics and potentials n n ds 2 = Q m −2 a + b Q m dQ2 , n

VII =

Q2− m a + bQ

n m

n 2m µ2 Q−2 + µ2 c Q m −2 + µ2 d Q n −2 ,

with Q = (Q 1 , Q 2 , Q 3 ), a, b, c, d, µ constants and n, m coprime positive integers. The substitution Q=

1

(a 2 + 4br 2 ) 2 − a 2b

mn (cos θ cos ϕ, cos θ sin ϕ, sin θ )

shows that the multifold Kepler models are equivalent to the type II Bertrand systems with parameters K = 4a −4 b2 and D = − 2b . It should be noticed that the a2 Darboux space of type III is a particular case of the multifold Kepler systems.

Optimal Extension of Bertrand’s Theorem to Curved Manifolds

1039

3. The Orbit Equation and Geometric Superintegrability Hereafter we shall analyze the properties of the Hamiltonian systems in (M, g j ) given by H j :=

1 p2g j + V j (q) , 2

j = I, II ,

(7)

where the metric g j and the potential V j are respectively defined by (5) and (6). As previously discussed, the orbits of these systems correspond to trajectories of the associated Bertrand spacetimes. It should be noticed that in the adapted coordinate system, these Hamiltonians read: √ pϕ2 pθ2 1 n 2 2 2 HI = 1 + K r pr + 2 + 2 2 (8a) + r −2 + K + G , 2 m r r sin θ ⎤ ⎡ pϕ2 n 2 (1 − Dr 2 )2 − K r 4 pr2 pθ2 1⎣ ⎦ + 2 + HII = 2 2m 2 1 − Dr 2 ± (1 − Dr 2 )2 − K r 4 r r 2 sin2 θ −1 ∓r 2 1 − Dr 2 ± (1 − Dr 2 )2 − K r 4 +G, (8b) where pr is the momentum conjugate to r and pθ and pϕ are defined analogously. In this section we shall derive the simplest superintegrability property of the Hamiltonian systems (7) (cf. Proposition 10), which nonetheless seems to have escaped unnoticed so far. The proof of this result relies on the fact that, by definition, the orbits of (7) define an invariant foliation by (topological) circles in an open subset ⊂ T ∗ M of the phase space of the system. E.g., in the classical Kepler problem = (q, p) ∈ R3 × R3 : H (q, p) < 0, q × p = 0 is the set of points with negative energy and nonzero angular momentum, whereas for the harmonic oscillator one can take = (R3 ×R3 )\{(0, 0)}, i.e., the whole phase space minus the equilibrium. In Proposition 7 below we compute the expression of the orbits in closed form, revealing that the above foliation is actually a locally trivial fibration. This allows us to resort to the geometric theory of superintegrable Hamiltonian systems [12], yielding the first superintegrability result for (7). Before discussing the precise statement of Proposition 10, let us compute the orbits of the Hamiltonian (7). In fact, the closed expression that we shall derive is not only used in the proof of Proposition 10, but it is also a key element of Theorem 12, where a stronger superintegrability result is presented. It is convenient to introduce the rectangular coordinates q = (q 1 , q 2 , q 3 ) associated to the spherical coordinates (r, θ, ϕ) as q = (r cos θ cos ϕ, r cos θ sin ϕ, r sin θ ).

(9)

The conjugate momenta will be denoted by p = ( p1 , p2 , p3 ). Clearly the coordinates (q, p) are globally defined in T ∗ M. We shall use the notation ·, × and · for the Euclidean inner product, cross product and norm in R3 and call E = H j (p, q) and J 2 = q × p2 the energy and angular momentum of an integral curve (q(t), p(t)) of (7). Obviously E and J 2 are constants of motion.

1040

Á. Ballesteros, A. Enciso, F. J. Herranz, O. Ragnisco

Proposition 7. Let γ be an inextendible orbit of the Hamiltonian system (7) which is contained in the invariant plane θ = π2 . Then γ is given by √ nϕ 1 + J 2 r −2 + K (10a) cos − ϕ0 = m 1 + 2J 2 (E − G) + K J 4 if j = I and by cos

nϕ m

− ϕ0

J 2 r −2 1 − Dr 2 ± (1 − Dr 2 )2 − K r 4 + D J 2 + 2G − 2E = (2E − 2G − D J 2 )2 ± 4J 2 − K J 4 (10b)

if j = II. Here ϕ0 is a real constant. Proof. We begin with the case j = I. The crucial observation is that the orbit equation 2 m2 J 2 dr J2 = 2E − 2VI − 2 2 4 2 n r (1 + K r ) dϕ r simplifies dramatically with the change of variables √ u = r −2 + K , in terms of which the potential and the inverse square term read: VI = u + G ,

r −2 = u 2 − K .

The orbit equation is then given by m J du 2 = 2E − 2G + K J 2 − 2u − J 2 u 2 , n dϕ which can be readily integrated to yield cos

nϕ m

− ϕ0 =

1 + J 2u 1 + 2J 2 (E − G) + K J 4

for some constant ϕ0 . When j = II the treatment is analogous. Now the orbit equation reads 1 − Dr 2 ± (1 − Dr 2 )2 − K r 4 m J dr 2 J2 = E − V − , II n dϕ 2r 2 r 4 (1 − Dr 2 )2 − K r 4 and it is convenient to introduce the variable v = r −2 1 − Dr 2 ± (1 − Dr 2 )2 − K r 4 . In terms of this new coordinate the potential is simply VII = G ∓ v1 , whereas the inverse square term is given by r −2 =

v 2 + 2Dv + K . 2v

Optimal Extension of Bertrand’s Theorem to Curved Manifolds

1041

Hence a straightforward computation shows that the orbit equation is m J dv 2 = 4(E − G)v − J 2 v 2 + 2Dv + K ± 4 , n dϕ thereby obtaining cos

nϕ

J 2 (v + D) + 2G − 2E − ϕ0 = . m (2E − 2G − D J 2 )2 ± 4J 2 − K J 4

Here ϕ0 is a real constant.

Remark 8. Equations (10) are well defined also when J = 0. Moreover, it is not difficult to check that r can be readily expressed as a function of ϕ by performing some manipulations in the right-hand side of (10). We shall now specify what is understood by geometric superintegrability. Let F0 be a smooth Hamiltonian defined on a 2d dimensional symplectic manifold N admitting s ≥ d −1 functionally independent first integrals F1 , . . . , Fs other than the Hamiltonian. Let us suppose that F = (F0 , F1 , . . . , Fs ) is a submersion onto its image with compact and connected fibers, which by Ehresmann’s theorem (cf. e.g. [41]) implies that its level sets define a locally trivial fibration F of N . If s ≥ d, not all the latter first integrals can Poisson-commute: the usual condition to impose is that there exists a matrix-valued function P : F(N ) → Mat(s + 1) of rank s − d + 1 such that {Fi , F j } = Pi j ◦ F,

0 ≤ i, j ≤ s.

(11)

In particular, when s = d − 1 this yields the usual definition of Liouville integrability. Well known generalizations of the Liouville–Arnold theorem [42,44] show that every fiber of F is an invariant (2d − s − 1)-torus, and that the motion on each of these tori is conjugate to a linear flow. Moreover, the fibration F has symplectic local trivializations. Geometrically, the existence of the function P means that F has a polar foliation [12], i.e, a foliation F ⊥ whose tangent spaces are symplectically orthogonal to those of F. Similarly, the rank condition in Eq. (11) is tantamount to demand that the invariant (2d − s − 1)-tori of the foliation be isotropic. Thus the crucial element in the geometric characterization of superintegrability is the bifoliation (F, F ⊥ ), which is a type of dual pair as defined in [51]. One is thus led to introduce the following definition (cf. [12] and the survey [17], where slightly different wording is used): Definition 9. A Hamiltonian system on a symplectic 2d-dimensional manifold is geometrically superintegrable with s ≥ d − 1 semiglobal integrals if the Hamiltonian vector field is tangent to a locally trivial fibration by isotropic (2d − s − 1)-tori which admits a polar foliation. If s takes the maximum value 2d − 2 we shall simply say that the system is geometricaly superintegrable. Of course, generally not all the phase space of a (super)integrable system is fibered by invariant isotropic tori: there can be, e.g., singular points and unbounded orbits. But it is customary and of interest to restrict one’s attention to the region where such fibration is well defined. In the case when s = d − 1 (Liouville integrability), the invariant tori are Langrangian and therefore F ⊥ = F, explaining why the bifibration (F, F ⊥ ) is less well known than the fibrations by Lagrangian tori. (However, an advantage of the bifibration is that, under mild technical assumptions, it is uniquely determined (and finer),

1042

Á. Ballesteros, A. Enciso, F. J. Herranz, O. Ragnisco

whereas for integrable systems with additional integrals there is some arbitrariness in the choice of invariant Lagrangian tori.) It should be noticed that the above structure yields “semiglobal” (i.e., defined in a tubular neighborhood of each torus) first integrals associated to the existence of generalized action-angle coordinates; a detailed account can be found in [8,12,17]. The content of the following proposition is that the Bertrand systems (7) are geometrically superintegrable in the region foliated by periodic orbits. Proposition 10. Let be the region of T ∗ M where all the orbits of H j are periodic. Then H j | is geometrically superintegrable. Proof. It easily follows from Proposition 7 that the orbits of H j define a locally trivial fibration by (topological) circles in . The fibers are certainly isotropic, as they are one-dimensional, and the flow of H j on each fiber is conjugate to the linear one because H j does not possess any critical points in . Moreover, it stems from Proposition 7 that the function → R+ mapping each point in to the length of the (periodic) orbit passing through it is smooth, which in turn readily implies that the period function is also smooth and nonvanishing in this region. Hence a theorem of Fassò [16] implies that H j is geometrically superintegrable, proving the proposition. 4. The Generalized Runge–Lenz Vector In this section we shall prove a stronger superintegrability result for the Bertrand Hamiltonians (7). More precisely, we shall provide a semi-explicit construction of an additional vector first integral which we shall call the generalized Runge–Lenz vector. This vector of the original space M, and it is invariant under field is defined on an n-fold cover M In the flow generated by the lift of the Bertrand Hamiltonian to the covering space M. M, this vector field induces a global tensor field of rank n which is preserved under the flow of H . As before, n is the positive integer which appears in Eq. (7). As regards the superintegrability properties of the Hamiltonian systems (7), the spherical symmetry of these systems readily yields three first integrals other than the Hamiltonian, which can be identified with the components of the angular momentum. The idea of looking for generalizations of the Runge–Lenz vector in order to find an additional integral of motion is not new: an updated and rather complete review of the related literature can be found in [36]. Here we shall use our information about the integral curves of (7) and some ideas already present in the work of Fradkin [20] and Holas and March [26]. Let us start by recalling Fradkin’s construction [20] of a local vector first integral for the Hamiltonian system H0 =

1 p2 + U (q) , 2

where U (q) is an arbitrary central potential and (q, p) ∈ R3 × R3 . The starting point is the following trivial remark. Consider an integral curve q(t) of H0 contained in the plane {θ = π2 } ⊂ R3 , where (r, θ, ϕ) are the usual spherical coordinates. We can assume without loss of generality that we have taken the initial condition ϕ(0) = 0 and use the notation r = q, J = pϕ = r 2 ϕ. ˙ A simple computation shows that the derivative along this integral curve of the unit vector field a=

sin ϕ cos ϕ q+ q × (q × p) r rJ

(12)

Optimal Extension of Bertrand’s Theorem to Curved Manifolds

1043

is identically zero, as in fact a(t) is the constant vector (1, 0, 0). Fradkin’s observation was that if cos ϕ and J −1 sin ϕ can be expressed in terms of q and p in a domain ⊂ R3 \{0}, then the resulting vector field is a first integral of H0 in . When H0 is the Kepler Hamiltonian, the generalized Runge–Lenz vector field is well defined globally and essentially coincides with the classical Runge–Lenz vector divided by its norm. When H0 is the harmonic oscillator, the generalized Runge–Lenz vector is multivalued (this can be neatly understood by considering the turning points of the orbits), but can be used to recover the conserved tensor field C = 2ω2 q ⊗ q + p ⊗ p associated to the SU(3) symmetry [26]. Definition 11. Let H be a Hamiltonian system defined on (the cotangent bundle of) a 3-manifold N . We say that H admits a generalized Runge–Lenz vector if there exists a nontrivial momentum-dependent vector field A in N which is constant along the flow of H . Obviously the conserved vector A is nontrivial if it is not constant and cannot be written in terms of the energy and the angular momentum integrals, and we recall that a momentum-dependent vector field in N is a map A : T ∗ N → T N such that πT N oA = πT ∗ N . A momentum-dependent tensor field is defined similarly. The main problem with Fradkin’s approach is that, of course, it is not at all obvious how to obtain sufficient conditions ensuring that these local integrals are well defined globally, while local superintegrability is trivial in a neighborhood of any regular point of the Hamiltonian flow. However, we shall see below that Fradkin’s approach works well for the kind of Hamiltonian systems that we are considering in this paper, and that one can construct a globally defined generalized Runge–Lenz vector (cf. Eq. (16) below) which is roughly analogous to (12). Theorem 12. Consider a Hamiltonian of the form (7), with m, n coprime positive integers. of M such that the lift of this Hamiltonian to M Then there exists an n-fold cover M admits a generalized Runge–Lenz vector. Proof. We shall call H j , j = I, II, the Hamiltonian (7). Let γ be an inextendible orbit of H j , which can be assumed to lie in the invariant plane θ = π2 . By Proposition 7, and taking ϕ0 = 0 in Eq. (10) without loss of generality, γ is the self-intersecting curve given by cos

nϕ = χ (r 2 , J 2 , E) , m

where χ is the function √ ⎧ 1 + J 2 r −2 + K ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ 1 + 2J 2 (E−G) + K J 4 2 2 χ (r , J , E) = 2 r −2 1−Dr 2 ± (1−Dr 2 )2 −K r 4 + D J 2 +2G−2E ⎪ ⎪ J ⎪ ⎪ ⎪ ⎩ (2E−2G−D J 2 )2 ± 4J 2 −K J 4

(13)

if j = I ,

if j = II.

Moreover, the chain rule immediately yields m d nϕ m r˙ ∂ nϕ =− cos =− χ (r 2 , J 2 , E) = (r r, sin ˙ r 2 , J, E) , (14) m n dϕ m n ϕ˙ ∂r

1044

Á. Ballesteros, A. Enciso, F. J. Herranz, O. Ragnisco

where mr 2 (D1 χ )(r 2 , J 2 , E) nJ and D1 χ stands for the derivative of the function χ with respect to its first argument. It should be noted that these expressions are well defined also for J = 0. Using the properties of the Chebyshev polynomials it is trivial to express cos nϕ and sin nϕ in terms of r, r˙ , J and E as nϕ cos nϕ = Tm cos = Tm χ (r 2 , J 2 , E) , m nϕ nϕ Um−1 cos = (r r, ˙ r 2 , J, E) Um−1 χ (r 2 , J 2 , E) . sin nϕ = sin m m Here Tm and Um respectively stand for the Chebyshev polynomials of the first and second kind and degree m. Setting (r r, ˙ r 2 , J, E) = −2r r˙

S1 = {z ∈ C : |z| = 1}, we find it convenient to define the analytic S1 -valued map ˙ r 2 , J, E) = Tm χ (r 2 , J 2 , E) + i(r r, ˙ r 2 , J, E) Um−1 χ (r 2 , J 2 , E) , En (r r, in terms of which the orbit γ is characterized as einϕ = En (r r, ˙ r 2 , J, E).

(15)

It stems from Fradkin’s argument that (12) yields a vector first integral of (8) in any region where eiϕ can be unambiguously expressed in terms of the coordinates (q, p). However, Eq. (15) does not determine the angle ϕ univocally modulo 2π because the map z → z n of the unit circle onto itself has degree n, so that Fradkin’s construction is, a priori, not global. As a matter of fact, it is obvious that Eq. (15) only defines ϕ modulo 2π/n, thus yielding an n-valued additional integral. The aforementioned problem is a consequence of the fact that the orbit γ has selfintersections. It is standard that this difficulty can be circumvented by means of an appropriate covering space of our initial manifold. The construction which we shall next outline is in fact analogous to that of the Riemann surface of the function z → z n . We denote by γ (t) the periodic integral curve of (7) defined by the orbit γ ⊂ M and take → M of M such that the lift is a smooth an n-fold cover : M γ (t) of γ (t) to M path without self-intersections. Notice that γ (t) is actually an integral curve of the lifted Hamiltonian 2 j = 1 p ∗ g + (V j ◦ )( H q) , j = I, II , j 2 M is a fiber bundle over M with typical fiber Zn , and for each where ( q, p ) ∈ T ∗ M. the section of M with fiber value k. Obviously k k ∈ Zn we denote by k : M → M is an injective map, and an isometry from an open and dense subset Mk ⊂ M onto its ∗ g j ). One obviously has that ◦ k = id and image in ( M, −1 (q) = k (q) k∈Zn

for all q ∈ M.

Optimal Extension of Bertrand’s Theorem to Curved Manifolds

1045

By construction, in each section k (M) there exists a determination of the (complex) n th root which allows to solve eiϕ in terms of einϕ univocally along k ( γ ). Therefore, for each k ∈ Zn there exist real functions Sk and Ck (namely, determinations of arcsin and arccos) such that eiϕ(t) = Ck (cos nϕ(t)) + iSk (sin nϕ(t)) whenever the point (r (t), θ = shows that the functions

π 2 , ϕ(t))

lies in k ( γ ). Moreover, an easy computation

Ck (r 2 , J 2 , E) = Ck Tm (χ (r 2 , J 2 , E)) , ˙ r 2 , J, E) Um−1 (χ (r 2 , J 2 , E)) Sk (r r, ˙ r 2 , J 2 , E) = J −1 Sk (r r,

are analytic in their domains. In order to express Ck (r 2 , J 2 , E) and Sk (r r, ˙ r 2 , J 2 , E) in a more convenient way, we consider the lift of the coordinates q to each space k (M). With a slight abuse of notation, we shall still denote these coordinates by q. An immediate computation shows that ∗ g j |k (M) reads: (q · dq)2 ds 2 = dq2 + h(q)2 − 1 , q2 where the function h is defined as in Sect. 2, namely, ⎧ m2 ⎪ ⎪ ⎪ if j = I , ⎪ ⎪ ⎨ n 2 (1 + K r 2 ) h(r )2 = ⎪ 2m 2 1 − Dr 2 ± (1 − Dr 2 )2 − K r 4 ⎪ ⎪ ⎪ ⎪ if j = II. ⎩ n 2 (1 − Dr 2 )2 − K r 4 By differentiation it stems from this formula that the conjugate momentum p to q is given by q · q˙ p = q˙ + h(q)2 − 1 q, q2 yielding q˙ = v(q, p) with

q·p q. v(q, p) = p + h(q)−2 − 1 q2

As r r˙ = q· q˙ = q·v(q, p), we now have all the ingredients to invoke Fradkin’s argument (cf. Eq. (12), with which (16) should be compared) and derive that each component of the momentum-dependent vector field Ak in T ∗ k (M) defined by 1 Ck q2 , q × p2 , H j (q, p) q Ak = r + Sk q · v(q, p), q2 , q × p2 , H j (q, p) q × (q × p) (16) is a constant of motion in k (M). By construction, the vector fields Ak (with k ∈ Zn ) whose Lie derivdefine an analytic global momentum-dependent vector field A in T ∗ M j is zero, thereby obtaining the desired unit Runge–Lenz vector. ative along the flow of H

1046

Á. Ballesteros, A. Enciso, F. J. Herranz, O. Ragnisco

is a finite cover of Remark 13. The particular form of the orbits (10) and the fact that M M ensure that all the lifted orbits which are bounded are also periodic, and that the lifted is endowed with the pulled back orbits do not have any self-intersections. Note that if M ∗ g j ) → (M, g j ) metric g j = g j , H j is a natural Hamiltonian system and : ( M, becomes a Riemannian cover. Corollary 14. Consider a Hamiltonian of the form (7) with n = 1. Then the generalized Runge–Lenz vector is well defined in all M. Proof. It trivially follows from Theorem 12.

Corollary 15. Consider a Hamiltonian H j of the form (7), with m, n coprime positive integers. Then there exists a momentum-dependent symmetric tensor field in M of rank n which is invariant under the flow of H j . Proof. Let us use the same notation as in the proof of Theorem 12. In particular, we consider the integral curve γ (t) and the maps Ak used in the proof of Theorem 12. For each k ∈ Zn , let us denote by Ak (t) the restriction of the momentum-dependent vector field Ak : T ∗ k (M) → R3 to the projection of the integral curve γ (t) to T ∗ k (M). The only observation we need in order to prove Corollary 15 is that, by the expression and of the for γ found in Proposition 7 and the definitions of the covering space M momentum-dependent vector fields Ak , it easily follows that Ak t + n Tγ = Ak+ (t) for all k ∈ Zn , ∈ Z, t ∈ R such that γ (t) ∈ Mk+ and γ (t + n Tγ ) ∈ Mk . Here Tγ stands for the period of the integral curve γ (t) and the sum k + is to be considered modulo n. This periodicity property readily implies that the symmetric tensor product C of A1 , . . . , An , with components (i

C i1 ,...,in (q, p) = A1 1 (q, p) · · · Ainn ) (q, p) , is a well defined, analytic tensor field in M of rank n. As usual, symmetrization of the superscripts delimited by curved brackets is understood. To complete the proof of the corollary, it suffices to notice that C is trivially invariant under the flow of H j as each Ak is a (multivalued) first integral. Some comments may be in order. First, one should observe the dependance of the additional integrals (16) on the momenta is generally complicated (and in particular not quadratic), which explains why they are usually so hard to spot [30]. Second, it should be noticed that Corollaries 14 and 15 yield the usual Runge–Lenz vector and second rank conserved tensor (up to a normalization constant) when the Bertrand Hamiltonian we consider is the Kepler or harmonic oscillator system in Euclidean space [26,47]. Note, however, that given an arbitrary Bertrand Hamiltonian it is usually hard to compute the conserved tensor C or the Runge–Lenz vector A in closed form. In this direction, it should be mentioned that an additional integral has been explicitly obtained for some of the Bertrand Hamiltonians discussed in Example 6 (cf. e.g. [2,4,21,27] and references therein).

Optimal Extension of Bertrand’s Theorem to Curved Manifolds

1047

5. Bertrand’s Theorem on Curved Spaces In the previous sections we have thoroughly analyzed the superintegrability properties of the spherically symmetric natural Hamiltonian systems whose bounded orbits are all periodic. When combined with the discussion of harmonic oscillators and Kepler potentials on Bertrand spacetimes presented in Sect. 2, this gives all the ingredients we need to state a fully satisfactory analogue of Bertrand’s theorem on spherically symmetric spaces: Theorem 16. Let H be the Hamiltonian function associated to a Bertrand spacetime, i.e., an autonomous, spherically symmetric natural Hamiltonian system on a Riemannian 3-manifold (M, g) satisfying Properties (i) and (ii) in Bertrand’s Theorem 1. Then the following statements hold: (i) H is of the form (7) for some coprime positive integers n, m. (ii) The potential V is the intrinsic Kepler or oscillator potential in (M, g). (iii) H is superintegrable. More precisely, (a) H is geometrically superintegrable in the region of T ∗ M foliated by bounded orbits. of M such that the lift of H to M admits a (b) There exists an n-fold cover M generalized Runge–Lenz vector. (c) There exists a nontrivial momentum-dependent tensor field in M of rank n which is invariant under the flow of H . As mentioned in the Introduction, this result is of interest both in itself and because of the abundant literature devoted to the study of particular cases of this problem in different contexts and from various points of view. Acknowledgements. This work was partially supported by the Spanish Ministerio de Educación under grant no. MTM2007-67389 (with EU-FEDER support) (A.B. and F.J.H.), by the Spanish DGI and CAM–Complutense University under grants no. FIS2008-00209 and CCG07-2779 (A.E.), and by the INFN–CICyT (O.R.).

References 1. Atiyah, M.F., Hitchin, N.J.: Low-energy scattering of non-Abelian magnetic monopoles. Phys. Lett. A 107, 21–25 (1985) 2. Ballesteros, A., Enciso, A., Herranz, F.J., Ragnisco, O.: A maximally superintegrable system on an n-dimensional space of nonconstant curvature. Physica D 237, 505–509 (2008) 3. Ballesteros, A., Enciso, A., Herranz, F.J., Ragnisco, O.: Bertrand spacetimes as Kepler/oscillator potentials. Class. Quant. Grav. 25, 165005 (2008) 4. Ballesteros, A., Herranz, F.J., Santander, M., Sanz-Gil, T.: Maximal superintegrability on N -dimensional curved spaces. J. Phys. A: Math. Gen. 36, L93–L99 (2003) 5. Ballesteros, A., Herranz, F.J.: Universal integrals for superintegrable systems on N -dimensional spaces of constant curvature. J. Phys. A: Math. Theor. 40, F51–F59 (2007) 6. Bertrand, J.: Théorème relatif au mouvement d’un point attiré vers un centre fixe. C. R. Math. Acad. Sci. Paris 77, 849–853 (1873) 7. Bini, D., Cherubini, C., Jantzen, R.T., Mashhoon, B.: Gravitomagnetism in the Kerr–Newman–Taub–NUT spacetime. Class. Quant. Grav. 20, 457–468 (2003) 8. Bogoyavlenskij, O.I.: Theory of tensor invariants of integrable Hamiltonian systems I. Commun. Math. Phys. 180, 529–586 (1996); II. Ibid. 184, 301–365 (1997) 9. Braam, P.J.: Magnetic monopoles on three-manifolds. J. Diff. Geom. 30, 425–464 (1989) 10. Cherkis, S.A., Kapustin, A.: Nahm transform for periodic monopoles and N = 2 Super Yang-Mills theory. Commun. Math. Phys. 218, 333–371 (2001)

1048

Á. Ballesteros, A. Enciso, F. J. Herranz, O. Ragnisco

11. Cordani, B., Fehér, L.G., Horváthy, P.A.: Kepler-type dynamical symmetries of long-range monopole interactions. J. Math. Phys. 31, 202–211 (1990) 12. Dazord, P., Delzant, T.: Le problème général des variables actions-angles. J. Diff. Geom. 26, 223–251 (1987) 13. Enciso, A., Peralta-Salas, D.: Geometrical and topological aspects of Electrostatics on Riemannian manifolds. J. Geom. Phys. 57, 1679–1696 (2007); Addendum, ibid. 58, 1267–1269 (2008) 14. Enciso, A., Peralta-Salas, D.: Critical points and level sets in exterior boundary problems. Indiana Univ. Math. J., in press, 2009 15. Enciso, A., Peralta-Salas, D.: Critical points and generic properties of Green functions on complete manifolds. Preprint, 2009 16. Fassò, F.: Quasi-periodicity of motions and complete integrability of Hamiltonian systems. Ergod. Th. & Dynam. Syst. 18, 1349–1362 (1998) 17. Fassò, F.: Superintegrable Hamiltonian systems: Geometry and perturbations. Acta Appl. Math. 87, 93–121 (2005) 18. Féjoz, J., Kaczmarek, L.: Sur le théorème de Bertrand. Ergod. Th. & Dynam. Sys. 24, 1583–1589 (2004) 19. Fradkin, D.M.: Three-dimensional isotropic harmonic oscillator and SU3 . Amer. J. Phys. 33, 207–211 (1965) 20. Fradkin, D.M.: Existence of the dynamic symmetries O4 and SU3 for all classical central potential problems. Prog. Theor. Phys. 37, 798–812 (1967) 21. Gibbons, G.W., Warnick, C.M.: Hidden symmetry of hyperbolic monopole motion. J. Geom. Phys. 57, 2286–2315 (2007) 22. Gibbons, G.W., Ruback, P.J.: The hidden symmetries of multi-centre metrics. Commun. Math. Phys. 115, 267–300 (1988) 23. Guillemin, V., Sternberg, S.: Variations on a theme by Kepler. Providence, RI: Amer. Math. Soc., 1990 24. Hauser, I., Malhiot, R.: Spherically symmetric static space-times which admit stationary Killing tensors of rank two. J. Math. Phys 15, 816–823 (1974) 25. Higgs, P.W.: Dynamical symmetries in a spherical geometry. J. Phys. A: Math. Gen. 12, 309–323 (1979) 26. Holas, A., March, N.H.: A generalization of the Runge–Lenz constant of classical motion in a central potential. J. Phys. A: Math. Gen. 23, 735–749 (1990) 27. Iwai, T., Katayama, N.: Two classes of dynamical systems all of whose bounded trajectories are closed. J. Math. Phys. 35, 2914–2933 (1994) 28. Iwai, T., Katayama, N.: Multifold Kepler systems—Dynamical systems all of whose bounded trajectories are closed. J. Math. Phys. 36, 1790–1811 (1995) 29. Jezierski, J., Lukasik, M.: Conformal Yano–Killing tensors for the Taub–NUT metric. Class. Quant. Grav. 24, 1331–1340 (2007) 30. Kalnins, E.G., Kress, J.M., Miller, W.: Second-order superintegrable systems in conformally flat spaces. IV. The classical 3D Stäckel transform and 3D classification theory. J. Math. Phys. 47, 043514 (2006) 31. Kalnins, E.G., Kress, J.M., Winternitz, P.: Superintegrability in a two-dimensional space of nonconstant curvature. J. Math. Phys. 43, 970–983 (2002) 32. Kalnins, E.G., Kress, J.M., Miller, W., Winternitz, P.: Superintegrable systems in Darboux spaces. J. Math. Phys. 44, 5811–5848 (2003) 33. Krivonos, S., Nersessian, A., Ohanyan, V.: Multicenter McIntosh-Cisneros-Zwanziger-Kepler system, supersymmetry and integrability. Phys. Rev. D 75, 085002 (2007) 34. Kronheimer, P.B., Mrowka, T.S.: Monopoles and contact structures. Invent. Math. 130, 209–255 (1997) 35. Kronheimer, P.B., Mrowka, T.S., Ozsvàth, P., Szabó, Z.: Monopoles and lens space surgeries. Ann. Math. 165, 457–546 (2007) 36. Leach, P.G.L., Flessas, G.P.: Generalisations of the Laplace–Runge–Lenz vector. J. Nonlin. Math. Phys. 10, 340–423 (2003) 37. Li, P., Tam, L.F.: Symmetric Green’s functions on complete manifolds. Amer. J. Math. 109, 1129–1154 (1987) 38. Li, P., Tam, L.F.: Green’s functions, harmonic functions, and volume comparison. J. Diff. Geom. 41, 277–318 (1995) 39. Manton, N.S.: A remark on the scattering of BPS monopoles. Phys. Lett. B 110, 54–56 (1982) 40. McIntosh, H.V., Cisneros, A.: Degeneracy in the presence of a magnetic monopole. J. Math. Phys. 11, 896–916 (1970) 41. Meigniez, G.: Submersions, fibrations and bundles. Trans. Amer. Math. Soc. 354, 3771–3787 (2002) 42. Mischenko, A.S., Fomenko, A.T.: Generalized Liouville method of integration of Hamiltonian systems. Funct. Anal. Appl. 12, 113–121 (1978) 43. Nash, O.: Singular hyperbolic monopoles. Commun. Math. Phys. 277, 161–187 (2008) 44. Nekhoroshev, N.N.: Action-angle variables and their generalizations. Trans. Moskow Math. Soc. 26, 180–198 (1972)

Optimal Extension of Bertrand’s Theorem to Curved Manifolds

1049

45. Nersessian, A., Yeghikyan, V.: Anisotropic inharmonic Higgs oscillator and related (MICZ-) Kepler-like systems. J. Phys. A: Math. Theor. 41, 155203 (2008) 46. Norbury, P., Romão, N.M.: Spectral curves and the mass of hyperbolic monopoles. Commun. Math. Phys. 270, 295–333 (2007) 47. Perlick, V.: Bertrand spacetimes. Class. Quant. Grav. 9, 1009–1021 (1992) 48. Schrödinger, E.: Eigenvalues and eigenfunctions. Proc. Roy. Irish Acad. Sect. A 46, 9–16 (1940) 49. Shchepetilov, A.V.: Comment on “Central potentials on spaces of constant curvature: The Kepler problem on the two-dimensional sphere S2 and the hyperbolic plane H2 ” [J. Math. Phys. 46, 052702 (2005)]. J. Math. Phys. 46, 114101 (2005) 50. Seiberg, N., Witten, E.: Electric-magnetic duality, monopole condensation, and confinement in N = 2 supersymmetric Yang-Mills theory. Nucl. Phys. B 426, 19–53 (1994); Monopoles, duality and chiral symmetry breaking in N = 2 supersymmetric QCD. Nucl. Phys. B 431, 484–550 (1994) 51. Weinstein, A.: The local structure of Poisson manifolds. J. Diff. Geom. 18, 525–557 (1983) Communicated by G.W. Gibbons

Commun. Math. Phys. 290, 1051–1064 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0790-8

Communications in

Mathematical Physics

Spectral Conditions for Positive Maps Dariusz Chru´scinski, ´ Andrzej Kossakowski Institute of Physics, Nicolaus Copernicus University, Grudzi¸adzka 5/7, 87–100 Toru´n, Poland. E-mail: [email protected] Received: 7 October 2008 / Accepted: 14 December 2008 Published online: 27 March 2009 – © Springer-Verlag 2009

Abstract: We provide partial classification of positive linear maps in matrix algebras which is based on a family of spectral conditions. This construction generalizes the celebrated Choi example of a map which is positive but not completely positive. It is shown how the spectral conditions enable one to construct linear maps on tensor products of matrix algebras which are positive but only on a convex subset of separable elements. Such maps provide basic tools to study quantum entanglement in multipartite systems.

1. Introduction One of the most important problems of quantum information theory [1] is the characterization of mixed states of composed quantum systems. In particular it is of primary importance to test whether a given quantum state exhibits quantum correlation, i.e. whether it is separable or entangled. For low-dimensional systems there exists simple necessary and sufficient condition for separability. The celebrated Peres-Horodecki criterion [2,3] states that a state of a bipartite system living in C2 ⊗ C2 or C2 ⊗ C3 is separable iff its partial transpose is positive. Unfortunately, for higher-dimensional systems there is no single universal separability condition. It turns out that the above problem may be reformulated in terms of positive linear maps in operator algebras: a state ρ in H1 ⊗ H2 is separable iff (id ⊗ ϕ)ρ is positive for any positive map ϕ which sends positive operators on H2 into positive operators on H1 . Therefore, a classification of positive linear maps between operator algebras B(H1 ) and B(H2 ) is of primary importance. Unfortunately, in spite of the considerable effort, the structure of positive maps is rather poorly understood [4–8] (for some recent works see [9–11] and for a review paper see [12]). Positive maps play an important role both in physics and mathematics providing a generalization of ∗-homomorphisms, Jordan homomorphisms and conditional expectations. Normalized positive maps define affine mappings between sets of states of C∗ -algebras.

1052

D. Chru´sci´nski, A. Kossakowski

In the present paper we perform partial classification of positive linear maps which is based on spectral conditions. Actually, the presented method enables one to construct maps with a desired degree of positivity — so-called k-positive maps with k = 1, 2, . . . , d = min{dim H1 , dim H2 }. Completely positive (CP) maps correspond to d-positive maps, i.e. maps with the highest degree of positivity. These maps are fully classified due to Stinespring’s theorem [13]. Now, any positive map which is not CP can be written as ϕ = ϕ+ −ϕ− , with ϕ± being CP maps. However, there is no general method to recognize the positivity of ϕ from ϕ+ − ϕ− . We show that suitable spectral conditions satisfied by the pair (ϕ+ , ϕ− ) guarantee the k-positivity of ϕ+ − ϕ− . This construction generalizes the celebrated Choi example of a map which is (d − 1)-positive but not CP [6]. From a physical point of view, our method leads to a partial classification of entanglement witnesses. Recall, that an entanglement witness is a Hermitian operator W ∈ B(H1 ⊗ H2 ) which is not positive but satisfies (h 1 ⊗ h 2 , W h 1 ⊗ h 2 ) ≥ 0 for any h i ∈ Hi . The main result of our paper consists in Theorem 1: it states that if ϕ± satisfy a certain spectral condition (see formula (4.3)) then ϕ is k-positive but not (k + 1)-positive. Such maps play an important role in quantum information theory – they provide means to detect quantum entangled states with Schmidt number greater than k. Interestingly, our construction may be easily generalized for the multipartite case, i.e. for constructing entanglement witnesses in B(H1 ⊗ . . . ⊗ Hn ). Translated into the language of linear maps from B(H2 ⊗ . . . ⊗ Hn ) into B(H1 ) the presented method enables one to construct maps which are not positive but become such when restricted to separable elements in B(H2 ⊗ . . . ⊗ Hn ). The corresponding spectral condition is provided in Theorem 2 (see formula (6.16)). To the best of our knowledge we provide the first nontrivial example of a linear map ϕ : B(H ⊗ H) −→ B(H ⊗ H) which is positive on separable elements but not necessarily positive. 2. Preliminaries Consider the space L(H1 , H2 ) of linear operators a : H1 −→ H2 , or equivalently the space of d1 × d2 matrices, where di = dim Hi < ∞. Let us recall that L(H1 , H2 ) is equipped with a family of Ky Fan k-norms [14]: for any a ∈ L(H1 , H2 ) one defines || a ||k :=

k

si (a),

(2.1)

i=1

where s1 (a) ≥ . . . ≥ sd (a) (d = min{d1 , d2 }) are the singular values of a. Clearly, for k = 1 one recovers the operator norm || a ||1 = || a || and if d1 = d2 = d, then for k = d one reproduces the trace norm || a ||d = || a ||tr . The family of k-norms satisfies: 1. 2. 3.

|| a ||k ≤ || a ||k+1 , || a ||k = || a ||k+1 if and only if rank a = k, if rank a ≥ k + 1 , then || a ||k < || a ||k+1 .

Note that a family of Ky Fan norms may be equivalently introduced as follows: let us define the following subset of B(H): Pk (H) = { p ∈ B(H) : p = p ∗ = p 2 , tr p = k }.

(2.2)

Now, for any p ∈ Pk (H2 ) define the following inner product in L(H1 , H2 ): a, b p := tr [( pa)∗ ( pb)] = tr (a ∗ pb) = tr ( pba ∗ ).

(2.3)

Spectral Conditions for Positive Maps

1053

It is easy to show that || a ||2k =

max

p∈Pk (H2 )

a, a p =

max

p∈Pk (H2 )

tr ( paa ∗ ).

(2.4)

Throughout the paper we shall consider only finite-dimensional Hilbert spaces. We denote by Md the space of d × d complex matrices and by Id the identity matrix in Md . Proposition 1. For arbitrary projectors P and Q in H, || Q P Q || = || P Q P ||.

(2.5)

|| Q P Q || = || Q P(Q P)∗ || = || (Q P)2 ||,

(2.6)

|| P Q P || = || P Q(P Q)∗ || = || (P Q)2 ||.

(2.7)

Proof. One obviously has

and Now, for any hermitian A one has ||

A2 ||

= ||

A∗2 ||

= ||

A ||2 ,and

|| (Q P)2 || = || (Q P)∗2 || = || (P Q)2 ||, which ends the proof.

hence (2.8)

Consider now the Hilbert space being the tensor product H1 ⊗ H2 . It is clear that H1 ⊗ H2 is isomorphic to the space of linear operators from L(H1 , H2 ), i.e. each ψ ∈ H1 ⊗ H2 corresponds to some linear operator F : H1 −→ H2 . Indeed, taking orthonormal basis {ei } (i = 1, . . . , d1 ) in H1 and { f α } (α = 1, . . . , d2 ) in H2 , one has ψ=

d1 d2

ψiα ei ⊗ f α ,

(2.9)

i=1 α=1

where ψiα are complex coefficients. Defining F : H1 −→ H2 by Fei :=

d2

ψiα f α ,

α=1

one finds ψ=

d1

ei ⊗ Fei ,

(2.10)

i=1

which establishes the above-mentioned isomorphism. Let us note that the normalization condition ψ|ψ =

d1 d2

|ψiα |2 = 1,

(2.11)

i=1 α=1

gives rise to tr F F ∗ = 1,

(2.12)

for the corresponding operator F. It is therefore clear that the rank-1 projector P := |ψ ψ| may be represented in the following way:

1054

D. Chru´sci´nski, A. Kossakowski

P=

d1

ei j ⊗ Fei j F ∗ ,

(2.13)

i, j=1

where ei j := |ei e j | ∈ B(H1 ). Now, it is easy to see that SR(ψ) = rank F,

(2.14)

where SR(ψ) denotes the Schmidt rank of ψ (1 ≤ SR(ψ) ≤ d), i.e. the number of non-vanishing Schmidt coefficients in the Schmidt decomposition of ψ. It is clear that F does depend upon the chosen basis {e1 , . . . , ed1 }. Note, however, that F F ∗ is basisindependent and, therefore, it has the physical meaning being the reduction of P with respect to the first subsystem, F F ∗ = tr 1 P.

(2.15)

Proposition 2. Let P be a projector in H1 ⊗ H2 represented as in (2.13) and Q = Id1 ⊗ p, where p ∈ Pk (H2 ). Then the following formula holds: || (Id1 ⊗ p)P(Id1 ⊗ p) || = tr( p F F ∗ ),

(2.16)

|| (Id1 ⊗ p)P(Id1 ⊗ p) || ≤ || F ||2k .

(2.17)

and hence

Proof. Due to Proposition 1 one has || (Id1 ⊗ p)P(Id1 ⊗ p) || = || P(Id1 ⊗ p)P ||,

(2.18)

and hence || (Id1 ⊗ p)P(Id1 ⊗ p) || = tr[P(Id1 ⊗ p)] =

d1

tr(Feii F ∗ p) = tr(F F ∗ p), (2.19)

i=1

where we have used

d 1

i=1 eii

= Id1 .

√ Note that if F = V / d1 , where V is an isometry, V V ∗ = Id2 , then P is the maximally entangled state P=

d1 1 ei j ⊗ V ei j V ∗ , d1

(2.20)

i, j=1

and one obtains in this case || (Id1 ⊗ p)P(Id1 ⊗ p) || =

k = || F ||2k . d1

(2.21)

Spectral Conditions for Positive Maps

1055

3. Entangled States vs. Positive Maps Let us recall that the state of a quantum system living in H1 ⊗ H2 is separable iff the corresponding density operator σ is a convex combination of product states σ1 ⊗ σ2 . For any normalized positive operator σ on H1 ⊗ H2 one may define its Schmidt number SN(σ ) = min max SR(ψk ) , (3.1) αk ,ψk

k

where the minimum is taken over all possible pure state decompositions, σ = αk |ψk ψk |,

(3.2)

k

where αk ≥ 0, k αk = 1, and ψk are normalized vectors in H1 ⊗ H2 . This number characterizes the minimum Schmidt rank of the pure states needed to construct σ . It is evident that 1 ≤ SN(σ ) ≤ d = min{d1 , d2 }. Moreover, σ is separable iff SN(σ ) = 1. It was proved [15] that Schmidt number is non-increasing under local operations and classical communication. Now, the notion of the Schmidt number enables one to introduce a natural family of convex cones in B(H1 ⊗ H2 )+ (the set of semi-positive elements in B(H1 ⊗ H2 )): Vr = { σ ∈ B(H1 ⊗ H2 )+ | SN(σ ) ≤ r }.

(3.3)

One has the following chain of inclusions: V1 ⊂ . . . ⊂ Vd = B(H1 ⊗ H2 )+ .

(3.4)

Clearly, V1 is the cone of separable (unnormalized) states and Vd V1 stands for the set of entangled states. Let ϕ : B(H1 ) −→ B(H2 ) be a linear map such that ϕ(a)∗ = ϕ(a ∗ ). The map ϕ is positive iff ϕ(a) ≥ 0 for any a ≥ 0. Definition 1. A linear map ϕ is k-positive if idk ⊗ ϕ : Mk ⊗ B(H1 ) −→ Mk ⊗ B(H2 ) is positive. A map which is k-positive for k = 1, . . . , d = min{d1 , d2 } is called completely positive (CP map). Due to the Choi-Jamiołkowski isomorphism [6,16] any linear adjoint-preserving map ϕ : B(H1 ) −→ B(H2 ) corresponds to a Hermitian operator ϕ ∈ B(H1 ⊗ H2 ), ϕ :=

d1

ei j ⊗ ϕ(ei j ).

(3.5)

i, j=1

One easily proves [6] the following properties of the Choi matrix ϕ (see also [15] in the context of quantum information). Proposition 3. A linear map ϕ is k-positive if and only if (Id1 ⊗ p) ϕ (Id1 ⊗ p) ≥ 0,

(3.6)

for all p ∈ Pk (H2 ). Equivalently, ϕ is k-positive iff tr(σ ϕ ) ≥ 0 for any σ ∈ Vk . Corollary 1. A linear map ϕ is positive iff tr(σ ϕ ) ≥ 0 for any σ ∈ V1 , i.e. for all separable states σ . Moreover, ϕ is CP iff tr(σ ϕ ) ≥ 0 for any σ ∈ Vd , i.e. ϕ ≥ 0.

1056

D. Chru´sci´nski, A. Kossakowski

4. Main Result It is well known that any CP map may be represented in the so- called Kraus form [17], ϕCP (a) = K α a K α∗ , (4.1) α

where (Kraus operators) K α ∈ L(H1 , H2 ). Any positive map is a difference of two CP maps ϕ = ϕ+ − ϕ− . However, there is no general method to recognize the positivity of ϕ from ϕ+ − ϕ− . Consider now a special class when ϕ+ and ϕ− are orthogonally supported and ϕ− = λ1 P1 , with P1 being a rank-1 projector. Let ϕ(a) =

D α=2

λα Fα a Fα∗ − λ1 F1 a F1∗ ,

(4.2)

such that

1. all rank-1 projectors Pα = d1−1 i,d1j=1 ei j ⊗ Fα ei j Fα∗ , are mutually orthogonal, 2. λα > 0 , for α = 1, . . . , D, with D := d1 d2 . Our main result consists in the following Theorem 1. Let ||F1 ||k+1 < 1. The map (4.2) is k-positive but not (k + 1)-positive if λ1 ||F1 ||2k+1 1 − ||F1 ||2k+1

(Id1 ⊗ Id2 − P1 ) > ϕ+ ≥

λ1 ||F1 ||2k 1 − ||F1 ||2k

(Id1 ⊗ Id2 − P1 ).

(4.3)

The above theorem follows from the following two lemmas: Lemma 1. Let || F1 ||k < 1. If ϕ+ ≥

λ1 ||F1 ||2k 1 − ||F1 ||2k

(Id1 ⊗ Id2 − P1 ),

(4.4)

then ϕ is k-positive. Proof. Let p ∈ Pk (H2 ). Take a unit vector ξ ∈ (Id1 ⊗ p)Cd1 ⊗ Cd2 and set µ=

λ1 ||F1 ||2k 1 − ||F1 ||2k

.

(4.5)

One obtains (ξ, (Id1 ⊗ p) ϕ (Id1 ⊗ p)ξ ) ≥ µ − (µ + λ1 )(ξ, (Id1 ⊗ p)P1 (Id1 ⊗ p)ξ ).

(4.6)

Now, using Proposition 2 one has (ξ, (Id1 ⊗ p)P1 (Id1 ⊗ p)ξ ) ≤ || (Id1 ⊗ p)P1 (Id1 ⊗ p) || ≤ ||F1 ||2k ,

(4.7)

and hence (ξ, (Id1 ⊗ p) ϕ (Id1 ⊗ p)ξ ) ≥ 0, which proves k-positivity of ϕ.

(4.8)

Spectral Conditions for Positive Maps

1057

Lemma 2. Let || F1 ||k < 1. If ϕ+ <

λ1 ||F1 ||2k 1 − ||F1 ||2k

(Id1 ⊗ Id2 − P1 ),

(4.9)

then ϕ is not k-positive. Proof. To prove that ϕ is not k positive we construct a vector ξ0 ∈ Cd1 ⊗ Cd2 such that (ξ0 , (Id1 ⊗ p0 ) ϕ (Id1 ⊗ p0 )ξ0 ) < 0, for some p0 ∈ Pk

(Cd2 ).

Now, take any p ∈ Pk

(Cd2 )

(4.10)

such that

N 2 = tr( p F1 F1∗ ),

(4.11)

is nonzero. Define ξ = N −1

d1

ei ⊗ p F1 ei .

(4.12)

i=1

Assuming (4.9) one finds (ξ, (Id1 ⊗ p) ϕ (Id1 ⊗ p)ξ ) < µ − (µ + λ1 )(ξ, (Id1 ⊗ p)P1 (Id1 ⊗ p)ξ ) µ 2 ||F || − (ξ, (I ⊗ p)P (I ⊗ p)ξ ) , = 1 d 1 d 1 1 k ||F1 ||2k (4.13) with µ defined by (4.5). Now, it is easy to show that (ξ, (Id1 ⊗ p)P1 (Id1 ⊗ p)ξ ) = tr( p F1 F1∗ ), and therefore (ξ, (Id1 ⊗ p) ϕ (Id1 ⊗ p)ξ ) <

µ 2 ∗ ||F || − tr( p F F ) . 1 k 1 1 ||F1 ||2k

(4.14)

(4.15)

Finally, let us observe that since Pk (Cd2 ) is compact there exists a point p0 ∈ Pk (Cd2 ) such that Tr( p0 F1 F1∗ ) = ||F1 ||2k .

(4.16)

(ξ0 , (Id1 ⊗ p0 ) ϕ (Id1 ⊗ p0 )ξ0 ) < 0,

(4.17)

Hence −1 d1

with ξ0 = ||F1 ||k

i=1 ei

⊗ p0 F1 ei .

Remark 1. Note that condition (4.3) may be equivalently rewritten as the following condition for the spectrum of ϕ+ : µk+1 > λα ≥ µk ; α = 2, . . . , D,

(4.18)

with µl = for l = 1, 2, . . . , d.

λ1 ||F1 ||l2 1 − ||F1 ||l2

,

(4.19)

1058

D. Chru´sci´nski, A. Kossakowski

d d Remark 2. √ If d1 = d2 = d and P1 is the maximally entangled state in C ⊗ C , i.e. F = U/ d with unitary U , then the above theorem reproduces a 25 year old result by Takasaki and Tomiyama [18].

Remark 3. For d1 = d2 = d , k = 1 and arbitrary P1 the formula λα ≥ µ1 (α = 2, . . . , d 2 ) was derived by Benatti et. al. [19]. Let us observe that part of the above theorem may be easily generalized for maps where rank ϕ− = m > 1. Consider ϕ(a) =

D α=m+1

with λα > 0. Corollary 2. Let

λα Fα a Fα∗ −

m α=1

λα Fα a Fα∗ ,

(4.20)

m

α=1 ||

Fα ||2k < 1. If m m 2 α=1 λα ||Fα ||k ϕ+ ≥ Pα , Id1 ⊗ Id2 − 2 1− m α=1 ||Fα ||k α=1

(4.21)

then ϕ is k-positive. The proof is analogous to the proof of Lemma 1. Finally, the condition λα > 0 may be easily relaxed. One has the following Corollary 3. Consider the map (4.20) such that λ1 = . . . = λ = 0 ( < m) and λ+1 , . . . , λ D > 0. If m m 2 α= λα ||Fα ||k ϕ+ ≥ Pα , Id1 ⊗ Id2 − (4.22) 2 1− m α=1 ||Fα ||k α=1 then ϕ is k-positive. 5. Example: Generalized Choi Maps Let us consider a family of maps ϕλ : Md −→ Md , defined as follows: ϕλ (a) := Id tra − λF1 a F1∗ .

(5.1)

It generalizes the celebrated Choi map [6] which is (d − 1)-positive but not CP, ϕChoi (a) := Id tra −

1 a, d −1

(5.2)

√ which follows from (5.1) with F1 = Id / d and λ = d/(d − 1). If λ = d, then (5.1) reproduces the so-called reduction map, ϕred (a) := Id tra − a,

(5.3)

Spectral Conditions for Positive Maps

1059

which is known to be completely co-positive. One easily finds ϕλ = Id ⊗ Id − λP1 = ϕ+ − ϕ− ,

(5.4)

where ϕ+ = Id ⊗ Id − P1 ,

ϕ− = (λ − 1)P1 ,

are orthogonally supported positive operators and P1 =

d

ei j ⊗ F1 ei j F1∗ .

(5.5)

i, j=1

Let f k := || F1 ||2k and assume that f k+1 < 1 . Theorem 1 implies that the map ϕλ is k-positive but not (k + 1)-positive iff 1 1 ≥ λ > . fk f k+1

(5.6)

Consider the family of states ρµ =

1−µ (Id ⊗ Id − P1 ) + µP1 . d2 − 1

(5.7)

Computing tr( ϕλ ρµ ) one finds that SN(ρµ ) = k iff f k ≥ µ > f k−1 .

(5.8)

In particular ρµ is separable iff µ ≥ f 1 = || F1 ||2 . Note that if P1 is the maximally entangled state then ρµ defines a family of isotropic states. In this case f k = k/d and one recovers the well known result [15]: SN(ρµ ) = k iff k/d ≥ µ > (k − 1)/d. Consider now the following generalization of (5.1): ϕλ (a) := Id tra − λ

m α=1

Fα a Fα∗ ,

(5.9)

and the corresponding operator ϕλ = Id ⊗ Id − λP,

(5.10)

where P is a rank-m projector given by P=

d m i, j=1 α=1

ei j ⊗ Fα ei j Fα∗ .

(5.11)

The map ϕλ is k-positive if λ ≤

1 ,

fk

(5.12)

1060

where now

fk = states

D. Chru´sci´nski, A. Kossakowski

m−1 α=1

|| Fα ||2k and we assume that

f k < 1. Consider the family of

ρµ =

1 − mµ µ (Id ⊗ Id − P) + P. 2 d −m m

(5.13)

Computing tr( ϕλ ρµ ) one finds that SN(ρµ ) = k iff

f k−1 . (5.14) fk ≥ µ >

m−1 In particular ρµ is separable iff µ ≥

f 1 = α=1 || Fα ||2 . Note that if P is the sum of m maximally entangled states then ρµ defines a generalization of the family of isotropic state. In this case

f k = mk/d and one obtains: SN(ρµ ) = k iff mk/d ≥ µ > m(k −1)/d. 6. Multipartite Setting Consider now an n-partite state ρ living in H1 ⊗ . . . ⊗ Hn . Recall Definition 2. A state ρ is separable iff it can be represented as the convex combination of product states ρ1 ⊗ . . . ⊗ ρn . One proves [20] the following Proposition 4. An n-partite state ρ in H1 ⊗ . . . ⊗ Hn is separable iff (id ⊗ ϕ) ρ ≥ 0,

(6.1)

for all linear maps ϕ : B(H2 ⊗ . . . ⊗ Hn ) −→ B(H1 ) satisfying ϕ( p2 ⊗ . . . ⊗ pn ) ≥ 0,

(6.2)

where pk is a rank-1 projector in Hk . Definition 3 (Generalized Choi-Jamiołkowski isomorphism). For any linear map ϕ : B(H2 ⊗ . . . ⊗ Hn ) −→ B(H1 ), let ϕ be an operator in B(H1 ⊗ . . . ⊗ Hn ) defined by ϕ := d1 (id ⊗ ϕ ) P + ,

(6.3)

where P + is the canonical maximally entangled state in H1 ⊗ H1 , and ϕ denotes the dual map. Proposition 5. A linear map ϕ : B(H2 ⊗ . . . ⊗ Hn ) −→ B(H1 ), satisfies (6.2) iff ϕ ] ≥ 0, tr[( p1 ⊗ . . . ⊗ pn ) for any rank-1 projectors pk .

(6.4)

Spectral Conditions for Positive Maps

1061

Proof. One has tr[( p1 ⊗ . . . ⊗ pn ) ϕ ] = d1 tr[( p1 ⊗ . . . ⊗ pn ) (id ⊗ ϕ )P + ] = d1 tr[P + · p1 ⊗ ϕ( p2 ⊗ . . . ⊗ pn )]. Now, using P + = d1−1 i,d1j=1 ei j ⊗ ei j , one obtains tr[P + · p1 ⊗ ϕ( p2 ⊗ . . . ⊗ pn )] = d1−1

d1

(6.5)

tr(ei j p1 ) tr[ei j ϕ( p2 ⊗ . . . ⊗ pn )]. (6.6)

i, j=1

Finally, due to

i, j

tr(ei j a)ei j = a T , one finds

tr[( p1 ⊗ . . . ⊗ pn ) ϕ ] = tr[ p1T ϕ( p2 ⊗ . . . ⊗ pn )] , from which the proposition immediately follows.

(6.7)

Corollary 4. A linear map ϕ : B(H2 ⊗ . . . ⊗ Hn ) −→ B(H1 ), satisfies (6.2) iff (I ⊗ p2 ⊗ . . . ⊗ pn ) ϕ (I ⊗ p2 ⊗ . . . ⊗ pn ) ≥ 0,

(6.8)

for any rank-1 projectors pk . To construct linear maps which are positive on separable states let us define the following norm: let Psep = { p2 ⊗ . . . ⊗ pn : pk = pk∗ = pk2 , tr pk = 1},

(6.9)

and define an inner product in the space of linear operators L(H1 , H2 ⊗ . . . ⊗ Hn ), A, B P := tr[(P A)∗ (P B)],

(6.10)

|| A ||2sep := max A, A P .

(6.11)

|| A ||sep ≤ || A ||.

(6.12)

with P ∈ Psep . Finally, let P∈Psep

It is clear that

Consider now the linear map defined by ϕ(a) =

D α=2

λα Fα a Fα∗ − λ1 F1 a F1∗ ,

(6.13)

where D = d1 . . . dn , tr(Fα∗ Fβ ) = δαβ and λα > 0. One finds for the corresponding ϕ, ϕ=

D

λα Pα − λ1 P1 ,

α=2

where the rank-1 projectors read as follows:

(6.14)

1062

D. Chru´sci´nski, A. Kossakowski

Pα =

d1

ei j ⊗ Fα ei j Fα∗ .

(6.15)

i, j=1

In analogy to Theorem 1 one easily proves Theorem 2. Let || F1 ||sep < || F1 || < 1. Then ϕ is positive on separable states but not positive if λ1 ||F1 ||2sep λ1 ||F1 ||2 > λ ≥ , α 1 − ||F1 ||2 1 − ||F1 ||2sep

(6.16)

for α = 2, . . . , D. Example. Consider the map ϕλ : Md ⊗ Md −→ Md 2 ≡ Md ⊗ Md ,

(6.17)

ϕλ (a) = λ(Id ⊗ Id tra − F0 a F0 ) − F0 a F0 ,

(6.18)

defined by

with ⎡ ⎤ d 1 ⎣Id ⊗ Id − F0 = F0∗ = √ ei j ⊗ ei∗j ⎦ . 2d(d − 1) i, j=1 Note that tr F02 = 1 and Hence

√

(6.19)

d(d − 1)/2 · F0 is a projector (see [21] for more details). || F0 ||2 =

2 . d(d − 1)

(6.20)

Now, for any rank-1 projectors p, q ∈ Md one has tr ( p ⊗ q)F02 =

1 (1 − tr pq), d(d − 1)

(6.21)

and therefore || F0 ||2sep := max tr ( p ⊗ q)F02 = p,q∈Psep

1 < || F0 ||2 . d(d − 1)

(6.22)

Corollary 5. Let d > 2, i.e. || F0 ||sep < || F0 || < 1. For 2 1 > λ ≥ , d(d − 1) − 2 d(d − 1) − 1 ϕλ is positive on separable elements in Md ⊗ Md but it is not a positive map.

(6.23)

Spectral Conditions for Positive Maps

1063

Remark 4. To the best of our knowledge ϕλ provides the first nontrivial example of a map which is not positive but it is positive on separable states. Of course any linear functional ϕ ) ∈ B = C, Md1 ⊗ Md2 a −→ tr(a takes positive values whenever ‘a’ is separable, but it is not a positive map when ϕ is a positive map which is not CP. Nontrivial means that our example defines a map ϕ : Md1 ⊗ Md2 −→ B = Md1 ⊗ Md2 , where the algebra B is noncommutative. Moreover, ϕ is not a tensor product ϕ1 ⊗ ϕ2 of two positive maps ϕk : Mdk −→ Mdk (ϕ1 ⊗ ϕ2 needs not be positive but it is positive on separable elements). 7. Conclusions We provide partial classification of positive linear maps based on spectral conditions. The presented method generalizes the celebrated Choi example of a map which is positive but not CP. From the physical point of view our scheme provides a simple method for constructing entanglement witnesses. Moreover, this scheme may be easily generalized to the multipartite setting. The presented method guarantees k-positivity but says nothing about indecomposability and/or optimality. We stress that both the indecomposable and the optimal positive maps are crucial in detecting and classifying quantum entanglement. Therefore, the analysis of positive maps based on spectral properties deserves further study. Acknowledgement. This work was partially supported by the Polish Ministry of Science and Higher Education Grant No 3004/B/H03/2007/33 and by the Polish Research Network Laboratory of Physical Foundations of Information Processing.

References 1. Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information. Cambridge: Cambridge University Press, 2000 2. Peres, A.: Phys. Rev. Lett. 77, 1413 (1996) 3. Horodecki, P.: Phys. Lett. A 232, 333 (1997) 4. Størmer, E.: Acta Math. 110, 233 (1963) 5. Arveson, W.: Acta Math. 123, 141 (1969) 6. Choi, M.-D.: Lin. Alg. Appl. 10, 285 (1975); ibid 12, 95 (1975) 7. Woronowicz, S.L.: Rep. Math. Phys. 10, 165 (1976) 8. Woronowicz, S.L.: Commun. Math. Phys. 51, 243 (1976) 9. Perez-Garcia, D., Wolf, M.M., Petz, D., Ruskai, M.B.: J. Math. Phys. 47, 083506 (2006) 10. Chru´sci´nski, D., Kossakowski, A.: J. Phys. A: Math. Theor. 41, 215201 (2008) 11. Størmer, E.: J. Funct. Anal. 254, 2303 (2008); E. Størmer: Separable states and positive maps II. http:// arxiv.org/abs/:0803.4417V1[math.OA], 2008 12. Chru´sci´nski, D., Kossakowski, A.: Open Syst. and Inf. Dyn. 14, 275 (2007) 13. Paulsen, V.: Completely Bounded Maps and Operator Algebras. Cambridge: Cambridge University Press, 2003 14. Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis. New York: Cambridge University Press, 1991 15. Terhal, B., Horodecki, P.: Phys. Rev. A 61, 040301 (2000) 16. Jamiołkowski, A.: Rep. Math. Phys. 3, 275 (1972) 17. Kraus, K.: States, Effects and Operations: Fundamental Notions of Quantum Theory. Berlin-Heidelberg Springer Verlag, NewYork, 1983

1064

18. 19. 20. 21.

D. Chru´sci´nski, A. Kossakowski

Takasaki, K., Tomiyama, J.: Math. Zeit. 184, 101–108 (1983) Benatti, F., Floreanini, R., Piani, M.: Open Syst. and Inf. Dyn. 11, 325–338 (2004) Horodecki, M., Horodecki, P., Horodecki, R.: Phys. Lett. A 283, 1 (2001) Chru´sci´nski, D., Kossakowski, A.: Phys. Rev. A 73, 062313 (2006); Phys. Rev. A 73:062314 (2006)

Communicated by M. B. Ruskai

Commun. Math. Phys. 290, 1065–1097 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0798-0

Communications in

Mathematical Physics

Eigenvector Localization for Random Band Matrices with Power Law Band Width Jeffrey Schenker Michigan State University, East Lansing, Michigan 48824, USA. E-mail: [email protected] Received: 8 October 2008 / Accepted: 8 January 2009 Published online: 18 April 2009 – © Springer-Verlag 2009

Abstract: It is shown that certain ensembles of random matrices with entries that vanish outside a band around the diagonal satisfy a localization condition on the resolvent which guarantees that eigenvectors have strong overlap with a vanishing fraction of standard basis vectors, provided the band width W raised to a power µ remains smaller than the matrix size N . For a Gaussian band ensemble, with matrix elements given by i.i.d. centered Gaussians within a band of width W , the estimate µ ≤ 8 holds.

1. Introduction Random band matrices, with entries that vanish outside a band of width W around the diagonal, have been suggested [7,8] as a model to study the crossover between a strongly disordered “insulating” regime, with localized eigenfunctions and weak eigenvalue correlations, and a weakly disordered “metallic” regime, with extended eigenfunctions and strong eigenvalue repulsion. Such a crossover is believed to occur in the spectra of certain random partial differential (or difference) operators as the spectral parameter (energy) is changed. In this paper, the strong disorder side of the band matrix crossover is analyzed. It is shown here that certain ensembles of random matrices whose entries vanish in a band of width W around the diagonal satisfy a localization condition in the limit that the size of the matrix N tends to infinity provided W 8 /N → 0. This result requires certain assumptions on the distribution of the entries of the matrix, and the proof given here has technical requirements that may not be necessary. Nonetheless, the conditions imposed below (see Sect. 3) allow for a large family of interesting examples. In particular, one may consider a Gaussian distributed band matrix, with distribution e−2W tr X W ;N dX W ;N , 2

(1.1)

1066

J. Schenker

where dX W ;N is the Lebesgue measure on the vector space of N × N matrices of band width W . That is ⎞ ⎛ d1,1 a1,2 · · · a1,W .. ⎟ ⎜ ∗ . ⎟ ⎜ a2,1 d2,2 ⎟ ⎜ ⎟ ⎜ .. .. .. ⎟ ⎜ . . . 1 ⎜ ⎟ (1.2) X W ;N = √ ⎜ ⎟, . . . . ∗ ⎟ ⎜ . . W ⎜aW,1 ⎟ ⎟ ⎜ .. .. .. ⎟ ⎜ . . . ⎠ ⎝ .. .. .. . . .

N ×N

with di and ai, j independent families of i.i.d. real and complex unit Gaussian variables, respectively. The main result obtained here is a localization estimate for the eigenvectors of the matrices X W ;N . This localization result is most conveniently stated in terms of the resolvent (X W ;N − λ)−1 , a well defined random matrix for λ ∈ R. (We will see that λ is an eigenvalue of X W ;N with probability zero.) Let ei denote the standard basis vectors ei ( j) = δi, j . Then Theorem 1. If X W ;N has distribution (1.1), or more generally a distribution satisfying assumptions 1, 2 and 3 in Sect. 3 below, then there exist µ > 0 and σ < ∞ such that given r > 0 and s ∈ (0, 1) there are As < ∞ and αs > 0 such that s |i− j| E ei , (X W ;N − λ)−1 e j ≤ As W sσ e−αs W µ (1.3) for all λ ∈ [−r, r ] and all i, j = 1, . . . , N . For the Gaussian band ensemble (1.1) σ ≤ and µ ≤ 8.

1 2

Remarks. For the Gaussian Band Ensemble (1.1), the density of states, in the regime W, N → ∞, W/N → 0, is known to be the Wigner semi-circle law (see Sect. 5 below). For λ outside the support of the semi-circle law, one could obtain (1.3) with µ = 1 using Lifschitz tail type estimates. This will be dealt with in a separate paper. Theorem 1 estimates the decay of matrix elements of the resolvent away from the diagonal. Using techniques developed in the context of discrete random Schrödinger operators one may obtain from (1.3) estimates on eigenvectors. Theorem 2 (Eigenvector localization). Let X W ;N have distribution (1.1), or more generally a distribution satisfying assumptions 4 and 5 in Sect. 5. (1) With probability one all eigenvalues of X W ;N are simple. (2) If (1.3) holds for all λ in an interval [−r, r ] and if λk , k = 1, . . . , N , are the eigenvalues of X W ;N with corresponding eigenvectors vk , k = 1, . . . , N , then there are B < ∞, τ ≥ 0, and β > 0, E

sup

λk ∈[−r,r ]

for all i, j = 1, . . . , N .

|vk (i)vk ( j)|

≤ BW τ e−β

|i− j| Wµ

(1.4)

Localization for Random Band Matrices

1067

Remark. For the proof of this theorem, the reader is directed to the corresponding result in the context of random Schroedinger operators. See for example [15] for the nondegeneracy of the eigenvalues and [2, Theorem A.1] for a derivation of (1.4) from Green’s function decay (1.3). In both cases, the proof involves only averaging over the coupling of a rank one perturbation and can be applied in the present context. 1.1. Sketch of the Proof. The proof of Theorem 1 is based on two observations, which may be summarized as follows.1 Let G W ;N (i, j) = ei , (X W ;N − λ)−1 e j . Then (1) The random variable G W ;N (i, j) is rarely large. This may be expressed through a bound (uniform in N ) on the tails of the distribution of G W ;N (i, j) Lemma W. If X W ;N has distribution (1.1), or more generally a distribution with the properties outlined in Sect. 3 below, then there exist κ > 0 and σ < ∞ such that Wσ Prob |G W ;N (i, j)| > t ≤ κ . (1.5) t (2) The fluctuations of ln |G W ;N (i, j)| grow at least linearly with |i − j|. One would typically express the growth of fluctuations by an inequality like Var(ln |G W ;N (i, j)|) ≥ const. |i − j|, where Var(X ) = E(X 2 ) − E(X )2 is the variance of a random variable X . However for present purposes a more convenient quantitative expression of this idea is the following Lemma F. If X W ;N has distribution (1.1), or more generally a distribution with the properties outlined in Sect. 3 below, then there is µ > 0 such that if 0 < r < s < 1 and |i − j| > 3W then r/s E |G W ;N (i, j)|r ≤ exp(−Cr,s W −µ |i − j|)E |G W ;N (i, j)|s (1.6) with Cr,s > 0. For the Gaussian band ensemble (1.1) µ ≤ 8. Lemmas W and F together easily imply Theorem 1. Indeed, it suffices to show that the second factor on the right-hand side of (1.6) is uniformly bounded. But it follows from Lemma W that κs W sσ . (1.7) E |G W ;N (i, j)|s ≤ 1−s This observation, which is the basis of the fractional moment analysis of random Schrödinger operators [1–3], follows easily from (1.5) since ∞ s E |G W ;N (i, j)| = s Prob |G W ;N (i, j)| > t t s−1 dt, (1.8) 0

and probabilities are bounded by one. 1 The idea to study localization via these two complementary estimates was suggested in the context of random Schroedinger operators by Michael Aizenman, and is inspired by the Dobrushin-Shlosman proof [10] of the Mermin-Wagner Theorem [17] on the absence of continuous symmetry breaking in classical statistical mechanics of dimension 2.

1068

J. Schenker

It may not be immediately clear what Lemma F has to do with large fluctuations. Towards understanding this, let X = ln |G W ;N (i, j)|. By the Hölder inequality, r

E(er X ) ≤ E(es X ) s .

(1.9)

Furthermore, equality holds only if X is non random — if there is x0 ∈ R so that X = x0 almost surely. In other words r

E(er X ) = e−h(r,s) E(es X ) s

(1.10)

with h(r, s) > 0 unless X is non random. If X were Gaussian with variance σ 2 (and arbitrary mean), then h(r, s) would be proportional to the variance h(r, s) =

r (s − r ) 2 σ . 2

(1.11)

For a general random variable X , the associated quantity h(r, s) may be taken as a measure of the fluctuations of X . In place of (1.11), we have the following identity for h in terms of the variance of X in weighted ensembles: Proposition 3. Let X be a random variable with E es X < ∞ for some s > 0. If r ∈ (0, s), then E er X < ∞ and 1 s h(r, s) = min(r, q) (s − max(r, q)) Varq (X )dq, (1.12) s 0 where h(r, s) is defined by (1.10) and q X 2 E Xe E X 2 eq X − Varq (X ) = q X E e E eq X

(1.13)

is the variance of X with respect to the weighted probability measure Probq (A)= q X q X /E e . E χAe Proof. Hölder’s inequality is the statement that the function (r ) = ln E (er σ ) is convex. In particular, if s > 0 then (r ) ≤

r (s) s

(1.14)

for r ∈ (0, s), since (1) = ln E (1) = 0. If E(esσ ) < ∞, it follows that is bounded on [0, s]. The identity (1.12) follows from Taylor’s formula with remainder. Indeed, the second derivative of at r is equal to the weighted variance Varr (X ). Thus, s (s) = (r ) + (r )(s − r ) + (s − q) Var q (X )dq, and (1.15) r

0 = (0) = (r ) − (r )r +

r 0

q Varq (X )dq.

(1.16)

Localization for Random Band Matrices

1069

Taking a convex combination of these identities, chosen so the first order terms cancel, gives r s (s − r )q (s − q)r r (s) = (r ) + Varq (X )dq + Varq (X )dq s s s 0 r s 1 = (r ) + min(r, q)(s − max(r, q)) Var q (X )dq, (1.17) s 0 which is equivalent to (1.12).

Thus Lemma F may be understood as giving a lower bound on the fluctuations of X = ln |G W ;N (i, j)|, as measured by the improvement to Hölder’s inequality. The proof of this result will be accomplished using a product formula for G W ;N (i, j) that expresses this quantity as a matrix element of a product of O(|i − j|/W ) matrices of size W × W . Proposition (3) will be applied to factors in this product, with each factor contributing a term of size 1/W 7 to h(r, s). Since there are O(|i − j|/W ) terms, this produces the claimed decay. The strategy taken below in proving Lemmas W and F has two parts. First we identify certain axioms for the distribution of X W ;N which lead naturally to the lemmas. Second, we verify that the Gaussian band ensemble (1.1) satisfies these axioms. To motivate the form of the axioms for the distribution of X W ;N , we begin in Sect. 2 with a self contained sketch of the argument in the tri-diagonal case W = 2. In Sect. 3 we state the assumptions needed to adapt the proof to W > 2, state the associated general results and prove Lemma W. In Sect. 4 we get to the heart of the matter and prove Lemma F. In Sect. 5, we discuss examples of ensembles, including the Gaussian band ensemble (1.1), satisfying the axioms of Sect. 3. In an appendix, an elementary probability lemma used below is stated and proved.

1.2. Remarks on the literature and open problems. In [7,8] it was observed, based on numerical evidence, that the localization of eigenfunctions and eigenvalue statistics of the Gaussian band ensemble (1.1) are essentially determined by the parameter W 2 /N . When W 2 /N << 1 the eigenfunctions are strongly localized and the eigenvalue process is close to a Poisson process. When W 2 /N >> 1 the eigenfunctions are extended and the eigenvalue statistics are well described by the Gaussian unitary ensemble (GUE). A theoretical physics explanation of these numerical results was given by Fyodorov and Mirlin [13]. They considered a slightly different ensemble in which a full GUE matrix is modified by multiplying each element by a factor which decays exponentially in the distance from the diagonal. For this model, on the basis of super-symmetric functional integrals, they obtain an effective σ -model approximation which, at the √ level of saddle point analysis, shows a localization/delocalization transition at W ≈ N . Theorem 1 is consistent with the above picture. However, [7,8,13] suggest that the proper exponent on the r.h.s. of (1.3) would be µ = 2. Problem 1. What is the optimal value of µ in (1.3)? In particular, does this equation hold with µ = 2? In the physics literature, the nature of eigenvalue processes in the large N limit is generally expected to be related to localization properties of the eigenfunctions, with Poisson statistics corresponding to localized eigenfunctions and Wigner-Dyson statistics

1070

J. Schenker

corresponding to extended eigenfunctions. Let us call this idea the “statistics/localization diagnostic.” (In the context of band random matrices, a vector v is a function on the index set {1, . . . , N }, namely v(i) = i th coordinate of v. The statistics/localization diagnostic suggests that the eigenvalues of a random matrix should be approximately uncorrelated if a typical eigenvector is essentially supported on a vanishing fraction of {1, . . . , N }, and should show strong correlations if it is typically spread over more or less the entire index set.) The extreme cases W = 1 and W = N of the Gaussian band ensemble (1.1) are consistent with the statistics/localization diagnostic. Indeed, with W = 1, the matrix is diagonal and the eigenvalues, which are just the diagonal entries d j, j , are independent. After suitable rescaling the eigenvalue process converges to a Poisson process in the large N limit. (This is essentially the definition of a Poisson process.) Likewise the eigenfunctions are the elementary basis vectors ei ( j) = δi ( j), which are localized on single sites. On the other hand, with W = N the matrix X W ;N is sampled from the GUE. In this case, the eigenfunctions together form a uniformly distributed orthonormal frame, so they are completely extended, and a suitable rescaling of the eigenvalue process converges in distribution to an explicit determinantal point process as calculated by Dyson [11,12]. Based on the statistics/localization diagnostic, it is reasonable to conjecture that Poisson statistics hold for local fluctuations of the eigenvalues of X W ;N in a limit N → ∞ with W = W (N ) → ∞ provided W (N )µ /N → 0. (One must be a little careful with the diagnostic, as it is easy to concoct random matrices with totally extended eigenfunctions and arbitrary statistics: put N random numbers with any given joint distribution on the diagonal of a matrix and conjugate the result with a random unitary! Of course, in that ensemble the matrix elements will most likely be highly correlated. Thus, it remains plausible that the statistics/localization diagnostic is correct, at least, for matrices with independent matrix elements.) For random Schrödinger operators, Minami has derived Poisson statistics for the local correlations of the eigenvalue process from exponential decay of the resolvent [18]. Some aspects of Minami’s proof translate to the present context. Most notably, the so-called Minami estimate which bounds the probability of having two eigenvalues in a small interval, 1 Prob #{λ j ∈ I } ≥ 2 ≤ C W |I |2 , 2 N

(1.18)

where |I | is the length of the interval and λ1 ≤ · · · ≤ λ N are the eigenvalues of X W ;N , holds with C W ∝ W 2σ .

(1.19)

Here σ is as in Thm. 1. (The proof of this fact may be accomplished by following Minami’s argument or by one of the various alternatives that have appeared recently in the literature [5,9,14].) However, one crucial ingredient is missing: we lack sufficient control on the convergence of the density of states. The density of states of X W ;N is the measure κW ;N (λ)dλ on the real line giving the density of the eigenvalue process: I

κW ;N (λ)dλ =

1 E #{λ j ∈ I } . N

(1.20)

Localization for Random Band Matrices

1071

As indicated, κW ;N (λ)dλ is absolutely continuous. In fact, it follows from the Wegner estimate — (3.8) below — that κW ;N (λ) W σ ,

(1.21)

so analogous to (1.18) we have 1 1 Prob #{λ j ∈ I } ≥ 1 ≤ E #{λ j ∈ I } = N N

I

κW ;N (λ)dλ ≤ W σ |I |. (1.22)

(In fact, the Minami estimate is proved in a similar way, by showing that the expected number of eigenvalue pairs in I is bounded by the r.h.s. of (1.18).) To study local fluctuations of the eigenvalue processes near λ0 ∈ R, it is natural to consider the re-centered and re-scaled process λ j = N (λ j − λ0 ),

(1.23)

which has mean spacing O(1). We say that the eigenvalue process has Poisson statistics near λ0 , in some limit W = W (N ) and N → ∞, if the point process { λ j } converges to a Poisson process. The density of this Poisson process would then be given by the limit lim N →∞ κW (N );N (λ0 ). The difficulty is we do not know that this limit exists. Now, for a fairly general class of matrix ensembles with independent centered entries, e.g., for the Gaussian ensemble (1.1), it is known that the density of states κW ;N converges weakly to the semi-circle law, provided W (N )/N → 0 or 1 (see [19]). That is, 1 E tr f (X W (N );N ) = N

R

N →∞

f (λ)κW ;N (λ)dλ −−−−→

1 2π

2 −2

f (t) 4 − t 2 dt. (1.24)

However, as indicated this is a weak convergence result, and it does not follow that N →∞

κW (N );N (λ) −−−−→

1 4 − λ2 I [|λ| ≤ 2], 2π

(1.25)

or even that

N →∞

(λ0 − Na ,λ0 + Nb )

κW ;N (λ)dλ −−−−→

1 4 − λ2 I [|λ| ≤ 2], 2π

(1.26)

which would in fact be sufficient to control the density of the putative limit process. In this regard, let us state a couple of open problems. Problem 2. Improve the estimate (1.21). In particular, does this bound hold with σ = 0? (The interpretation of κW ;N (λ)/N as the mean eigenvalue spacing and the convergence (1.24) suggests that κ should be bounded.) Problem 3. Verify either (1.25) or (1.26).

1072

J. Schenker

2. Tridiagonal Matrices The aim of this section is to motivate the assumptions on the distribution of X W ;N , spelled out below in Sect. 3, by examining separately the somewhat simpler case W = 2. Thus, consider for each N ∈ N, a random tridiagonal matrix ⎛ ⎞ v1 t1 ∗ ⎜ t1 v2 t2 ⎟ ⎜ ⎟ ⎜ ⎟ . . ∗ ⎜ ⎟ . t2 v3 ⎜ ⎟ (2.1) X 2;N = ⎜ ⎟, .. .. .. ⎜ ⎟ . . . ⎜ ⎟ ⎜ ⎟ .. ⎝ . v N −1 t N −1 ⎠ t N∗ −1 vN with v1 , v2 , . . . and t1 , t2 , . . . two given mutually independent sequences of independent random variables, real and complex valued respectively. For such matrices, exponential decay of the Green’s function and localization of eigenfunctions can be obtained by the transfer matrix approach, see [6]. Here we use a different method, which is closely related to the technique of Kunz and Souillard [16]. As discussed above, the central technical estimate is a bound on E(| ei , (X 2;N − λ)−1 e j |s ), decaying exponentially in the distance |i − j|. To obtain this bound, it is convenient to assume that (vk ) are identically distributed and likewise (tk ). (This assumption could be replaced by uniformity in k of certain bounds assumed below. Likewise, strict independence of (vk ) is not really the issue. The argument could easily be adapted to the situation in which (vk ) are generated by a distribution with finite range coupling, such as k ρ(vk −vk−1 )dvk .) The distribution of (tk ) can be arbitrary — these variables may even be deterministic as in the case of random Jacobi matrices. To facilitate the fluctuation argument proposed above we will suppose the common distribution of vk has a density ρ with the following property: Definition 1. We say that a probability density ρ on R is fluctuation regular if there are constants , δ > 0 and measurable set ⊂ R with ρ(v)dv > 0 such that v ∈ ⇒

ρ(v1 ) ≥δ ρ(v2 )

for all v1 , v2 ∈ (v − , v + ).

(2.2)

Remark. A sufficient condition for fluctuation regularity is that ln ρ is Lipschitz on some open interval. For example a uniform distribution ρ(x) ∝ χ[a,b] (x) is fluctuation regular. So are the Gaussian and Cauchy distributions. However, fluctuation regularity is quite a bit stronger than absolute continuity of the measure ρdx, since it implies the existence of an open set on which ρ is strictly positive. Our goal in this section is to prove the following result: ∞ Theorem 4. Let (vk )∞ k=1 and (tk )k=1 be two sequences of i.i.d. random variables, real and complex valued respectively. Suppose that the common distribution of vk has a density ρ which is bounded and fluctuation regular. Then for 0 < s < 1 and > 0 there are As < ∞ and µs, > 0 such that for all λ ∈ [−, ], E | ei , (X 2;N − λ)−1 e j |s ≤ As e−µs, |i− j| . (2.3)

Localization for Random Band Matrices

1073

Remark. We restrict λ to a compact set to facilitate the fluctuation argument below. In fact, for large |λ| the rate of exponential decay will improve, although the mechanism will be somewhat different. One could construct a proof in this context along the lines of [3]. Thus the dependence of the mass of decay µs; may be dropped. Let g N (i, j; λ) = ei , (X 2;N − λ)−1 e j . Recall that the decay of E (|g N (i, j; λ)|s ) was to be established in two steps, the first of those being Lemma W which gives finiteness of the fractional moments. A preliminary observation is that Lemma W holds for these tridiagonal matrices: Lemma 2.1 (Lemma W for X 2;N ). Suppose that the distribution of vk , k = 1, . . . , N satisfies Prob(vk ∈ [a, b]) ≤

κ |b − a|, 2π

(2.4)

for any interval [a, b], with κ a finite constant. Then Prob(|g N (i, j; λ)| > t|(vk )k=i, j , (tk )) ≤

κ , t

(2.5)

so, in particular, Prob(|g N (i, j; λ)| > t) ≤

κ t

(2.6)

and E |g N (i, j; λ)|s ≤

κs 1−s

(2.7)

for 0 < s < 1. Remark. The l.h.s. of (2.5) is the conditional probability of the event {|g N (i, j; λ)| > t} at specified values of (vk )k=i, j and (tk ) — that is the probability conditioned on the algebra generated by these variables. Equation (2.5) is a standard estimate from the fractional moment analysis of discrete random Schrödinger operators, see [1]. The main point of this result is that to bound E (|g N (i, j; λ)|s ), it is sufficient to average over vi and v j . The second part of the argument is to establish large fluctuations for g N (i, j; λ) — this is Lemma F above. In the present context we have Lemma 2.2 (Lemma F for X 2;N ). Under the hypotheses of Theorem 4, for each 0 < r < s < 1 and ∈ R there is a constant Cr,s; < ∞ such that r/s E |g N (i, j; λ)|r ≤ exp(−Cr,s; |i − j|)E |g N (i, j; λ)|s , for λ ∈ [−, ]. Remark. Together Lemmas 2.1 and 2.2 prove Theorem 4.

(2.8)

1074

J. Schenker

Proof. Let us fix λ for the moment and drop it from the notation: g N (i, j) = g N (i, j; λ). Suppose without loss of generality that i < j. A preliminary observation is that g N (i, j) = −g j−1 (i, j − 1)t j−1 g N ( j, j),

(2.9)

which may be established using the resolvent identity, writing X 2;N as a perturbation of the corresponding matrix with t j−1 set equal to zero (which decouples into two distinct blocks). Iteration of this identity gives ⎡ ⎤ j−1 gk (k, k)tk ⎦ g N ( j, j). (2.10) g N (i, j) = (−1) j−i ⎣ k=i

Thus ln |g N (i, j)| =

j−1 "

ln |tk | +

k=i

j−1 "

ln |gk (k, k)| + ln |g N ( j, j)|,

(2.11)

k=i

suggesting that if either ln |tk | or ln |gk (k, k)| were to exhibit fluctuations of order one, then the variance of ln |g N (i, j)| would be of order |i − j| and Lemma 2.2 would follow. However, there are substantial correlations between the various terms, making it difficult to proceed directly along this line of argument. To make a precise analysis, let us consider the random variables γk =

1 , gk (k, k)

(2.12)

which are related by a recursion relation γk = vk − λ −

|tk−1 |2 , 2 ≤ k ≤ N, γk−1

(2.13)

with γ1 = v1 .

(2.14)

These identities may be established using the Schur-complement formula. In a similar way, the Schur-complement formula may be used to show that |t j−1 |2 1 # j+1 = γ j − |t j |2 G # j+1 , = vj − λ − − |t j |2 G g N ( j, j) γ j−1

(2.15)

# j+1 = e j+1 , ( # where G X 2;N − λ)−1 e j+1 with # X 2;N the matrix obtained from X 2;N by setting t j = 0: ⎞ ⎛ .. .. . . ⎟ ⎜ ⎟ ⎜. . ⎟ ⎜ . v j−1 t j−1 ⎟ ⎜ ∗ ⎟ ⎜ t j−1 vj 0 ⎟. ⎜ # (2.16) X 2;N = ⎜ ⎟ 0 v j+1 t j+1 ⎟ ⎜ ⎟ ⎜ .. ⎟ ⎜ . t ∗j+1 v j+1 ⎠ ⎝ .. .. . . N # j+1 is a function of the variables (vk ) N In particular, G k= j+1 and (tk )k= j+1 .

Localization for Random Band Matrices

1075

We now make a change of variables vk → γk in our probability space. The Jacobian is triangular with ones on the diagonal and therefore has determinant one. Thus N N Joint distribution of (γk )k=1 given (tk )k=1

= ρ(γ1 + λ)

N N |tk−1 |2 ρ γk + λ + dγk . γk−1

k=2

(2.17)

k=1

So γk is a chain of variables with nearest neighbor couplings — thinking of k as a time parameter, {γk } is a Markov chain. In terms of these variables, we have g N (i, j) = (−1)|i− j|

j−1 tk 1 × , 2# γk γ − |t j j| G j+1 k=i

(2.18)

# j+1 may be written as a function of (γk ) N and (tk ) N , since vk = γk + λ − where G k= j k= j |tk−1 |2 /γk . A useful trick for analyzing fluctuations in this context, inspired by the Dobrushin Shlosman analysis of continuous symmetries in 2D classical statistical mechanics [10], is to couple the system to a family of independent identically distributed random variables α2 , α5 , . . ., each with absolutely continuous distribution H (αk )dαk . For technical reasons, which will become apparent below, we introduce αk only for k ≡ 2 mod 3. Let us define f k = eαk γk ,

(2.19)

where we take αk = 0 for k ≡ 2 mod 3. The Jacobian determinant of the transformation N (γk , αk ) → ( f k , αk ) is k=2,5,8,... e−αk , so N N N and (αk )k=1 , given (tk )k=1 joint distribution of ( f k )k=1 N

= k≡2

mod 3

|tk−2 |2 −αk |tk−1 |2 ρ e ρ f k−1 + λ + fk + λ + f k−2 f k−1

|tk |2 H (αk )e−αk γk ×d f k−1 d f k d f k+1 dαk , ×ρ f k+1 + λ + eαk

(2.20)

with the convention that t0 = 0. N , and consider the conditional distribution of (α ) N , which carWe now fix ( f k )k=1 k k=1 N . A key point is that the variables ries some information on the distribution of (γk )k=1 αk remain independent after conditioning. They are, however, no longer identically distributed. Instead, N N and ( f )=1 distribution of αk given (t )=1 |2 αk |tk |2 H (α )e−αk ρ e−αk f k + λ + |tk−1 ρ f + λ + e k+1 k f k−1 fk = dαk Zk

with

Zk =

% $ % $ |tk |2 |tk−1 |2 ρ f k+1 + eα H (α)e−α dα. ρ e−α f k + λ + f k−1 fk

(2.21)

(2.22)

1076

J. Schenker

N , We now express g N (i, j) in terms of the variables (t , f , α )=1

&

g N (i, j) = (−1) j−i

eαk

' & j−1

k≡2 mod 3 i≤k≤ j−1

k=i

tk fk

'

#j+1 , H

(2.23)

where #j+1 = H

e−α j

1 # j+1 f j − |t j |2 G

(2.24)

N . By the conditional independence of (α ) we find that is a function of (t , f , α )= k j N N E |g N (i, j)|r (t , f )=1 , (α )= j ⎛ ⎞ j−1 r |t | k ⎠ # r r αk N (2.25) =⎝ E e , f ) H (t j+1 =1 . | f k |r k=i

k≡2 mod 3 i≤k≤ j−1

N Applying Propostion 3 to each factor E er αk |(t , f )=1 on the right hand side, we find that N N E |g N (i, j)|r (t , f )=1 , (α )= j ⎛ ⎞ j−1 r/s r |t | k ⎠ # r −h k (r,s) sαk N =⎝ e E e , f ) , (2.26) H (t j+1 =1 | f k |r k=i

with 1 h k (r, s) = s and

k≡2 mod 3 i≤k≤ j−1

s 0

N min(r, q)(s − max(r, q)) Var q (αk |(t , f )=1 )dq,

N E (αk − m)2 eqαk |(t , f )=1 . N m∈R E eqαk |(t , f )=1

N ) = inf Varq (αk |(t , f )=1

(2.27)

(2.28)

Using the conditional independence of (αk ) once again to reassemble g N (i, j) inside the expectation on the r.h.s., we find that N N E |g N (i, j)|r |(t , f )=1 , (α )= j r/s ( j−1 N N = e− k=i h k (r,s) E |g N (i, j)|s |(t , f )=1 , (α )= , (2.29) j where we have set h k (r, s) = 0 for k ≡ 2 mod 3. After averaging and applying the Hölder inequality, we conclude that s E |g N (i, j)|r ≤ E e− s−r

( j−1 k=i

h k (r,s)

s−r s

r/s E |g N (i, j)|s .

(2.30)

Localization for Random Band Matrices

1077

Equation (2.30) is the key result. The exponent in the first factor is a sum of O(N ) non-negative terms, each presumably O(1) and positive with positive probability. It will not be so surprising to find that the term itself is O(N ) with good probability. The rest is estimates. To proceed with the estimates, let us take the a priori distribution of αk , before coupling and conditioning, to be uniform in an interval [−η, η] centered at the origin: H (α) =

1 I [|α| < η], 2η

(2.31)

N ) is defined as a function of with η to be chosen below. Although Varq (αk |(t , f )=1 N N : (t , f )=1 , it is useful to express it in terms of the variables (t , γ , α )=1 η 2 (q−1)α ν (α)dα k −η (α − m) e N η Varq αk |(t , f )=1 = inf (2.32) (q−1)α ν (α)dα m∈R e k −η

with

% $ % $ |tk |2 |tk−1 |2 ρ γk+1 + λ + eα−αk . νk (α) = ρ eαk −α γk + λ + γk−1 γk

(2.33)

N ), sufficient for our purposes, is A lower bound for Varq (αk |(t , f )=1 inf −η<α<η νk (α) 1 N . ≥ e−2|q−1|η η2 Varq αk |(t , f )=1 3 sup−η<α<η νk (α)

(2.34)

The r.h.s. still carries some dependence on αk , through the density νk . We may eliminate the dependence on αk entirely by bounding the right-hand side from below: N Varq αk |(t , f )=1 2 −α γ + λ + |tk−1 | α |tk |2 ρ γ ρ e + λ + e k k+1 γk−1 γk 1 . inf ≥ η2 e−2|q−1|η |tk−1 |2 |tk |2 −2η<α,β<2η 3 −β β ρ γ +λ+e ρ e γ +λ+ k

γk−1

k+1

γk

(2.35) It is useful to write e−α γk + λ +

|tk−1 |2 = (e−α − 1)γk + vk , γk−1

and similarly for the term in the denominator and the term with index k + 1. Finally, the r.h.s. is no larger if we factor the infimum on the right hand side, N Varq αk |(t , f )=1 ρ vk + (e−α − 1)γk 1 2 −2|q−1|η inf ≥ η e −2η<α,β<2η ρ vk + (e−β − 1)γk 3 2 ρ vk+1 + (eα − 1) |tγkk| . × inf (2.36) 2 −2η<α,β<2η ρ vk+1 + (eβ − 1) |tγkk|

1078

J. Schenker

On the r.h.s., the only dependence on q is in the exponential term. In the integral (2.27), there is not much loss in replacing this exponential by the (smaller) e−2|s−1|η , so that s r s 2 −2|s−1|η Uk (η), h k (r, s) ≥ η e s −r 6

(2.37)

with Uk (η) =

inf

−2η<α,β<2η

α − 1) |tk |2 −α ρ v + (e k+1 ρ vk + (e − 1)γk γk . inf ρ vk + (e−β − 1)γk −2η<α,β<2η ρ vk+1 + (eβ − 1) |tk |2 γk (2.38)

Plugging this estimate into Eq. (2.30), we obtain s−r r s 2 −2|s−1|η ( j−1 r/s s k=i Uk (η) E |g N (i, j)|s . E |g N (i, j)|s ≤ E e− 6 η e

(2.39)

Since ρ is fluctuation regular, there are δ, > 0 and a set ⊂ R with ρdx = q0 > 0

such that Uk (η) ≥ δ 2 I [Ak ], where I [Ak ] is the indicator function of the event: ) Ak = vk , vk+1 ∈ , |γk | ≤

* |tk |2 , and ≤ . e2η − 1 |γk | e2η − 1

(2.40)

In turn, since γk = vk + λ + |tk−1 |2 /γk−1 and |λ| ≤ (by assumption), we see that Ak ⊃ {vk ∈ , |vk | ≤ L} ∩ {vk+1 ∈ } ∩ {|tk−1 |, |tk | ≤ τ } $ %* ) * ) 1 1 1 1 ≤ 2 − L − ∩ ≤ , ∩ |γk−1 | τ e2η − 1 |γk | τ 2 e2η − 1

(2.41)

with τ and L any positive numbers. We estimate the probability of Ak from below by integrating Eq. (2.41) over vk+1 , vk , vk−1 , tk , and tk−1 in that order. (The need to integrate over three consecutive v variables is the reason we introduced αk only for k ≡ 2 mod 3.) To begin, Prob(vk+1 ∈ |(vl )l=k , (tl )) = ρ(v)dv = q0 . (2.42)

Looking now at vk , since γk = vk + λ + |tk−1 |2 /γk−1 , we see that * ) 1 1 {vk ∈ , |vk | ≤ L} ∩ ≤ 2 2η |γk | τ e −1 * ) e2η − 1 e2η − 1 = {vk ∈ } ∩ {|vk | ≤ L} ∩ vk ∈ [a − τ 2 , a +τ 2 ] (2.43)

Localization for Random Band Matrices

1079

with a = λ + |tk−1 |2 /γk−1 . Since the density ρ is bounded, it follows that % 1 1 (v Prob vk ∈ , |vk | ≤ L , ) , (t ) ≤ 2 2η l l = k,k+1 l |γk | τ e − 1 $

≥ q0 − Prob(|vk | > L) − 2 ρ∞ τ 2

e2η − 1 .

(2.44)

Similarly $

$

% % (vl )l=k−1,k,k+1 , (tl ) − L − |γk−1 | e2η − 1 1 . ≥ 1 − 2 ρ∞ τ 2 −L − e2η −1

Prob

1

≤

1 τ2

(2.45)

Combining these estimates with Eq. (2.41) and integrating over the identically distributed variables tk and tk−1 , we find Prob(Ak |(vl )l=k−1,k,k+1 , (tl )l=k,k−1 ) % $ e2η − 1 ≥ q0 q0 − Prob(|vk | > L) − 2 ρ∞ τ 2 1 × 1 − 2ρ∞ τ 2 Prob(|tk | ≤ τ )2 . − L − 2η e −1

(2.46)

The key thing to observe is that the r.h.s. of Eq. (2.47) is independent of k and can be made arbitrarily close to q02 by suitable choice of large L, τ and small η. So, for sufficiently small η we have Prob(Ak |(vl )l=k−1,k,k+1 , (tl )l=k,k−1 )) ≥ 21 q02 , say. Since Uk (η) ≥ δ 2 I [Ak ], we find that s−r r s 2 −|s−1|η ( j−1 s k=i Uk (η) E e− 6 η e $ + |i − j| ,% s −r 2 −δ 2 η2 r6s e−2|s−1|η q 1−e , ≤ exp − 2s 0 3

(2.47)

by integrating successively over vk , tk from k = i, . . . , j − 1 (see Lemma A.1 below). Combined with (2.39) this completes the proof of Lemma 2.2.

3. Band Matrices To translate the argument of the previous section to the context of band matrices, we replace each of the variables v j and t j by W × W matrices. Given W ∈ N, consider a sequence, V j , j = 1, . . ., of independent identically distributed hermitian W × W matrices together with a sequence, T j , j = 1, . . ., of independent identically distributed W × W matrices (not necessarily hermitian). With these matrix variables, we form an

1080

J. Schenker

infinite random hermitian band matrix ⎛ V1 ⎜ † ⎜T ⎜ 1 ⎜ XW = ⎜ ⎜ 0 ⎜ ⎜ ⎝

T1

0

V2

T2

⎞

⎟ ⎟ ⎟ .. ⎟ † ⎟, . T2 V3 ⎟ ⎟ .. .. ⎟ . . ⎠ .. .. . . ..

.

(3.1)

a random operator on 2 (N), and for each N the random matrix X W ;N = Q N X W Q N with Q N the projection onto 2 ({1, . . . , N }). For simplicity, let multiple of W : N = nW . Thus, ⎛ V1 T1 0 ⎜ † . ⎜ T1 V2 T2 . . ⎜ ⎜ .. ⎜ 0 T2† V3 . ⎜ X W ;N = X W ;nW = ⎜ .. .. ⎜ . . ⎜ ⎜ .. .. ⎝ . . † 0 Tn−1

(3.2) us consider only N a ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟. ⎟ 0 ⎟ ⎟ ⎟ Tn ⎠ Vn

(3.3)

Let P j denote the projection onto the j th block, 2 ({( j − 1)W + 1, . . . , j W }), so V j = P j X W P j and T j = P j X W P j+1 . Band matrix ensembles such as the Gaussian band ensemble (1.1) are of this form, with T j lower triangular matrices. However, for the argument presented below it is not necessary that T j be lower triangular. (Also, neither strict independence nor identicality of distribution are needed. Nonetheless, to keep things simple, let us stick to the i.i.d. case.) In adapting the arguments from the scalar case to the matrix variables V j , T j , we must account for the non-commutativity of the matrix product. The basis of the argument is a change of variables V j → eα j j with α j a scalar random variable and j a W × W matrix obtained from the resolvent of X W ; j W . In the end we will need to estimate the ratio ρ(V j + (e−α − 1) j ) ρ(V j + (e−β − 1) j ) for small α, β, where ρ is the density of the distribution of V j (assumed to be absolutely continuous with respect to Lebesgue measure on some vector space of matrices). In the scalar case, this change of variables was useful for all fluctuation regular densities. In the matrix case, an additional complication arises. Unless j falls in the vector space supporting the distribution of V j there will be constraints on the matrix elements of j which manifest themselves as δ functions after the change of variables. However, j is formed from {Vk } and {Tk } via non-linear operations, so there is no reason to expect

Localization for Random Band Matrices

1081

it to fall in this vector space. (For example when V j are diagonal, j will in general have off-diagonal components.) To guarantee closure under non-linear operations we suppose that the vector space supporting the distribution of V j is a matrix algebra: Definition 2. A algebra over R of W × W matrices is a set A of W × W matrices that is a vector space over R, under the usual addition and scalar multiplication, and such that V1 , V2 ∈ A ⇒ V1 V2 ∈ A and V1† ∈ A. We will use Proposition 5. If A is a matrix algebra over R and V ∈ A is invertible then V −1 ∈ A. Proof. This is a standard result for C algebras. In that context, the algebra is usually assumed to be a vector space over C, but that is not necessary. Here is the proof. If V ∈ A is self-adjoint and invertible, by the Weierstrass theorem we can approximate V −1 (in the operator norm, say) by polynomials in V with real coefficients. That is, we can approximate V −1 by elements of A. Since a finite dimensional vector space is complete, V −1 ∈ A. For general invertible V ∈ A, we have V −1 = (V † V )−1 V † ∈ A, since V † V ∈ A is self adjoint.

Assumption 1. Let S be an increasing sequence of integers and fix, for each W ∈ S, a algebra over R of W × W matrices AW , and the set TW of matrices which preserve AW under conjugations . TW = T : T † AW T ⊂ AW . (3.4) 0 / H = V ∈A † , the set of hermitian elements of A . We require Let AW W : V = V W H , j = 1, . . . . that T j ∈ TW and V j ∈ AW Remark. Note that TW is closed under conjugation: T ∈ TW ⇒ T † ∈ TW . There is a good deal of flexibility in the choice of algebras. Of course, we may take AW = TW = all n × n complex matrices, so X W ;N is complex Hermitian. On the other hand, we could restrict AW to be the set of matrices with real entries, so X W ;N is real symmetric. In this case AW is not a complex vector space. Similarly we could take AW to be the set of matrices with quaternion entries, where the quaternions units are represented by 2 × 2 matrices, so X W ;N would by Hermitian but anti-symmetric under T transposition X W ;N = −X W ;N . In this last case, S would be the set of even integers. H , is that we have An important consequence of assuming that T j ∈ TW and V j ∈ AW some a priori information on the block matrices making up the resolvent of X W ;nW . Lemma 3.1. Suppose Y is an nW × nW matrix that is block tri-diagonal, Pi Yn P j = 0 if |i − j| ≥ 2, and satisfies V j = P j Y P j ∈ AW ,

j = 1, . . . , n

1082

J. Schenker

and T j = P j Y P j+1 = (P j+1 Y P j )† ∈ TW ,

j = 1, . . . , n − 1.

If Y is invertible then P j Y −1 P j ∈ AW ,

j = 1, . . . , n

(3.5)

and Pi Y −1 P j ∈ AT W , i, j = 1, . . . , n,

(3.6)

where AT W is the algebra generated by AW and TW . Remark. The off diagonal blocks of Y −1 need not be in AW . This is apparent already for n = 2, where, by the Schur complement formula, P1 Y −1 P2 = (V1 − T1 V2−1 T1† )−1 T1 V2−1 = V1−1 T1 (V2 − T1† V1−1 T1 )−1 . In each expression on the right, the first and last factors are in AW but the middle factor, T1 , is not. Proof. The proof is by induction on n. The result is clear for n = 1. So, suppose we know that it holds if Y is a tridiagonal block matrix of size no larger than (n −1)W ×(n −1)W . First consider (3.5). By the Schur complement formula, † P j−1 Y−−1 P j−1 T j−1 )−1 , P j Y −1 P j = (V j − T j P j+1 Y+−1 P j+1 T j† − T j−1

where

⎛

V j+1

⎜ † ⎜ T j+1 Y+ = ⎜ ⎜ ⎝

T j+1 .. . .. .

⎞ ..

.

..

.

† Tn−1

⎛

V1

⎜ † ⎟ ⎜T ⎟ ⎟ , Y− = ⎜ 1 ⎜ ⎟ ⎝ Tn−1 ⎠ Vn

T1 .. .

..

.

⎞ .. ..

.

. † T j−1

⎟ ⎟ ⎟. ⎟ T j−1 ⎠ Vj

As Y+ and Y− are of size no larger than (n − 1)W × (n − 1)W and T j , T j+1 ∈ TW , it follows that † T j P j+1 Y+−1 P j+1 T j† , T j−1 P j−1 Y− P j−1 T j−1 ∈ AW .

By Prop. 5 P j Y −1 P j ∈ AW . Now consider (3.6). Suppose i < ⎛ V1 ⎜ † ⎜T #= ⎜ 1 Y ⎜ ⎝

j (the other case is similar). Let ⎞ T1 .. .. ⎟ . . ⎟ ⎟. ⎟ .. .. . . T j−2 ⎠ † T j−2 V j−1

By the resolvent identity, one has #−1 P j−1 T j P j Y −1 P j . Pi Y −1 P j = −Pi Y #−1 P j−1 ∈ AT W by the induction hypothesis and P j Y −1 P j ∈ AW as we have But Pi Y just shown. It follows that the r.h.s. is in AT W .

Localization for Random Band Matrices

1083

We now consider the properties required of the distribution of V j , denoted PW . Let · denote the operator norm of a matrix A = sup Av

(3.7)

v=1

and let σ (A) denote the set of eigenvalues of a matrix. Recall, if A is self-adjoint, that 1 1 1 1 A = max |σ (A)|, 1A−1 1 =

1 . min |σ (A)|

Assumption 2. Let (PW )W ∈S be a family of probability measures such that H and absolutely continu• Absolute continuity: Each measure PW is supported on AW ous with respect to Lebesgue measure on that space. Let ρW (V ) denote the density of PW with respect to Lebesgue measure. H , W ∈ S, • Wegner-type estimates: There are κ > 0 and σ ≥ 0 such that for all A ∈ AW

1 1 . 1 1 1 PW V : 1(V − A)−1 1 > t W 1+σ ≤ κ ; t

(3.8)

H and C ∈ AT , W ∈ S, and for all A, B ∈ AW W 1$ 2 3 %−1 1 1 1 1 C 1 1 V1 − A PW ⊗ PW (V1 , V2 ) : 1 1 > t W 1+σ ≤ 2κ . (3.9) C† V2 − B 1 1 t

• Fluctuation regularity with bounded tails: There are constants p0 , δ, > 0, L , a, ζ ≥ H with P ( ) ≥ p and if V ∈ , 0 such that, for each W ∈ S, there is W ⊂ AW W W 0 W then V ≤ L W a

(3.10)

and ρW (V1 ) ≥δ ρW (V2 )

1 1 for all V1 , V2 ∈ AhW with 1V j − V 1 ≤ W −ζ , j = 1, 2. (3.11)

Remarks. (1) Since 1 1 1 1 1(V − λI )−1 1 =

1 , dist(λ, σ (V ))

the Wegner-type estimate (3.8) implies PW V : dist(λ, σ (V )) ≤

W 1+σ

.

≤ κ .

(3.12)

If V is suitably scaled so as to have mean eigenvalue spacing of order 1/W , this suggests that we should be able to take σ = 0. That has not been proved, however, for the random matrix ensembles studied here. For Wigner type matrices, in particular for the Gaussian band ensemble (1.1), we will obtain the estimates (3.8, 3.9) with σ = 21 in Sect. 5.

1084

J. Schenker

(2) The parameters σ and a are not independent. If we rescale via V → W γ V this results in a shift σ → σ − γ and a → a + γ . Nonetheless it is convenient to keep both parameters since the natural scaling of V is to choose the eigenvalue spacing to be of order 1/W . This typically leads to a = 0, but if the entries of V have heavy tails then one may have a > 0. We require very little from the distribution of T j , denoted QW , essentially just a uniform (in W ) bound on the tails: Assumption 3. Let (QW )∞ W =2 be a family of probability measures, with QW supported on TW . Suppose that there are q0 , τ > 0 and b ≥ 0 such that . QW T : T ≤ τ W b ≥ q0 . (3.13) Remark. QW could be supported on a single point, in which case T j would be a constant sequence. For instance, we could take T j = I . Lemma W for X W ;nW follows easily from part (2) of Assumption 1. Lemma 3.2 (Lemma W for X W ;nW ). Let V j , T j , j = 1, . . ., be mutually independent sequences of independent random W × W matrices. Suppose each V j has distribution PW and each T j has distribution QW . Then, for each λ ∈ R, Prob λ is an eigenvalue of X W ;nW = 0 and

1 1 1 1 1 ≤ 2κ (3.14) Prob 1 Pi (X W ;nW − λI )−1 P j 1 > t W 1+σ | {Tk }n−1 and{V } k k = i, j k=1 t

for any 1 ≤ i, j ≤ n. Proof. Let us first consider the case i = j. The Schur complement formula shows that Pi (X W ;nW − λI )−1 Pi = (Vi − λI + K )−1 with † K = Ti−1 (X − − λI )−1 Ti−1 + Ti (X + − λI )−1 Ti†

with X − and X + the restrictions of X to the blocks above and below i. By Lemma 3.1 H . (Note that it is self adjoint.) It follows from (3.8) that λ is an eigenvalue of K ∈ AW X W ;nW with probability 0 and that (3.14) holds for i = j. The argument for i = j is similar. In this case, we first estimate 1 1 1 1 1 1 1 1 1Pi (X W ;nW − λI )−1 P j 1 ≤ 1(Pi + P j )(X W ;nW − λI )−1 (Pi + P j )1 . As above, we have (Pi + P j )(X W ;nW − λI )

−1

&$ (Pi + P j ) =

Vi − λI 0

0 V j − λI

%

$

A + C†

C B

%' ,

where A, B and C are formed from blocks of the resolvents of restrictions of X W ;nW . H and C ∈ AT . Thus the result follows from (3.9). One may verify that A, B ∈ AW

W

Localization for Random Band Matrices

It follows that

1s 1 2s κ s (1+σ )s 1 1 W E 1Pi (X W ;nW − λI )−1 P j 1 ≤ , 1−s

1085

(3.15)

and so

s 2s κ s (1+σ )s W , E v, Pi (X W ;nW − λI )−1 P j w ≤ 1−s for any two vectors v, w. (See (1.8).) Lemma F in this context is as follows:

(3.16)

Lemma 3.3 (Lemma F for X W ;nW ). Let V j , T j , j = 1, . . ., be mutually independent sequences of independent random W × W matrices. Suppose each V j has distribution H . Fix PW and each T j has distribution QW . Let DW denote the real dimension of AW −ν a positive number ν large enough that supW DW W < ∞ and suppose also that ν ≥ ζ + max(a, 1 + σ + 2b), with σ, a, ζ as in Assumption 2 and b as in Assumption 3. Then for each 0 < r < s < 1 and 0 < < ∞ there is Cr,s > 0 such that if |i − j| ≥ 3 then 4 5r E Pi (X W ;nW − λI −1 P j 5s r/s 4 ≤ exp −Cr,s W −2ν |i − j| E Pi (X W ;nW − λI )−1 P j (3.17) for any λ ∈ [−, ] and any non-negative, positive-homogeneous function : AT W → R — i.e., (Y ) ≥ 0 and (αY ) = α (Y ) for α ≥ 0. Remarks. (1) Below we will apply the result with (Y ) a semi-norm such as the absolute value of a matrix element | v, Y w| or the norm Y . However the proof does not make use of the triangle inequality, so the result also applies, for example, to (Y ) = spectral radius (Y ) or (Y ) = smallest singular value of . (2) Under rescaling of the matrix elements X W ;nW → W γ X W ;nW the localization length 1/Cr,s W −2ν should not change. That this is indeed so follows since ζ → ζ −γ , a → a+γ , σ → −γ and b → b+γ , so the combination ζ +max(a, 1+σ +2b) is invariant under rescaling. Combining Lemma 3.3 and (3.16) we have Theorem 6. Let V j , T j , j = 1, . . ., be mutually independent sequences of independent random W × W matrices. Suppose each V j has distribution PW and each T j has disH . Fix a positive number ν large tribution QW . Let DW denote the real dimension of AW −ν enough that supW DW W < ∞ and suppose also that ν ≥ ζ + max(a, 1 + σ + 2b), with σ, a, ζ as in Assumption 2 and b as in Assumption 3. For 0 < t < 1 let $ t % (3.18) M(W, t) = max E ex , (X W ;N − λ)−1 ey , 1≤x,y≤nW

where ex and ey denote elementary basis vectors. Then 2t κ t (1+σ )t W (3.19) 1−t and given 0 < s < t there are constants C, µ such that for any 1 ≤ x, y ≤ nW , s −2ν−1 |x−y| . (3.20) E ex , (X W ;N − λ)−1 ey ≤ C M(W, t)s/t e−µW M(W, t) ≤

1086

J. Schenker

Proof. This amounts to special cases of (3.16) and Lemma 3.3. The exponent 2ν + 1 appears in (3.20) because the difference |i − j| of the blocks to which x and y belong is estimated by |x − y|/W . The constant C compensates for the exponential factor −ν−1 |x−y| when |x −y| is smaller than 3W , in which case the estimate of Lemma 3.3 e−µW does not hold.

Remark. Putting (3.20) and (3.19) together we have s −2ν−1 |x−y| E ex , (X W ;N − λ)−1 ey ≤ const.W (1+σ )s e−µW .

(3.21)

If the diagonal blocks V j are Wigner matrices, as in Assumption 4 in Sect. 5 below, one may obtain the estimate M(W, t) ≤

1 2t κ t W 2, 1−t

resulting in a very slight improvement on the estimate on the r.h.s. of (3.21), s s −2ν−1 |x−y| E ex , (X W ;N − λ)−1 ey ≤ const.W 2 e−µW .

(3.22)

(3.23)

This improvement is not very significant, as the main point here is the exponential factor, which dominates any power of W as long as |x − y| >> W 2ν+1 . 4. Fluctuations We now prove Lemma 3.3. Following the proof of Lemma 2.2, let us fix λ and set G n (i, j) = Pi (X W ;nW − λI )−1 P j . Since G n (i, j)† = G n ( j, i), in estimating G n (i, j) we may assume without loss that i ≤ j. We have, by the resolvent identity, G n (i, j) = −G j−1 (i, j − 1)T j−1 G n ( j, j).

(4.1)

Iteration gives G N (i, j) = (−1) j−i G i (i, i) ×Ti G i+1 (i + 1, i + 1)Ti+1 · · ·G j−1 ( j − 1, j − 1)T j−1 G n ( j, j). (4.2) Let us define W × W random matrices k = G k (k, k)−1 ,

(4.3)

† −1 k−1 Tk−1 . k = Vk − λI − Tk−1

(4.4)

related by a recursion relation

As in the W = 2 case, these identities may be established using the Schur-complement formula — compare with (2.9) and (2.13). Similarly, † †# † # −1 G n ( j, j)−1 = V j − λI − T j−1 j−1 T j−1 − T j G j+1 T j = j − T j G j+1 T j , (4.5)

Localization for Random Band Matrices

1087

# j+1 = P j+1 ( # where G X W ;nW − λ)−1 P j+1 with # X W ;nW the matrix obtained from X W ;nW # j+1 is a function of the matrix variables (Vk ) N by setting T j = 0. Thus G k= j+1 and N (Tk )k= j+1 . We now make the change of variables Vk → k in our probability space. By Lem. 3.1 H . As in the tri-diagonal case, the Jacobian determinant is 1, so and Prop. 5, k ∈ AW Joint distribution of (k )nk=1 given (Tk )n−1 k=1 = ρ(1 + λI )

n

† ρ(k + λI + Tk−1 k−1 Tk−1 )

k=2

n

dk ,

(4.6)

k=1

H . In terms of the matrices , we have where dk denotes Lebesgue measure on AW k † −1 −1 # G n (i, j) = (−1)|i− j| i−1 Ti i+1 Ti+1 · · · −1 j−1 T j · ( j − T j G j+1 T j ) ,

(4.7)

# j+1 is a function of (k )n and (Tk )n (since Vk = k +λI − T † −1 Tk−1 ). where G k= j k= j k−1 k−1 The matrix product in (4.7) is non-commutative, so it is not clear if the heuristic analysis that the “log of G is a sum of terms with only local correlations” is valid. Nonetheless, we may use the trick employed above of coupling the system to a family of independent identically distributed scalar variables α2 , α5 , . . ., each with absolutely continuous distribution H (αk )dαk =

1 I [|αk | ≤ η]dαk , 2η

(4.8)

with η > 0 to be chosen below. We define Fk = eαk k ,

(4.9)

where we take αk = 0 for k ≡ 2 mod 3. The Jacobian of the transformation (k , αk ) → N H is the dimension of A H . Thus (Fk , αk ) is k=2,5,8,... e−DW αk , where DW = dim AW W N joint distribution of (Fk )nk=1 and (αk )k=1 , given (Tk )n−1 k=1 n

= k≡2

† † −1 −1 ρ(Fk−1 + λI + Tk−2 Fk−2 Tk−2 )ρ(e−αk Fk + λI + Tk−1 Fk−1 Tk−1 )

mod 3

×ρ(Fk+1 + λI + eαk Tk† Fk−1 Tk ) H (αk )e−DW αk dFk−1 dFk dFk+1 dαk ,

(4.10)

with the convention that T0 = 0. As in the tri-diagonal case, the variables αk remain independent after conditioning N . Also, the on (Fk )k=1 n distribution of αk given (T )n−1 =1 and (F )=1 † −1 ρ e−αk Fk + λI + Tk−1 Fk−1 Tk−1 ρ Fk+1 + λI + eαk Tk† Fk−1 Tk H (αk )e−DW αk = dαk Zk

(4.11)

1088

J. Schenker

with 1 Zk = 2η

† −1 ρ e−α Fk + λI + Tk−1 Fk−1 Tk−1 −η ρ Fk+1 + λI + eα Tk† Fk−1 Tk e−DW α dα. η

(4.12)

Now fix a non-negative positive homogeneous as in the statement of the lemma. Replacing j in (4.7) by e−α j F j , we find that (G n (i, j)) =

⎛

⎛

eαk ⎝(−1)|i− j| ⎝

k≡2 mod 3 i≤k≤ j−1

j−1

⎞

⎞

#j+1 ⎠ , Fk−1 Tk ⎠ H

(4.13)

k=i

where #j+1 = H

1 # j+1 T † j − Tj G j

(4.14)

N . Since (α ) are conditionally independent, it follows that is a function of (Tk , Fk , αk )k= k j

N N E [ (G n (i, j))]r (T , F )=1 , (α )k= j ⎡ ⎛ ⎛ ⎞ ⎞⎤r j−1 #j+1 ⎠⎦ = ⎣ ⎝(−1)|i− j| ⎝ Fk−1 Tk ⎠ H k=i

N . E er αk (Tk , Fk )k=1

k≡2 mod 3 i≤k≤ j−1

(4.15) By Propostion 3 and the Hölder inequality, we conclude that (compare with (2.30)): s E [ (G n (i, j))]r ≤ E e− s−r

( j−1 k=i

h k (r,s)

s−r s

r/s E [ (G n (i, j))]s , (4.16)

where for k ≡ 2 mod 3, 1 s N dq, h k (r, s) = min(r, q)(s − max(r, q)) Var q αk |(T , F )=1 s 0

(4.17)

with Varq as in (2.28), and we have set h k (r, s) = 0 for k ≡ 2 mod 3. N ) in terms of (T , , α ) N : Let us express Varq (αk |(T , F )=1 =1 η N ) Var q (αk |(T , F )=1

= inf

m∈R

− m)2 e(q−DW )α νk (α)dα η (q−DW )α ν (α)dα k −η e

−η (α

(4.18)

with † −1 k−1 Tk−1 )ρ(k+1 +λI + eα−αk Tk† k−1 Tk ). (4.19) νk (α) = ρ(eαk −α k + λI + Tk−1

Localization for Random Band Matrices

1089

Thus (compare with (2.36)), N )≥ Var q (αk |(T , F )=1

ρ(Vk + (e−α − 1)k ) 1 2 −2qη −2DW η η e e inf −2η<α,β<2η ρ(Vk + (e−β − 1)k ) 3 inf

−2η<α,β<2η

ρ(Vk+1 + (eα − 1)Tk† k−1 Tk ) ρ(Vk+1 + (eβ − 1)Tk† k−1 Tk )

.

(4.20)

With (4.16) this implies s−r r s 2 −2sη −2D η ( j−1 r/s s W k=i Uk (η) E [ (G n (i, j))]s , E [ (G n (i, j))]r ≤ E e− 6 η e e (4.21) where

ρ Vk + (e−α − 1)k inf Uk (η) = −2η<α,β<2η ρ Vk + (e−β − 1)k ρ Vk+1 + (eα − 1)Tk† k−1 Tk . × inf −2η<α,β<2η ρ Vk+1 + (eβ − 1)Tk† k−1 Tk

By fluctuation regularity of PW , we have Uk (η) indicator function of the event: ) W −ζ , and Ak = Vk , Vk+1 ∈ W k ≤ 2η e −1

(4.22)

≥ δ 2 I [Ak ], where I [Ak ] is the 1 1 1 † −1 1 1Tk k Tk 1 ≤

* −ζ , W e2η − 1 (4.23)

with δ, > 0, ζ ≥ 0 and W as in Assumption 3. In turn, since k = Vk + λ + † −1 k−1 Tk−1 , we see that Tk−1 . Ak ⊃ {Vk+1 ∈ W } ∩ {Vk ∈ W } ∩ Tk−1 , Tk ≤ τ W b $ * % )1 1 1 1 −1 1 −ζ a −2b W − L W − |λ| W ∩ 1k−1 1≤ 2 τ e2η − 1 * )1 1 1 1 −1 1 −2b−ζ , (4.24) W ∩ 1k 1 ≤ 2 2η τ e −1 with L , a ≥ 0 as in Assumption 2 and τ, b ≥ 0 as in Assumption 3. This allows us to estimate the probability of Ak from below by successively integrating over Vk+1 , Vk , Vk−1 , Tk , and Tk−1 in that order. To begin, by Assumption 2, Prob(Vk+1 ∈ W |(Vl )l=k , (Tl )) = PW ( W ) ≥ p0 > 0.

(4.25)

† −1 k−1 Tk−1 , we see from the Wegner estimate (3.8) that Since k = Vk + λI + Tk−1 $ % 1 1 1 1 1 Prob Vk ∈ W , 1k−1 1 ≤ 2 2η W −2b−ζ (Vl )l=k,k+1 , (Tl ) τ e −1

≥ p0 − κτ 2

e2η − 1 1+σ +ζ +2b W .

(4.26)

1090

J. Schenker

Similarly $ % % $1 1 1 1 −1 1 −ζ a −2b (V ≤ W − L W − |λ| W ) , (T ) Prob 1k−1 1 l l l=k−1,k,k+1 τ 2 e2η − 1 1 W 1+σ +2b . (4.27) ≥ 1 − κτ 2 −ζ a − |λ| W − L W e2η −1 Combining these estimates and using Assumption 3 to integrate over Tk and Tk−1 , we find Prob( Ak | (Vl )l=k−1,k,k+1 , (Tl )l=k,k−1 ) $ % e2η − 1 1+σ +2b+ζ W ≥ q02 p0 p0 − κτ 2 1 2 1+σ +2b W × 1 − κτ . W −ζ − L W a − |λ| e2η −1

(4.28)

Taking η = cW −ν with ν ≥ max(a, 2b+σ +1)+ζ , we may choose c sufficiently small to make the r.h.s. larger than 21 q02 p02 , say. Since Uk (η) ≥ δ 2 I [Ak ] we find, integrating successively over Vk , Tk from k = i, . . . , j − 1 (see Lemma A.1), that r s 2 −sη −D η ( j−1 s−r s W k=i Uk (η) E e− 6 η e e $ %+ ,% $ −ν −ν |i − j| s −r 2 2 −c2 δ 2 r6s W −2ν e−csW e−cDW W q p 1−e . ≤ exp − 2s 0 0 3 (4.29) Increasing ν, if necessary, so that supW DW W −ν < ∞ completes the proof.

5. Ensembles In this section, we consider several examples of band matrix ensembles satisfying Assumptions 1, 2, and 3 of Sect. 3. Assumption 1 is simply the choice of an algebra AW to support the distribution of the diagonal blocks, and the corresponding set TW for the off-diagonal blocks. In this regard, we will consider two cases: (R) AW = W × W matrices with real entries, or (C) AW = W × W matrices with complex entries. In each case the dimension of the algebra DW is comparable to W 2 and TW = AW .

5.1. Wigner-matrix blocks and the Wegner estimate. We shall suppose that the diagonal blocks V j of X W ;N are Wigner matrices: Assumption 4. The distribution of the diagonal blocks, dPW (V ), written in terms of the matrix elements

Localization for Random Band Matrices

⎛

1091

d1

a1,2

⎜ ∗ ⎜ a1,2 1 ⎜ ⎜ V = √ ⎜ ... W ⎜ ⎜ . ⎝ ..

d2

∗ a1,W

···

··· ..

···

.

···

..

.

∗ aW −1,W

⎞ a1,W .. ⎟ . ⎟ ⎟ .. ⎟ , . ⎟ ⎟ ⎟ aW −1,W ⎠ dW

(5.1)

has the form dPW (V ) =

W

h(d j )dd j

j=1

g(ai, j )dai, j ,

(5.2)

1≤i< j≤W

where dd j is Lebesgue measure on the real line, h ∈ L ∞ (R) ∩ L 1 (R) is non-negative with h = 1, and either (R) dai, j is Lebesgue measure on R and g ∈ L ∞ (R) ∩ L 1 (R) is non-negative with R g = 1, or (C) dai, j is Lebesgue measure on C and g ∈ L ∞ (C) ∩ L 1 (C) is non-negative with C g = 1. Furthermore, we require R

λ2 h(λ)dλ < ∞,

(5.3)

|a|4 g(a)da < ∞ , and

ag(a)da = 0.

(5.4)

Clearly the measure dPW is absolutely continuous with respect to Lebesgue measure H — this is part 1 of Assumption 2. Regarding the Wegner estimates — part 2 of on AW Assumption 2 — we then have the following Theorem 7 (Wegner estimate). Under assumption 4, the Wegner estimates (3.8) and (3.9) hold with σ = 21 and κ = 2π ess-supλ h(λ). Proof. This result, which is obtained by averaging over the diagonal variables {d j } only, is a standard estimate from the theory of random Schrödinger operators, first obtained by Wegner [20]. we sketch the proof. 1 For completeness, 1 Note that 1(V − A)−1 1 > t if and only if V − A has an eigenvalue in the interval (− 1t , 1t ). It follows that & ' 1 . -1 2 1 −1 1 −1 1 2 Prob 1(V − A) 1 > t ≤ 2 E tr (V − A) + 2 t t 6 & 7 & ' ' W 1 −1 1 −1 2" 2 E Im ei , V − A − i I ei = . = E Im tr V − A − i I t t t t i=1

(5.5)

1092

J. Schenker

By the Schur complement formula, 6 & 7 ' 1 −1 ei = ei , V − A − i I t

1 √1 di W

− 1t i − γ

,

(5.6)

where γ is a function of all matrix elements of V except di . Thus γ is a random variable independent of di , so ⎞ ⎛ 6 & 7 ' 1 + Im γ 1 −1 t ⎠ E Im ei , V − A − i I ei = E⎝ 1 t ( √ di − Re γ )2 + ( 1t + Im γ )2 W √ ≤ h∞ π W , (5.7) where the inequality follows from replacing the average •h(di )ddi by the upper bound h∞ R •ddi . Summing over i gives the result. The proof of (3.9) is analogous. However in that case the trace is over a 2W dimensional space, resulting in the additional factor of 2 on the r.h.s. of that equation.

√ The scaling factor W that appears in (5.11) is natural, as with this scaling the matrix V has a finite density of states in the large W limit [19]: 2σ 1 1 lim tr f (V )dPW (V ) = f (λ) 4σ 2 − λ2 dλ, (5.8) 2 W →∞ W A H 2σ π −2σ W with σ 2 = |a|2 g(a)da. A key fact below is the following related result Theorem 8. (Bai and Yin [4]) Let V be a W × W random matrix of the form (5.11), with {di } and {ai, j } mutually independent sets of independent random variables. If E (|di |) < ∞, E |ai, j |4 < ∞, E ai, j = 0, and σ 2 = E |ai, j |2 , then lim Prob [V > 2σ + η] = 0.

W →∞

(5.9)

Remark. This follows from Theorem A of ref. [4], which gives the convergence of λ1 , the largest eigenvalue of V , to 2σ with probability one. Symmetrizing the assumptions of Theorem A and applying the result also to show that λW , the smallest eigenvalue of V , converges to −2σ , this result follows. (The proof in [4] is written out in the real symmetric case, but carries over to the complex hermitian case with only very minor modifications.) Corollary 9. Under Assumption 4, we may find p0 , L > 0 such that Prob [V ≤ L] ≤ p0 .

(5.10)

We require very little of the off diagonal blocks T j . They need only satisfy the estimate (3.13) analogous to (3.10) and (5.10). In particular, they could be deterministic, say T j = I for all j or T j given by a Toeplitz matrix. In this section we consider a few examples of random off-diagonal blocks modeled on the blocks for the Gaussian band ensemble (1.1). In that case, the off-diagonal blocks T j are lower triangular matrices with Gaussian entries. More generally we may suppose

Localization for Random Band Matrices

1093

Assumption 5. The distribution of the off-diagonal blocks, dQW (T ), written in terms of the matrix elements ⎛

0

⎜ ⎜ t2,1 1 ⎜ ⎜ T = √ ⎜ ... W ⎜ ⎜ . ⎝ .. tW,1

···

0 .. .

..

..

.

..

···

.

..

. ···

···

. tW,W −1

⎞ 0 .. ⎟ .⎟ ⎟ .. ⎟ , .⎟ ⎟ ⎟ 0⎠ 0

(5.11)

has the form dQW (T ) =

dµ(ti, j ),

(5.12)

1≤ j
where either (R) µ(ti, j ) is a probability measure on R or (C) µ(ti, j ) is a probability measure on C, and

|t| dµ(t) < ∞, and 4

tdµ(t) = 0.

(5.13)

Theorem 10. Under Assumption 5, we may find q0 , τ > 0 such that Assumption 3 holds with b = 0, i.e., Prob [T ≤ τ ] ≤ q0 . Proof. It follows from [4, Theorem A] that, with σ 2 = 1 5 41 1 1 lim Prob 1T + T † 1 > σ + η = 0,

W →0

(5.14)

|t|2 dµ(t),

1 5 41 1 1 lim Prob 1i(T − T † )1 > σ + η = 0,

W →0

(5.15) for any η > 0. Since T =

1 1 (T + T † ) + i(T − T † ), 2 2i

(5.16)

it follows that lim Prob [T > σ + η] = 0.

W →0

Thus (5.14) holds.

(5.17)

1094

J. Schenker

5.2. Fluctuation regularity. A particular example of distributions satisfying Assumption 4 are the Gaussian Orthogonal Ensemble (GOE), corresponding to case (R), and the Gaussian Unitary Ensemble (GUE), corresponding to case (C). In these cases, the measure P is of the form H dP(V ) ∝ e−βW tr V dV, V ∈ AW , 2

(5.18)

with β = 1 (R) or β = 2 (C). That is, 1 1 2 2 h(d) = √ e−βd , g(a) = e−2β|a| . β π (2βπ ) 2

(5.19)

Theorem 11. If V is a GUE or GOE matrix of size W then Assumption 2 of Sect. 3 holds with σ = 21 , ζ = 2 and a = 0. Corollary 12. Assumptions 1, 2, and 3 hold for the Gaussian band ensemble (1.1). Proof. We have already derived the Wegner estimates (Thm. 7). It remains only to show the fluctuation regularity. For the Gaussian ensembles, we have ρ(V1 ) 2 2 2 = e−βW tr(V1 −V2 ) = e−βW tr(V1 −V2 )(V1 +V2 ) ≥ e−βW V1 −V2 V1 +V2 . (5.20) ρ(V2 ) If V1 − V , V2 − V ≤ W −2 we have ρ(V1 ) −2 ≥ e−2β (V + W ) . ρ(V2 )

(5.21)

Letting p0 and L be as in Cor. 9, we set W := {V ≤ L}. Then Prob( W ) ≥ p0 > 0 and if V ∈ W , we have ρ(V1 ) ≥ e−2 (L+ ) := δ, ρ(V2 ) whenever V1 − V , V2 − V ≤ W −2 .

(5.22)

To obtain fluctuation regularity for general Wigner matrices (3.11) we require additional assumptions on h and g. For instance, we have the following Theorem 13. If V satisfies Assumption 4 with ln h and ln g uniformly Hölder continuous with exponent α, then Assumption 2 of Sect. 3 holds with σ = 21 , ζ = α2 + 21 and a = 0. α

Remark. For example h(λ) = g(λ) = cα e−|λ| with 0 < α ≤ 1 satisfies the hypotheses of the theorem. Proof. We have

⎛ ⎞ " " ρ(V1 ) = exp⎝ ln h(di;1 ) − ln h(di;2 )+ ln g(ai, j;1 )−ln g(ai, j;2 )⎠ ρ(V2 ) i i, j ⎡ ⎛ ⎞⎤ " " ≥ exp ⎣−C ⎝ |di;1 − di;2 |α + ln |ai, j;2 − ai, j;2 |α ⎠⎦ . (5.23) i

i, j

Localization for Random Band Matrices

1095

If V1 − V , V2 − V ≤ W −ζ , then √ √ 1 |di;1 − di;2 |, |ai, j;2 − ai, j;2 | ≤ W V1 − V + W V2 − V ≤ 2 W 2 −ζ . It follows that 5 4 α ρ(V1 ) ≥ exp −C W 2 W 2 −ζ α = e−C =: δ. ρ(V2 )

(5.24)

This estimate holds for every V , so in particular for all V in W = {V ≤ L}.

Theorem 13 cannot apply if h or g has compact support. Nonetheless compactly supported densities can be handled. A general result of this type would be somewhat involved to state, so let us simply note that Assumption 2 holds if h and g are characteristic functions of open neighborhoods of the origin. Theorem 14. Suppose that V satisfies Assumption 4 and that h(d) =

1 1 I [|d| < D], g(a) = I [|a| < A], 2D cβ Aβ

with c1 = 2 and c2 = π . Then Assumption 2 of Sect. 3 holds with σ = 21 , ζ = a = 0.

5 2

and

Proof. Clearly the moment conditions of assumption 4 hold. Thus by Cor. 9 we can find p0 and L so that (5.10) holds. 5 Now suppose V1 − V , V2 − V ≤ W − 2 . Suppose also that the matrix elements 1 1 5 1 1 5 of V satisfy W − 2 |di | ≤ W − 2 D − W − 2 , W − 2 |ai, j | ≤ W − 2 A − W − 2 for all i, j. Then ρ(V1 ) = 1. ρ(V2 ) But

(5.25)

1 1 5 1 1 5 Prob W − 2 |di | ≤ W − 2 D − W − 2 , W − 2 |ai, j | ≤ W − 2 A − W − 2 " " Prob(|di | > D − W −2 ) − Prob(|ai, j | > A − W −2 ) ≥1− i

i, j

≥ 1 − C .

(5.26)

Now let

W . 1 1 5 1 1 5 = V ≤L , W − 2 |di |≤W − 2 D − W − 2 , and W − 2 |ai, j |≤W − 2 A − W − 2 , (5.27) with sufficiently small that Prob( W ) ≥ p0 − C > 0.

(5.28)

1096

J. Schenker

5.3. Summary. Putting the results of this section together with Thm. 6 we have: Theorem 15. Let AW = TW = set of W × W matrices with real or complex entries and suppose P and Q satisfy Assumptions 4 and 5. (1) If P is either the Gaussian orthogonal or Gaussian unitary ensemble, then given r > 0 and s ∈ (0, 1) there are As < ∞ and αs > 0 such that s s −α |i− j| E ei , (X W ;N − λ)−1 e j ≤ As W 2 e s W 8 , λ ∈ [−r, r ]. (5.29) In particular, (5.29) holds for the Gaussian band ensemble (1.1). (2) If ln h and ln g are uniformly Hölder continuous with exponent α, then given r > 0 and s ∈ (0, 1) there are As < ∞ and αs > 0 such that s |i− j| s E ei , (X W ;N − λ)−1 e j ≤ As W 2 e−αs W µ , λ ∈ [−r, r ], (5.30) with µ = 5 + α4 . (3) If h and g are proportional to characteristic functions of open neighborhoods of the origin, then given r > 0 and s ∈ (0, 1) there are As < ∞ and αs > 0 such that (5.30) holds with µ = 9. Appendix A. A Lemma on Conditional Averages In the proofs of the various versions of Lemma F above, a key step was to estimate averages of the form (n E e− j=1 U j (A.1) in which U j are non-negative, strictly positive with good probability, but not independent. The following lemma gives the relevant estimate, which can be seen as a simple version of stochastic domination. As the proof shows, under appropriate assumptions, we can estimate (A.1) in terms of the same expression with U j replaced by i.i.d. non-negative Bernoulli variables taking 0 with probability less than 1. Lemma A.1. Let j be a sequence of σ -algebras of events on a probability space and let U j be a sequence of non-negative random variables with U j measurable with respect to k for k = j. If for some δ > 0, Prob(U j ≥ δ| j ) ≥ p0 for each j, then

(n −δ E e− j=1 U j ≤ e−(1−e ) p0 n .

Proof. This follows by induction, since (n (n−1 (n−1 E e− j=1 U j |n = e− j=1 U j E e−Un |n ≤ [(1 − p0 ) + e−δ p0 ]e− j=1 U j and (1 − p0 ) + e−δ p0 ≤ e−(1−e

−δ ) p

0

.

Localization for Random Band Matrices

1097

Acknowledgments. I would like to thank Tom Spencer and Michael Aizenman for many interesting discussions related to this and other works, and to express my gratitude for the hospitality extended me by the Institute for Advanced study, where I was member when this project started, and more recently by the Isaac Newton Institute during my stay associated with the program Mathematics and Physics of Anderson Localization: 50 Years After.

References 1. Aizenman, M., Molchanov, S.: Localization at large disorder and at extreme energies: An elementary derivation. Commun. Math. Phys. 157(2), 245–278 (1993) 2. Aizenman, M., Schenker, J.H., Friedrich, R.M., Hundertmark, D.: Finite-volume fractional-moment criteria for Anderson localization. Commun. Math. Phys. 224(1), 219–253 (2001) 3. Aizenman, M.: Localization at weak disorder: some elementary bounds. Rev. Math. Phys. 6, no. 5A, 1163–1182, (1994), Special issue dedicated to Elliott H. Lieb 4. Bai, Z.D., Yin, Y.Q.: Necessary and sufficient conditions for almost sure convergence of the largest eigenvalue of a Wigner matrix. Ann. Probab. 16(4), 1729–1741 (1988) 5. Bellissard, J.V., Hislop, P.D., Stolz, G.: Correlation estimates in the Anderson model. J. Stat. Phys. 129(4), 649–662 (2007) 6. Carmona, R., Lacroix, J.: Spectral Theory of Random Schrödinger Operators. Probability and its Applications, Boston, MA: Birkhäuser Boston Inc., 1990 7. Casati, G., Molinari, L., Izrailev, F.: Scaling properties of band random matrices. Phys. Rev. Lett. 64(16), 1851–1854 (1990) 8. Chirikov, G., Guarneri, B.V., Izrailev, I., Casati, F.M.: Band-random-matrix model for quantum localization in conservative systems. Phys. Rev. E 48(3), R1613 (1993) 9. Combes, J.-M., Germinet, F., Klein, A.: Generalized eigenvalue-counting estimates for the Anderson model. http://arxiv.org/abs/0804.3202V2[math-ph], 2008 10. Dobrushin, R.L., Shlosman, S.B.: Absence of breakdown of continuous symmetry in two-dimensional models of statistical physics. Commun. Math. Phys. 42(1), 31–40 (1975) 11. Dyson, F.J.: Statistical theory of the energy levels of complex systems. I. J. Math. Phys. 3(1), 140 (1962) 12. Dyson, F.J.: Statistical theory of the energy levels of complex systems. II. J. Math. Phys. 3(1), 157 (1962) 13. Fyodorov, Y.V., Mirlin, A.D.: Scaling properties of localization in random band matrices: A σ -model approach. Phys. Rev. Lett. 67(18), 2405–2409 (1991) 14. Graf, G.-M., Vaghi, A.: A remark on the estimate of a determinant by Minami. Lett. Math. Phys. 79(1), 17–22 (2007) 15. Klein, A., Molchanov, S.: Simplicity of eigenvalues in the Anderson model. J. Stat. Phys. 122(1), 95–99 (2006) 16. Kunz, H., Souillard, B.: Sur le spectre des opérateurs aux différences finies aléatoires. Commun. Math. Phys. 78, 201–246 (1980) 17. Mermin, N.D., Wagner, H.: Absence of ferromagnetism or antiferromagnetism in one- or two-dimensional isotropic heisenberg models. Phys. Rev. Lett. 17(22), 1133–1136 (1966) 18. Minami, N.: Local fluctuation of the spectrum of a multidimensional Anderson tight binding model. Commun. Math. Phys. 177(3), 709–725 (1996) 19. Molchanov, S.A., Pastur, L.A., Khorunzhii, A.M.: Limiting eigenvalue distribution for band random matrices. Theor. Math. Phys. 90(2), 108–118 (1992) 20. Wegner, F.: Bounds on the density of states in disordered systems. Zeit. Phys. B 44(1–2), 9–15 (1981) Communicated by B. Simon

Communications in Mathematical Physics - Volume 221

Read more

Communications in Mathematical Physics - Volume 220

Read more

Communications in Mathematical Physics - Volume 235

Read more

Communications in Mathematical Physics - Volume 223

Read more

Communications In Mathematical Physics - Volume 283

Read more

Communications In Mathematical Physics - Volume 270

Read more

Communications in Mathematical Physics - Volume 208

Read more

Communications in Mathematical Physics - Volume 186

Read more

Communications In Mathematical Physics - Volume 294

Read more

Communications in Mathematical Physics - Volume 217

Read more

Communications In Mathematical Physics - Volume 274

Read more

Communications in Mathematical Physics - Volume 239

Read more

Communications in Mathematical Physics - Volume 306

Read more

Communications in Mathematical Physics - Volume 264

Read more

Communications in Mathematical Physics - Volume 227

Read more

Communications in Mathematical Physics - Volume 184

Read more

Communications in Mathematical Physics - Volume 261

Read more

Communications in Mathematical Physics - Volume 225

Read more

Communications In Mathematical Physics - Volume 263

Read more

Communications in Mathematical Physics - Volume 211

Read more

Communications In Mathematical Physics - Volume 293

Read more

Communications in Mathematical Physics - Volume 246

Read more

Communications In Mathematical Physics - Volume 298

Read more

Communications in Mathematical Physics - Volume 234

Read more

Communications In Mathematical Physics - Volume 288

Read more

Communications in Mathematical Physics - Volume 304

Read more

Communications In Mathematical Physics - Volume 292

Read more

Communications in Mathematical Physics - Volume 233

Read more

Communications in Mathematical Physics - Volume 253

Read more

Communications in Mathematical Physics - Volume 222

Read more

Recommend Documents

Communications in Mathematical Physics - Volume 221

Commun. Math. Phys. 221, 1 – 26 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Evolution of a ...

Communications in Mathematical Physics - Volume 220

Commun. Math. Phys. 220, 1 – 12 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 On the Definiti...

Communications in Mathematical Physics - Volume 235

Commun. Math. Phys. 235, 1–45 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0778-0 Communications in Mathe...

Communications in Mathematical Physics - Volume 223

Commun. Math. Phys. 223, 1 – 12 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Resonance Expan...

Communications In Mathematical Physics - Volume 283

Commun. Math. Phys. 283, 1–24 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0556-8 Communications in Mathe...

Communications In Mathematical Physics - Volume 270

Commun. Math. Phys. 270, 1–12 (2007) Digital Object Identifier (DOI) 10.1007/s00220-006-0139-5 Communications in Mathe...

Communications in Mathematical Physics - Volume 208

Commun. Math. Phys. 208, 1 – 23 (1999) Communications in Mathematical Physics © Springer-Verlag 1999 Characters of C...

Communications in Mathematical Physics - Volume 186

Commun. Math. Phys. 186, 1-59 (1997) Communications in Mathematical Physics (~) Springer-Verlag1997 Meanders and the...

Communications In Mathematical Physics - Volume 294

Commun. Math. Phys. 294, 1–19 (2010) Digital Object Identifier (DOI) 10.1007/s00220-009-0920-3 Communications in Mathe...

Communications in Mathematical Physics - Volume 217

Commun. Math. Phys. 217, 1 – 31 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Integrable Stru...