Communications in Mathematical Physics - Volume 302

Commun. Math. Phys. 302, 1–51 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1175-8 Communications in Mathe...

Author: M. Aizenman (Chief Editor)

31 downloads 831 Views 12MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 302, 1–51 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1175-8

Communications in

Mathematical Physics

Transition to Longitudinal Instability of Detonation Waves is Generically Associated with Hopf Bifurcation to Time-Periodic Galloping Solutions Benjamin Texier1, , Kevin Zumbrun2, 1 Université Paris Diderot (Paris 7), Institut de Mathématiques de Jussieu, UMR CNRS 7586,

75205 Paris Cedex 13, France. E-mail: [email protected]

2 Indiana University, Bloomington, IN 47405, USA. E-mail: [email protected]

Received: 17 December 2008 / Revised: 25 June 2010 / Accepted: 4 October 2010 Published online: 9 January 2011 – © Springer-Verlag 2011

Abstract: We show that transition to longitudinal instability of strong detonation solutions of reactive compressible Navier–Stokes equations is generically associated with Hopf bifurcation to nearby time-periodic “galloping”, or “pulsating”, solutions, in agreement with physical and numerical observation. In the process, we determine readily numerically verifiable stability and bifurcation conditions in terms of an associated Evans function, and obtain the first complete nonlinear stability result for strong detonations of the reacting Navier–Stokes equations, in the limit as amplitude (hence also heat release) goes to zero. The analysis is by pointwise semigroup techniques introduced by the authors and collaborators in previous works. Contents 1.

Introduction . . . . . . . . . . . . . . . . . . . . 1.1 The reacting Navier-Stokes equations . . . . 1.2 Assumptions . . . . . . . . . . . . . . . . . . 1.3 Coordinatizations . . . . . . . . . . . . . . . 1.4 Strong detonations . . . . . . . . . . . . . . 1.5 Structure of the equations and the profiles . . 1.6 The Evans function . . . . . . . . . . . . . . 1.7 Results . . . . . . . . . . . . . . . . . . . . . 1.7.1 Stability. . . . . . . . . . . . . . . . . . 1.7.2 Transition from stability to instability. . . 1.7.3 Nonlinear instability. . . . . . . . . . . . 1.8 Verification of stability/bifurcation conditions 1.9 Discussion and open problems . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

Research of B.T. was partially supported under NSF grant number DMS-0505780.

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

2 3 4 4 5 7 8 9 9 10 11 11 12

Research of K.Z. was partially supported under NSF grants no. DMS-0300487 and DMS-0801745.

2

2. 3.

4.

5.

6.

B. Texier, K. Zumbrun

Strong Detonations . . . . . . . . . . . . . . . . . . . . Resolvent Kernel and Green Function Bounds . . . . . 3.1 Laplace transform . . . . . . . . . . . . . . . . . . 3.1.1 The limiting, constant-coefficient equations. . . 3.1.2 Low-frequency behaviour of the normal modes. 3.1.3 Description of the essential spectrum. . . . . . 3.1.4 Gap Lemma and dual basis. . . . . . . . . . . 3.1.5 Duality relation and forward basis. . . . . . . . 3.1.6 The resolvent kernel. . . . . . . . . . . . . . . 3.1.7 The Evans function. . . . . . . . . . . . . . . 3.2 Inverse Laplace transform . . . . . . . . . . . . . . 3.2.1 Pointwise Green function bounds. . . . . . . . 3.2.2 Convolution bounds. . . . . . . . . . . . . . . Stability: Proof of Theorem 1.14 . . . . . . . . . . . . 4.1 Linearized stability criterion . . . . . . . . . . . . 4.2 Auxiliary energy estimate . . . . . . . . . . . . . . 4.3 Nonlinear stability . . . . . . . . . . . . . . . . . . Bifurcation: Proof of Theorem 1.18 . . . . . . . . . . . 5.1 The perturbation equations . . . . . . . . . . . . . 5.2 Coordinatization . . . . . . . . . . . . . . . . . . . 5.3 Poincaré return map . . . . . . . . . . . . . . . . . 5.4 Lyapunov-Schmidt reduction . . . . . . . . . . . . 5.4.1 Pointwise cancellation estimate. . . . . . . . . 5.4.2 Reduction. . . . . . . . . . . . . . . . . . . . 5.4.3 Bifurcation. . . . . . . . . . . . . . . . . . . . Nonlinear Instability: Proof of Theorem 1.19 . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

13 16 17 18 19 22 23 26 28 30 30 31 34 34 34 35 35 39 39 39 40 40 41 45 46 47

1. Introduction Motivated by physical and numerical observations of time-oscillatory “galloping” or “pulsating” instabilities of detonation waves [MT,BMR,FW,MT,AlT,AT,F1,F2,KS], we study stability and Hopf bifurcation of viscous detonation waves, or traveling-wave solutions of the reactive compressible Navier–Stokes equations. This extends a larger program begun in [Zl,LyZ1,LyZ2,JLW,LRTZ] toward the dynamical study of viscous combustion waves using Evans function/inverse Laplace transform techniques introduced in the context of viscous shock waves [GZ,ZH,ZS,Zl,MaZ3], continuing the line of investigation initiated in [TZ1,TZ2,SS,TZ3] on bifurcation/transition to instability. It has long been observed that transition to instability of detonation waves occurs in certain predictable ways, with the archetypal behavior in the case of longitudinal, or onedimensional instability being transition from a steady planar progressing wave U (x, t) = U¯ (x1 −st) to a galloping, or time-periodic planar progressing wave U˜ (x1 −st, t), where U˜ is periodic in the second coordinate, and in the case of transverse, or multi-dimensional instability, transition to more complicated “spinning” or “cellular behavior”; see [KS,TZ1,TZ2], and references therein. The purpose of this paper is, restricting to the one-dimensional case, to establish this principle rigorously, arguing from first principles from the physical equations that transition to longitudinal instability of detonation waves is generically associated with Hopf bifurcation to time-periodic galloping solutions, not only at the spectral but also at the full nonlinear level. In the process, we establish the first full nonlinear stability results for

Transition to Longitudinal Instability of Detonation Waves

3

strong detonations of the reacting Navier–Stokes equations, extending the sole previous result obtained by Tan–Tesei [TT] for the special class of initial perturbations with zero integral. 1.1. The reacting Navier-Stokes equations. The single-species reactive compressible Navier–Stokes equations, in Lagrangian coordinates, appear as [Ch] ⎧ ∂t τ − ∂x u = 0, ⎪ ⎪ ⎨ −1 ∂ u), ∂t u + ∂x p = ∂x (ντ x (1.1) −2 ∂ z + κτ −1 ∂ T + ντ −1 u∂ u , qdτ ∂ E + ∂ ( pu) = ∂ ⎪ t x x x x x ⎪ ⎩ −2 ∂t z + kφ(T )z = ∂x (dτ ∂x z), where τ > 0 denotes specific volume, u velocity, E > 0 total specific energy, and 0 ≤ z ≤ 1 mass fraction of the reactant. The variable U := (τ, u, E, z) ∈ R4 depend on time t ∈ R+ , position x ∈ R, and parameters ν, κ, d, k, q, where ν > 0 is a viscosity coefficient, κ > 0 and d > 0 are respectively coefficients of heat conduction and species diffusion, k > 0 represents the rate of the reaction, and q is the heat release parameter, with q > 0 corresponding to an exothermic reaction and q < 0 to an endothermic reaction. In (1.1), T = T (τ, e, z) > 0 represents temperature, p = p(τ, e, z) pressure, where the internal energy e > 0 is defined through the relation 1 E = e + u 2 + qz. 2 In (1.1), we assume a simple one-step, one-reactant, one-product reaction kφ(T )

A −→ B,

z := [A ],

[A ] + [B ] = 1,

where φ is an ignition function. More realistic reaction models are described in [GS2]. In the variable U, after the shift x → x − st,

s ∈ R,

the system (1.1) takes the form of a system of differential equations ∂t U + ∂x (F(U )) = ∂x (B(U )∂x U ) + G(U ), where

and

⎛

⎛

⎞ −u ⎜ p ⎟ F := ⎝ − s(ε)U, pu ⎠ 0

(1.2)

⎛

⎞ 0 0 ⎜ ⎟ G := ⎝ ⎠, 0 −kφ(T )z

⎞ 0 0 0 0 ⎜ ⎟ 0 ντ −1 0 0 ⎟ B := ⎜ ⎝ κτ −1 ∂τ T −κuτ −1 ∂e T + ντ −1 u κτ −1 ∂e T κτ −1 (∂z T − q∂e T ) + qdτ −2 ⎠. 0 0 0 dτ −2

4

B. Texier, K. Zumbrun

The characteristic speeds of the first-order part of (1.1), i.e., the eigenvalues of ∂U F(U ), are {−s − σ, −s, −s + σ , −s }, reactive eigenvalue fluid eigenvalues

(1.3)

where σ, the sound speed of the gas, is 1

1

σ := ( p∂e p − ∂τ p) 2 = τ −1 ( ( + 1)e) 2 . 1.2. Assumptions. We make the following assumptions: Assumption 1.1. We assume a reaction-independent ideal gas equation of state, p = τ −1 e,

T = c−1 e,

(1.4)

where c > 0 is the specific heat constant and is the Gruneisen constant. Assumption 1.2. The ignition function φ is smooth; it vanishes identically for T ≤ Ti , and is strictly positive for T > Ti . Remark 1.3. A typical ignition function is given by the modified Arrhenius law E

φ(T ) = Ce T −Ti ,

(1.5)

where E is activation energy. Remark 1.4. The specific choice (1.4) is made for concreteness/clarity of exposition. Our results remain valid for any reaction–independent equation of state with pτ < 0, pe > 0, and Te > 0.1 With further effort, reaction-dependence should be treatable as well. 1.3. Coordinatizations. We let w := (u, E, z) ∈ R3 ,

v := (τ, u, E) ∈ R3 .

Then we have the coordinatizations U = (v, z) = (τ, w). In particular, Assumption 1.1 implies that in the (τ, w) coordinatization, B takes the block-diagonal form 0 0 , B= 0 b where b is full rank for all values of the parameters and U ; the system (1.2) in (τ, w) coordinates is ∂t τ − s∂x τ − J ∂x w = 0, ∂t w + ∂x f (τ, w) = ∂x (b(τ, w)∂x w) + g(w), 1 An obvious exception is Lemma 1.6, which depends on specific structure.

Transition to Longitudinal Instability of Detonation Waves

5

with the notation

⎛

⎞ ⎛ ⎞ p 0 ⎠. 0 f := ⎝ pu ⎠ − sw, g := ⎝ 0 −kφ(T )z

J := 1 0 0 ,

(1.6)

In the (v, z) coordinatization, the system (1.2) takes the form ∂t v + ∂x f (v, z) = ∂x (b1 (v)∂x v + b2 (v)∂x z) ∂t z − s∂x z + kφ(T )z = ∂x (dτ −2 ∂x z), where the flux is f = (−u − sτ, p − su, pu − s E), and, under Assumption 1.1, the diffusion matrices are ⎛ ⎛ ⎞ ⎞ 0 0 0 0 ⎠, ⎠. 0 ντ −1 0 b2 = ⎝ b1 = ⎝ 0 −2 −1 −1 −1 −1 −1 q(dτ − κτ ) 0 τ (ν − κc )u κτ c Note that, in the (v, z) coordinatization, the first component is a conservative variable, in the sense that ∂t v is a perfect derivative, hence (v(x, t) − v(x, 0)) d x ≡ 0, (1.7) R

for v(t) − v(0) ∈ W 2,1 (R). 1.4. Strong detonations. We prove in this article stability and bifurcation results for viscous strong detonations of (1.1), defined as follows: Definition 1.5. A one-parameter, right-going family of viscous strong detonations is a family {U¯ ε }ε∈R of smooth stationary solutions of (1.2), associated with speeds s(ε), s(ε) > 0, model parameters (ν, κ, d, k, q)(ε) and ignition function φ ε , with U¯ ε , φ ε , (s, ν, κ, d, k, q)(ε) depending smoothly on ε in L ∞ × L ∞ × R6 ,satisfying U¯ ε (x, t) = U¯ ε (x),

lim U¯ ε (x) = U±ε ,

x→±∞

(1.8)

connecting a burned state on the left to an unburned state on the right, ε z− ≡ 0, z +ε ≡ 1,

(1.9)

with a temperature on the burned side above ignition temperature T−ε > Ti ,

(1.10)

and satisfying the Lax characteristic conditions σ− := σ (U−ε ) > s > σ+ := σ (U+ε ), uniformly in ε.

(1.11)

6

B. Texier, K. Zumbrun

Fig. 1. Characteristic speeds for strong detonations

Consider a standing wave (1.8), U = (τ, u, E, z), solution of (1.2), with endstates U± = (τ± , u ± , E ± , z ± ). It satisfies the linear constraint −s(τ − τ− ) = u − u − , the system of ordinary differential equations (Fig. 1) ⎧ ντ −1 u = p − su − ( p − su)− , ⎪ ⎪ ⎪ −1 −1 −1 ⎪ ⎨ κτ c E + τ (ν − κc−1 )uu = pu − s E − ( pu − s E)− + (κτ −1 c−1 − dτ −2 )qy, ⎪ = y, ⎪ z ⎪ ⎪ ⎩ dτ −2 y = −sy + kφ(T )z, and the Rankine-Hugoniot relations ⎧ −s(τ+ − τ− ) ⎪ ⎪ ⎪ ⎨ ( p − su)+ ( pu − s E)+ ⎪ ⎪ y± ⎪ ⎩ φ(T± )z ±

= u+ − u−, = ( p − su)− , = ( pu − s E)− , = 0, = 0,

(1.12)

(1.13)

expressing the fact that (u ± , E ± , 0, z ± ) are rest points of (1.12). From (1.11) and (1.13), we note that the right endstate of a strong detonation satisfies φ(T+ε ) = 0,

(1.14)

which, by Assumption 1.2, implies also φ (T+ε ) = 0.

(1.15)

Lemma 1.6. Under Assumptions 1.1, 1.2, if q > 0 and s is large enough with respect to q, then for any z + ∈ (0, 1], there exists an open subset O− in R3 , such that any left endstate U− = (v− , 0) with v− ∈ O− satisfies (1.10) and (1.11), and is associated with a right endstate U+ = (v+ , z + ) satisfying T+ < Ti , (1.11) and (1.13).

Transition to Longitudinal Instability of Detonation Waves

7

The existence of strong detonations was proved by Gasser and Szmolyan [GS1] for small dissipation coefficients ν, κ and d. We restrict throughout the article to strong detonations with left endstates as in the above lemma. Remark 1.7. In the small-heat-release limit q → 0, the equations in (y, z) (in system (1.12)) are decoupled from the fluid equations; in particular, strong detonations converge to ordinary nonreacting gas-dynamical shocks of standard Lax type, the existence of which has been established by Gilbarg [G]. A consequence of Lemma 1.6 is that strong detonations converge exponentially to their endstates, a key fact of the subsequent stability and bifurcation analysis. Corollary 1.8. Under Assumptions 1.1, 1.2, let {U¯ ε }ε be a family of viscous strong detonations. There exist C, η0 > 0, such that, for k ≥ 0 and j ∈ {0, 1}, j |∂ε ∂xk (U¯ ε − U−ε )(x)| ≤ Ce−η0 |x| , j |∂ε ∂xk (U¯ ε − U+ε )(x)| ≤ Ce−η0 |x| ,

x < 0, x > 0.

(1.16)

In particular, |(U¯ ε ) (x)| ≤ Ce−η0 |x| , for all x. Remark 1.9. In the ZND limit, strong detonations are transverse orbits of (1.12), a result proved in Sect. 3.6 of [LyZ2], following [GS1]. Lemma 1.6 and Corollary 1.8 are proved in Sect. 2. 1.5. Structure of the equations and the profiles. System (1.2), seen as a system in τ, w, satisfies (A1) the convection terms in the equation in τ are linear in (τ, w); (A2) the diffusion matrix b is positive definite. For strong detonation waves, the convection terms in (1.1) satisfy (H1) The convection coefficient s(ε) in the evolution equation in τ is nonzero, uniformly in ε. (H2) The spectrum of ∂U F, given in (1.3), is real, simple, and nonzero, uniformly in ε. System (1.2) satisfies the Kawashima dissipativity condition (H3) For all ε, for all ξ ∈ R, θξ2 , σ iξ ∂U F(U±ε ) − ξ 2 B(U±ε ) + ∂U G(U±ε ) ≤ − 1 + ξ2 at the endstates U±ε of a family of strong detonations. In (H3), σ denotes spectrum of a matrix, and θ > 0 is independent of ξ and ε. To verify (H3), it suffices, by a classical result of [ShK], to check that (1.2) has a symmetrizable hyperbolic-parabolic structure, and that the genuine coupling condition holds. These conditions are coordinates-independent, and easily checked in (τ, u, e) coordinates. Finally, the assumption (H4) Considered as connecting orbits of (1.12), U¯ ε lie in a smooth one-dimensional manifold of solutions of (1.12), obtained as a transversal intersection of the unstable manifold at U−ε and the stable manifold at U+ε ,

8

B. Texier, K. Zumbrun

holds in the ZND limit, as stated in Remark 1.9. Under (H4), in a vicinity of U¯ ε , the set of stationary solutions of (1.2) with limits U±ε at ±∞ is a smooth one-dimensional manifold, given by {U¯ ε (· − c), c ∈ R}, and the associated speed ε → s(ε) is smooth. Conditions (A1)–(A2), (H0)–(H4) are the assumptions of [TZ3] (where G ≡ 0), themselves a strengthened version of the assumptions of [MaZ3]. 1.6. The Evans function. A central object in the study of stability of traveling waves is the Evans function D(ε, ·) (precisely defined in Sect. 3.1.7), a Wronskian of solutions of the eigenvalue equation (L(ε) − λ)U = 0 decaying at plus or minus spatial infinity [AGJ],2 where the linearized operator L is defined as L(ε) := −∂x (A ·) + ∂x (B(U¯ ε )∂x ·) + ∂U G(U¯ ε ),

(1.17)

A := −∂U F(U¯ ε ) + (∂U B(U¯ ε ) ·)(U¯ ε ) .

(1.18)

with the notation

Recall the important result of [LyZ2]: Proposition 1.10 ([LyZ2], Theorem 4). Under Assumptions 1.1 and 1.2, let {U¯ ε }ε be a one-parameter family of viscous strong detonation waves satisfying (H4). For all ε, the associated Evans function has a zero of multiplicity one at λ = 0: D(ε, 0) = 0,

and

D (ε, 0) = 0.

Proof. By translational invariance, D(ε, 0) = 0, for all ε. Generalizing similar results known for shock waves [GZ,ZS], there was established in [Zl,LyZ1,LyZ2] the fundamental relation D (ε, 0) = γ δ.

(1.19)

In (1.19), γ is a coefficient given as a Wronskian of solutions of the linearized travelingwave ODE about U¯ ; transversality corresponds to γ = 0. In (1.19), δ is the Lopatinski determinant δ := det r1− r2− r4− ( τ+ − τ− u + − u − E + − E − )tr , ε (where r − j denote the eigenvectors of ∂U F(U− ) associated with outgoing eigenvalues, F as in (1.2), and tr denotes transverse matrix3 ) determining hyperbolic stability of the Chapman–Jouget (square wave) approximation modeling the detonation as a shock discontinuity. Hyperbolic stability corresponds to δ = 0. See [Zl,LyZ1,LyZ2,JLW] for further discussion. By (H4), γ = 0, while δ = 0 by direct calculation comparing to the nonreactive (shock-wave) case.

Remark 1.11. The vectors r1− , r2− and r4− correspond to outgoing modes to the left of x = 0, see Sect. 3.1.2 and Fig. 4. (The fluid modes r − j , 1 ≤ j ≤ 3, are ordered as usual by increasing characteristic speeds: −s − σ− < −s < 0 < −s + σ− , so that r3− is incoming.) 2 For applications of the Evans function to stability of viscous shock and detonation waves, see, e.g., [AGJ,GZ,ZS,Zl,LyZ1,LyZ2,LRTZ]. 3 This notation will be used throughout the article.

Transition to Longitudinal Instability of Detonation Waves

9

1.7. Results. Let X and Y be two Banach spaces, and consider a traveling wave U¯ solution of a general evolution equation. Definition 1.12. A traveling wave U¯ is said to be X → Y linearly orbitally stable if, for any solution U˜ of the linearized equations about U¯ with initial data in X, there exists a phase shift δ, such that |U˜ (·, t) − δ(t)U¯ (·)|Y is bounded for 0 ≤ t ≤ ∞. It is said to be X → Y linearly asymptotically orbitally stable if it is X → Y linearly orbitally stable and if moreover |U˜ (·, t) − δ(t)U¯ (·)|Y → 0 as t → ∞. Definition 1.13. A traveling wave U¯ is said to be X → Y nonlinearly orbitally stable if, for each δ > 0, for any solution U˜ of the nonlinear equations with |U˜ (·, 0) − U¯ | X sufficiently small, there exists a phase shift δ, such that |U˜ (·, t) − U¯ (· − δ(t), t)|Y ≤ δ for 0 ≤ t ≤ ∞. It is said to be X → Y nonlinearly asymptotically orbitally stable if it is X → Y nonlinearly orbitally stable and if moreover |U˜ (·, t) − U¯ (· − δ(t), t)|Y → 0 as t → ∞. 1.7.1. Stability. Our first result, generalizing that of [LRTZ] in the artificial viscosity case, is a characterization of linearized stability and a sufficient condition for nonlinear stability, in terms of an Evans function condition. Theorem 1.14. Under Assumptions 1.1, 1.2, let {U¯ ε }ε be a one-parameter family of viscous strong detonation waves. For all ε, U¯ ε is L 1 ∩ L p → L p linearly orbitally stable if and only if, for all ε, the only zero of D(ε, ·) in λ ≥ 0 is a simple zero at the origin.

(1.20)

If (1.20) holds, U¯ ε is L 1 ∩ H 3 → L 1 ∩ H 3 linearly and nonlinearly orbitally stable, and L 1 ∩ H 3 → L p ∩ H 3 asymptotically orbitally stable, for p > 1, with |U˜ ε (·, t) − U¯ ε (· − δ(t))| L p ≤ C|U˜ 0ε − U¯ ε | L 1 ∩H 3 (1 + t)

− 21 (1− 1p )

,

(1.21)

where U˜ ε is the solution of (1.2) issued from U˜ 0ε , for some δ(·) satisfying |δ(t)| ≤ C|U˜ 0ε − U¯ ε | L 1 ∩H 3 , ˙ |δ(t)| ≤ C|U˜ 0ε − U¯ ε | L 1 ∩H 3 (1 + t)− 2 . 1

Remark 1.15. It is shown in [LyZ2] that in the small heat-release limit q → 0, strong detonations are Evans stable if and only if the limiting gas-dynamical profile (see Remark 1.7) is Evans stable: in particular, for shock (or equivalently detonation) amplitude sufficiently small [HuZ2]. Corollary 1.16. Under Assumptions 1.1, 1.2, strong detonation profiles are linearly and nonlinearly orbitally stable (in the strong sense of (1.21) in the limit as amplitude |U+ − U− | (hence also heat release q) goes to zero, with U− (or U+ ) held fixed. Corollary 1.16 is notable as the first complete nonlinear stability result for strong detonations of the reacting Navier–Stokes equations. The only previous result on this topic, a partial stability result applying to zero mass (i.e., total integral) perturbations, was obtained by Tan and Tesei under similar, but more restrictive assumptions (in particular, for nonphysical Heaviside-type ignition function) in 1997.

10

B. Texier, K. Zumbrun

1.7.2. Transition from stability to instability. Theorem 1.17. Under Assumptions 1.1, 1.2, let {U¯ ε }ε be a one-parameter family of viscous strong detonation waves satisfying (H4). Assume that the family of Eqs. (1.2) and profiles U¯ ε undergoes transition to instability at ε = 0 in the sense that U¯ ε is linearly stable for ε < 0 and linearly unstable for ε > 0. Then, one or more pair of nonzero complex conjugate eigenvalues of L(ε) move from the stable (negative real part) to the neutral or unstable (nonnegative real part) half-plane as ε passes from negative to positive through ε = 0, while λ = 0 remains a simple root of D(ε, ·) for all ε. That is, transition to instability is associated with a Hopf-type bifurcation in the spectral configuration of the linearized operator about the wave. Proof of Theorem 1.17. By Theorem 1.14, transition from stability to instability must occur through the passage of a root of the Evans function from the stable half-plane to the neutral or unstable half-plane. However, Proposition 1.10 implies that D has a zero of multiplicity one at the origin, for all ε, and so no root can pass through the origin. It follows that transition to instability, if it occurs, must occur through the passage of one or more nonzero complex conjugate pairs λ = γ ± iτ, τ = 0, from the stable half-plane (γ < 0 for ε < 0) to the neutral or unstable half-plane (γ ≥ 0 for ε ≥ 0). Our third result and the main object of this paper is to establish, under appropriate nondegeneracy conditions, that the spectral Hopf bifurcation configuration described in Theorem 1.17 is realized at the nonlinear level as a genuine bifurcation to time-periodic solutions. Given k ∈ N and a weight function ω > 0, define the Sobolev space and associated norm 1

Hωk := { f ∈ S (R), ω 2 f ∈ H k (R)},

1

f Hωk := ω 2 f H k .

(1.22)

Let ω ∈ C 2 be a growing weight function such that, for some θ0 > 0, C > 0, for all x, y, ⎧ 1 2 ⎨ 1 ≤ ω(x) ≤ eθ0 (1+|x| ) 2 , (1.23) ⎩ |ω (x)| + |ω (x)| ≤ Cω(x), ω(x) ≤ Cω(x − y)ω(y). Theorem 1.18. Under Assumptions 1.1, 1.2, let {U¯ ε }ε be a family of viscous strong detonation waves satisfying (H4). Assume that the family of Eqs. (1.2) and profiles U¯ ε undergo transition from linear stability to linear instability at ε = 0. Moreover, assume that this transition is associated with passage of a single complex conjugate pair of eigenvalues of L(ε), λ± (ε) = γ (ε) + iτ (ε) through the imaginary axis, satisfying γ (0) = 0, τ (0) = 0, dγ /dε(0) = 0.

(1.24)

Then, given a growing weight ω satisfying (1.23) with θ0 sufficiently small, for r ≥ 0 sufficiently small and C > 0 sufficiently large, there are C 1 functions r → ε(r ), r →

Transition to Longitudinal Instability of Detonation Waves

11

T (r ), with ε(0) = 0, T (0) = 2π/τ (0), and a C 1 family of time-periodic solutions U˜ r (x, t) of (1.2) with ε = ε(r ), of period T (r ), with C −1 r ≤ U˜ r − U¯ ε Hω2 ≤ Cr.

(1.25)

Up to translation in x, t, these are the only time-periodic solutions nearby in · Hω2 with period T ∈ [T0 , T1 ] for any fixed 0 < T0 < T1 < +∞. That is, transition to linear instability of viscous strong detonation waves is “generically” (in the sense of (1.24)) associated with Hopf bifurcation to time-periodic galloping solutions, as asserted in the title of this paper. 1

The choices ω ≡ 1 and ω = eθ0 (1+|x| ) 2 are allowed in (1.23), as well as ω = (1 + |x|2 ) p , for any real p > 0. In Theorem 1.18, we need, in particular, θ0 < η0 , where η0 is as in Corollary 1.8, so that the spatial localization given by (1.25) is less precise than the spatial localization of the background profile U¯ ε . The smallness condition on θ0 is described in Remark 5.9. 2

1.7.3. Nonlinear instability. We complete our discussion with the following straightforward result verifying that the exchange of linear stability described in Theorem 1.18, as expected, corresponds to an exchange of nonlinear stability as well, the new assertion being nonlinear instability for ε > 0. Theorem 1.19. Under the assumptions of Theorem 1.18, the viscous strong detonation waves U¯ ε undergo a transition at ε = 0 from nonlinear orbital stability to instability; that is, U¯ ε is nonlinearly orbitally stable for ε < 0 and unstable for ε > 0. 1.8. Verification of stability/bifurcation conditions. The above theory not only describes the nature of possible bifurcation/exchange of stability but characterizes its occurrence in terms of corresponding spectral conditions involving zeros of the Evans function of the linearized operator about the wave. These may readily and efficiently be computed numerically [HuZ1,BHRZ], answering in a practical sense the question of whether or not such transitions which actually occur as parameters are varied in any given compact region. Much more can be said in certain interesting limiting cases. It is shown in [LyZ2] that in the small heat-release limit q → 0, strong detonations are Evans stable if and only if the limiting gas-dynamical profile (see Remark 1.7) is Evans stable. As noted in Corollary 1.16, this implies in particular that strong detonations are stable in the smallamplitude limit as the distance between endstates goes to zero with one endstate held fixed (forcing q → 0 as well). For an ideal gas law (1.4), stability of large-amplitude detonations in the small heat-release limit is strongly suggested by the recent asymptotic and numerical studies of [HLZ,HLyZ] indicating that viscous ideal gas shocks are stable for arbitrary amplitudes. A more interesting limit from the viewpoint of stability transitions is the smallviscosity, or ZND limit as ν, κ, d go to zero. Recall, [GS1,GS2], that in this limit, the viscous detonation profile approaches an invscid profile composed of a smooth reaction zone preceded by a shock discontinuity. In [Z4], it has recently been shown that strong detonations are stable in the ZND limit if and only if both the limiting ZND profile and the viscous shock profile associated with its component shock discontinuity satisfy spectral Evans stability conditions like those developed here for viscous detonations.

12

B. Texier, K. Zumbrun

Since viscous shocks for ideal gas law (1.4) as just mentioned are uniformly stable, this means that Evans stability of rNS profiles reduces in the small viscosity limit to Evans stability of the limiting ZND profile. For ZND profiles, there is a wealth of numerical [Er1,Er2,FW,S2,KS,BMR,BM,KS] and asymptotic [F1,FD,B,BN,S1,Er4] literature indicating that stability transitions do, and do often, occur. Indeed, a classic benchmark problem of Fickett and Woods [FW] tests numerical code for parameters = 1.2, E = 50, q = 50 for which transition to stability is known to occur as overdrive is varied as a bifurcation parameter [BMR]. In multidimensions, a theorem of Erpenbeck [Er3] gives a rigorous proof of instability for certain detonation types, occurring through high-frequency transverse modes (the only such proof to our knowledge). In short, the evidence is overwhelming that spectral bifurcation occurs in the ZND context, whence (by the results of [Z4]) also for (1.1) for ν, κ, d sufficiently small. Together with these observations, the results of this paper answer definitively and positively the fundamental question whether the reacting Navier–Stokes equations are adequate to capture the bifurcation phenomena observed for more than half a century in physical experiments [FD,Er1]. A very interesting problem would be to establish in one dimension a rigorous spectral instability result for ZND analogous to that of Erpenbeck for multi-d, thus completing an entirely mathematical proof; in this regard, we mention that the analyses of [BN,S1] appear to come very close.

1.9. Discussion and open problems. This analysis in large part concludes the one-dimensional program set out in [TZ2]. However, a very interesting remaining open problem is to determine linearized and nonlinear stability of the bifurcating time-periodic solutions, in the spirit of Sect. 4.3. For a treatment in the shock wave case with semilinear viscosity, see [BeSZ]. Likewise, it would be very interesting to carry out a numerical investigation of the spectrum of the linearized operator about detonation waves with varying physical parameters, as done in [LS,KS] in the inviscid ZND setting, but using the viscous methods of [Br1,Br2,BrZ,BDG,HuZ1] to treat the full reacting Navier–Stokes equations, in order to determine the physical bifurcation boundaries. Other interesting open problems are the extension to multi-dimensional (spinning or cellular) bifurcations, as carried out for artificial viscosity systems in [TZ2], and to the case of weak detonations (analogous to the case of undercompressive viscous shocks; see [HZ,RZ,LRTZ]). The strong detonation structure considerably simplifies both stability and bifurcation arguments over what was done in [LRTZ]. We remark that, at the expense of further complication, nonlinear stability of general (time-independent) combustion waves, including also weak detonations and strong or weak deflagrations, may be treated by a combination of the pointwise arguments of [LRTZ] and [RZ]. We remark finally that the restriction to a scalar reaction variable is for simplicity only. Indeed, the results of this article (as well as the results of the article by Lyng and Zumbrun [LyZ2] from which it draws) are independent of the dimension of the reactive equation, so long as the reaction satisfies an assumption of exponential decay of space-independent states (with temperature at −∞ above the ignition temperature). Plan of the paper. Lemma 1.6 and Corollary 1.8 are proved in Sect. 2. We give a detailed description of the low-frequency behavior of the resolvent kernel for the linearized equations in Sect. 3, following [MaZ3]. In Sect. 4, we prove Theorem 1.14, while Sect. 5 is devoted to the proof of Theorem 1.18. Finally, in Sect. 6, we prove Theorem 1.19.

Transition to Longitudinal Instability of Detonation Waves

13

2. Strong Detonations Proof of Lemma 1.6. Let U− be a given left endstate, with z − = 0, satisfying (1.10) and (1.11). We look for a right endstate U+ , with z + ∈ (0, 1], that satisfies (1.13), (1.11), and T+ < Ti . We note that (1.13)(i) determines u + and that T+ < Ti entails (1.13)(v). The Rankine-Hugoniot relations in the (τ+ , p+ ) plane are

p = −s 2 τ + c1 p = (c0 − sτ (1 + −1 ))−1 (c2 + sqz + + 21 s 3 τ 2 − s 2 c0 τ )

(R), (H),

where (R) is the Rayleigh line, corresponding to (1.13)(ii), (H) the Hugoniot curve, corresponding to (1.13)(iii), and where c0 := u − + sτ− ,

c1 := p− + s 2 τ− ,

1 c2 := ( p− u − − s E − ) + c02 s 2

depend on parameters U− and s. The temperature and Lax constraints for both endstates are

τ+ p+ < c Ti < τ− p− τ+−1 p+ < ( + 1)−1 s 2 < τ−−1 p−

(T)± , (L)± .

We restrict to left endstates satisfying in the large s regime p− = 2s 2 −1 τ− + p˜ − ,

τ− = O(1),

u − = s u˜ − ,

(2.1)

with u˜ − = O(1) and p˜ − = O(1). Under (2.1), conditions (T)− and (L)− are satisfied as soon as s is large enough. The Hugoniot curve takes the form −1 pH = u˜ − +τ− −(1+ −1 )τ

1 3 s (τ −(1+2 −1 )τ− )(τ −(1−2 −1 )τ− −2u˜ − )+sqz + . 2

Assume that u˜ − is such that τ− < (1 − 2 −1 )τ− + 2u˜ − < (1 + 2 −1 )τ− . 1 + −1

(2.2)

For any such u˜ − , any given τ− and any q > 0, if s is large enough then, for any z + ∈ (0, 1], the Hugoniot curve has two zeros τ < τ , with asymptotic expansions τ = (1 − 2 −1 )τ− + 2u˜ − + O(s −2 ), p˜ − u˜ − + qz + τ = (1 + 2 −1 )τ− − s −2 −1 + O(s −3 ). 2 τ− − u˜ −

(2.3) (2.4)

If s is large, by (2.2), τ0 < τ < τ , where τ0 := c0 s −1 (1 + −1 )−1 is the pole of (H).

14

B. Texier, K. Zumbrun

The Rayleigh line and the Hugoniot curve have at least one intersection point to the right of τ0 if pR (τ ) < 0 < pR (τ ). Under (2.2), the inequality 0 < pR (τ ) holds, and pR (τ ) < 0 holds as well if in addition p˜ − < −

p˜ − u˜ − + qz + . 2 −1 τ− − u˜ −

(2.5)

Let τ+ be an intersection point of (R) and (H) to the right of τ0 . Condition (T)+ is satisfied if τ+ = (1 + 2 −1 )τ− + s −2 τ˜+ + O(s −3 ),

(2.6)

(1 + 2 −1 )τ− ( p˜ − − τ˜+ ) < c Ti .

(2.7)

with

Condition (L)+ is satisfied if (1 + 2 −1 )τ− < (1 + (1 + )−1 )τ+ ,

(2.8)

which holds under (2.6), if s is large. We plug the ansatz (2.6) in the equation pH = pR , to find

p˜ −

qz + −1 −1 (1 +

)(1 + 2

) − 1 + . (2.9) τ˜+ = −1 (1 + 2 )τ− (1 + 2 −1 )τ− The intersection point τ+ is an admissible right specific volume if pH (τ+ ) > 0 and pR (τ+ ) > 0. These inequalities hold if τ < τ+ < (α + 1)τ− + s −2 p˜ − .

(2.10)

The inequalities (2.5), (2.7) and (2.10) are constraints on τ− , p˜ − , and u˜ − . The lower bound on τ+ in (2.10) is satisfied in the regime (2.1) if s is large. If we let p˜ − =

−2 qz + + O(s −1 ), τ−

then (2.5) holds. Finally, if τ− satisfies 1<

cTi 1 3 + 2 −1 − (1 + 2 −1 )τ− < 1 + , 4τ− qz +

then the upper bounds in (2.10) and (2.7) hold as well. The Rayleigh line (R), the Hugoniot curve (R) and the temperature (T) and Lax (L) constraints are pictured on Fig. 2. The black dots represent the intersection points of (R) and (H). Note that (L) and (R) imply τ− < τ+ for a strong detonation, so that only the intersection point to the right of τ− is admissible. (The other intersection point corresponds to a deflagration, see for instance [LyZ2], Sect. 1.4.)

Transition to Longitudinal Instability of Detonation Waves

15

Fig. 2. The Rankine-Hugoniot, Lax and temperature conditions

Proof of Corollary 1.8. Rewrite (1.12) as U = F(ε, U ). Let U±ε be the endstates of a family of strong detonations. The linearized equations at U±ε are governed by matrices ∂U F(ε, U¯ ±ε ) =

f

a± ∗ r 0 a±

.

(2.11)

The block triangular structure is a consequence of Assumption 1.1, 1.9, 1.14, and 1.15. f Under Assumption 1.1, the eigenvalues λ of a± , f

a± :=

ντ −1 0 −1 τ (ν − κc−1 )u κc−1 τ −1

∂u p − s − s −1 ∂τ p ∂e p u(∂u p − s −1 ∂τ p) + p u∂e p − s

,

satisfy λ2 + sκc−1 τ±−1 + s −1 ντ±−1 (s 2 + (∂τ p)± ) λ + κc−1 ντ±−2 (s 2 − σ±2 ) = 0.

(2.12)

The Lax condition (1.11) implies that the center subspace on both sides is trivial, that f f the eigenvalues of a− have opposite signs, and that the eigenvalues of a+ are negative. The eigenvalues λ of −sd −1 kd −1 φ(T± ) r a± := 1 0

16

B. Texier, K. Zumbrun

satisfy dλ2 + sλ − kφ(T± ) = 0. They are non-zero and have distinct signs on the −∞ side. On the +∞ side, there is one negative eigenvalue, and a one-dimensional kernel. In particular, U−ε is a hyperbolic rest point of the linearized traveling-wave ordinary differential equation, which implies (1.16)(i) with j = 0, by standard ODE estimates. However, the linearized travelingwave equations at U+ have a one-dimensional center subspace, which a priori precludes exponential decay (1.16). From Lemma 1.6, if U− ∈ O, then the system (1.12) has a line of equilibria that goes through U+ε . Any center manifold of (1.12) at U+ε contains all equilibria, so by dimension count it must consist of equilibria. Therefore, the 4-dimensional stable center manifold at U+ε consists (again by dimension count) of the union of the stable manifolds of all equilibria. Since solutions off of stable center manifold do not stay for all time in small vicinity of center manifold, any traveling-wave orbit must lie on the center-stable manifold, and so lies on the stable manifold of some equilibrium. Exponential decay, (1.16)(ii), j = 0, now follows by the stable manifold theorem. To prove (1.16) with j = 1, consider now the traveling-wave ODE in (U, ∂ε U ). The rest points satisfy F(ε, U ) = 0,

∂ε F(ε, U ) + ∂U F(ε, U )∂ε U = 0.

(2.13)

The kernel of ∂U F(ε, U+ε ) being one-dimensional, (2.13) has a two-dimensional manifold of solutions. Let (U+ε , V+ε ) be such a rest point. The linearized equations at (U+ε , V+ε ) are governed by matrices ∂U F(ε, U+ε ) 0 , ∗ ∂U F(ε, U+ε ) where the bottom left entry depends on second derivatives of F. In particular, the linearized equations have a two-dimensional center subspace. We can thus argue as above that any center manifold consists entirely of equilibria, and that (1.16)(ii) holds with j = 1. The proof of (1.16)(i) with j = 1 is similar. 3. Resolvent Kernel and Green Function Bounds The linearized equations about a traveling wave U¯ ε solution of (1.2) are ∂t U = L(ε)U,

(3.1)

where L(ε) is defined in (1.17). The coefficients of L(ε) are asymptotically constant at ±∞. Let L ± (ε) be the associated constant-coefficient, limiting operators: L ± (ε) := −A± ∂x + B± ∂x2 + G ± , with the notation A± := ∂U F(U±ε ), B± := B(U±ε ), G ± := ∂U G(U±ε ). Let L(ε)∗ denote the dual operator of L(ε). Its associated constant-coefficient, limiting operators ∗ ∂2 + G∗ . are L ± (ε)∗ = A∗± ∂x + B± x ±

Transition to Longitudinal Instability of Detonation Waves

17

3.1. Laplace transform. Consider the Laplace transform of the linearized equations, (L(ε) − λ)U = 0,

λ ∈ C, x ∈ R, U (ε, x, λ) ∈ C4 .

(3.2)

Eq. (3.2) can be cast as a first-order ordinary differential system in R7 , W = A(ε, λ)W,

λ ∈ C, x ∈ R, W (ε, x, λ) ∈ C7 ,

where the limits A± of A at ±∞ are given by ⎛ ⎞ −1 s −1 λ 0 −s −1 J b± −1 ⎠, A± := ⎝ 0 0 b± −1 s −1 λ∂τ f |± λ − ∂w g|± (∂w f |± − s −1 ∂τ f |± J )b±

(3.3)

(3.4)

where |± denotes evaluation at U±ε , b± := b(U±ε ), and J is defined in (1.6). Considered as an operator in L 2 (R; C4 ), L is closed, with domain H 2 dense in L 2 . Similarly, for all λ, the operator d − A(λ) : dx

H 1 (R; C7 ) ⊂ L 2 (R; C7 ) → L 2 (R; C7 )

is closed and densely defined. The following straightforward lemma gives a correspondence between (3.2) and (3.3). Lemma 3.1. Let λ ∈ C and f = ( f 1 , f 2 ) ∈ L 2 (R; C1 × C3 ). If the equation (L − λ)U = f

(3.5)

has a solution U =: (τ, w) ∈ H 2 (R; C1 × C3 ), then W := (τ, w, bw ) ∈ H 1 (R; C7 ) satisfies W = A(λ)W + F,

(3.6)

with F = ( f 1 , 0, f 2 ) ∈ L 2 (R; C7 ). Conversely, let F = ( f 1 , 0, f 2 ) ∈ L 2 (R; C7 ) and λ ∈ C. If W = (w1 , w2 ) ∈ H 1 (R; C7 ) satisfies (3.6), then a solution in H 2 (R; C4 ) to (3.5) with f = ( f 1 , f 2 ) is given by U = w1 . Similarly, the dual eigenvalue equation (L(ε)∗ − λ)U˜ = 0,

λ ∈ C,

y ∈ R, U˜ (ε, y, λ) ∈ C4 ,

(3.7)

can be cast as ˜ λ)W˜ , W˜ = A(ε,

λ ∈ C,

y ∈ R, W˜ (ε, y, λ) ∈ C7 ,

˜ ± of A ˜ at ±∞ are given by where the limits A ⎞ ⎛ tr btr−1 0 s −1 ∂τ f |± −s −1 λ ± ⎟ ⎜ tr−1 ⎟. ˜ ± := ⎜ 0 0 b± A ⎠ ⎝ tr −(∂ f tr + s −1 J tr ∂ f tr )btr−1 s −1 λJ tr λ − ∂w g|± w |± τ |± ± A correspondence between (3.7) and (3.8) holds, as in Lemma 3.1.

(3.8)

(3.9)

18

B. Texier, K. Zumbrun

3.1.1. The limiting, constant-coefficient equations. Associated with (3.2) and (3.7) are the limiting, constant-coefficient eigenvalue equations (L ± (ε) − λ)U = 0,

(3.10)

(L ± (ε)∗ − λ)U˜ = 0.

(3.11)

and

Definition 3.2 (Normal modes). We call normal modes the solutions (λ, U ) of Eqs. (3.10) and dual normal modes the solutions (λ, U˜ ) of Eqs. (3.11). Associated with (3.3) and (3.8) are the limiting, constant-coefficient differential equations W = A± (ε, λ)W,

(3.12)

˜ ± (ε, λ)W˜ , W˜ = A

(3.13)

and

˜ ± are defined in (3.4) and (3.9). where A± and A There is a correspondence between solutions of (3.10) and solutions of (3.12): Lemma 3.3. If (λ0 , U ), U =: (τ, w), is a normal mode, then W := (τ, w, bw ) solves (3.12) at λ = λ0 . Conversely, if W = (w1 , w2 ) ∈ C4 × C3 solves (3.12) at λ = λ0 , then (λ0 , w1 ) is a normal mode. In particular, (i) Eigenvalues μ of A± satisfy det(−μA± + μ2 B± + G ± − λ) = 0,

(3.14)

and associated eigenvectors, satisfying A± (λ)W = μW, have the form W = (U, w2 ) ∈ C4 × C3 , with U ∈ ker(−μA± + μ2 B± + G ± − λ), U =: (τ, w), w2 := μb± w. (3.15) (ii) Normal modes (λ, U ) satisfy U=

±

e xμ j (λ) U ± j (x, λ),

(3.16)

j

where the μ±j are eigenvalues of A± , and the U ± j are polynomials in x. The correspondence between (3.11) and (3.13) is similar. In particular, eigenvalues ˜ ± satisfy μ˜ of A ∗ det(μA ˜ ∗± + μ˜ 2 B± + G ∗± − λ) = 0,

(3.17)

˜ ± (λ)W˜ = μ˜ W˜ , have the form W˜ = (U˜ , w˜ 2 ) ∈ associated eigenvectors, satisfying A C4 × C3 , with ∗ tr + G ∗± − λ), U˜ =: (τ˜ , w), ˜ w˜ 2 := μb± w, ˜ U˜ ∈ ker(μA ˜ ∗± + μ˜ 2 B±

(3.18)

Transition to Longitudinal Instability of Detonation Waves

19

and dual normal modes satisfy U˜ =

±

e y μ˜ j (λ) U˜ ± j (y, λ),

(3.19)

j

˜ ± , and the U˜ ± are polynomials in y. where the μ˜ ±j are eigenvalues of A j ˜ ± (λ), then μ(λ) If μ(λ) ˜ is an eigenvalue of A ˜ = −μ(λ¯ ), where μ(λ¯ ) is some eigen¯ value of A± (λ). The matrices A± , B± and G ± having real coefficients, the complex conjugate of μ(λ¯ ) is an eigenvalue of A± (λ). We can thus relate the solutions of (3.14) and (3.17) by μ(λ) ˜ = −μ(λ). Note that z +ε = 1, φ(T+ε ) = 0, and φ (T+ε ) = 0 imply that the v derivative of the coupling reaction term kφ(T )z vanishes when evaluated at U±ε . In particular, in (v, z) coordinates, A± =

∂v f |± ∂z f |± 0 −s

, B± =

b1|± b2|± 0 d

, G± =

0 0 0 −kφ±

,

with the notation of Sect. 1.3, |± denoting evaluation at U±ε , and φ± := φ(T±ε ), so that φ+ = 0, while by (1.10) and Assumption 1.2, φ− > 0. This triangular structure of the matrix −μA± + μ2 B± + G ± allows a simple description of the solutions of (3.14). Indeed, (3.14), a polynomial, degree four equation in λ, splits into the linear equation μs + μ2 d − kφ± − λ = 0,

(3.20)

and the degree three equation

det(−μ∂v f |± + μ2 b1|± − λ) = 0.

(3.21)

By inspection, (3.20) is quadratic in μ, while (3.21) is degree five in μ. Thus, the four solutions λ(μ) of (3.14) correspond to seven eigenvalues μ(λ) of A(λ). 3.1.2. Low-frequency behaviour of the normal modes. We describe here the behaviour of the normal modes in a small ball B(0, r ) := {λ ∈ C, |λ| < r }. Definition 3.4 (Slow modes, fast modes). We call slow mode at ±∞ any family of normal modes {(λ, U (λ)}λ∈B(0,r ) ,

f or some r > 0,

such that, in (3.16), μ±j (0) = 0, for all j. Normal modes which are not slow are called fast modes. We define similarly slow dual modes and fast dual modes, using (3.19). The solutions of (3.20) are 1 1 (−s + (s 2 + 4d(λ + kφ± )) 2 ), 2d 1 1 2 2 μ± 5 = − 2d (s + (s + 4d(λ + kφ± )) );

μ± 4 =

(3.22) (3.23)

20

B. Texier, K. Zumbrun

they depend analytically on λ (in the case of μ4+ and μ+5 , this is ensured by s > 0, assumed in Definition 1.5), and satisfy, for λ in a neighborhood of the origin, ± μ+4 = s −1 λ − s −3 dλ2 + O(λ3 ), μ− 4 > 0, μ5 < 0.

(3.24)

Note that the inequality μ− 4 > 0 is a consequence of φ− > 0. By (3.18), the eigenvector ˜ + that is associated with −μ+ is of A L +4

=

4

+4 + μ4 b+tr +4

∈ C4 × C3 , +4 (0) = +4 ,

(3.25)

where tr +4 := 0 0 0 1

(3.26)

is the reactive left eigenvector of A+ associated with the reactive eigenvalue of A+ . We ± − ± ˜ label L − 4 , L 5 the eigenvectors of A± associated with −μ4 and −μ5 . By the block − − 2 structure of −μA± + μ B± + G ± , spectral separation of μ4 and μ5 (and of μ+4 and ± μ+5 ), the eigenvectors L ± 4 and L 5 are analytic in λ, in a neighborhood of the origin (see for instance [Kat], II.1.4); in particular, +4 = +4 + O(λ), μ+4 b+tr +4 = O(λ).

(3.27)

The solutions of (3.21), seen as an equation in λ, are the eigenvalues of the matrix −μ∂v f |± + μ2 b1|± . By (1.3) and the block structure of A± , we find that the spectrum

of ∂v f |± is

σ (∂v f |± ) = {−s(ε) − σ± , −s(ε), −s(ε) + σ± }.

The eigenvalues of ∂v f |± are distinct, hence, by Rouché’s theorem, the eigenvalues of

−∂v f |± + μb1|± are analytic in μ, for small μ, with expansions λ1 = s + σ± + β1± μ + O(μ2 ), λ2 = s + β2± μ + O(μ2 ), λ3 = s − σ± + β3± μ + O(μ2 ).

(3.28)

By (H3) (Sect. 1.5), β ± j > 0 for all j. Inversion of these expansions yields analytic ± functions μ j , called fluid modes, and defined in a neighborhood of the origin in Cλ : −1 −3 ± 2 3 μ± 1 := (s + σ± ) λ − (s + σ± ) β1 λ + O(λ ), ± ± 2 −1 −3 3 μ2 := s λ − s β2 λ + O(λ ), −1 −3 ± 2 3 μ± 3 := (s − σ± ) λ − (s − σ± ) β3 λ + O(λ ).

˜ that are associated with these eigenvalues are By (3.18), the eigenvectors of A ±j ± L j (λ) = ∈ C4 × C3 , ±j (0) = ±j , 1 ≤ j ≤ 3, tr ± μ±j b± j

(3.29)

(3.30)

Transition to Longitudinal Instability of Detonation Waves

21

± ± where the vectors ± 1 , 2 and 3 are the left eigenvectors of A± associated with the fluid eigenvalues −s − σ± , −s, and −s + σ± ; they have the form

tr ±j := ∗ ∗ ∗ 0 ,

1 ≤ j ≤ 3.

(3.31)

The eigenvalues of −∂v f |± + μb1|± being distinct, the associated eigenvectors are analytic as well, so that the L ±j , 1 ≤ j ≤ 3, are analytic in λ; in particular, ±j = ±j + O(λ),

tr ± μ±j b± j = O(λ).

(3.32)

Finally, the equation det(−μ∂v f |± +μ2 b1|± ) = 0 has two non-zero solutions γ6± , γ7± , corresponding to the remaining two (fast) modes, solutions of κτ±−2 c−1 sνμ2 + (κc−1 (s 2 − τ±−2 e± ) + νs 2 )τ±−1 μ + s(s 2 − σ±2 ) = 0.

(3.33)

The Lax condition (1.11) implies that solutions of (3.33) are distinct and have small frequency expansions ± μ± 6 = γ6 + O(λ), ± μ7 = γ7± + O(λ),

γ6± < 0, γ7− > 0, γ7+ < 0.

(3.34)

± ± ± ˜ We label L ± 6 and L 7 the eigenvectors of A associated with −μ6 and −μ7 . Again, by ± ± spectral separation, L 6 and L 7 are analytic in λ.

Lemma 3.5. For some r > 0, Eqs. (3.13) have analytic bases of solutions in B(0, r ), ±

B˜ ± := {V˜ j± }1≤ j≤7 , V˜ j± := e−yμ j (λ) L ±j (λ),

(3.35)

where the eigenvalues μ±j are given in (3.22), (3.23), (3.29), and (3.34) and the eigenvectors associated with the slow modes are given in (3.25), (3.27), (3.30) and (3.32). Proof. The above discussion describes analytic families μ±j , L ±j , such that the vectors V j± defined in (3.35) are analytic solutions of (3.13). For λ = 0, the eigenvalues μ±j are simple, so that the families B˜ ± define bases of Eqs. (3.13). By inspection of the expansions at λ = 0, the families B˜ ± define bases of Eqs. (3.13) at λ = 0 as well. The above low-frequency expansions of the eigenvalues show that ˜ − (λ)W˜ has a 3-dimensional subspace of solutions associated (i) Equation W˜ = A with slow modes (μ−j , j = 1, 2, 3) and 4-dimensional subspace of solutions asso− − − ciated with fast modes (μ− 4 , μ5 , μ6 , μ7 ). ˜ + (λ)W˜ has a 4-dimensional subspace of solutions associated (ii) Equation W˜ = A with slow modes (μ+j , j = 1, 2, 3, and μ+4 ) and a 3-dimensional subspace of solutions associated with fast modes (μ+5 , μ+6 , μ+7 ).

22

B. Texier, K. Zumbrun

3.1.3. Description of the essential spectrum. We adopt Henry’s definition of the essential spectrum [He]: Definition 3.6 (Essential spectrum). Let B be a Banach space and T : D(T ) ⊂ B → B a closed, densely defined operator. The essential spectrum of T, denoted by σess (T ), is defined as the complement of the set of all λ such that λ is either in the resolvent set of T, or is an eigenvalue with finite multiplicity that is isolated in the spectrum of T. By Lemma 3.3, the matrix A± (λ) has a non trivial center subspace if and only if λ ∈ C± , C± := {λ ∈ C, det(−iξ A± − ξ 2 B± + G ± − λ) = 0, for some ξ ∈ R}. The following lemma can be found in [He] (Theorem A.2, Chap. 5 of [He], based on Theorem 5.1, Chap. 1 of [GK]): Lemma 3.7. The connected component of C\ (C− ∪ C+ ) containing real +∞ is a connected component of the complement of the essential spectrum of L(ε). The reactive eigenvalues of −iξ A± − ξ 2 B± + G ± are λ = iξ s − ξ 2 d − kφ± . For small |ξ |, the fluid eigenvalues satisfy λ = iαξ − βξ 2 + O(ξ 3 ),

α ∈ R, β > 0,

as described in Sect. 3.1.2; for large |ξ |, they satisfy λ = −ξ 2 (α + O(ξ −1 ))

(parabolic eigenvalues),

(3.36)

with α ∈ {ντ±−1 , κc−1 τ±−1 }, or λ = isξ + O(1)

(hyperbolic eigenvalue).

(3.37)

This implies that the essential spectrum is confined to the shaded area in Fig. 3, the boundary of which is the union of an arc of parabola and two half-lines. (The origin λ = 0 is an eigenvalue, associated with eigenfunction (U¯ ε ) ; the existence of bifurcation eigenvalues γ (ε) ± iτ (ε) is assumed in Theorem 1.18, the proof of which is given in Sect. 5.) Remark 3.8. The essential spectrum, as given by Definition 3.6, is not stable under relatively compact perturbations (see [EE], Chap. 4, Ex. 2.2); namely, a domain of the complement of the essential spectrum of a (closed, densely defined) operator T is either a subset of the complement of the essential spectrum of T + S, or is filled with point spectrum of T + S, where S is a relatively compact perturbation of T. Remark 3.9. By the Fréchet-Kolmogorov theorem, L is a relatively compact perturbation of L ± . (This observation is the first step of the proof of Lemma 3.7, see Henry [He].) The pathology described in Remark 3.8 does not occur in the right half-plane here, as we know by an energy estimate that if λ is large and real, λ ∈ / σ p (L).

Transition to Longitudinal Instability of Detonation Waves

23

Fig. 3. Spectrum of L(ε)

3.1.4. Gap Lemma and dual basis. Let be the connected component of C \ (C− ∪ C+ ) containing real +∞. Definition 3.10 (Stable and unstable subspaces at ±∞). Given λ ∈ ∪ B(0, r ), r as ˜ ± (λ)) the stable subspace of A ˜ ± (λ) (i.e., the subspace of in Lemma 3.5, denote by S(A generalized eigenvectors associated with eigenvalues with negative real parts) and by ˜ ± (λ)) the unstable subspace of A ˜ ± (i.e., the subspace of generalized eigenvectors U (A associated with eigenvalues with positive real parts). We define similarly S(A± (λ)) and U (A± (λ)). By definition of C± , given λ ∈ , the matrices A± (λ) do not have purely imaginary ˜ ± (λ)) ⊕ U (A ˜ ± (λ)) = C7 , eigenvalues, so that S(A± (λ)) ⊕ U (A± (λ)) = C7 , and S(A for all λ ∈ . ˜ ± (λ)) have analytic bases in . ˜ ± (λ)) and U (A Lemma 3.11. The vector spaces S(A Proof. By simple-connectedness of , the lemma follows from a result of Kato ([Kat], II.4), that uses spectral separation in . Corollary 3.12. Equations (3.13) have analytic bases of solutions in . Proof. Basis elements of the stable and unstable spaces defined in Definition 3.10 are associated, through the flow of (3.13), with bases of solutions of (3.13). The matrices ˜ ± depending analytically on λ, the flow of (3.13) is analytic in λ. A Lemma 3.13. For λ real and large, dim S(A+ (λ)) = dim S(A− (λ)) = 3. Proof. From Lemma 3.3, μ is an eigenvalue of A± (λ) if and only if λ is an eigenvalue of −μA± +μ2 B± +G ± . As in Sect. 3.1.3, for large μ, the eigenvalues of −μA± +μ2 B± +G ± are sμ + O(1) (hyperbolic mode) and ντ±−1 μ2 + O(μ), κτ±−1 c−1 μ2 + O(μ), dμ2 + O(μ) (parabolic modes), c−1 as in Assumption 1.1. Inversion of these expansions gives three stable eigenvalues for both A− and A+ .

24

B. Texier, K. Zumbrun

Fig. 4. Normal modes on the −∞ side

Remark 3.14. The above lemma implies in particular that is a domain of consistent splitting, as defined in [AGJ]. (See also Sect. 3.1 of [LyZ2].) ˜ + (λ)) with soluGiven λ ∈ , the flow of (3.13) associates basis elements of S(A tions of (3.13) which are exponentially decaying as t → +∞, and basis elements of ˜ − (λ) with solutions which are exponentially decaying as t → −∞. Similarly, the U (A ˜ − (λ)) and U (A ˜ + (λ) are associated with exponentially growing solutions, at spaces S(A −∞ and +∞ respectively. Definition 3.15 (Decaying and growing normal modes). We call decaying dual normal mode at ±∞ any continuous family of dual normal modes {λ, U˜ (λ)}, λ ∈ B(0, r ), r as in Lemma 3.5, such that for all λ ∈ ∩ B(0, r ), U˜ (λ) corresponds to a decaying solution of (3.13) at ±∞. Families of normal modes which are not decaying are growing. We define similarly decaying dual normal modes and growing dual normal modes. By continuity of the eigenvalues and spectral separation in , if for some λ ∈ a continuous family of normal modes corresponds to a decaying (resp. growing) solution, then it corresponds for all λ ∈ to a decaying (resp. growing) solution. By (1.11), (3.24) and (3.29), μ+1 , μ+2 , μ+3 and μ+4 are growing (in the sense of Definition 3.15) at +∞, while μ+5 , μ+6 and μ+7 are decaying. − − − − − − Similarly, μ− 3 , μ5 and μ6 are growing, while μ1 , μ2 , μ4 and μ7 are decaying. The normal modes with which the characteristics of (1.1) are associated are pictured on Figs. 5 and 4. In particular, slow normal modes associated with incoming characteristics are growing. Definition 3.16 (Normal residuals). A map (y, λ) → + (y, λ) ∈ C7 defined on [y0 , +∞) × B(0, r ), for some y0 > 0, r > 0, is said to belong to the class of normal

Transition to Longitudinal Instability of Detonation Waves

25

Fig. 5. Normal modes on the +∞ side

residuals if it satisfies the estimates |+ | ≤ C,

|∂ y + | ≤ C(|λ| + e−θ|y| )

for some θ > 0 and C > 0, uniformly in y ≥ y0 and λ ∈ B(0, r ). We define similarly the class of normal residuals on (−∞, −y0 ) × B(0, r ). Lemma 3.17 (Fast dual modes). Equations (3.8) has solutions W˜ 4− , W˜ 5+ , W˜ 6+ , W˜ 7+ (growing)

and W˜ 5− , W˜ 6− , W˜ 7− (decaying),

which for λ ∈ B(0, r ), r possibly smaller than in Lemma 3.5, satisfy −yμ±j (λ) ˜ ± + λ ˜± , L ±j (0) + e−θ|y| y ≷ ±y0 , W˜ ± j =e 1j 2j

(3.38)

for some y0 > 0 independent of λ, where the constant vectors L ±j (0) are defined in ˜ ± , ˜ ± are normal residuals in the sense of Definition 3.16. Sect. 3.1.2, and 1j

2j

Proof. With the description of the normal modes in Lemma 3.5, this is a direct application of the Gap Lemma (for instance in the form of Proposition 9.1 of [MaZ3]). Lemma 3.18 (Slow dual modes). Equations (3.8) has solutions W˜ 1− , W˜ 2− (gr owing)

and W˜ 3− , W˜ 1+ , W˜ 2+ , W˜ 3+ , W˜ 4+ (decaying),

which for λ ∈ B(0, r ), r possibly smaller than in Lemma 3.5, satisfy −yμ±j (λ) ± ˜ ± , y ≷ ±y0 , W˜ ± L = e (0) + λ j j j

(3.39)

26

B. Texier, K. Zumbrun

for some y0 > 0 independent of λ, where the constant vectors L ±j (0) are defined in ˜ ± are normal residuals. Sect. 3.1.2, and j

Proof. The Conjugation Lemma ([MeZ]; Lemma 3.1 of [MaZ3]) implies that there ˜ + (·, λ)}λ∈B(0,r ) , for some r > 0 possiexists a family of matrix-valued applications { ˜ + is invertible for all λ and bly smaller than in Lemma 3.5, such that the matrix Id + + ˜ is smooth in y and analytic in λ, with exponential bounds y, the application +

˜ | ≤ C jk e−θ y , |∂λ ∂xk j

for some θ > 0, C jk > 0, for y ≥ y0 ,

for some y0 > 0, and such that any solution W˜ of (3.8) has the form ˜ + )V˜ + , W˜ = (Id +

for y ≥ y0 ,

(3.40)

where V˜ + is a dual normal mode, and, conversely, if V˜ + is a dual normal mode, then W˜ defined by (3.40) solves (3.8) on y ≥ y0 . Equation (3.8) at λ = 0 has a four-dimensional subspace of constant solutions; let {W˜ 0j }1≤ j≤4 be a generating family. The normal modes with which, through (3.40), the W˜ 0j are associated are slow normal modes. Hence, by Lemma 3.5, there exist coordinates c jk such that ˜ + (·, 0)) W˜ 0j = (Id + c jk L +k (0), y ≥ y0 , 1≤k≤4

which implies in particular that the matrix c := (c jk )1≤ j,k≤4 is invertible. Then, for 1 ≤ j ≤ 4, ˜ + (·, 0))L + (0) = (Id + (c−1 ) jk W˜ k0 , j 1≤k≤4 +

˜ (·, 0))L + (0) is constant, hence, by exponential decay of ˜ + , equal in particular, (Id + j to L +j (0). We can conclude that, for 1 ≤ j ≤ 4, ˜ + )V˜ + W˜ +j := (Id + j (where V˜ j+ is defined in Lemma 3.5) is a solution of (3.8) on y ≥ y0 , which can be put in the form (3.39). The proof on the −∞ side is based similarly on the decomposition of the fluid components of the W˜ 0j onto the (fluid) dual slow modes V˜ j− , for 1 ≤ j ≤ 3. 3.1.5. Duality relation and forward basis. We use the duality relation, introduced in [MaZ3], W˜ tr SW = 1

(3.41)

that relates solutions W of the forward Eq. (3.3) with solutions W˜ of the adjoint Eq. (3.8) through the conjugation matrix in (τ, w, bw ) coordinates ⎛ ⎞ −A11 −A12 0 S := ⎝ −A21 −A22 IdC3 ⎠, 0 −IdC3 0

Transition to Longitudinal Instability of Detonation Waves

27

where A is the convection matrix defined in (1.18). Namely, W is a solution of (3.3) if and only if it satisfies (3.41) for all solutions W˜ of (3.8), and conversely W˜ is a solution of (3.8) if and only if it satisfies (3.41) for all solutions W of (3.3). (See Lemma 4.2, [MaZ3]; note that the reactive term contains no derivative, hence does not play any role here.) Remark that there exist vectors rk± such that ±j A±rk± = −δ jk , Let Rk± be vectors of the form Rk±

:=

rk± ∗

1 ≤ j, k ≤ 4.

+ e−θ ± 1k ,

(3.42)

(3.43)

where for 1 ≤ k ≤ 4, rk± are given by (3.43), and where ± 1k are normal residuals. With the notation of Lemmas 3.17 and 3.18, let L ±j (0) if μ±j is slow, ¯L ± := (3.44) j ˜± L ±j (0) + e−θ|y| if μ±j is fast. 1j Lemma 3.19 (Forward and dual basis) For some r > 0 and y0 > 0, • Equation (3.3) has analytic bases of solutions {W1± , . . . , W7± }λ∈∪B(0,r ) , for y ≷ ±y0 ; • Equation (3.8) has analytic bases of solutions {W¯ 1± , . . . , W¯ 7± }λ∈∪B(0,r ) , for y ≷ ±y0 , such that for λ ∈ B(0, r ), ±

xμ j (λ) ± W± (R ± j =e j + λ j ),

W¯ ± j =e

−yμ±j (λ)

¯ ± ), ( L¯ ±j + λ j

y ≷ ±y0 , y ≷ ±y0 ,

(3.45) (3.46)

± ¯± ¯± where R ± j and L j are defined in (3.43) and (3.44), and j and j are normal residuals; the fast forward modes W4− and W7+ satisfy also ε (U¯ ) (x) ± + λ±j (x, λ), x ≷ ±y0 , (3.47) W j (x, λ) = ∗

where |±j | + |∂x ±j | ≤ Ce−θ|x| , for some C, θ > 0, uniformly in λ ∈ B(0, r ). 7 Proof. Given a family {F1, . . . , F7 } of vectors in C , let col(F j ) denote the 7 × 7 matrix col(F j ) := F1 . . . F7 . Let y0 , r, and W˜ ± j as in Lemma 3.17 and 3.18. For all λ ∈ ∪ B(0, r ), the families − − ˜ ˜ {W , . . . , Wn } and {W˜ + , . . . , W˜ n+ } are bases of solutions of (3.8), on y ≤ −y0 and 1

1

˜ 0± := col(W˜ ± ) are invertible y ≥ y0 respectively. In particular, the 7 × 7 matrices W j for all λ ∈ B(0, r ) and y ≷ ±y0 . Let ˜ 0± )tr S)−1 =: col(W 0± ). W0± := ((W k

(3.48)

28

B. Texier, K. Zumbrun

For the forward modes W 0± j defined in (3.48) to satisfy the low-frequency description ±

xμ j (λ) −θ|x| 0± W 0± (R 0± 1 j + λ0± j =e j +e 2 j ),

y ≷ ±y0 ,

(3.49)

0± where R 0± j are constant vectors and j are normal residuals, it suffices, by (3.41), that

0± 0± the matrices R0± := col(R 0± j ) and := col( j ) satisfy

Ltr SR0 = IdC7 , 0 ˜ 1 )tr S01 = − ˜ tr (L + e−θ|x| 1 SR , ˜ 01 (L + e−θ|x|

˜ 2 )tr S02 = + λ

0 ˜ tr − 2 SR ,

(3.50) (3.51) (3.52)

±

˜ := col( ¯ ± ) appear in the low-frequency description where L± := col(L ±j (0)) and j of the W˜ ± . In (3.50)–(3.52), the ± exponents are omitted. The matrices L± being invertj ible, (3.50) (with + or −) has a unique solution, and, for y0 large enough and r small enough, Eqs. (3.51) and (3.52) have unique solutions in the class of normal residuals. Note that for 1 ≤ j, k ≤ 4, Eq. (3.50) reduces to (3.42), up to exponentially decaying terms, so that the vectors Rk0± have the form (3.43). Remark now that (U¯ ε ) satisfies L(ε)(U¯ ε ) = 0, and decays at both −∞ and +∞, hence (U¯ ε ) is associated with decaying fast normal modes; by Lemma 3.17, there exist constants c±j , such that ε (U¯ ) (y) = c4 − W40− |λ=0 = c+j W 0+ (3.53) j |λ=0 . ∗ 5≤ j≤7

We may assume, without loss of generality, that c7+ = 0. Let now W− := W10− W20− W30− c4− W40− W50− W60− W70− , W+ := W10+ W20+ W30+ W40+ W50+ W60+ 7j=5 c+j W 0+ , j and W± =: col(W ± j ). These forward modes satisfy (3.45) and (3.47). Let finally ±tr ± −1 ˜ 0± and ¯ := (SW ) =: col(W¯ ± W j ), so that, in particular, the slow modes of W ¯ ± coincide. We can prove as above that the low-frequency description (3.49) of the W forward modes carries over to the dual modes through the duality relation, so that (3.46) is satisfied. 3.1.6. The resolvent kernel. Let L 2 (, D (R)) := {φ ∈ D ( × R), for all ϕ ∈ D(R), φ, ϕ ∈ L 2 ()}. A linear continuous operator T : L 2 (R) → L 2 (R) operates on L 2 (R, D (R)), by T φ, ϕ := T φ, ϕ. Let τ(·) δ ∈ L 2 (R, D (R)) be defined by τx δ, ϕ = ϕ(x), for all x ∈ R. Definition 3.20 (Resolvent kernel). Given λ in the resolvent set of L(ε), define the resolvent kernel Gλ of L(ε) as an element of L 2 (Rx , D (R y )) by Gλ := (L(ε) − λ)−1 τ(·) δ.

Transition to Longitudinal Instability of Detonation Waves

29

Given y ∈ R, let s y = sgn(y), and s ˜ D(y) := { j, μ˜ jy slow and decaying},

so that ˜ D(y) = {3}, if y < 0,

˜ D(y) = {1, 2, 3, 4}, if y > 0.

Given x, y ∈ R, let D(x, y) be the set of all ( j, k) such that for all x, y, for λ > 0 and s |λ| small enough, (μsjx x − μky y) < 0, that is, s

D(x, y) = {( j, k), μsjx andμ˜ ky slow and decaying} {( j, j), sx = s y , |y| < |x|, μsjx slow and decaying} s {( j, j), sx = s y , |x| < |y|, μ˜ jy slow and decaying}, so that

⎧ {(1, 1), (2, 2), (1, 3), (2, 3)}, ⎪ ⎪ ⎪ {(1, 3), (2, 3), (3, 3)}, ⎪ ⎪ ⎨ ∅, D(x, y) := {( j, k), 1 ≤ j ≤ 4, 1 ≤ k ≤ 2}, ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ {( j, j), 1 ≤ j ≤ 4}, ∅,

x ≤ y ≤ 0, y ≤ x ≤ 0, y ≤ 0 ≤ x, x ≤ 0 ≤ y, 0 ≤ x ≤ y, 0 ≤ y ≤ x.

Define now the excited term Eλ (x, y) := λ−1 (U¯ ε ) (x)

s

s tr −yμ jy (λ)

[c0j,s y ] jy

e

,

˜ j∈D(y)

and the scattered term

Sλ (x, y) :=

( j,k)∈D(x,y)

s

s tr xμsjx (λ)−yμky (λ)

j,s

[ck,sxy ]r sj x ky

e

,

where the vectors ±j are defined in (3.26) and (3.31), the vectors r ± j are defined in (3.42), j,±

0 ] and [c and the transmission coefficients [ck,± k,± ] are constants.

Proposition 3.21. Under (1.20), for λ ∈ B(0, r ), the radius r being possibly smaller 0 ] and [c j,± ] such that the than in Lemma 3.5, there exist transmission coefficients [ck,± k,± resolvent kernel decomposes as Gλ = Eλ + Sλ + Rλ , where Rλ satisfies

|∂xα ∂ yα Rλ | ≤ Ce−θ|x−y| + Cλα e−θ|x|

+ λ

˜ j∈D(y) 1+min(α,α )

+ λα e−θ|x|

sy

e−yμ j

sx

e xμk

sy

−yμ j

,

( j,k)∈D(x,y)

for α ∈ {0, 1, 2, }, α ∈ {0, 1}, for some C, θ > 0, uniformly in x, y and λ ∈ B(0, r ).

30

B. Texier, K. Zumbrun

Proof. The duality relation (3.41) allows to apply Proposition 4.6 of [MaZ3] (and its Corollary 4.7), which describes Gλ as sums of pairings of forward and dual modes, for λ in the intersection of and the resolvent set of L . By Lemma 3.19, Gλ extends as a meromorphic map on B(0, r ). The excited term Eλ comprises the pole terms, corresponding to pairings of a fast, decaying forward mode associated with the derivative of the background wave with a slow, decaying dual mode, i.e. W7+ /W¯ 3− for y ≤ 0 and W4− /W¯ +j for y ≥ 0, 1 ≤ j ≤ 4. The next-to-leading order term is the scattered term Sλ . It corresponds to pairings of a slow forward mode with a slow dual mode. For y ≤ 0, the scattered term comprises only fluid modes. For y ≤ 0 ≤ x and for 0 ≤ y ≤ x, the scattered term vanishes, as there are no outgoing modes to the right of the shock (see Figs. 1 and 4). By the Evans function condition (1.20) and Lemma 6.11 of [MaZ3], the residual Rλ does not contain any pole term; it comprises: (a) the contribution of the normal residuals to the fast forward/slow dual pairings involving the derivative of the background profile, (b) the fast forward/slow dual pairings not involving the derivative of the background profile, (c) the contribution of the normal residuals to the slow forward/slow dual pairings, and (d) the slow forward/fast dual pairings. Term (a) is bounded by the first two terms in the upper bound for Rλ . Term (b) is smaller than term (a) by a O(λ) factor. Term (c) is bounded by the third term in the upper bound. By the Lax condition (1.11), the Evans function condition (1.20) and Lemma 6.11 of [MaZ3], term (d) is also bounded by the third term. 3.1.7. The Evans function. By Lemma 3.13, for all λ ∈ , the dimensions of U (A− (λ)) and S(A+ (λ)), the vector spaces associated with decaying solutions of (3.3) at −∞ and +∞, add up to the full dimension of the ambient space: dim U (A− (λ)) + dim S(A+ (λ)) = 7. Definition 3.22 (Evans function). On ∪ B(0, r ), define the Evans function as D(ε, λ) := det(W1− , W2− , W4− , W7− , W5+ , W6+ , W7+ )|x=0 . The Evans function D satisfies Proposition 1.10; it has a zero at λ = 0, as reflected in equality (3.53). 3.2. Inverse Laplace transform. Similarly as in Sect. 3.1.6 (or Sect. 2 of [MaZ3]), define the Green function of L(ε) as G := et L(ε) τ(·) δ,

(3.54)

where {et L(ε) }t≥0 is the semi-group generated by L(ε). That is, the kernel of the integral operator et0 L(ε) is the Green function G evaluated at t = t0 . Assuming (1.20), the inverse Laplace transform representation of the semi-group by the resolvent operator (see for instance [Pa] Theorem 7.7; [Z3] Prop. 6.24) yields η0 +i∞ 1 G(ε, x, t; y) = P.V. eλt Gλ (ε, x, y) dλ, (3.55) 2πi η0 −i∞ for η0 > 0 sufficiently large.

Transition to Longitudinal Instability of Detonation Waves

31

3.2.1. Pointwise Green function bounds. Introduce the notations y 2 errfn(y) := e−z dz, −∞

and let, for y < 0 : e :=

⎛

⎞

⎛

⎛

⎞⎞

y − a−t y + a−t 0 ⎝errfn ⎝ 3 ⎠ − errfn ⎝ 3 ⎠⎠, ]−tr [c3,− 3 4β − 4β − j t j t

for y > 0 :

(3.56)

⎞ ⎛ ⎞⎞ +t +t y − a y + a ⎝errfn ⎝ j ⎠ − errfn ⎝ j ⎠⎠, e := [c0j,+ ]+tr j 4β +j t 4β +j t 1≤ j≤4

(3.57)

E(ε, x, t; y) := (U¯ ε ) (x)e(ε, t; y).

(3.58)

⎛

⎛

and In (3.56)–(3.57) and below, the {a ± j }1≤ j≤4 are the characteristic speeds, i.e. the limits at ±∞ of the eigenvalues of ∂U F(ε, U¯ ε ), ordered as in (1.3), the β ± j , 1 ≤ j ≤ 3, are the positive diffusion rates that were introduced in (3.28) (and which depend on ε, as do the characteristic speeds), and β4+ := d, the species diffusion coefficient. Let for y < 0 : − − −tr − 21 −(x−y−a j t)2 /4β j t r− (4πβ − e S := χ{t≥1} j j j t) 1≤ j≤2

⎛ − 1 e−x ⎝ − −tr 2 +χ{t≥1} −x r3 3 (4πβ3− t)− 2 e−(x−y+(s−σ− )t) /4β3 t x e +e ⎞ j,− j,− 2 j,− 1 j,− −tr + [c3,− ]r − (4πβ3,− t)− 2 e−(x−z 3,− ) /4β3,− t ⎠, j 3

(3.59)

1≤ j≤2

and for y > 0: S := χ{t≥1}

ex + − 21 −(x−y−a +j t)2 /4β +j t r +j +tr e j (4πβ j t) −x x e +e 1≤ j≤4

+χ{t≥1}

e−x e−x

+ ex

k,− 2 /4β k,− j,+ t

k,− − 2 −(x−z j,+ ) − +tr [ck,− e j,+ ]rk j (4πβ j,+ t) 1

,

1≤ j≤4 1≤k≤2

(3.60) where the indicator function χ{t≥1} is identically equal to 1 for t ≥ 1 and 0 otherwise, and ± 2 aj |x| ± |y| j,± k,± ± ± −1 z j,± := a j (t − |y||ak | ), βk,± := ± β j + ± βk± . |a j |t |ak |t ak±

32

B. Texier, K. Zumbrun

Let H := h(ε, t, x, y)τx+st δ,

h+4 ≡ 0,

(3.61)

where the notation τ(·) δ was introduced at the beginning of Sect. 3.1.6. j,± Let finally S0 be the scattered term defined in (3.59)–(3.60) in which [ck,± ] = 1 for all j, k. j,±

Proposition 3.23. Under (1.20), there exists transmission coefficients [c0j,± ] and [ck,± ], satisfying ⎧ 0 ⎪ ⎨ [c4+ ] = 0, 0 ](v ε − v ε ) + [c1,− ]r − + [c2,− ]r − , rk = [ck, − + (3.62) k, 1 k, 2 ⎪ ⎩ c1,− = [c2,− ] = 0, 4,+ 4,+ ε , z ε ), such that the where 1 ≤ k ≤ 3 if = + and 1 ≤ k ≤ 2 if = −, U±ε =: (v± ± Green function G(ε, x, t; y) defined in (3.54) may be decomposed as a sum of hyperbolic, excited, scattered, and residual terms, as follows:

G = H + E + S + R,

(3.63)

where H, E and S are defined in (3.56)–(3.61), with the estimates

|∂tk ∂xα ∂ yα h| ≤ C e−θt ,

|∂xα ∂ yα R| ≤ C e−θ(|x−y|+t) 1 1 2 +C t − 2 (1+α+α ) (t + 1)− 2 + e−θt e−(x−y) /Mt 1 1 1 +C (t − 2 + e−θ|x| )t − 2 (α+α ) + α t − 2 e−θ|y| |S0 |.

(3.64)

uniformly in ε, for k ∈ {0, 1}, α ∈ {0, 1, 2}, α ∈ {0, 1}, for some θ, C, M > 0. Proof. We only check (3.62), as decomposition (3.63) and bounds (3.64) are easily deduced from Proposition 7.1 of [MaZ3] and Proposition 7.3 of [LRTZ]. (See also Proposition 3.7 of [TZ2], especially Eqs. (3.30)–(3.33) and (3.38).) The description of the residue of Gλ at λ = 0 for y < 0 and y > 0 implies 0 0 + [c3,− ]− = [c ] + [c0j,+ ]+j , 4,+ 4 3 1≤ j≤3

corresponding to Eq. (1.34) in [MaZ3]. The (reactive) left eigenvector vector +4 being orthogonal to the (fluid) left eigenspace span{±j }1≤ j≤3 (see (3.26) and (3.31)), this implies (3.62)(i). Given U0 ∈ L 1 , the estimates for H and R imply (H + R)U0 d y d x = 0. lim t→+∞ R2

Hence, by conservation of mass in the fluid variables, (1.7), for all U0 ∈ L 1 , π(E + S)U0 d y d x = πU0 dy, lim t→+∞ R2

R

(3.65)

Transition to Longitudinal Instability of Detonation Waves

33

where π : C4v,z → C3v is defined by π(v, z) := v. (Eq. (3.65) corresponds to (1.33) and (7.60) in [MaZ3].) Taking U0 ∈ span{±j }1≤ j≤3 , we find (3.62)(ii), and taking U0 parallel to +4 , we find (3.62)(iii). Remark 3.24. The terms E and S correspond to the low-frequency part of the representation of G by inverse Laplace transform of the resolvent kernel Gλ , while the term H corresponds to the high-frequency part. As observed in [MaZ3], for low frequencies, the resolvent kernel in the case of real (physical) viscosity obeys essentially the same description as in the artificial (Laplacian) viscosity case, hence the estimates on E and S follow by the analysis in [LRTZ] of the corresponding artificial viscosity system, specialized to the case of strong detonations (more general waves were treated in [LRTZ]). The estimate of the terms H and R follows exactly as for the nonreactive case treated in [MaZ3,Z2]. Remark 3.25. Bound (3.64)(ii) is implied by bounds (7.1)–(7.4) of Proposition 7.1 of [MaZ3] and bounds (3.30), (3.32) and (3.38) of Proposition 3.7 of [TZ2]. Here the contribution of the hyperbolic, delta-function terms to the upper bounds for the spatial derivatives of R is absorbed in H, and the short-time, t ≤ |ak± ||y|, contributions of the scattered terms are absorbed in the generic parabolic residual term 2 e−θt e−(x−y) /Mt . Corollary 3.26. The excited terms Eλ and E contain only fluid terms: Eλ +4 ≡ 0 and E +4 ≡ 0. Proof. The equality E +4 ≡ 0 follows from (3.62)(i). The resolvent kernel Gλ is the Laplace transform of the Green function G, so that the coefficients [c0j,± ] in Propositions (3.21) and (3.23) must agree. Hence, (3.62)(i) implies also Eλ +4 ≡ 0. Corollary 3.27. For all η > 0, for some C, M > 0, some θ1 (η, s) > 0, the following bounds hold, for α ∈ {0, 1, 2}: |e−ηy ∂xα S+4 | ≤ Ce−θ1 t e−η|x−y|/2 , +

|e−ηy ∂xα R+4 | ≤ Ce−θ1 t (e−η|x−y|/2 + e−(x−y) +

2 /Mt

).

(3.66)

Proof. By (3.62)(iii), the contribution of the reactive modes to S is χ{y>0} χ{t≥1}

ex

ex 2 r + +tr (4π dt)−1/2 e−(x−y+st) /4dt . + e−x 4 4

Given 0 ≤ x ≤ y, we can bound e−ηy e−(x−y+st) e−ηy e−(st/2)2/4t ≤ e−ηy e−s and, for |x − y| >

2 t/16

2 /4dt

(3.67)

, for |x − y| ≤ 21 st, by

≤ e−η|y−x| e−s

2 t/16

,

1 2 st, by −ηy

e

≤ e−η|y−x|/2 e−ηy/2 ≤ e−η|y−x|/2 e−ηst/4 ,

and this implies (3.66)(i). To prove (3.66)(ii), we note that the contribution of the para1 1 2 bolic terms t − 2 (t + 1)− 2 e−(x−y) /Mt and S0 to R+4 comes from Riemann saddle-point estimates of the sole scattered terms Sλ +4 (see the proof of Proposition 7.1 in [MaZ3] for more details). Hence (3.66)(i) implies (3.66)(ii). Remark 3.28. The proof of Proposition 7.1 of [MaZ3] shows that Proposition 3.23 applies more generally to linear operators of the form (1.17) that satisfy (1.20) and the conditions (A1)–(A2), (H1)–(H4) of Sect. 1.5.

34

B. Texier, K. Zumbrun

3.2.2. Convolution bounds. From the pointwise bounds of Proposition 3.23 and Remarks 3.26 and 3.27, we obtain by standard convolution bounds the following L p → L q estimates, exactly as described in [MaZ1,MaZ2,MaZ3,MaZ4,Z2] for the viscous shock case. Corollary 3.29. Under (1.20), for all t ≥ 1, some C > 0, any η > 0, for any 1 ≤ q ≤ p, 1 ≤ p ≤ +∞, and f ∈ L q ∩ W 1, p , ! ! ! ! ! (S + R)(·, t; y) f (y) dy ! ! ! p ! R !L ! ! ! ∂ y (S + R)(·, t; y) f (y) dy ! ! ! p R ! !L ! ! + + −θ y ! (S + R)(·, t; y) f (y)e dy !! 4 ! p R ! !L ! ! ! H(·, t; y) f (y)dy ! ! ! p R

≤ Ct

− 21 ( q1 − 1p )

≤ Ct

− 21 ( q1 − 1p )− 21

| f | L q + Ce−ηt | f | L p ,

≤ Ct

− 21 ( q1 − 1p )− 21

| f | L q + Ce−ηt | f | L p ,

| f |L q ,

≤ Ce−ηt | f | L p ,

L

where y + := max(y, 0). Likewise, for all x and all t ≥ 0, |∂ y e(·, t)| L p + |∂t e(·, t)| L p ≤ Ct |∂t ∂ y e(·, t)| L p ≤ Ct

− 21 (1− 1p )

,

− 21 (1− 1p )− 21

.

4. Stability: Proof of Theorem 1.14 We often omit to indicate dependence on ε in the proof below. All the estimates are uniform in ε.

4.1. Linearized stability criterion. Proof of Theorem 1.14. Linear case. Sufficiency of (1.20) for linearized orbital stability follows immediately by the bounds of Corollary 3.29, exactly as in the viscous shock case, setting δ(t) :=

R

e(x, t; y)U0 (y) dy

so that U − δ(t)U¯ =

R

(H + S + R)(x, t; y)U0 (y) dy;

see [ZH,MaZ3,Z2] for further details. Necessity follows from more general spectral considerations not requiring the detailed bounds of Proposition 3.23; see the discussion of effective spectrum in [ZH,MaZ3,Z2]. The argument goes again exactly as in the viscous shock case.

Transition to Longitudinal Instability of Detonation Waves

35

4.2. Auxiliary energy estimate. Consider U˜ the solution of (1.2) issued from U˜ 0 , and let U (x, t) := U˜ (x + δ(t), t) − U¯ (x).

(4.1)

Then, the following auxiliary energy estimate holds. Lemma 4.1 (Proposition 4.15, [Z2]). Under the hypotheses of Theorem 1.14, assume ˙ and the H 3 norm that U˜ 0 ∈ H 3 , and suppose that, for 0 ≤ t ≤ T , the suprema of |δ| of U each remain bounded by a sufficiently small constant. Then, for all 0 ≤ t ≤ T , for some θ > 0, t ˙ 2 )(s) ds. |U (t)|2H 3 ≤ Ce−θt |U (0)|2H 3 + C e−θ(t−s) (|U |2L 2 + |δ| 0

4.3. Nonlinear stability. Proof of Theorem 1.14. Nonlinear case. Let U be the perturbation variable associated with solution U˜ as in (4.1); by a Taylor expansion, U solves the perturbation equation ˙ U¯ + ∂x U ), ∂t U − LU = ∂x Qf (U, ∂x U ) + Qr (U ) + δ(t)( where the linear operator L is defined in (1.17), and |Qf | ≤ C|U |(|U | + |∂x U |),

(4.2)

where C depends on U L ∞ and U¯ W 1,∞ . Lemma 4.2. Under Assumptions 1.1, 1.2, if the temperature T associated with solution U satisfies T L ∞ < Ti − T+ (by Lemma 1.6, 0 < Ti − T+ ), then the nonlinear reactive term Qr has the form Qr (U ) = +4 e−η0 x qr (U ), +

(4.3)

where x + := max(x, 0), η0 > 0 is as in Corollary 1.8, and qr (U ) = qr (w, z) is a scalar such that |qr (U )| ≤ C|U | 2,

(4.4)

where C depends on U L ∞ and U¯ L ∞ . Proof. We use the specific form −kφ(T )z+4 of the reactive source in (1.1), together with Taylor expansion (φ(T¯ + T )(¯z + z) − (φ(T¯ )¯z − (φ (T¯ )T z¯ + φ(T¯ )z) = φ (T¯ )T z + φ (T¯ + βT )T 2 z¯ , + for some 0 < β < 1, and the fact that φ (T¯ + T ) ≤ Ce−η0 x for |T | < Ti − T+ , for η > 0 as in Corollary 1.8, by φ(T+ ) = 0 together with the property that φ (T ) ≡ 0 for T ≤ Ti and exponential convergence of U¯ (x) to U+ as x → +∞.

36

B. Texier, K. Zumbrun

Recalling the standard fact that U¯ is a stationary solution of the linearized Eqs. (3.1), ¯ L U = 0, or G(x, t; y)U¯ (y)dy = et L U¯ (x) = U¯ (x), R

we have by Duhamel’s principle: U (x, t) = δ(t)U¯ (x) + G(x, t; y)U0 (y) dy R t + + G(x, t − s; y)+4 e−ηy qr (U )(y, s) dy ds 0 R t ˙ )(y, s) dy ds. − ∂ y G(x, t − s; y)(Qf (U, ∂x U ) + δU R

0

Defining δ(t) = −

e(y, t)U0 (y) dy R t + − e(y, t − s)+4 e−ηy qr (U )(y, s) dy ds 0 R t + ∂ y e(y, t − s)(Qf (U, ∂x U ) + δ˙ U )(y, s)dyds, 0

R

(4.5)

following [Z3,MaZ1,MaZ2,MaZ4], and recalling Proposition 3.23, we obtain finally the reduced equations: U (x, t) = (H + S + R)(x, t; y)U0 (y) dy R t ˙ ) + +4 e−ηy + qr (U ) dy ds + H(x, t − s; y) ∂ y (Qf (U, ∂x U ) + δU 0 R t + + (S + R)(x, t − s; y)+4 e−ηy qr (U ) dy ds 0 R t ˙ )dy ds, − ∂ y (S + R)(x, t − s; y)(Qf (U, ∂x U ) + δU (4.6) 0

R

and, differentiating (4.5) with respect to t, and recalling Corollary 3.26: ˙δ(t) = − ∂t e(y, t)U0 (y) dy R t ˙ )(y, s) dy ds, + ∂ y ∂t e(y, t − s)(Qf (U, ∂x U ) + δU 0 R δ(t) = − e(y, t)U0 (y) dy R t + ∂ y e(y, t − s)(Qf (U, ∂x U ) + δ˙ U )(y, s) dy ds. 0

R

(4.7)

(4.8)

Transition to Longitudinal Instability of Detonation Waves

Define ζ (t) :=

sup

37

1

|U (·, s)| L p (1 + s) 2

(1− 1p )

1 ˙ + |δ(s)|(1 + s) 2 + |δ(s)| .

0≤s≤t, 2≤ p≤∞

We shall establish: Claim. There exists c0 > 0, such that, for all t ≥ 0 for which a solution exists with ζ uniformly bounded by some fixed, sufficiently small constant, there holds ζ (t) ≤ c0 (|U0 | L 1 ∩H 3 + ζ (t)2 ). From this result, it follows by continuous induction that, provided |U0 | L 1 ∩H 3 <

1 2 c , 4 0

there holds ζ (t) ≤ 2c0 |U0 | L 1 ∩H 3

(4.9)

for all t ≥ 0 such that ζ remains small. For, by standard short-time theory/local wellposedness in H 3 , and the standard principle of continuation, there exists a solution U ∈ H 3 on the open time-interval for which |U | H 3 remains bounded, and on this interval ζ is well-defined and continuous. Now, let [0, T ) be the maximal interval on which |U | H 3 remains strictly bounded by some fixed, sufficiently small constant δ > 0. By Lemma 4.1, we have t ˙ 2 )(τ ) dτ |U (t)|2H 3 ≤ C|U (0)|2H 3 e−θt + C e−θ(t−τ ) (|U |2L 2 + |δ| 0 1 ≤ C |U (0)|2H 3 + ζ (t)2 (1 + t)− 2 , for some C, C , θ > 0, and so the solution continues so long as ζ remains small, with bound (4.9), at once yielding existence and the claimed sharp L p ∩ H 3 bounds, 2 ≤ p ≤ ∞. Proof of Claim. We must show that each of the quantities 1

|U | L p (1 + s) 2

(1− 1p )

˙ + s) 2 , and , |δ|(1 1

|δ|

is separately bounded by C(|U0 | L 1 ∩H 3 + ζ (t)2 ), for some C > 0, all 0 ≤ s ≤ t, so long as ζ remains sufficiently small. By (4.6)–(4.7) and the triangle inequality, we have |U | L p ≤ Ia + Ib + Ic + Id , ˙ |δ(t)| ≤ IIa + IIb , |δ(t)| ≤ IIIa + IIIb , where Ia is the L p norm of the first integral term in the right-hand side of (4.6), Ib the second term, etc., and similarly IIa is the modulus of the first term in the right-hand side of (4.7), etc. We estimate each term in turn, following the approach of [MaZ1,MaZ4].

38

B. Texier, K. Zumbrun

The linear term Ia satisfies bound Ia ≤ C|U0 | L 1 ∩L p (1 + t)

− 21 (1− 1p )

,

by Proposition 3.23 and Corollary 3.29. Likewise, applying the bounds of Corollary 3.29, we have t 1 Ib ≤ Cζ (t)2 e−η(t−s) (1 + s)− 2 ds 0 1

≤ Cζ (t) (1 + t)− 2 , 2

and (taking q = 2 in the second estimate of Corollary 3.29) t ˙ Ic + I d ≤ C e−η(t−s) (|U | L ∞ + |∂x U | L ∞ + |δ|)|U | L p (s)ds 0 t −3+ 1 ˙ +C (t − s) 4 2 p (|U | L ∞ + |δ|)|U | H 1 (s)ds 0 t − 1 (1− 1p )− 21 ≤ Cζ (t)2 e−η(t−s) (1 + s) 2 ds 0 t 3 −3+ 1 +Cζ (t)2 (t − s) 4 2 p (1 + s)− 4 ds 0

≤ Cζ (t) (1 + t) 2

− 21 (1− 1p )

, 1

IIa ≤ |∂t e(t)| L ∞ |U0 | L 1 ≤ C|U0 | L 1 (1 + t)− 2 , and

t

˙ |∂ y ∂t e(t − s)| L 2 (|U | L ∞ + |δ|)|U | H 1 (s)ds t 3 3 ≤ Cζ (t)2 (t − s)− 4 (1 + s)− 4 ds

IIb ≤

0

0 1

≤ Cζ (t) (1 + t)− 2 , 2

while |U0 | L 1 ≤ C|U0 | L 1 , IIIa ≤ |e(t)| L ∞ y and

t

˙ |∂ y e(t − s)| L 2 (|U | L ∞ + |δ|)|U | H 1 (s)ds t 1 3 ≤ Cζ (t)2 (t − s)− 4 (1 + s)− 4 ds

IIIb ≤

0

0

≤ Cζ (t) . 2

This completes the proof of the claim, establishing (1.21) for p ≥ 2. The remaining bounds 1 ≤ p < 2 then follow by a bootstrap argument as described in [Z2]; we omit the details.

Transition to Longitudinal Instability of Detonation Waves

39

5. Bifurcation: Proof of Theorem 1.18 Given two Banach spaces X and Y, we denote by L(X, Y ) the space of linear continuous applications from X to Y, and let L(X ) := L(X, X ). We use (1.22) to denote weighted Sobolev spaces and norms. Let x + := max(0, x). Given a constant η > 0 and a weight function ω > 0, define subspaces of S (R) by L 1η+ := { f, eη(·) f ∈ L 1 }, +

L 1ω := { f, ω f ∈ L 1 },

L 1ω,η+ := { f, ω f ∈ L 1η+ }.

Definition 5.1. Given a constant η > 0 and a weight function ω satisfying (1.23), define the Banach spaces B1 , B2 , X 1 , X 2 ⊂ D (R; C3v × Cz ) by B1 := H 1 , B2 := H 1 ∩ (∂x L 1 × L 1η+ ), X 1 := Hω2 , X 2 := Hω2 ∩ (∂x L 1ω × L 1ω,η+ ), with norms (v, z)B1 := (v, z) H 1 , (∂x v, z)B2 := (∂x v, z) H 1 + v L 1 + eη(·) z L 1 , (v, z) X 1 := (v, z) Hω2 , +

(∂x v, z) X 2 := (∂x v, z) Hω2 + ωv L 1 + ωeη(·) z L 1 . +

In particular, X 2 → X 1 → B1 , with ·B1 ≤ · X 1 ≤ · X 2 , and X 2 → B2 → B1 , with · B1 ≤ · B2 ≤ · X 2 , and the unit ball in X 1 is closed in B1 . ε 5.1. The perturbation equations. If U˜ ε solves (1.2) with initial datum U˜ |t=0 = U¯ ε +U0ε , then the perturbation variable U (ε, x, t) := U˜ ε (x, t) − U¯ ε (x) satisfies

∂t U − L(ε)U = ∂x Qf (ε, U, ∂x U ) + Qr (ε, U ), ε(x) U (ε, x, 0) = U0 .

(5.1)

The nonlinear term Qf satisfies (4.2), while Qr satisfies Lemma 4.2. ε be the eigenfunctions of L(ε) associated with the bifur5.2. Coordinatization. Let ϕ± ε be the corresponding left eigenfunctions. cation eigenvalues γ (ε) ± iτ (ε), and let ϕ˜± ε decay exponentially We know from Sect. 3.1.3 that (γ ±iτ )(ε) ∈ C \ C− ∪ C+ , hence ϕ± ε ∈ H 2 . Let at both −∞ and +∞, in particular, if in (1.23) θ0 is small enough, then ϕ± ω ε ) parallel to span(ϕ˜ ε )⊥ . Decomposing be the L 2 -projection onto span(ϕ± ±

U = u 11 ϕ+ε + u 12 ϕ+ε + u 2 ,

U0ε = a1 ϕ+ε + a2 ϕ+ε + b,

40

B. Texier, K. Zumbrun

ε ) (so that, in particular, where u 11 ϕ+ε + u 12 ϕ+ε and a1 ϕ+ε + a2 ϕ+ε belong to span(φ± u 1 j and a j are real), and coordinatizing as (u 1 , u 2 ), u 1 := (u 11 , u 12 ) ∈ R2 , we obtain after a brief calculation that U solves (5.1) if and only if its coordinates solve the system ⎧ γ (ε) τ (ε) ⎪ ⎪ ∂t u 1 = u + N (ε, u 1 , u 2 ), ⎪ −τ (ε) γ (ε) 1 ⎨ (5.2) ∂t u 2 = (1 − )L(ε)u 2 + (1 − )N (ε, u 1 , u 2 ), ⎪ ⎪ u = a, ⎪ 1|t=0 ⎩ u 2|t=0 = b,

where N (ε, u 1 , u 2 ) := (∂x Qf + Qr )(ε, U¯ ε , U ). Given T0 > 0, there exist ζ0 > 0 and C > 0, such that, if |a|+b Hω2 < ζ0 , the initial value problem (5.2) possesses a unique solution (u 1 , u 2 )(a, b, ε) ∈ C 0 ([0, T0 ], R2 ×Hω2 ) satisfying C −1 |a| − Cb2H 2 ≤ |u 1 (t)| ≤ C(|a| + b2H 2 ), ω

u 2 (t) Hω2 ≤ C(b Hω2 + |a|2 ), ∂(a,b) (u 1 , u 2 )(t)L(R2 ×H 1 ,H 1 ) ≤ C.

ω

(5.3)

(For more details on the initial value problem (5.2) and estimate (5.3), see [TZ2], Prop. 4.2.) 5.3. Poincaré return map. We express the period map (a, b, ε) → bˆ := u 2 (a, b, ε, T ) as a discrete dynamical system bˆ = S(ε, T )b + N˜ (a, b, ε, T ),

(5.4)

where S(ε, T ) := e T (1−)L(ε) is the linearized solution operator in v and T ˜ N (a, b, ε, T ) := S(ε, T − s)(1 − )N (ε, u 1 , u 2 )(s)ds 0

the difference between nonlinear and linear solution operators. Evidently, periodic solutions of (5.2) with period T are fixed points of the period map (equilibria of (5.4)) or, equivalently, zeros of the displacement map (a, b, ε, T ) := (S(ε, T ) − Id)b + N2 (a, b, ε, T ). 5.4. Lyapunov-Schmidt reduction. We now carry out a nonstandard Lyapunov–Schmidt reduction following the “inverse temporal dynamics” framework of [TZ2], tailored for the situation that Id − S(ε, T ) is not uniformly invertible, or, equivalently, the spectrum

Transition to Longitudinal Instability of Detonation Waves

41

of (1 − )L(ε) is not bounded away from { jπ/T } j∈Z . In the present situation, (1 − )L(ε) has both a 1-dimensional kernel (a consequence of (H4), see Sect. 1.5) and essential spectra accumulating at λ = 0, and no other purely imaginary spectra, so that Id − S(ε, T ) inherits the same properties; see [TZ2] for further discussion. Our goal, and the central point of the analysis, is to solve (a, b, ε, T ) = 0 for b as a function of (a, ε, T ), eliminating the transverse variable and reducing to a standard planar bifurcation problem in the oscillatory variable a. A “forward” temporal dynamics technique would be to rewrite = 0 as a fixed point map b = S(ε, T )b + N˜ (a, b, ε, T ),

(5.5)

then to substitute for T an arbitrarily large integer multiple j T . In the strictly stable case σ ((1 − )L) ≤ −θ < 0, S(ε, j T )L(X 1 ) < 21 for j sufficiently large. Noting that N˜ is quadratic in its dependency, we would have therefore contractivity of (5.5) with respect to b, yielding the desired reduction. However, in the absence of a spectral gap between σ ((1 − )L) and the imaginary axis, S(ε, j T )L(X 1 ) does not decay, and may be always greater than unity; thus, this naive approach does not succeed. The key idea in [TZ2] is to rewrite = 0 instead in “backward” form b = (Id − S(ε, T ))−1 N˜ (a, b, ε, T ),

(5.6)

then show that (Id − S(ε, T ))−1 is well-defined and bounded on Range N˜ , thus obtaining −1 ˜ ˜ contractivity by quadratic ∞ dependence of N . Since the right inverse (Id − S(ε, T )) N ˜ is formally given by j=0 S(ε, j T ) N this amounts to establishing convergence: a stability/cancellation estimate. Quite similar estimates appear in the nonlinear stability theory, where the interaction of linearized evolution S and nonlinear source N˜ are likewise crucial for decay. The formulation (5.6) can be viewed also as a “by-hand” version of the usual proof of the standard Implicit Function Theorem [TZ2]. Lemma 5.2. Under the assumptions of Theorem 1.18, if the constant η in Definition 5.1 satisfies η < η0 , where η0 was introduced in Corollary 1.8, then N˜ : (a, b, ε, T ) ∈ R2 × X 1 × R2 → N˜ (a, b, ε, T ) ∈ X 2 , is quadratic order, and C 1 as a map from R2 × B1 × R2 to B2 for b X 1 uniformly bounded, with N˜ X 2 ≤ C(|a| + b X 1 )2 , ∂(a,b) N˜ L(R2 ×B1 ,B2 ) ≤ C(|a| + b H 2 ), ∂(ε,T ) N˜ L(R2 ,B2 ) ≤ C(|a| + b H 2 )2 .

(5.7)

Proof. We use the variational bounds of [TZ3] (see Propositions 5 and 6, [TZ3]) and Lemma 4.2. Note that, in 1.23, only ω−1 ∈ L ∞ and 1.23(ii) were used at this point. 5.4.1. Pointwise cancellation estimate. We now develop the key cancellation estimates, adapting the pointwise semigroup methods of [ZH,MaZ3,Z2] to the present case. Our starting point is the inverse Laplace transform representation (3.55). Deforming the contour using analyticity of Gλ across oscillatory eigenvalues λ± (ε) we obtain G = G˜ + O, where ε ε O(x, t; y) := eλ+ (ε)t ϕ+ε (x)ϕ˜+ε (y)tr + eλ− (ε)t ϕ− (x)ϕ˜− (y)tr

42

B. Texier, K. Zumbrun

ε is the sum of the residues of the integrand at λ± (the right- and left-eigenfunctions ϕ± ε are defined in Sect. 5.2). The Green function G˜ is the kernel of the integral operand ϕ˜± ator S(ε, t) defined in Sect. 5.3. Note that, under the assumptions of Theorem 1.18, the Evans function associated with (1 − )L(ε) satisfies (1.20), so that, by Remark 3.28, ˜ Proposition 3.23 applies to G. For ν, ν0 > 0, let be the counterclockwise arc of circle ∂ B(0, r ) (r as in Proposition 3.21) connecting −ν − iν0 and −ν + iν0 . If ν and ν0 > 0 are sufficiently small, then

is entirely contained in the resolvent set of (1 − )L(ε), and G˜ can be decomposed as GI + GII , with 1 GI (ε, x, t; y) := eλt Gλ (ε, x, y) dλ, 2πi

−ν−iν0 −ν+i∞ (5.8) 1 GII (ε, x, t; y) := eλt Gλ (ε, x, y) dλ. P.V. + 2πi −ν−i∞ −ν+iν0

Let SI and SII denote the integral operators with respective kernels GI and GII , so that S = SI + SII , and let := (−¯ε0 , ε¯ 0 ) × (0, +∞), for some ε¯ 0 > 0. Remark 5.3. The contour being contained in the resolvent set of L , the elementary bound holds: |∂ y Gλ | ≤ Ce−θν |x−y| ,

λ ∈ ,

for some θν > 0 depending on ν. See for instance Proposition 4.4, [MaZ3]. Our treatment of the high-frequency term follows [TZ3]: Lemma 5.4. Under the assumptions of Theorem 1.18, the sequence of operators with N kernel n=0 GII (ε, nT ) is absolutely convergent in L(H 1 ), uniformly in (ε, T ) ∈ . Proof. Starting from the description of the resolvent kernel given in Proposition 3.21, we find by the same inverse Laplace transform estimates that give terms H and R in Proposition 3.23, that the high-frequency resolvent kernel GII , defined in (5.8), may be expressed as GII = Ce−θ(|x−y|+t) + hτx+st δ,

(5.9)

where C and its space-time derivatives are bounded, θ > 0, and hτx+st δ is a generic hyperbolic term; in particular h has the form (3.61) and satisfies (3.64)(i). The lemma follows. ˜ Its fluid terms are handled as in Next we turn to the low-frequency component of G. [TZ3]: Lemma 5.5. Under the assumptions of Theorem 1.18, the sequence of operators with N kernel n=0 GI (ε, nT ) converges in L(∂x L 1 , H 1 ), uniformly with respect to (ε, T ) ∈ . Proof. We argue as in the proof of Proposition 3 of [TZ3]. Let f ∈ L 1 . By (5.8), N −1 n=0 ∂ y GI f decomposes into I − II N , where 1 1 ∂ y Gλ f dλ dy, I= 2iπ R 1 − eλT eN T λ 1 II N = ∂ y Gλ f dλ dy. 2iπ R 1 − eλT For small ν and λ ∈ , (1 − eλT )−1 = λ−1 T −1 (1 + O(λ)).

Transition to Longitudinal Instability of Detonation Waves

43

The boundary term I is independent of N and is seen to belong to H 1 by Remark 5.3. By (3.56)–(3.60), λ−1 ∂ y Eλ and λ−1 ∂ y Sλ have the same form as Eλ and Sλ . By Proposition 3.21, λ−1 ∂ y Rλ behaves like the sum of Rλ and a pole term of form λ−1 e−θ|x−y| . Hence, by the same Riemann saddle-point estimates used to bound G in Proposition 3.23, we find that eλN T ∂ G dλ = (E + S + R) (ε, N T ), (5.10) λT y λ

1−e up to a constant (independent of N ) term of the form Ce−θ|x−y| , where the space-time derivatives of C are uniformly bounded. This constant term satisfies the same bound as term I. In (5.10), E, S, R denote generic excited, scattered and residual terms of form (3.58), (3.59)–(3.60) and (3.64)(ii). By dominated convergence, H1 − lim E(ε, N T ) f (y) dy N →∞ R

exists and is equal to a sum of terms of the form f (y) dy. C(ε, T )(U¯ ε )

(5.11)

Besides, by (3.59)–(3.60) and (3.64), " " " " " (S + R)(ε, N T ) f (y) dy " " "

(5.12)

R

R

H1

≤ C(N T )− f rac14 f L 1 .

This proves convergence in H 1 of the sequence II N . We examine finally the contribution to the series n S(ε, nT ) of the new (not present in [TZ3]), reactive terms. Lemma 5.6. Under the assumptions of Theorem 1.18, the sequence of operators with N kernels n=0 GI (ε, nT )+4 is absolutely convergent in L(L 1η+ , H 1 ), uniformly with respect to (ε, T ) ∈ . Proof. Let f ∈ L 1 . By (5.8), Proposition 3.23 and Corollary 3.26, the low-frequency kernel GI satisfies + + e−ηy GI (ε, t)+4 f (y) dy = e−ηy (S + R) (ε, t)+4 f (y) dy, (5.13) R

R

and, by Corollary 3.27, " " " " " e−ηy + (S + R)(ε, t)+ f (y) dy " 4 " " R

1

H1

≤ C(1 + t 4 )e−θ1 t f L 1 ,

and the upper bound defines for t = N T an absolutely converging series in H 1 .

44

B. Texier, K. Zumbrun

From Lemmas 5.4, 5.5 and 5.6 and the fact that S(ε, T ) ∈ L(B1 ), for all (ε, T ) ∈ , we can conclude that, under the assumptions of Theorem 1.18, the operator Id − S(ε, T ) has a right inverse S(ε, nT ) : B2 → B1 , (Id − S(ε, T ))−1 := n≥0

that belongs to L(B2 , B1 ), locally uniformly in (ε, T ) ∈ . We will need the following regularity result for the right inverse: Lemma 5.7. Under the assumptions of Theorem 1.18, the operator (Id − S(ε, T ))−1 is C 1 in (ε, T ) ∈ , with respect to the L(B2 , B1 ) norm on B2 . Proof. Note that, by (3.55), ∂t G has kernel λGλ ; in particular, the small λ (low-frequency) estimates of the proofs of Lemmas 5.5 and 5.6 imply the convergence of the N sequence n=0 ∂T SI (ε, nT ) in L(B2 , B1 ). The contribution of ∂T GII (ε, nT ) is handled as in Lemma 5.4, by (5.9) and (3.64)(i) with k = 1. Bounds for ε-derivatives are handled as in [TZ3], using either the variational equation (L − λ)∂ε Gλ = −(∂ε L)Gλ , or the ∂ε G bounds of Proposition 3.11 from [TZ2]. Note that the ε-derivative bound (5.7)(iii) is stated on a proper subspace of B1 , namely X 1 . In this respect, the following lemma, asserting boundedness of the right inverse on X 2 → B2 , in L(X 2 , X 1 ) norm, is key to the reduction procedure of the following Section. (See Remark 5.12.) Lemma 5.8. Under the assumptions of Theorem 1.18, (Id − S(ε, T ))−1 belongs to L(X 2 , X 1 ), for all (ε, T ) ∈ . Proof. The convolution bound " " " 1 " −θ|x−y| "ω 2 e f (y) dy " " " R

L2

≤ C min f L 2ω , f L 1ω ,

(5.14)

1

where C depends on ω 2 e−θ|·| L 1 ∩L 2 , holds by (1.23)(i) and (1.23)(iii). It implies that the contributions of GII , of I and of the constant pole term in II N (see the proofs of Lemmas 5.4 and 5.5) are all bounded in L(X 1 ). The scattered and residual terms in II contribute nothing to the limit, by (5.12). We use again Corollary 3.27 to handle the contribution of the reactive term. In (3.66), there are two terms in the upper bound for (S + R)+4 . The first term is handled by (5.14), and the second by " " " 1 " " 1 " 2 " −|x−y|2 /Mt " ≤ e−θt " 2 2 e −|x−y| /Mt " ω ω e−θt " e f (y) dy f L 1ω , (5.15) " " " 2 R

L2

noting that 1

e−θt ω 2 e−|x−y|

2 /Mt

L

1 1 2 2 L 2 ≤ Ce−θ1 t ω 2 e−|·| /Mt L 2 + ω 2 e−|·| /Mt L 2 |x|
and the upper bound in (5.16) defines for t = N T an absolutely converging series, if θ0 in (1.23) is small enough. We used in (5.16) the growth assumption on ω.

Transition to Longitudinal Instability of Detonation Waves

45

Remark 5.9. The above proof shows that, in (1.23), we need in particular θ0 < 21 η0 , where η0 is the decay rate of the background profile (see Corollary 1.8). In addition, we 2 need θ0 < min(θν , θ, M ), where θ is the rate of decay in the upper bounds of Proposition 3.23 and M is the diffusion rate in the upper bound of (3.66)(ii), and θ0 to be small ε ∈ H 2. enough so that ϕ± ω 5.4.2. Reduction. Corollary 5.10. Under the Assumptions of Theorem 1.18, the equation (a, b, ε, T ) = 0

(a, b, ε, T ) ∈ R2 × X 1 × R2 ,

(5.17)

where is defined in Sect. 5.3, is equivalent to b = (Id − S(ε, T ))−1 N˜ (a, b, ε, T ) + ω for ω ∈ Ker(Id − S(ε, T )) ∩ X 1 . Proof. A simple consequence of the definition in the above section of the right inverse (Id − S(ε, T ))−1 . For more details, see the proof of Lemma 2.8 in [TZ2]. Note that, as a consequence of (H4), Sect. 1.5, the kernel of Id − S(ε, T ) is of dimension one, for all ε, T, generated by (U¯ ε ) . Corollary 5.11. Under the Assumptions of Theorem 1.18, the map T (a, b, ε, T, α) := (Id − S(ε, T ))−1 N˜ (a, b, ε, T ) + α(U¯ ε ) , is bounded from R2 × X 1 × R2+1 to X 1 , and C 1 from R2 × B1 × R2+1 to X 1 , for |α| bounded and |a| + b X 1 + |ε| sufficiently small, with T X 1 ≤ C(|a| + b2X 1 ), ∂(a,b) T L(R2 ×B1 ,B1 ) ≤ C(|a| + b X 1 ), ∂T T B1 ≤ C(|a|2 + b2X 1 ), ∂ε T B1 ≤ C(|a|2 + b2X 1 + |α|), ∂α T B1 ≤ C(|a|2 + b2X 1 + 1). Proof. Follows from Lemma 5.2, the results of Sect. 5.4.1, and the above remark on the kernel of Id − S(ε, T ). Remark 5.12. Without Lemma 5.8 but with Lemmas 5.4 to 5.7, we could see T has a C 1 map from B1 to B1 (for fixed a, ε, T, α), with quadratic bound T B1 ≤ Cb2B1 , and derivative bound ∂b T L(B1 ≤ CbB1 . This would be sufficient to prove existence of a fixed point b ∈ B1 as in the following proposition, but not to prove regularity of the fixed point with respect to ε, precisely because the ε-derivative bound (5.7)(iii) requires more regularity in b, by ∂ε et L(ε) = et L(ε) ∂ε L (see also Remark 3.12, [TZ2]). Without regularity of the fixed point of T , we could not use the standard implicit function theorem in Sect. 5.4.3. These issues were discussed in detail in Sect. 2.3 of [TZ2].

46

B. Texier, K. Zumbrun

Proposition 5.13. Under the assumptions of Theorem 1.18, there exists a function β(a, ε, T, α), bounded from R4+1 to X 1 and C 1 from R4+1 to B1 , such that (a, β(a, ε, T, α), ε, T ) ≡ 0,

β X 1 + ∂(ε,T ) βL(R2 ,B1 ) ≤ C(|a|2 + |α|), ∂a βL(R2 ,B1 ) ≤ C|a|, ∂α βL(R,B1 ) ≤ C,

(5.18)

for |(a, ε, α)| sufficiently small. Moreover, for |(a, ε)|, b X 1 sufficiently small, all solutions of (5.17) lie on the 1-parameter manifold {b = β(a, ε, T, α), α ∈ R}. Proof. A consequence of the Banach fixed point theorem applied to map T . For more details, see the proof of Proposition 2.9 in [TZ2]. 5.4.3. Bifurcation. The bifurcation analysis is straightforward now that we have reduced to a finite-dimensional problem, the only tricky point being to deal with the 1-fold multiplicity of solutions (parametrized by α). Define to this end ˜ ε, T, α) β(a, ˆ := β(a, ε, T, |a|α), ˆ with αˆ restricted to a ball in R1 , noting, by (5.18), that ˜ ˜ X 1 , ∂(a,ε,T,α) β ˆ βL(R4 ,B1 ) ≤ C|a|, with β˜ Lipshitz in (a, ε, T, α) ˆ and C 1 away from a = 0. Solutions (u 1 , u 2 ) of (5.2) originating at ˜ ε, T, a)), (a, b) = (a, β(a, ˆ by (5.3), remain for 0 ≤ t ≤ T in a cone C := {(u 1 , u 2 ) : |u 2 | ≤ C1 |u 1 |}, C1 > 0. Indeed, (5.3) implies the bound u 2 (t) X 1 ≤ C|u 1 (t)|, for b X 1 ≤ C1 |a|, for C1 > 0 small enough, for all 0 ≤ t ≤ T.

(5.19)

Transition to Longitudinal Instability of Detonation Waves

47

Likewise, any periodic solution of (5.2) originating in C, since it necessarily satisfies = 0, must originate from data (a, b) of the form (5.19). ˜ ε, T, α), Defining b ≡ β(a, ˆ and recalling invariance of C under flow (5.2), we may view v(t) as a multiple u 2 (x, t) = c(a, ε, T, α, ˆ x, t)u 1 (t)

(5.20)

of u 1 (t), where c is bounded, Lipschitz in all arguments, and C 1 away from a = 0. Substituting into (5.2)(i), we obtain a planar ODE, γ (ε) τ (ε) ∂t u 1 = u + M(u 1 , ε, T, t, α, ˆ a) −τ (ε) γ (ε) 1 in approximate Hopf normal form, with nonlinearity M := N now nonautonomous and depending on the additional parameters (T, α, ˆ a), but, by (4.2) and (4.4), still satisfying the key bounds |M|, |∂ε,T,αˆ M| ≤ C|u 1 |2 ; |∂a,w M| ≤ C|u 1 |

(5.21)

along with planar bifurcation criterion (1.24). From (5.21), we find that M is C 1 in all arguments, also at a = 0. By standard arguments (see, e.g., [HK,TZ1]), we thus obtain a classical Hopf bifurcation in the variable u 1 with regularity C 1 , yielding existence and uniqueness up to time-translates of a 1-parameter family of solutions originating in C, indexed by r and δ with r := a1 and (without loss of generality) a2 ≡ 0. Bound (1.25) is a consequence of (5.3)(i) and (5.20). Finally, in order to establish uniqueness up to spatial translates, we observe, first, that, by dimensional considerations, the one-parameter family constructed must agree with the one-parameter family of spatial translates, and second, we argue as in [TZ2] that any periodic solution has a spatial translate originating in C, yielding uniqueness up to translation among all solutions and not only those originating in C; see Proposition 2.20 and Corollary 2.21 of [TZ2] for further details. 6. Nonlinear Instability: Proof of Theorem 1.19 We describe a nonlinear instability result in general setting. Consider ∂t U = LU + ∂x N (U ) + R(U ),

(6.1)

well-posed in H s , where L = ∂x (B∂x U ) + ∂x (AU ) + GU, and |N (U )|, |R(U )| ≤ C|U |2 for |U | ≤ C. Suppose that L has a conjugate pair of simple unstable eigenvalues λ± = γ ± iτ, γ > 0, and the rest of the spectrum is neutrally stable, without loss of generality e(1−)Lt ≤ Ct, where is the projection onto the eigenspace associated with λ± . Coordinatizing similarly as in Sect. 5 by U (x, t) = u 11 ϕ1 (x) + u 12 ϕ2 (x) + u 2 (x, t),

48

B. Texier, K. Zumbrun

where ϕ j = O(e−θ|x| ) are eigenfunctions of L, denote r (t) := |u 1 |(t). Then, so long as |U | H s ≤ C , we have existence (by variation of constants, standard continuation) of solutions of (6.1) in H s , with estimates r = γ r + O( )|U |, u 2 = (1 − )Lu 2 + O( )|U |

(6.2)

in L 2 . We shall argue by contradiction. That is, using (6.2), we shall show, for C > 0 fixed, > 0 sufficiently small, and |u 2 (0)| H 1 ≤ Cr (0), that eventually r (t) ≥ , no matter how small r (0) is, or equivalently |U | H s (0). This, of course entails nonlinear instability. Define α(x, t) := u 2 (x, t)/r (t). Then, α =

r u 2 − u 2 r u 2 u2 r − , = r2 r r r

yielding after some rearrangement the equation α = ((1 − )L − γ )α + O( (e−θ|x| + |α| + |α|2 )).

(6.3)

From (6.3) and standard variation of constants/contraction mapping argument, we find that |α(t)| H 1 remains less than or equal to C|α(t0 )| H 1 for t − t0 small. By variation of constants and the semigroup bound |e((1−)L−γ )t | H 1 →H 1 ≤ Ce−γ t (note: γ is scalar so commutes with (1 − )L), we obtain δ(t) ≤ C(|α(0)| H 1 + (1 + δ(t)2 )), for δ(t) := sup0≤τ ≤t |α(τ )| H 1 . So long as δ remains less than or equal to unity and C ≤ 21 , this yields δ(t) ≤ 2C(|α(0)| H 1 + ), and thus 1 δ(t) ≤ 2C|α(0)| H 1 + . 2 Substituting into the radial equation, we obtain r ≥ (γ − (1 + δ) )r, yielding exponential growth for sufficiently small. In particular, r ≥ C for some time, and thus |U | H 1 ≥ , a contradiction. We may conclude, therefore, that |U | L 2 eventually grows larger than any , no matter how small the initial size r (0), and thus we may conclude instability of the trivial solution U ≡ 0. Taking now (6.1) to be the perturbation equations about a strong detonation profile, we obtain the result of nonlinear instability of the background profile U¯ . Remark 6.1. In the easier case of a single, real eigenvalue, the scalar, w equation, would play the role of the radial equation here. This case is subsumed in our analysis as well. Acknowledgements. Thanks to Björn Sandstede and Arnd Scheel for their interest in this work and for stimulating discussions on spatial dynamics and bifurcation in the absence of a spectral gap. Thanks to Gregory Lyng for pointing out reference [Ch]. B.T. thanks Indiana University for their hospitality during the collaborative visit in which the analysis was carried out. B.T. and K.Z. separately thank the Ecole Polytechnique Fédérale de Lausanne for their hospitality during two visits in which a substantial part of the analysis was carried out.

Transition to Longitudinal Instability of Detonation Waves

49

References [AT] [AGJ] [AlT] [BHRZ] [Ba] [BeSZ] [BM] [BMR] [BDG] [Br1] [Br2] [BrZ] [B] [BN] [C] [Ch] [CF] [EE] [Er1] [Er2] [Er3] [Er4] [F1] [F2] [FD] [FW] [G] [GK] [GZ] [GS1] [GS2]

Abouseif, G., Toong, T.Y.: Theory of unstable one-dimensional detonations. Combust. Flame 45, 67–94 (1982) Alexander, J., Gardner, R., Jones, C.K.R.T.: A topological invariant arising in the analysis of traveling waves. J. Reine Angew. Math. 410, 167–212 (1990) Alpert, R.L., Toong, T.Y.: Periodicity in exothermic hypersonic flows about blunt projectiles. Acta Astron. 17, 538–560 (1972) Barker, B., Humpherys, J., Rudd, K., Zumbrun, K.: Stability of viscous shocks in isentropic gas dynamics. Commun. Math. Phys 281(1), 231–249 (2008) Batchelor, G.K.: An introduction to fluid dynamics. Second paperback edition. Cambridge Mathematical Library. Cambridge: Cambridge University Press, 1999 Beck, M., Sandstede, B., Zumbrun, K.: Nonlinear stability of time-periodic shocks. Arc. Rat. Mech. Anal. 196, 1011–1076 (2010) Bourlioux, A., Majda, A.: Theoretical and numerical structure of unstable detonations. Proc. R. Soc. Lond. A 350, 29–68 (1995) Bourlioux, A., Majda, A., Roytburd, V.: Theoretical and numerical structure for unstable onedimensional detonations. SIAM J. Appl. Math. 51, 303–343 (1991) Bridges, T.J., Derks, G., Gottwald, G.: Stability and instability of solitary waves of the fifth-order kdv equation: a numerical framework. Phys. D 172(1-4), 190–216 (2002) Brin, L.: Numerical testing of the stability of viscous shock waves. Doctoral thesis, Indiana University, 1998 Brin, L. Q.: Numerical testing of the stability of viscous shock waves. Math. Comp. 70(235), 1071– 1088 (2001) Brin, L., Zumbrun, K.: Analytically varying eigenvectors and the stability of viscous shock waves. In: Proc. Seventh Workshop on Partial Differential Equations, Part I (Rio de Janeiro, 2001). Mat. Contemp. 22, 19–32, 2002 Buckmaster, J.D.: An introduction to combustion theory. The mathematics of combustion, Frontiers in App. Math. Philadelphia, SIAM, 1985, pp. 3–46 Buckmaster, J., Neves, J.: One-dimensional detonation stability: the spectrum for infinite activation energy. Phys. Fluids 31(12), 3572–3576 (1988) Carr J.,: Applications of centre manifold theory. Applied Mathematical Sciences, 35. New YorkBerlin: Springer-Verlag, 1981 Chen, G.Q.: Global solutions to the compressible navier-stokes equations for a reacting mixture. SIAM J. Math. Anal. 23(3), 609–634 (1992) Courant, R., Friedrichs, K.O.: Supersonic flow and shock waves. New York: Springer-Verlag, 1976 Edmunds, D.E., Evans, W.D.: Spectral theory and differential operators. Oxford: Oxford University Press, 1987 Erpenbeck, J.J.: Stability of steady-state equilibrium detonations. Phys. Fluids 5, 604–614 (1962) Erpenbeck, J.J.: Stability of idealized one-reaction detonations. Phys. Fluids 7, 684 (1964) Erpenbeck, J.J.: Detonation stability for disturbances of small transverse wave length. Phys. Fluids 9, 1293–1306 (1966) Erpenbeck, J.J.: Nonlinear theory of unstable one–dimensional detonations. Phys. Fluids 10(2), 274–289 (1967) Fickett, W.: Stability of the square wave detonation in a model system. Physica 16D, 358–370 (1985) Fickett, W.: Detonation in miniature. In The mathematics of combustion, Frontiers in App. Math. Philadelphia: SIAM, 1985, pp.133–182 Fickett, W., Davis, W.C.: Detonation, Berkeley, CA: University of California Press, 1979, reissued as Detonation: Theory and experiment, Mineola, New York: Dover Press, 2000 Fickett, W., Wood, W.W.: Flow calculations for pulsating one-dimensional detonations. Phys. Fluids 9, 903–916 (1966) Gardner, R.: On the detonation of a combustible gas. Trans. Amer. Math. Soc. 277(2), 431–468 (1983) Gohberg, I., Krein, M.G.: Introduction to the theory of linear nonselfadjoint operators. Translations of mathematical monographs, Volume 18, Providence, RI: Amer, Math. Soc., 1969 Gardner, R., Zumbrun, K.: The gap lemma and geometric criteria for instability of viscous shock profiles. Comm. Pure Appl. Math. 51(7), 797–855 (1998) Gasser, I., Szmolyan, P.: A geometric singular perturbation analysis of detonation and deflagration waves. SIAM J. Math. Anal. 24, 968–986 (1993) Gasser, I., Szmolyan, P.: Detonation and deflagration waves with multistep reaction schemes. SIAM J. Appl. Math. 55, 175–191 (1995)

50

[HK] [He] [HZ] [HLZ] [HLyZ] [HuZ1] [HuZ2] [JLW] [Kat] [KS] [LS] [LyZ1] [LyZ2] [LRTZ] [MM] [MaZ1] [MaZ2] [MaZ3] [MaZ4] [MaZ5] [MeZ] [Pa] [MT] [RZ] [SS] [ShK] [S1] [S2] [TT] [TZ1] [TZ2]

B. Texier, K. Zumbrun

Hale, J., Koçak, H.: Dynamics and bifurcations. Texts in Applied Mathematics, 3. New York: Springer-Verlag, 1991 Henry, D.: Geometric theory of semilinear parabolic equations. Lecture Notes in Mathematics, Volume 840, Berlin: Springer-Verlag, 1981 Howard, P., Zumbrun, K.: Stability of undercompressive viscous shock waves. J. Diff. Eq. 225(1), 308–360 (2006) Humpherys, J., Lafitte, O., Zumbrun, K.: Stability of viscous shock profiles in the high mach number limit. Commun. Math. Phys. 293(1), 1–36 (2010) Humpherys, J., Lyng, G., Zumbrun, K.: Spectral stability of ideal-gas shock layers. Arch. Rat. Mech. Anal. 194(3), 1029–1079 (2009) Humpherys, J., Zumbrun, K.: An efficient shooting algorithm for evans function calculations in large systems. Phys. D 220(2), 116–126 (2006) Humpherys, J., Zumbrun, K.: Spectral stability of small amplitude shock profiles for dissipative symmetric hyperbolic–parabolic systems. Z. Angew. Math. Phys. 53, 20–34 (2002) Jenssen, H.K., Lyng, G., Williams, M.: Equivalence of low-frequency stability conditions for multidimensional detonations in three models of combustion. Indiana Univ. Math. J. 54(1), 1–64 (2005) Kato, T.: Perturbation theory for linear operators. Berlin Heidelberg: Springer-Verlag, 1985 Kasimov, A.R., Stewart, D.S.: Spinning instability of gaseous detonations. J. Fluid Mech. 466, 179–203 (2002) Lee, H. I., Stewart, D. S.: Calculation of linear detonation instability: one-dimensional instability of plane detonation. J. Fluid Mech. 216, 102–132 (1990) Lyng, G., Zumbrun, K.: A stability index for detonation waves in majda’s model for reacting flow. Phys. D 194(1–2), 1–29 (2004) Lyng, G., Zumbrun, K.: One-dimensional stability of viscous strong detonation waves. Arch. Rat. Mech. Anal. 173(2), 213–277 (2004) Lyng, G., Raoofi, M., Texier, B., Zumbrun, K.: Pointwise green function bounds and stability of combustion waves. J. Diff. Eqs. 233(2), 654–698 (2007) Marsden, J. E., McCracken, M.: The Hopf bifurcation and its applications. Applied Mathematical Sciences 19, Berlin-Heidelberg-New York: Springer, 1976 Mascia, C., Zumbrun, K.: Pointwise green’s function bounds and stability of relaxation shocks. Indiana Univ. Math. J. 51(4), 773–904 (2002) Mascia, C., Zumbrun, K.: Stability of small-amplitude shock profiles of symmetric hyperbolicparabolic systems. Comm. Pure Appl. Math. 57(7), 841–876 (2004) Mascia, C., Zumbrun, K.: Pointwise green function bounds for shock profiles of systems with real viscosity. Arch. Rat. Mech. Anal. 169(3), 177–263 (2003) Mascia, C., Zumbrun, K.: Stability of large-amplitude viscous shock profiles of hyperbolicparabolic systems. Arch. Rat. Mech. Anal. 172(1), 93–131 (2004) Mascia, C., Zumbrun, K.: Stability of large-amplitude shock profiles of general relaxation systems. SIAM J. Math. Anal. 37(3), 889–913 (2005) Métivier, G., Zumbrun, K.: Large viscous boundary layers for noncharacteristic nonlinear hyperbolic problems. Mem. Amer. Math. Soc. 175(826) (2005) Pazy, A.: Semigroups of linear operators and applications to partial differential equations. Applied Mathematical Sciences, 44. New York: Springer-Verlag, 1983 McVey, U.B., Toong, T.Y.: Mechanism of instabilities in exothermic blunt-body flows. Combus. Sci. Tech. 3, 63–76 (1971) Raoofi, R., Zumbrun, K.: Stability of undercompressive viscous shock profiles of hyperbolic– parabolic systems. J. Diff. Eqs. 246(4), 1539–1567 (2009) Sandstede, B., Scheel, A.: Hopf bifurcation from viscous shock waves. SIAM J. Math. Anal. 39(6), 2033–2052 (2008) Shizuta, S., Kawashima, Y.: On the normal form of the symmetric hyperbolic-parabolic systems associated with the conservation laws. Tohoku Math. J. (2) 40(3), 449–464 (1988) Short, M.: An asymptotic derivation of the linear stability of the square-wave detonation using the newtonian limit. Proc. R. Soc. Lond. A 452, 2203–2224 (1996) Short, M.: Multidimensional linear stability of a detonation wave at high activation energy. Siam J. Appl. Math. 57(2), 307–326 (1997) Tan, D., Tesei, A.: Nonlinear stability of strong detonation waves in gas dynamical combustion. Nonlinearity 10, 355–376 (1997) Texier, B., Zumbrun, K.: Relative poincaré–hopf bifurcation and galloping instability of traveling waves. Methods Appl. Anal. 12(4), 349–380 (2005) Texier, B., Zumbrun, K.: Galloping instability of viscous shock waves. Physica D 237(10–12), 1553–1601 (2008)

Transition to Longitudinal Instability of Detonation Waves

[TZ3] [VT] [Zl] [Z2] [Z3] [Z4] [ZH] [ZS]

51

Texier, B., Zumbrun, K.: Hopf bifurcation of viscous shock waves in compressible gas-dynamics and MHD. Arch. Rat. Mech. Anal. 190(1), 107–140 (2008) Vanderbauwhede, A., Iooss, G.: Center manifold theory in infinite dimensions. In: Dynamics reported: expositions in dynamical systems, Dynam. Report. Expositions Dynam. Systems (N.S.) 1, Berlin: Springer, 1992, pp. 125–163 Zumbrun, K.: Multidimensional stability of planar viscous shock waves. In: Advances in the theory of shock waves, Progr. Nonlinear Differential Equations Appl., 47, Boston, MA: Birkhäuser Boston, 2001, pp. 307–516 Zumbrun, K.: Stability of large-amplitude shock waves of compressible Navier–Stokes equations. In: Handbook of mathematical fluid dynamics. Vol. III, Amsterdam: North-Holland, 2004, pp. 311–533 Zumbrun, K.: Planar stability criteria for viscous shock waves of systems with real viscosity. In: Hyperbolic systems of balance laws, Lecture Notes in Math., 1911, Berlin: Springer, 2007, pp. 229–326 Zumbrun K.,: Stability of viscous detonations in the ZND limit. To appear, Arch. Ration. Mech. Anal. doi:10.1007/s00205-101-03426, 2010 Zumbrun, K., Howard, P.: Pointwise semigroup methods and stability of viscous shock waves. Indiana Mathematics Journal 47, 741–871 (1998); Errata, Indiana Univ. Math. J. 51(4), 1017–1021 (2002) Zumbrun, K., Serre, D.: Viscous and inviscid stability of multidimensional planar shock fronts. Indiana Univ. Math. J. 48, 937–992 (1999)

Communicated by P. Constantin

Commun. Math. Phys. 302, 53–111 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1177-6

Communications in

Mathematical Physics

Critical Measures, Quadratic Differentials, and Weak Limits of Zeros of Stieltjes Polynomials A. Martínez-Finkelshtein1,2 , E. A. Rakhmanov3 1 Department of Statistics and Applied Mathematics, University of Almería,

04120 Almeria, Spain. E-mail: [email protected]

2 Instituto Carlos I de Física Teórica y Computacional,

Granada University, 18071 Granada, Spain

3 Department of Mathematics, University of South Florida, Tampa, FL 33620, USA.

E-mail: [email protected] Received: 6 April 2009 / Accepted: 25 July 2010 Published online: 8 January 2011 – © Springer-Verlag 2011

Abstract: We investigate the asymptotic zero distribution of Heine-Stieltjes polynomials – polynomial solutions of second order differential equations with complex polynomial coefficients. In the case when all zeros of the leading coefficients are all real, zeros of the Heine-Stieltjes polynomials were interpreted by Stieltjes as discrete distributions minimizing an energy functional. In a general complex situation one deals instead with a critical point of the energy. We introduce the notion of discrete and continuous critical measures (saddle points of the weighted logarithmic energy on the plane), and prove that a weak-* limit of a sequence of discrete critical measures is a continuous critical measure. Thus, the limit zero distributions of the Heine-Stieltjes polynomials are given by continuous critical measures. We give a detailed description of such measures, showing their connections with quadratic differentials. In doing that, we obtain some results on the global structure of rational quadratic differentials on the Riemann sphere that have an independent interest. The problem has a rich variety of connections with other fields of analysis; some of them are briefly mentioned in the paper. Contents 1. Generalized Lamé Equation . . . . . . . . . . . . . . . . . . . . . . 2. Discrete and Continuous Extremal Measures . . . . . . . . . . . . . 3. Discrete and Continuous Critical Measures . . . . . . . . . . . . . . 4. Rational Quadratic Differentials on the Riemann Sphere in a Nutshell 5. Critical Measures in the Field of a Finite System of Fixed Charges . . 6. Critical Measures and Extremal Problems . . . . . . . . . . . . . . . 7. Weak Limit of Zeros of Heine-Stieltjes Polynomials . . . . . . . . . 8. Heun’s Differential Equation ( p = 2) . . . . . . . . . . . . . . . . . 9. General Families of A-Critical Measures . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

54 56 59 64 67 81 87 90 101 108

54

A. Martínez-Finkelshtein, E. A. Rakhmanov

1. Generalized Lamé Equation Let us start with a classical problem more than 125 years old. Given a set of pairwise distinct points fixed on the complex plane C, A = {a0 , a1 , . . . , a p },

(1.1)

( p ∈ N), and two polynomials, A(z) =

p

(z − ai ),

B(z) = αz p + lower degree terms ∈ P p , α ∈ C, (1.2)

i=0

where we denote by Pn the set of all algebraic polynomials of degree ≤ n, we are interested in the polynomial solutions of the generalized Lamé differential equation (in algebraic form), A(z) y (z) + B(z) y (z) − n(n + α − 1)Vn (z) y(z) = 0,

(1.3)

where Vn is a polynomial (in general, depending on n) of degree ≤ p − 1; if deg V = p − 1, then V is monic. An alternative perspective on the same problem can be stated in terms of the second order differential operator L[y](z) = A(z) y (z) + B(z) y (z), def

and the associated generalized spectral problem (or multiparameter eigenvalue problem, see [97]), L[y](z) = n(n + α − 1)Vn (z) y(z), n ∈ N,

(1.4)

where Vn ∈ P p−1 is the “spectral polynomial”. Special instances of Eq. (1.3) are well known. For instance, p = 1 corresponds to the hypergeometric differential equation. Case p = 2 was studied by Lamé in the 1830s in the special setting B = A /2, a j ∈ R, and a0 + a1 + a2 = 0, in connection with the separation of variables in the Laplace equation using elliptical coordinates (see e.g. [100, Ch. 23]). For the general situation of p = 2 we get Heun’s equation, which still attracts interest and poses open questions (see [76]). Recently, Eq. (1.3) has also found other applications in studies as diverse as the construction of ellipsoidal and sphero-conal h-harmonics of the Dunkl Laplacian [98,99], the quantum asymmetric top [1,17,37], or certain quantum completely integrable systems called generalized Gaudin spin chains [40], and their thermodynamic limits. Heine [41] proved that for every n ∈ N there exist at most n+ p−1 σ (n) = (1.5) n different polynomials Vn such that (1.3) (or (1.4)) admits a polynomial solution y = Q n ∈ Pn . These particular Vn are called Van Vleck polynomials, and the corresponding polynomial solutions y = Q n are known as Heine-Stieltjes (or simply Stieltjes) polynomials. Heine’s theorem states that if the polynomials A and B are algebraically independent (that is, they do not satisfy any algebraic equation with integer coefficients) then for any n ∈ N there exist exactly σ (n) Van Vleck polynomials Vn , their degree is exactly p − 1,

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

55

and for each Vn Eq. (1.3) has a unique (up to a constant factor) solution y of degree n. The condition of algebraic independence of A and B is sufficient but not necessary. It should be noted that the original argument of Heine is far from clear, and even Szeg˝o [92] cites his result in a rather ambiguous form. Recently significant research on the algebraic aspects of this theory has been carried out by B. Shapiro in [82], and we refer the reader to his work for further details. In particular, it has been proved in [82] that for any polynomials A and B like in (1.2) there exists N ∈ N such that for any n ≥ N , there exist σ (n) Van Vleck polynomials Vn of degree exactly p − 1 such that (1.3) has a polynomial solution of degree exactly n. Stieltjes discovered an electrostatic interpretation of zeros of the polynomials discussed in [41], which attracted common attention to the problem. He studied the problem (1.3) in a particular setting, assuming that A ⊂ R and that all residues ρk in B(x) ρk = A(x) x − ak p

(1.6)

k=0

are strictly positive (which is equivalent to the assumption that the zeros of A alternate with those of B and that the leading coefficient of B is positive). He proved in [90] (see also [92, Theorem 6.8]) that in this case for each n ∈ N there are exactly σ (n) different Van Vleck polynomials of degree p − 1 and the same number of corresponding HeineStieltjes polynomials y of degree n, given by all possible ways how the n zeros of y can be distributed in the p open intervals defined by A (see Sect. 2). Further generalizations of the work of Heine and Stieltjes followed several paths; we will mention only some of them. First, under Stieltjes’ assumptions (A ⊂ R and ρk > 0), Van Vleck [95] and Bôcher [15] proved that the zeros of each Vn belong to the convex hull of A (see also the work of Shah [78–81]). Pólya [70] showed that this is true for A ⊂ C if we keep the assumption of positivity of the residues ρk . Marden [58], and later, Al-Rashed, Alam and Zaheer (see [2,3,101,102]) established further results on location of the zeros of the Heine-Stieltjes polynomials under weaker conditions on the coefficients A and B of (1.3). An electrostatic interpretation of these zeros in cases when A ⊂ R and some residues ρk are negative has been studied by Grünbaum [39], and by Dimitrov and Van Assche [25]. For some interlacing properties, see e.g. [18]. We are interested in the asymptotic regime (so called semiclassical asymptotics) when n (the degree of the Heine-Stieltjes polynomials) tends to infinity. The first general result in this direction, based precisely on the Stieltjes model, is due to Martínez-Finkelshtein and Saff [61]. There the limit distribution of zeros of Heine-Stieltjes polynomials has been established in terms of the traditional extremal problem for the weighted logarithmic energy on a compact set of the plane. The main goal of this paper is to consider the weak-* asymptotics of the HeineStieltjes and Van Vleck polynomials in the general setting of A ⊂ C and ρk ∈ C, which leads to a very different electrostatic problem - equilibrium problem in the conducting plane (with a finite exceptional set of points). It is essentially known that zeros of Heine-Stieltjes polynomials present a discrete critical measure – saddle point of the discrete energy functional. A continuous analogue of this notion leads to a concept of “continuous” critical measure, i.e. critical point of the usual energy functional defined on Borel measures with respect to a certain class of local variations. We prove (Sect. 7) that the weak limit of discrete critical measures is a continuous critical measure (as the number of atoms or mass points tends to infinity). Thus, discrete critical measures are limit distributions of zeros of the Heine-Stieltjes polynomials.

56

A. Martínez-Finkelshtein, E. A. Rakhmanov

To complete the description of the limit zero distributions of these polynomials we have to study more deeply the set of continuous critical measures. The problem, rather complex, is connected to many other classical problems of analysis, and has potentially a large circle of applications. In Sect. 6 we mention a few connections, in particular, to minimal capacity problem and its generalizations. In Sect. 5 we characterize critical measures in terms of trajectories of a (closed) rational quadratic differential on the Riemann sphere; for completeness of reading we summarize basic results on quadratic differentials in Sect. 4. Further investigation of such differentials in carried out in Sects. 8 (case p = 2) and 9 (general case). In the following two sections, 2 and 3, we discuss in some detail the concepts of the discrete and continuous equilibrium. 2. Discrete and Continuous Extremal Measures 2.1. Stieltjes electrostatic model: discrete equilibrium. We denote by Mn the class of uniform discrete measures on C, n def def δz k , z k ∈ C , and M = Mn , Mn = n≥1

k=1

where δx is a unit mass (Dirac delta) at x. With any polynomial P(z) = we associate its zero counting measure ν(P) =

n

n

j=1 (z

− ζj)

δζ j ∈ Mn ,

j=1

where the zeros

are counted according to their multiplicity. For μ = nk=1 δζk ∈ M we define its (discrete) energy 1 def , E(μ) = log |ζi − ζ j | i= j

(if two or more ζ j ’s coincide, then E(μ) = +∞). Additionally, given a real-valued function (external field) ϕ, finite at supp(μ), we consider the weighted energy Eϕ (μ) = E(μ) + 2 def

n

ϕ(ζk ).

(2.1)

k=1

In the above mentioned paper [90] Stieltjes introduced the following extremal problem. For fixed subset A = {a0 , . . . , a p } ⊂ R, a0 < · · · < a p , values ρk ≥ 0, k = p 0, 1, . . . , p, and an arbitrary vector n = (n 1 , . . . , n p ) ∈ Z+ (where Z+ = N∪{0}), define def p |n| = n 1 + · · · + n p , j = [a j−1 , a j ], j = 1, . . . , p, and = ∪ j=1 j = [a0 , a p ]. Consider the class of discrete measures def M|n| ( , n) = μ ∈ M|n| : supp(μ) ⊂ , μ( j ) = n j , j = 1, . . . , p , (2.2) and the external field ϕ(x) = Re ( (x)), (x) =

p ρj j=0

2

log

1 . x − aj

(2.3)

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

57

We seek a measure μ∗ = μ∗ (n) minimizing the weighted energy (2.1) in the class M|n| ( , n): (2.4) Eϕ (μ∗ ) = min Eϕ (μ) : μ ∈ M|n| ( , n) . In other words, we place n j unit electric charges on the conductor j and look for the equilibrium position of such a system of charges in the external field ϕ, if the interaction obeys the logarithmic law. Stieltjes proved that the global minimum (2.4) provides the only equilibrium position, and that the zeros of the solution y = Q n of (1.3) are exactly points of the support of the extremal measure μ∗ in (2.4): ν(Q n ) = μ∗ . Actually, μ∗ provides also the unique component-wise or point-wise minimum of Eϕ (“Nash-type” equilibrium). The Stieltjes equilibrium problem (2.4) is a constrained one: the constraints are embedded in the definition of the class M|n| ( , n). A classical non-constrained version of the same problem leads to the (weighted) Fekete points. Given a compact ⊂ C and def n ∈ N, we want to find μ∗ ∈ Mn ( ) = {μ ∈ Mn : supp(μ) ⊂ } with Eϕ (μ∗ ) = min Eϕ (μ) : μ ∈ Mn ( ) . Stieltjes’ model for the hypergeometric case ( p = 1) provides the well known electro(α,β) static interpretation of the Jacobi polynomials. Zeros of the Jacobi polynomials Pn are β+1 α+1 1 1 also weighted Fekete points for = [−1, 1] and ϕ(x) = 2 log |x−1| + 2 log |x+1| . Similarly, zeros of Laguerre and Hermite polynomials are weighted Fekete points for 1 x x2 = [0, +∞), ϕ(x) = α+1 2 log |x| + 2 and = R, ϕ(x) = 2 , respectively. It was pointed out in [42] that zeros of general orthogonal polynomials with respect to a measure on R may be interpreted as weighted Fekete points with an external field ϕ = ϕn in general depending on the degree n. Besides its elegance, the electrostatic model just described allows to establish monotonicity properties of the zeros of the Heine-Stieltjes polynomials as a function of the parameters ρk . Furthermore, the minimization problem for the discrete energy it is based upon, admits substantial generalizations (one of them is the subject of the present paper). The problem of the limit distribution of the discrete extremal points as n → ∞ leads to the corresponding continuous energy problems. 2.2. Extremal problem for Borel measures: continuous equilibrium. We denote by M (resp., MR ) the set of all finite positive (resp., real) Borel measures μ with compact support supp(μ) ⊂ C. Hereafter, |μ| stands for the total variation of μ ∈ MR , and def μ = |μ|(C). For n ∈ N, let Mn = {μ ⊂ M : μ = n} be the set of positive Borel measures with total mass n on C. With every measure μ ∈ MR we can associate its (continuous) logarithmic energy

1 def dμ(x)dμ(y). (2.5) E(μ) = log |x − y| Given the external field ϕ ∈ L 1 (|μ|), we consider also the weighted energy

def E ϕ (μ) = E(μ) + 2 ϕ dμ.

(2.6)

58

A. Martínez-Finkelshtein, E. A. Rakhmanov

If is a subset of C, we denote by M( ) (resp., MR ( )) the restriction of the corresponding families to measures supported on . Again, a standard extremal problem of the potential theory is to seek for a global minimizer λ ,ϕ ∈ M1 ( ) such that def E ϕ (λ ,ϕ ) = ρ = min E ϕ (μ) : μ ∈ M1 ( ) .

(2.7)

It is well known that under certain conditions on ϕ this minimizer λ ,ϕ exists and is unique; it is called the equilibrium measure of in the external field ϕ, see e.g. [77] for further details. For ϕ ≡ 0, measure λ = λ ,0 is also known as the Robin measure of . In terms of the extremal constant ρ we can also define the weighted (logarithmic) capacity of , capϕ ( ) = e−ρ . For ϕ ≡ 0 we simplify notation writing cap( ) instead of cap0 ( ). If cap( ) = 0, then is a polar set. Observe that E(μ) = +∞ for any μ ∈ M, so that any finite set is polar. There is a number of properties characterizing the equilibrium measure λ ,ϕ . For instance, if we define the logarithmic potential of μ ∈ MC by

1 def dμ(t), U μ (z) = log |z − t| then up to a polar subset of , U

λ ,ϕ

(z) + ϕ(z)

= ρ ∗ , if z ∈ supp(λ ,ϕ ), ≥ ρ ∗ , if z ∈ ,

(2.8)

where ρ ∗ is a constant related to ρ and ϕ. Furthermore, if and ϕ are sufficiently regular, min U λ ,ϕ (z) + ϕ(z) = max min U μ (z) + ϕ(z) . (2.9) z∈

μ∈M1 ( ) z∈

This max- min property is a basis for applications of the equilibrium measure in the asymptotic theory of extremal (in particular, orthogonal) polynomials, see [34,66,73], and also the monograph [77]. Like for the discrete measures, we will consider general external fields of the form ϕ(z) = Re (z), where is analytic, but in general multivalued. What we require in the sequel is that is holomorphic in C\A, allowing further construction below. Remark 2.1. Further generalizations of this construction can be obtained either considering several measures on respective sets interacting according to a certain law (vector equilibrium) [35], or including additional constraints. For instance, prescribing an upper bound on the density of the extremal measure on in (2.7) we obtain the so-called constrained equilibrium [26,74], relevant for the asymptotic description of polynomials of discrete orthogonality. Another way is to impose in (2.7) the size of μ on each compodef nent of , such as it was done in [61]: if A = {a0 , . . . , a p } ⊂ R, a0 < · · · < a p , j = p [a j−1 , a j ], j = 1, . . . , p, = ∪ j=1 j = [a0 , a p ], and N is the standard simplex in R p−1 , p N = θ = (θ1 , . . . , θ p ) : θi ≥ 0, i = 1, . . . , p, and θi = 1 , i=1

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

59

then for each θ = (θ1 , . . . , θ p ) ∈ N we can consider the global minimum of the weighted energy E ϕ (·) restricted to the class def M1 ( , θ ) = μ ∈ M1 : supp(μ) ⊂ , μ( j ) = θ j , j = 1, . . . , p − 1 . Again, for ϕ like in (2.3) with ρ j ≥ 0 there exists a unique minimizing energy, λ ,ϕ (θ). Remark 2.2. It should be mentioned that a characterization of the weighted Fekete points on the real line and its continuous limit were used in [21] to prove new results on the support of an equilibrium (i.e. extremal) measure in an analytic external field on R. 2.3. Relation between discrete and continuous equilibria. The transfinite diameter of a compact set is defined by the limit process when the number of Fekete points tends to infinity. It was Pólya who proved the remarkable fact that the transfinite diameter of is equal to its capacity. Fekete observed further that the normalized counting measure of Fekete points converges to the equilibrium (Robin) measure of . For the weighted analogue of this result, see [77, Ch. III]. The connection between the discrete and continuous equilibria allowed to use the Stieltjes model in [61] in order to obtain in this situation the limit distribution of zeros p of Heine-Stieltjes polynomials. Namely, if for each vector n = (n 1 , . . . , n p ) ∈ Z+ ∗ we consider the discrete extremal measure μ (n) introduced in (2.4), and assume that |n| → ∞ in such a way that each fraction n k /|n| has a limit, nj = θj, |n|→∞ |n| lim

j = 1, . . . , p,

then μ∗ (n)/|n| weakly converges to the equilibrium measure λ ,0 (θ), = [a0 , a p ], θ = (θ1 , . . . , θ p ), defined in the previous section. In a certain sense, this can be regarded as a generalization of the just mentioned classical result of Fekete. 3. Discrete and Continuous Critical Measures According to a well-known result of Gauss, there are no stable equilibrium configurations (i.e. local minima of the energy) in a conducting open set under a harmonic external field. Unstable equilibria usually do not attract much attention from a point of view of physics. However, as we will show further, they constitute a rich and relevant object that appears naturally in many fields of analysis. We introduce now the concept that plays the leading role in this paper: the family of measures providing saddle points for the logarithmic energy on the plane, with a separate treatment of the discrete and continuous cases. 3.1. Discrete critical measures. We start with the following definition: Definition 3.1. Let be a domain on C, A ⊂ a subset of zero capacity, and ϕ be a C 1 real-valued function in \A. A measure μ=

n k=1

δζk ∈ Mn , ζi = ζ j

for i = j,

(3.1)

60

A. Martínez-Finkelshtein, E. A. Rakhmanov

is a discrete (A, ϕ)-critical measure in , if supp(μ) ⊂ C\A, and for the weighted discrete energy Eϕ (μ) = Eϕ (ζ1 , . . . , ζn ) we have grad Eϕ (ζ1 , . . . , ζk ) = 0,

(3.2)

or equivalently, ∂ Eϕ (ζ1 , . . . , z, . . . ζn )|z=ζk = 0, k = 1, . . . , n, ∂z

1 ∂ = ∂z 2

∂ ∂ −i ∂x ∂y

More generally, if ϕ = Re , where is an analytic (in general, multivalued) function in with a single-valued derivative , then this definition does not need any modification. In the sequel we omit the mention to if = C. The following proposition is just a reformulation of Eq. (1.3) in this new terminology: Proposition 3.1. Assume that A = {a0 , a1 , . . . , a p }, p ∈ N, is a set of pairwise distinct points on C, and the external field ϕ is given by (2.3). Then μ=

n

δζk ∈ Mn , ζi = ζ j

for i = j,

(3.3)

k=1

supported on C\A, is a discrete (A, ϕ)-critical measure if and only if there exists a polynomial Vn ∈ P p−1 such that y(z) = yn (z) = nk=1 (z − ζk ) is a solution of the differential equation (1.3), with B(x) ρk = . A(x) x − ak p

k=0

In other words, discrete (A, ϕ)-critical measures with external field generated by complex charges fixed at A correspond precisely to zeros of Heine-Stieltjes polynomials. Proof. A straightforward computation shows that for z = w, 2

∂ 1 log |z − w| = . ∂z z−w

Hence, 2

∂ 1 ∂ E(ζ1 , . . . , ζn ) = −2 log |ζi − ζ j | = − . ∂ζk ∂ζk ζk − ζ j i= j

j=k

On the other hand, the multivalued function ϕ has a single-valued derivative given by (see (2.3)) ρj 1 ∂ ϕ(z) = − = (z). ∂z 2 z − aj p

2

j=0

Thus, using the notation from (1.6), we can rewrite condition (3.2) as ⎛ ⎞ 1 1 B 2⎝ − 2 (ζk )⎠ = 2 + (ζk ) = 0, k = 1, . . . , n, ζk − ζ j ζk − ζ j A j=k

j=k

(3.4)

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

and with y(z) = def

n

61

i=1 (z

− ζi ) this identity takes the form y B (ζk ) = 0, k = 1, . . . , n. + y A

(3.5)

As a consequence, polynomial A(z) y (z) + B(z)y (z) ∈ Pn+ p−1 , n ∈ P p−1 such that is divisible by y, so there exists a polynomial V n (z)y(z), A(z) y (z) + B(z)y (z) = V which concludes the proof. In the sequel we will make use of the following uniform boundedness of the supports of the discrete critical measures, corresponding to a sequence of external fields of the form ϕn = Re n , n (z) = −

p ρk (n) k=0

2

log(z − ak ),

(3.6)

where ρk (n) ∈ C. Proposition 3.2. Let μn ∈ Mn , n ∈ N, be a discrete (A, ϕn )-critical measure corresponding to an external field (3.6). If lim inf Re n

then

n

p ρk (n)

n

k=0

1 >− , 2

(3.7)

supp(μn ) is bounded in C.

In other words, if we assume that in (1.3) the coefficient B = Bn may depend on n, but Bn /n is bounded (in such a way that (3.7) holds), then the zeros of the Heine-Stieltjes polynomials are also uniformly bounded.

Proof. Let μn = nk=1 δζk (n) ∈ Mn , and assume that |ζ1 (n)| ≥ · · · ≥ |ζn (n)|. Since |ζ1 (n)| > 0, by (3.4), n j=2

1 ζ1 (n) =− ρk (n) . 1 − ζ j (n)/ζ1 (n) ζ1 (n) − ak p

k=0

But

|ζ j (n)/ζ1 (n)| ≤ 1

⇒

1 Re 1 − ζ j (n)/ζ1 (n)

so that 1 ζ1 (n) 1 Re ρk (n) ≤− . n−1 ζ1 (n) − ak 2 p

k=0

≥ 1/2,

62

A. Martínez-Finkelshtein, E. A. Rakhmanov

Hence, if ζ1 (n) → ∞ along a subsequence of N, then lim inf Re n

p ρk (n) k=0

n

1 ≤− , 2

which contradicts our assumptions. Remark 3.1. It was proved in [82] that for a fixed ϕ of the form (3.6) (that is, ρk (n) ≡ ρk , k = 0, . . . , p), the zeros of the Heine-Stieltjes polynomials accumulate on the convex hull of A. Remark 3.2. Condition (3.7) is in general necessary for the assertion of Proposition 3.2. Indeed, for p = 0, a0 = 0, and ϕn (z) =

n−1 log |z|, 2

any discrete uniform measure supported at the scaled zeros of unity, that is, μn =

n

δζk (n) ∈ Mn , ζk (n) = ζn e2πik/n , ζn ∈ C\{0},

k=1

is (A, ϕn )-critical, which is easily established using (3.4) and (3.5). Obviously, for ζn → ∞ the support of μn is not uniformly bounded in n. 3.2. Continuous critical measures. Unlike in the discrete case, we provide now a variational definition for the continuous critical measure. Any smooth complex-valued function h in the closure of a domain generates a local variation of by z → z t = z + t h(z), t ∈ C. It is easy to see that z → z t is injective for small values of the parameter t. The transformation above induces a variation of def sets e → et = {z t : z ∈ e}, and (signed) measures: μ → μt , defined by μt (et ) = μ(e); in the differential form, the pullback measure μt can be written as dμt (x t ) = dμ(x). Definition 3.2. Let be a domain on C, A ⊂ a subset of zero capacity, and ϕ be a C 1 real-valued function in \A. We say that a signed measure μ ∈ MR () is a continuous (A, ϕ)-critical if for any h smooth in \A such that h |A ≡ 0 , E ϕ (μt ) − E ϕ (μ) d E ϕ (μt )|t=0 = lim = 0. t→0 dt t

(3.8)

Furthermore, if ϕ = Re , where is an analytic (in general, multivalued) function in with a single-valued derivative , then this definition does not need any modification. In what follows we will always mean by an (A, ϕ)-critical measure the continuous one, satisfying Definition 3.2. Furthermore, in order to simplify notation, we speak about an A-critical measure meaning a continuous (A, ϕ)-critical measure with the external field ϕ ≡ 0. Observe that if A = ∅, this notion is nontrivial. A particularly interesting case is treated in the following lemma:

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

63

Lemma 3.1. If ϕ = Re , and is analytic in a simply connected domain , then condition (3.8) is equivalent to f ϕ (μ; h) = 0, with

f ϕ (μ; h) = def

h(x) − h(y) dμ(x)dμ(y) − 2 x−y

(3.9)

(x) h(x) dμ(x).

(3.10)

Proof. It is sufficient to show that E ϕ (μt ) − E ϕ (μ) = − Re t f (μ; h) + O(t 2 ) . We have

1 dμt (x t )dμt (y t ) − yt |

1 dμ(x)dμ(y), = log |(x − y) + t (h(x) − h(y))|

E(μ ) = t

so that

log

|x t

h(x) − h(y) dμ(x)dμ(y) E(μ ) − E(μ) = − log 1 + t x−y

h(x) − h(y) log 1 + t = − Re dμ(x)dμ(y). x−y

t

On the other hand,

t t t ϕ(x ) dμ (x ) − ϕ(x) dμ(x) = ϕ(x + th(x)) dμ(x) − ϕ(x) dμ(x) t

= Re ( (x + th(x)) − (x)) dμ(x). Taking into account the behavior of log(1 + x) for small x, we conclude that as t → 0,

h(x) − h(y) t + O(t 2 ) dμ(x)dμ(y) E ϕ (μt ) − E ϕ (μ) = − Re x−y

+2 Re t (x) h(x) + O(t 2 ) dμ(x), and the statement follows. Remark 3.3. For a finite set A and the external field given by (2.3), the discrete (A, ϕ)critical measures fit into the same variational definition as their continuous counterparts, as long as we replace in (3.8) the continuous energy E ϕ (μ) by Eϕ (μ). Indeed, arguments similar to those used in the proof of Lemma 3.1 show that for μ in (3.3), the condition d Eϕ (μt )|t=0 = 0, dt

(3.11)

64

A. Martínez-Finkelshtein, E. A. Rakhmanov

written for h(ζ ) =

A(ζ ) , z∈ / A, ζ −z

yields i= j

n B(z) 1 D(z) 1 + = , (ζi − z)(ζ j − z) A(z) ζi − z A(z) i=1

where D is a polynomial. In particular, the residue of the left hand side (as a function def n of z) is 0 at w = ζk , k = 1, . . . , n; setting y(z) = i=1 (z − ζi ), we arrive again at the system (3.5). And viceversa, using the chain rule it is easy to show that the condition (3.2) implies (3.11). Critical measures constitute an important object; for a finite set A the natural description of their structure is in terms of the trajectories of quadratic differentials. In the next section we give an abridged introduction to quadratic differentials on the Riemann sphere in the form needed for our purposes. For a comprehensive account on this theory see for instance [43,72,91,96]. 4. Rational Quadratic Differentials on the Riemann Sphere in a Nutshell Let A and V be monic polynomials of degree p + 1 and p − 1, respectively, with A given by (1.2) with all ak ’s pairwise distinct. The rational function V /A defines on the Riemann sphere C the quadratic differential (z) = −

V (z) (dz)2 . A(z)

(4.1)

The only singular points of (assuming that the zeros of V and A are disjoint) are: – the points ak ∈ A, where has simple poles (critical points of order −1); – the zeros of V of order k ≥ 1, where has zeros of the same order; – the infinity, where has a double pole (critical point of order −2) with the residue −1. The rest of the points in C are the regular points of , and their order is 0. All singular points of order ≥ −1 are called finite critical points of . In a neighborhood of any regular point z 0 we can introduce a local parameter

z

z √ V (t) dt, (4.2) ξ = ξ(z) = = − A(t) in terms of which the representation of is identically equal to one. This parameter is not uniquely determined: any other parameter ξ with this property satisfies ξ = ±ξ + const. The function ξ is called the distinguished or natural parameter near z 0 . Following [72] and [91], a smooth curve γ along which −V (z)/A(z) (dz)2 > 0

⇔

Im ξ(z) = const

is a horizontal arc of the quadratic differential . More precisely, if γ is given by a parametrization z(t), t ∈ (α, β), then

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

V − (z(t)) A

dz dt

65

2 > 0, t ∈ (α, β).

A maximal horizontal arc is called a horizontal trajectory (or simply a trajectory) of . Analogously, trajectories of − are called orthogonal or vertical trajectories of ; along these curves V (z)/A(z) (dz)2 > 0

⇔

Re ξ(z) = const.

Any simply connected domain D not containing singular points of and bounded by two vertical and two horizontal arcs is called a -rectangle. In other words, if ξ is any distinguished parameter in D, then ξ(D) is a (euclidean) rectangle, and D → ξ(D) is a one-to-one conformal mapping. Obviously, this definition is consistent with the freedom in the selection of the natural parameter ξ . We can define a conformal invariant metric associated with the quadratic differential √ , given by the length element |dξ | = | V /A|(z)|dz|; the -length of a curve γ is

V 1 (z) |dz|; γ = π γ A (observe that this definition differs by a normalization constant from Definition 5.3 in [91]). Furthermore, if D is a simply connected domain not containing singular points of , we can introduce the -distance by dist(z 1 , z 2 ; , D) = inf{ γ : z 1 , z 2 ∈ γ¯ , γ ⊂ D}. Trajectories and orthogonal trajectories are in fact geodesics (in the -metric) connecting any two of its points. Indeed, according to [72, Thm. 8.4], in any simply connected domain D not containing singular points of , a trajectory arc γ joining z 1 with z 2 is the shortest: if L 1 , L 2 are the orthogonal trajectories through z 1 and z 2 , respectively, then any rectifiable curve γ that connects L 1 with L 2 in D satisfies γ . γ ≤ The local structure of the trajectories is well known (see the references cited at the end of the previous section). For instance, at any regular point trajectories look locally as simple analytic arcs passing through this point, and through every regular point of passes a uniquely determined horizontal and uniquely determined vertical trajectory of , that are locally orthogonal at this point [91, Theorem 5.5]. If z is a finite critical point of of order k ≥ −1, then from z emanate k + 2 trajectories under equal angles 2π/(k + 2) (see Fig. 1). In the case of a double pole, the trajectories have either the radial, the circular or the spiral form, depending whether the residue at this point is negative, positive or non-real, see Fig. 2. In particular, with the assumptions on A and V above all trajectories of the quadratic differential (4.1) in a neighborhood of infinity are topologically identical to circles. The global structure of the trajectories is much less clear. The trajectories and orthogonal trajectories of a given differential produce a transversal foliation of the Riemann sphere C. The main source of troubles is the existence of the so-called recurrent trajectories, whose closure may have a non-zero plane Lebesgue measure. We refer the reader to [91] for further details. A trajectory γ is critical or short if it joins two (not necessarily different) finite critical points of . The set of critical trajectories of together with their endpoints (critical

66

A. Martínez-Finkelshtein, E. A. Rakhmanov

Fig. 1. The local trajectory structure near a simple zero (left) or a simple pole

Fig. 2. The local trajectory structure near a double pole with a negative (left), positive (center) or non-real residue

Fig. 3. -rectangles intersecting the support of a positive (left) and sign-changing measure (right); for further details, see Sect. 8.4

points of ) is the critical graph of . Critical and closed trajectories are the only trajectories of with finite -length. The quadratic differential is called closed if all its trajectories are either critical or closed (i.e. all its trajectories have a finite -length). In this case the trajectories of that constitute closed Jordan curves cover the whole plane, except a set of critical trajectories of a plane Lebesgue measure zero; see e.g. Fig. 3 for a typical structure of such trajectories.

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

67

If the quadratic differential (4.1) with A given by (1.2) is closed, there exists a set of at √ most p critical trajectories of such that the complement to is connected, and V /A has a single-valued branch in C\ . 5. Critical Measures in the Field of a Finite System of Fixed Charges In what follows we fix the set of p+1 distinct points A = {a0 , . . . , a p } ⊂ C and consider the basic domain = C\A, A = {a0 , a1 , . . . , a p }, and an external field ϕ of the form ϕ = Re , (z) = −

p ρk k=0

2

log(z − ak ), (z) = −

p ρk /2 B(z) =− , z − ak 2 A(z) k=1

(5.1) where we have used notation from (1.6). If {ρ0 , . . . , ρ p } ⊂ R, then this external field corresponds to the potential of a discrete signed measure supported on A: ϕ(z) = U σ (z),

σ =

p ρk k=0

2

δak ∈ M p+1 .

(5.2)

However, if any ρk ∈ C\R, then ϕ is not single-valued in C\A; nevertheless, the notion of an (A, ϕ)-critical measures for this case has been discussed in Definition 3.2. In particular, Lemma 3.1 applies. In this section we state and prove the main structural theorem for (A, ϕ)-critical measures, which asserts that the support of any such a measure is a union of analytic curves made of trajectories of a rational quadratic differential. On each arc of its support the measure has an analytic density with respect to the arc-length measure. Finally, we describe the Cauchy transform and the logarithmic potential of an (A, ϕ)-critical measure. 5.1. The main theorem. According to (5.1), A is exactly the set of singularities of the external field ϕ, except for the case when ρk = 0 for some k ∈ {0, . . . , p}. In such a case we do not drop the corresponding ak from the set A; it remains as a fixed point of the class of variations (Definition 3.2). However, the status of the point ak ∈ A with ρk = 0 is different from the case ρk = 0, see the next theorem. Theorem 5.1. Let A = {a0 , a1 , . . . , a p } and ϕ given by (5.1). Then for any continuous (A, ϕ)-critical measure μ there exists a rational function R with poles at A and normalized by R(z) =

2 1 κ + O 3 , z → ∞, z z

κ = μ(C) + def

p 1 ρj, 2

(5.3)

j=0

such that the support supp(μ) consists of a union of trajectories of the quadratic differential (z) = −R(z)dz 2 . If all ρ j ∈ R, then is closed, and supp(μ) is made of a finite number of trajectories of . If in the representation (5.1), ρ j = 0, j ∈ {0, 1, . . . , p}, then a j is either a simple pole or a regular point of R; otherwise R has a double pole at a j .

68

A. Martínez-Finkelshtein, E. A. Rakhmanov

The proof of this theorem reduces to two lemmas below. The first of them deals with the principal value of the Cauchy transform

1 def μ C (z) = lim dμ(x) (5.4) →0+ |z−x|> x − z of the (A, ϕ)-critical measure μ. Lemma 5.1. For any (A, ϕ)-critical measure μ there exists a rational function R with properties listed in Theorem 5.1 such that μ 2 C (z) + (z) = R(z)

mes2 − a.e.,

(5.5)

where mes2 is the plane Lebesgue measure on C. Remark 5.1. Formula (5.5) and its variations for equilibriums measures of compact sets of minimal capacity (see Sect. 6) are well-known, although occasionally written in terms of quadratic differentials, see e.g. the work of Nuttall [67], Stahl [86,88], Gonchar and Rakhmanov [36,75], Deift and collaborators [21]. Notice that in the situation considered here the support of the critical measure is not known a-priori. Remark 5.2. Formula (5.5) is also sufficient for μ being (A, ϕ)-critical, so that it in fact characterizes these critical measures. The proof of this statement lies beyond the scope of this already lengthy paper, and we do not go into further details. Proof. Assume that μ is an (A, ϕ)-critical measure for ϕ like in (5.1). We will actually show that (5.5) is valid at any point z ∈ C where the integral defining C μ is absolutely convergent. It is well known that at such a z,

dμ(x) lim+ = 0, (5.6) r →0 |x−z| 0 denote Dr = {ζ ∈ C : |ζ − z| < r }. def Function m(r ) = μ (Dr ) is continuous from the left and monotonically increasing, so that the subset m(r + ε) − m(r − ε) def = r ∈ (0, 1) : m (r ) = lim exists ε→0 2ε has the linear Lebesgue measure 1.

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

69

For r ∈ and ε ∈ (0, 1) define the “smooth step” function ⎧ if 0 ≤ x < 1 − ε, ⎨ 0, 2 def (x, ε) = (x−1−ε) (x−1+2ε) , if 1 − ε ≤ x < 1 + ε, 4ε3 ⎩ 1, if x ≥ 1 + ε. It is easy to see that (·, ε) ∈ C 1 (R+ ) and that | ddx (x, ε)| < 1/ε for all ε > 0. Using this function we define on C the C 1 function |ζ − z| def θ (ζ ) = θ (ζ, r, ε) = ,ε , r and consider the condition (3.9) with the following particular choice of h: h(ζ ) = h ε (ζ ; r ) =

A(ζ ) θ (ζ, r, ε). ζ −z

(5.7)

For the sake of brevity we use the notation Kr,ε = Dr (1+ε) \Dr (1−ε) , Fr,ε = C\Dr (1+ε) , def

def

so that Dr (1−ε) , Kr,ε and Fr,ε provide a partition of C. Furthermore, by construction 0, if ζ ∈ Dr (1−ε) , h(ζ ) = A(ζ ) (5.8) ζ −z , if ζ ∈ Fr,ε . Consider first

h ε (x; r ) − h ε (y; r ) dμ(x)dμ(y) x−y = I (Dr (1−ε) × Dr (1−ε) ) + I (Kr,ε × Kr,ε ) + I (Fr,ε × Fr,ε ) + 2I (Dr (1−ε) × Kr,ε ) + 2I (Dr (1−ε) × Fr,ε ) + 2I (Kr,ε × Fr,ε ),

where I () means the integral in the l.h.s. taken over the set . Observe that by (5.8), I (Dr (1−ε) × Dr (1−ε) ) = 0. Let ζ ∈ Kr,ε ; since |ζ − z| |ζ − z| ∂ ζ −z ∂ 1 1 ,ε ,ε , θ (ζ ) = |ζ − z| = r r r r |ζ − z| ∂ζ ∂ζ we have

∂ 1 1 grad θ (ζ ) = θ (ζ ) ≤ . 2 rε ∂ζ

In consequence, for x, y ∈ Kr,ε , h ε (x; r ) − h ε (y; r ) const ≤ , x−y rε

(5.9)

where the constant in the right hand side is independent of ε. Obviously, by definition of h we have that this inequality is valid (with a different constant) if x ∈ Kr,ε and y lies on a compact subset of C.

70

A. Martínez-Finkelshtein, E. A. Rakhmanov

From (5.9) we conclude that h (x; r ) − h (y; r ) ε ε I (Kr,ε × Kr,ε ) = dμ(x)dμ(y) Kr,ε ×Kr,ε x−y 2 const ≤ μ Kr,ε . rε Taking into account that r ∈ , we have that μ Kr,ε = 2r m (r ), lim ε→0+ ε

(5.10)

(5.11)

so by (5.10), I (Kr,ε × Kr,ε ) = o(1) as ε → 0+. Consider now x ∈ Kr,ε and y ∈ Dr,ε . Then h ε (x; r ) − h ε (y; r ) h ε (x; r ) A(x) = = θ (x). x−y x−y (x − z)(x − y) Consider two cases. If |y − z| < r (1 − 2ε), then h ε (x; r ) − h ε (y; r ) const const ≤ ≤ . x−y r (1 − ε)|x − y| r (1 − ε)(|x| − |y|) Hence, with a different constant, h ε (x; r ) − h ε (y; r ) dμ(x)dμ(y) x∈Kr,ε , |y−z|
r (1+ε) r (1−2ε) const st ≤ dsdt; r t −s r (1−ε) 0 the double integral in the r.h.s. is explicit, and it is straightforward to verify that it is o(1) a ε → 0+. If on the contrary r (1 − 2ε) < |y − z| < r (1 − ε), with x ∈ Kr,ε we can use the estimate (5.11), which yields h ε (x; r ) − h ε (y; r ) dμ(x)dμ(y) x∈Kr,ε , r (1−2ε)<|y−z|
(5.12)

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

71

and we conclude that

h ε (x; r ) − h ε (y; r ) dμ(x)dμ(y) lim ε→0+ x−y

h 0 (x; r ) − h 0 (y; r ) = dμ(x)dμ(y) x−y |x−z|≥r, |y−z|≥r

h 0 (x; r ) − h 0 (y; r ) dμ(x)dμ(y) +2 x−y |x−z|≥r, |y−z|
h 0 (ζ ; r ) =

0,

if |ζ − z| < r,

A(ζ ) ζ −z ,

if |ζ − z| > r.

(5.13)

Let us analyze the behavior of each integral as r → 0+ separately. First,

A(x) dμ(x)dμ(y). I2 (r ) = (x − z)(x − y) |x−z|≥r, |y−z|
|x−z|≥r

A(x) x − z

|y−z|
dμ(y) |x − y|

dμ(x) < +∞,

so that applying Fubini’s theorem we conclude that

A(x) dμ(y) I2 (r ) = dμ(x). |x−z|≥r x − z |y−z|
r →0+, r ∈

On the other hand, by (5.8),

I1 (r ) = |x−z|≥r, |y−z|≥r

I2 (r ) = 0.

A(x) A(y) − (x − z)(x − y) (y − z)(x − y)

dμ(x)dμ(y). (5.14)

The identity A(x)(y − z) − A(y)(x − z) + A(z)(x − y) = (x − y)(x − z)(y − z)D(x, y, z) (5.15) is immediate, where D(x, y, z) = α0 (x, y) + α1 (x, y)z + · · · + α p−2 (x, y)z p−2 + z p−1 is a polynomial of degree ≤ p − 1 in each variable. Hence, A(x) A(y) A(z) = − + D(x, y, z). (x − z)(x − y) (y − z)(x − y) (x − z)(y − z)

(5.16)

72

A. Martínez-Finkelshtein, E. A. Rakhmanov

Using it in (5.14) we get that lim

r →0+, r ∈

2 I1 (r ) = D1 (z) − A(z) C μ (z) ,

where

D1 (z) =

D(x, y, z) dμ(x)dμ(y).

(5.17)

Thus, by (5.13),

lim

lim

r →0+, r ∈ ε→0+

2 h ε (x; r ) − h ε (y; r ) dμ(x)dμ(y) = D1 (z) − A(z) C μ (z) . x−y (5.18)

In a similar fashion we can analyze !

(x)h ε (x; r ) dμ(x) =

Kr,ε

+

Again estimates on Kr,ε and (5.11) show that

lim

(x)h ε (x; r ) dμ(x) = ε→0+

"

Fr,ε

|x−z|≥r

(x)h ε (x; r ) dμ(x).

(x)

A(x) dμ(x). x−z

Taking into account (5.1) we can rewrite the right hand side as

1 B(x) − B(z) 1 B(x) dμ(x) = − dμ(x) − 2 x−z 2 |x−z|≥r x−z

1 1 dμ(x). − B(z) 2 x − z |x−z|≥r Thus,

lim lim

r →0 ε→0+

(x)h ε (x; r ) dμ(x) = −

where

D2 (z) =

1 D2 (z) + B(z)C μ (z) , 2

B(x) − B(z) dμ(x) x−z

is a polynomial of degree ≤ p − 1; it is ≡ 0 if ϕ ≡ 0. Combining this last identity with (5.18) and using (3.10) we get that 2 lim lim f ϕ (μ; h ε (·; r )) = D1 (z) − A(z) C μ (z) + D2 (z) + B(z)C μ (z). r →0 ε→0+

(5.19)

(5.20)

Since (3.9) is valid for each ε > 0 and r ∈ , we obtain that the right hand side in (5.20) is 0. In consequence, 2 A(z)C μ (z) − A(z)B(z) C μ (z) + B 2 (z)/4 = A(z) (D1 (z) + D2 (z)) + B 2 /4(z).

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

73

Taking into account (5.1) we rewrite this condition as (5.5), with R(z) = def

D1 (z) + D2 (z) 2 + (z) , A(z)

and clearly R has a double pole at z = a j if and only if ρ j = 0. Finally, the normalization condition (5.3) follows from considering (5.5) as z → ∞. This establishes the assertion of the theorem. The next proposition is a slightly modified version of [6, Lemma 4] by T. Bergkvist and H. Rullgård. The original lemma considers only positive measures μ for which (C μ )k , for certain k ∈ N, is a reciprocal of a polynomial. Lemma 5.2. Assume that μ ∈ MR is a finite signed Borel measure on the plane whose Cauchy transform C μ is such that that there exist rational functions r and R with possible poles at A satisfying μ 2 C + r (z) = R(z) mes2 − a.e. (5.21) Then μ is supported on a union of analytic arcs, that are trajectories of the quadratic differential (z) = −R(z)dz 2 , with possible mass points at A. If all poles of r are simple and have real residues, then additionally is a closed differential, and the number of connected components of supp(μ) is finite. Finally, if μ is a positive Borel measure, then the intersection of any -rectangle, not containing the zeros or poles of R, with supp(μ) is connected. Proof. For the quadratic differential consider a -rectangle D (see the definition in Sect. 4), disjoint with A√and not containing the zeros of R. We select in this rectangle a holomorphic branch of R and a distinguished parameter

z# R(t) dt, (5.22) ξ = ξ(z) = def $= which is a conformal mapping of D onto D ξ(D). $ the following function: Let us define in D μ C +r def χ (ξ ) = sgn √ (z(ξ )) . R

Hence, χ takes only two values, ±1, and for z ∈ D, √ μ C + r (z) = χ (ξ(z)) R(z).

(5.23)

We have that in the sense of generalized derivatives, for z ∈ D, ∂ μ 1 ∂ ∂ ∂ C + r (z) = π μ(z), = +i . ∂z ∂z 2 ∂x ∂y Differentiating in (5.23) and using the chain rule, we get −π μ(z) =

√ √ ∂ ∂ ∂χ (ξ ) ∂ξ(z) √ χ (ξ(z)) R(z) = R(z). (χ (ξ(z))) R(z) = ∂z ∂z ∂z ∂ξ

74

A. Martínez-Finkelshtein, E. A. Rakhmanov

Taking into account the definition of χ in (5.22) we conclude that if z ∈ D and ξ = ξ(z), then ∂χ (ξ ) ∂ξ

=−

π μ(z) . |R(z)|

(5.24)

In particular, the (generalized) partial derivative of χ (ξ ) along the vertical axis is zero; if ξ = x + i y, this implies that χ (ξ ) is equivalent to a function g(x) which takes only values +1 and −1. Thus, the set ∂χ (ξ ) = 0 ξ: ∂ξ is a union of vertical arcs in the ξ -plane. From (5.24) it follows that the number of these arcs is finite. This means that the image of the support of μ in D by (5.22) is made of vertical lines, that is, supp(μ) ∩ D is a union of horizontal trajectories of . Moreover, if μ is positive, we get by (5.24) that ∂χ (ξ ) ∂ξ

≥0

$ in D,

$ In other words, supp(μ) ∩ D contains at so that χ (ξ ) changes sign at most once in D. most one single analytic arc, which is a horizontal trajectory of . Finally, if all residues of r are real, then

Re

z

μ C + r (t)dt = U μ (z) + Re

z

r (t)dt + const

is harmonic and single-valued in C\ supp(μ), which means that the trajectories of are either critical or level curves of a harmonic function. Thus, is closed. Remark 5.2. See Fig. 3 for an illustration of the statement about the connectedness of the intersection of a -rectangle with supp(μ). Obviously, in this assertion we can replace the -rectangle by any simply-connected domain that is mapped one-to-one by ξ onto a convex set. Remark 5.3. Observe that in Lemma 5.2 we do not assume a priori that E ϕ (μ) < ∞, so that mass points of μ are allowed. An immediate consequence of Lemma 5.1 and Lemma 5.2 is Corollary 5.1. If in (5.1), {ρ0 , . . . , ρ p } ⊂ R, then√in any connected component of C\ supp(μ) we can select a single-valued branch of R such that there the formula μ

U (z) + ϕ(z) = − Re holds.

z

#

R(t) dt

(5.25)

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

75

5.2. Critical and reflectionless measures. By Theorem 5.1, supp(μ) is a union of analytic arcs. Definition 5.1. We call a point z ∈ supp(μ) regular if there exists a simply connected open neighborhood B of z such that B ∩ supp(μ) is a Jordan arc. Lemma 5.3. Let μ be an (A, ϕ)-critical measure. Then the principal value of the Cauchy transform of μ (5.4) satisfies C μ (z) + (z) = 0, z ∈ supp(μ)\A.

(5.26)

On any simple subarc of supp(μ) measure μ is absolutely continuous with respect to the arc-length measure, and its density is given by 1 # dμ(z) = (5.27) R(z) dz . π Remark 5.4. We can reformulate (5.26) as C μ + = 0 μ-a.e. on C. Proof. Assume first that z ∈ supp(μ) is regular, and let B be a simply connected open neighborhood B of z such that B ∩ supp(μ) is an open analytic arc not containing A. By Lemma 5.1, 2 μ C (ζ ) + (ζ ) = R(ζ ), ζ ∈ C\ supp(μ), where the Cauchy transform is understood in the strong sense (ordinary integral), so that # C μ (ζ ) = − (ζ ) + R(ζ ) (5.28) in each connected component of B\ supp(μ), with an appropriate selection of the branch of the square root. At a regular point z the boundary values of C μ from both sides of supp(μ), μ

C± (z) = def

lim

ζ →z ± , ζ ∈C\ supp(μ)

C μ (ζ ),

are well defined and satisfy the Sokhotsky-Plemelj relations, μ

μ

μ

μ

C+ (z) − C− (z) = 2πiμ (z), C+ (z) + C− (z) = 2C μ (z). (5.29) √ By (5.28), if the branch of R coincides on both sides of B ∩ supp(μ), then μ ≡ √0 there, which is impossible. Hence, with an appropriate selection of the branch of R in B, # C μ (z) = − (z) ± R(z), z ∈ B ∩ supp(μ), and both (5.26) and (5.27) follow from (5.29). Finally, if z ∈ supp(μ)\A is not regular, it must coincide with a zero of the rational function R in (5.5). Taking into account the expression for the density of the measure we see that it vanishes at z at least as the square root of (ζ − z). Then, at this point (5.6) holds, as well as formula (5.5). This concludes the proof of (5.26).

76

A. Martínez-Finkelshtein, E. A. Rakhmanov

Formula (5.26) is a direct continuous analogue of property (3.4) for the discrete critical measures, and it may be proved directly (independently from Theorem 5.1) using local variations (h ≡ 0 outside of a small neighborhood of the singularity at z). The original proof of Theorem 5.1 by the second author (unpublished) was based on a combination of Lemmas 5.1 and 5.3. The technique from [6] used in Lemma 5.2 above streamlines the arguments. Observe that for ϕ ≡ 0 we obtain from Lemma 5.3 that for any A-critical measure μ, and for any regular point z ∈ supp(μ), C μ (z) = 0, so that C μ = 0 μ-a.e. Measures with this property are called reflectionless; see [65], where they are treated in the context of the geometric function theory. Remark 5.5. For measures supported on the unit circle T = {z ∈ C : |z| = 1} we can find in literature an alternative definition of the “reflectionlessness” (see [31–33]), characterized by vanishing of the sum of the boundary values of the Carathéodory function of μ and not of its Cauchy transform. For instance, the Lebesgue measure mes1 on T is reflectionless in the sense of [32], but does not satisfy v.p. C mes1 = 0 on T. In this paper we give the “reflectionless measure” the meaning specified above. def

Reflectionless measures on R have their origin in the spectral theory; they are in fact spectral measures of Schrödinger self-adjoint operators with reflectionless potentials (see e.g. [20]) and of reflectionless Jacobi operators [94]. It is immediate to show that there are reflectionless measures with infinite energy. So, the following conjecture seems natural: Conjecture 1. For any positive reflectionless measure μ with a finite energy we can find a polar set A on C such that μ is A-critical. However, a weaker statement follows from Proposition 5.2 below: assume that μ is a reflectionless measure supported on a finite set of analytic arcs = 1 ∪ · · · ∪ k of C, and that μ is absolutely continuous with respect to the arc-length measure. Then there exists a discrete finite set A such that this measure is A-critical. 5.3. Equilibrium conditions in terms of potentials, and the S-property. The variational requirements defining a continuous (A, ϕ)-critical measure μ impose equilibrium conditions that we discuss next. We will see that the gradient of the total potential (that is, force) vanishes at any regular point of the support of the measure located in the conducting part of the plane. However, grad(U μ + ϕ) is not continuous across any arc in supp(μ), and we have to consider separately the force acting on an element of charge from either side of supp(μ). This leads to equality of the normal derivatives, the so called S-property, see (5.30) below. Lemma 5.4. The total potential of an (A, ϕ)-critical measure μ satisfies the following properties: (i) if supp(μ) = 1 ∪ · · · ∪ k , where j are the connected components of supp(μ), then U μ (z) + ϕ(z) = w j = const, z ∈ j ,

j = 1, . . . , k.

(ii) at any regular point z ∈ supp(μ), ∂ μ ∂ μ U + ϕ (z) = U + ϕ (z), ∂n + ∂n −

(5.30)

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

77

where n ± are the normal vectors to supp(μ) at z pointing in the opposite directions. Additionally, if z ∈ supp(μ)\A is not regular, then grad U μ (z) + ϕ(z) = 0. (5.31) Furthermore, assume that a finite real measure μ, whose support supp(μ) consists of a union of a finite set of analytic arcs, supp(μ) = 1 ∪ · · · ∪ k , satisfies conditions (i) and (ii) above. Then μ satisfies an equation of the form (5.5), where R is a rational function with possible poles at A of order ≤ 2. Remark 5.3. As it follows from the last statement of this lemma and Remark 5.2, conditions (i) and (ii) are sufficient for μ being (A, ϕ)-critical. Proof. Let μ be an (A, ϕ)-critical measure. From Theorem 5.1 and Lemma 5.3 it follows that μ has an analytic density on the regular points of its support, made of analytic curves. Hence, U μ is continuous up to the boundary, and (i) is a direct consequence of Corollary 5.1 and the fact that μ lives on trajectories of the quadratic differential −R(z)(dz)2 . For any z ∈ C\ supp(μ) we have that 1 ∂ μ U (z) + ϕ(z) = (C μ (z) + (z)), ∂z 2 and this relation is inherited by the limit values on the Carathéodory boundary of C\ supp(μ). Using (5.26) we conclude that on regular points of supp(μ)\A, ∂ μ ∂ μ U (z) + ϕ(z) + + U (z) + ϕ(z) − = 0. ∂z ∂z Observe that for a real valued function u, ∂u(z)/(∂z) coincides, up to a factor 1/2, with the (complex) gradient of u. From (i) it follows that grad(U μ + ϕ) is normal to supp(μ), so that % % %grad U μ (z) + ϕ(z) % = ∂ U μ (z) + ϕ(z) , z ∈ supp(μ)\A, ∂n ± and (5.30) follows from the last two equalities. Finally, the only possible non-regular points of supp(μ)\A are the zeros of R, where the density vanishes, and we conclude (5.31). Let us prove the reciprocal. Assume that μ ∈ MR has a bounded support comprised of a finite number of smooth linear connected components j . Fix a regular point ζ ∈ supp(μ) (without loss of generality, ζ ∈ j ), and let again B be a simply connected open neighborhood of ζ such that B ∩supp(μ) is a Jordan arc. It splits B into two disjoint domains, that we denote by B ± , so that B\ supp(μ) = B + ∪ B − . It follows from (i) and (5.30) that ⎧ μ + ⎪ ⎨ U (z) + ϕ(z) − w j , if z ∈ B , def μ U (z) = − U (z) + ϕ(z) − w j , if z ∈ B − , ⎪ ⎩ 0, if z ∈ B ∩ supp(μ), is harmonic in B. Equivalently, C(z) = ∂U (z)/(∂z) is holomorphic in B. But μ if z ∈ B + , C (z) + (z), μ C(z) = − C (z) + (z) , if z ∈ B − , def

(5.32)

78

A. Martínez-Finkelshtein, E. A. Rakhmanov

is continuous in B. It implies that R(z) = (C μ + )2 (z) is holomorphic in B, and in consequence, R is analytic at any regular point of supp(μ). Since R is obviously analytic also in C\ supp(μ), we conclude that it is in fact holomorphic in C, except for the set of irregular points of supp(μ), that is, the endpoints of the arcs comprising supp(μ). By (5.31) and (5.32), R vanishes also at the irregular points of supp(μ)\A and at infinity. Since μ is a finite measure with finite energy, C μ has a sub-polar growth at any point of C. Altogether it means that R is a rational function with possible poles at A (of order ≤ 2). def

5.4. Correspondence of critical measures with closed quadratic differentials. We begin with some general remarks on (A, ϕ)-critical measures for the case (5.2), when all ρk ∈ R and the external field corresponds to the potential of a discrete signed real measure supported on A. According to Theorem 5.1, for any (A, ϕ)-critical measure μ there exists a closed quadratic differential = −R(z)dz 2 such that supp(μ) consists of a finite union of its trajectories. This is not a one-to-one correspondence: even in the class M the same quadratic differential may correspond to a whole family of critical measures. Example 5.1. Let p = 0, a0 = 0, and ϕ(z) = 21 log |z| (generated by a charge −1/2 at the origin). Then for any r > 0 the normalized angular (Lebesgue) measure mes1 living on the circle |z| = r is (A, ϕ)-critical. Each such a measure is supported on a trajectory of the same quadratic differential −(dz)2 /z 2 (for a discrete analogue of this statement, see Remark 3.2). Example 5.2. In a more general situation we can consider an A-critical measure μ for an arbitrary configuration A without external field (ϕ ≡ 0); the trajectories of the associated quadratic differential near infinity are closed Jordan curves. Select any such a trajectory β containing in the bounded component of its complement both A and supp(μ), and denote by $ μ the balayage of μ onto β (see the definition in [77, § II.4]). def Then μ1 = 2$ μ −μ is another A-critical measure with the same total mass as μ, and such that β ⊂ supp(μ1 ); observe that μ1 also corresponds to the same quadratic differential. The verification of this assertion is a simple exercise. What these very basic examples have in common is that in each case when we were able to construct more than one A-critical measure associated with the same quadratic differential, closed trajectories in the support of measures were present. Moreover, we have seen that an infinite family of critical measures may correspond to the same quadratic differential. This is not the case if we restrict ourselves to critical measures with a connected complement to the support. However, even in this situation the quadratic differential can give rise to more than one (signed) critical measure, as the following example shows. Example 5.3. Consider the quadratic differential with 4 simple poles a0 , . . . , a3 at the vertices of a rectangle, and two simple zeros v1 , v2 , situated symmetrically at the midpoints of the longest sides of this rectangle. The trajectories of such a differential are depicted in Fig. 4. We can associate to three different A-critical measures with a connected complement of their supports. Indeed, although the critical trajectories joining poles will always belong to the support of any such a critical measure, for the third component of the support we can choose any of critical trajectories connecting both zeros v j .

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

79

Fig. 4. Trajectories of the quadratic differential described in Example 5.3. Bold lines represent the critical graph of

All these examples illustrate the general difficulty of the analysis of the correspondence between quadratic differentials and critical measures. Nevertheless, we will show below that in the class of positive (A, ϕ)-critical measures μ corresponding to ϕ generated by a positive measure, the mapping associating to such a μ, the quadratic differential described in Theorem 5.1 is an injection. Moreover, C\ supp(μ) is connected. The assumption of positivity of the mass giving rise to ϕ is necessary, as the following example (first considered by Teichmüller in his “Habilitationsschrift” [93]) shows. Example 5.4. Let A = {0, 1}, σ = −αδ0 , 0 < α < 1 (negative “attracting” charge). Then there exists a unique positive critical measure μ with μ(C) = 1 + α. The corresponding quadratic differential is −

(z − c) (dz)2 , c = c(α) ∈ (0, 1). z 2 (z − 1)

(5.33)

The supp(μ) is the whole critical graph of this differential, which consists of the segment [c, 1] and a closing loop passing through c and enclosing the origin (see Fig. 5). Thus, we do not have critical measures with a connected complement to supp(μ). Moreover, if we now consider an external field of an opposite sign, ϕ = U −σ (positive “repelling” charge), then the corresponding (A, ϕ )- critical measure μ is associated with the same quadratic differential (5.33), but now supp( μ) = [c, 1], and μ(C) = 1 − α. We point out that this is not a mere artificial example: the whole variety of the critical measures in a similar situation appears in the asymptotic analysis of the Jacobi polynomials with varying non-standard parameters, see e.g. [47] and [60]. We present here a lemma that will allow us to isolate the cases that are of our interest. Lemma 5.5. Let μ ∈ MR be an (A, ϕ)-critical measure for ϕ = U σ , σ ∈ MR , and assume that β is a closed contour contained entirely in supp(μ), delimiting the bounded domain . Then −μ(β) = 2(μ + σ )().

80

A. Martínez-Finkelshtein, E. A. Rakhmanov

Fig. 5. Critical graph of the quadratic differential given in (5.33)

Proof. Let n − be the unit outer normal vector to β, and n + = −n − . By the Gauss theorem (see e.g. Theorem 1.1, §II.1 of [77]), ' ' 1 ∂ μ ∂ 1 U + ϕ (z)|dz| = U μ+σ (z)|dz| = (μ + σ )(β ∪ ), 2π β ∂n − 2π β ∂n − ' ' 1 ∂ μ ∂ 1 U + ϕ (z)|dz| = U μ+σ (z)|dz| = −(μ + σ )(). 2π β ∂n + 2π β ∂n + By the S-property (5.30), both integrals in the left hand side are equal, and the lemma follows since σ (β) = 0. Proposition 5.1. If ϕ = U σ , with σ ≥ 0, then (i) the support of any positive (A, ϕ)-critical measure has a connected complement; (ii) the correspondence between the positive (A, ϕ)-critical measures and the associated quadratic differentials is injective. Proof. Statement (i) is an obvious consequence of Lemma 5.5. Let now μ be a positive (A, ϕ)-critical measure and let = −R(z)dz 2 be the quadratic differential whose critical trajectories support μ (see Theorem 5.1). We have to prove that determines μ uniquely. Let be the critical graph of , and denote by Crit ( ) the class of all (signed) (A, ϕ)-critical measures μ corresponding to and such that C\ supp( μ) is connected; by (i), μ ∈ Crit ( ). If both sides of a critical trajectory γ ⊂ belong to the boundary of the same connected component of C\ , then either one of these disjoint possibilities holds: – γ belongs to the support of every measure from Crit ( ); – γ is not contained in the support of any measure from Crit ( ). Indeed, by (5.21), any measure μ ∈ Crit ( ) is recovered from its support using the Sokhotsky-Plemelj formulas. Since both sides of γ belong to a connected complement √ of C\ , either possibility √ above is determined by the analytic continuation of R: γ ∈ supp( μ) if and only if R will have opposite signs on both sides of γ . This is obviously the case if any closed curve, contained in this connected component and joining both sides of γ , encloses an odd number of singular points of . As a corollary, we conclude that any critical trajectory emanating from a simple pole of , must belong to the support of any μ ∈ Crit ( ).

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

81

Let now γ be a critical trajectory joining two different zeros of . Each side of γ is a boundary of a ring domain filled with closed trajectories of . If both sides of γ are in the boundary of the same ring domain, then considerations above apply. So, it remains to consider the case when γ is in the outer boundary of a ring domain (see e.g. the middle arc joining v1 and v2 in Fig. 4). Let γ be a closed trajectory from , and denote by λ the restriction of μ + σ contained inside γ . Since λ is by assumption a positive measure, the gradient (flux) of the potential U λ on γ is directed inwards. By continuity, this also happens on γ , so the restriction of μ + σ to γ cannot be positive. In conclusion, γ does not belong to supp(μ), and it finishes the proof of (ii). Remark 5.6. In fact, the following stronger uniqueness property holds (formulated in the notation introduced in the proof above): if for a quadratic differential , the set Crit ( ) contains more than one measure, then none of the measures in Crit ( ) is positive. This is the case, for instance, of Example 5.3. Combining Lemma 5.4 and Proposition 5.1 and following the logic of Remark 5.2 we get an addendum to Lemma 5.2, which is of independent interest: Proposition 5.2. Assume that μ ∈ M is a finite positive Borel measure on the plane whose Cauchy transform C μ is such that μ 2 C (z) = R(z) mes2 − a.e. for a rational function R. Denote by A the set of poles of R. Then μ is an A-critical measure, and it is uniquely determined by R. 6. Critical Measures and Extremal Problems Critical measures are connected in an essential way with a class of extremal problems that lies on a crossroad of the geometric function theory, approximation theory, potential theory and some other topics that will be mentioned below. We start with one of the oldest problems of that kind. 6.1. Chebotarev’s continuum. For a finite set A = {a0 , . . . , a p } in C we are interested in the continuum of minimal capacity containing A. More precisely, if we denote by F the family of all continua F ⊂ C with A ⊂ F, we seek ∈ F such that cap( ) = min cap(F). F∈F

(6.1)

This problem was raised by Chebotarev (alternative spelling, Tchebotaröv) in a letter to Pólya (see [71]). Grötzsch [38] and Lavrentiev (or Lavrentieff) [52,53] proved that there exists a unique ∗ = ∗ (A) satisfying (6.1), and that ∗ is a union of critical trajectories of a rational quadratic differential −R(z)(dz)2 , where R = V ∗ /A, with A p−1 given by (1.2), and V ∗ (z) = j=1 (z − v ∗j ). We call this ∗ Chebotarev’s compact or Chebotarev’s continuum. The most recent account on the background of this problem and some of its applications can be found in [69]. Here we describe briefly (and inductively) some basic facts about the geometry of ∗ that will be useful in the sequel. The case p = 1 (that is, when A contains 2 points) is trivial: in this situation ∗ is the segment [a0 , a1 ] joining both points.

82

A. Martínez-Finkelshtein, E. A. Rakhmanov

Fig. 6. Chebotarev’s compact for three points

Fig. 7. Chebotarev’s compact for 4 points, when the zeros of V ∗ are simple (left) or double

For p = 2 (the first non-trivial case), V ∗ (z) = z − v ∗ . If a0 , a1 and a2 are collinear, say a2 ∈ [a0 , a1 ], then v ∗ = a2 , and we are in the situation of p = 1 considered above. Otherwise ∗ (A) is made of three analytic arcs, each emanating from the respective pole a j ( j = 0, 1, 2) of R, and all merging at v ∗ (see Fig. 6). Point v ∗ = v ∗ (A), as function of A, is uniquely defined, but its analytic representation is not known (not to speak of the zeros of V ∗ for p ≥ 3), see [51]. In the totally symmetric case when a j lie at the vertices of the equilateral triangle, point v ∗ is just its center, and the three curves coincide with the bisectors joining it with the vertices. For further geometric properties of ∗ and v ∗ see [51, Ch. 1]. For p = 3 (four poles), in a generic situation the zeros of V ∗ are simple, and Chebotarev’s compact consists of 5 arcs; if these zeros coalesce forming a double zero, then ∗ (A) is made of 4 arcs (see Fig. 7). When one or both zeros v ∗j coincide with a pole from A we are left in one of the cases previously considered. The number of “degenerate” situations grows fast with p, so in the sequel we restrict our attention to a generic one, when the points from A are in a general position. This notion of genericity (as opposed to some more special or coincidental cases that are possible) means in our case that all zeros of V ∗ are simple and disjoint with A. For p = 5 (6 points) we can find essentially two different configurations: the linearly ordered set (Fig. 8, left) and the branched tree (Fig. 8, right). In general, a non-degenerate

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

83

Fig. 8. Chebotarev’s compacts for 6 points

(A) consists of a set of linear branches, like in Fig. 7, left, and Fig. 8, left, and it may have a branch point at a zero of V ∗ where three branches merge, like in Fig. 8, right. Thus, ∗ (A) is an analytic tree. It will be crucial for our study of families of positive A-critical measures below. In fact, ∗ (A) plays a role of the “origin” of a coordinate system on the parameter plane of the above mentioned families of measures. Observe finally that by Lemma 5.4, Proposition 5.1 and the result of Lavrentiev, the Robin (or equilibrium) measure λ of = (A) is a (unit) positive A-critical measure with a connected support, and that λ is the only (unit) positive A-critical measure with this property. 6.2. Cuts of minimal capacity and convergence of Padé approximants.

For A = {a0 , . . . , a p } ⊂ C we denote by U(A) the class of analytic germs f (z) = n≥0 f n z −n at infinity that admit an analytic continuation to C\A. For a fixed germ f ∈ U(A) consider the family F( f ) of cuts F ⊂ C which make f single-valued in their complement: F( f ) = {F ⊂ C : f holomorphic in C\F}. def

We are interested in the set = ( f ) ⊂ F( f ) such that cap( ) = min cap(F). F∈F( f )

(6.2)

Observe that this problem is a generalization of that considered in the previous section. Approximately 40 years ago one of the hottest topics in approximation theory of analytic functions was the problem of convergence of diagonal Padé approximants [n/n] f = P/Q for functions f ∈ U(A). In this connection J. Nuttall made a basic conjecture that the sequence [n/n] f converges (in capacity) to f ∈ U(A) in C\ ( f ); he also proved his conjecture in some special cases (see review [67]). The problem was completely solved in 1986 by H. Stahl, who proved Nuttall’s conjecture in a striking generality: for closed sets A with cap(A) = 0. In particular, he established that for any f ∈ U(A) there exists an essentially unique set ( f ) satisfying (6.2); he also characterized it in terms of the equilibrium conditions and the S-property (see Sect. 5.3). We refer the interested reader to [86–89]. We note that our concept of an A-critical measure can also be extended to compact sets A of capacity zero, but in this paper we deal with finite sets A only, an assumption that we keep in the sequel.

84

A. Martínez-Finkelshtein, E. A. Rakhmanov

Fig. 9. A minimal capacity set for 4 points

For a finite set A there obviously exists only a finite number of different possible solutions (sets of minimal capacity) of (6.2), that we denote by 0 , . . . , N , in such a way that 0 is Chebotarev’s continuum for A. In other words, for any f ∈ U(A), ( f ) ∈ { 0 , . . . , N }. For any k = 0, . . . , N , the equilibrium measure λ = λk of = k is A-critical and satisfies U λ (z) = ρk = const, z ∈ = supp(λ), ∂ ∂ U λ (z) = U λ (z), z at regular points of , ∂n + ∂n −

(6.3) (6.4)

where n ± are the normal vectors to supp(μ) at z pointing in the opposite directions. Moreover, it follows from [87] that these conditions define λ and uniquely in the given homotopic class (we spare details here). A comparison with Lemma 5.4 (for ϕ ≡ 0) shows that among all A-critical measures the Robin (or equilibrium) measures λk are distinguished by the equality of the equilibrium constants: if k,1 , . . . , k,M are the connected components of = k , then U k,1 = · · · = U . k,M

Chebotarev’s continuum 0 corresponds to M = 1; the other k ’s may be combinatorially characterized using the structure of 0 . We present an example explaining this statement. Example 6.1. Assume p = 3, a0 = −x − i y (with x > y > 0), a1 = a0 , a2 = −a0 , a3 = −a1 , like in Fig. 7, left. Then there are exactly two different compact sets of minimal capacity. One is Chebotarev’s set 0 (as the one depicted in Fig. 7, left), corresponding for instance to f (z) =

(z − a0 )(z − a1 ) (z − a2 )(z − a3 )

1/4 ∈ U(A),

and the other, 1 , with two components (as in Fig. 9), corresponding to the function f (z) =

(z − a0 )(z − a1 ) (z − a2 )(z − a3 )

1/2 ∈ U(A).

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

85

Observe that 1 is topologically equivalent to 0 with the arc connecting v1∗ and v2∗ removed. The quadratic differential associated with λ = λ1 has a double zero v at the center of the rectangle, so that (5.25) takes the form

z t −v dt, w − U λ (z) = Re √ A(t) a0 which is the Green function of the Riemann surface of the function y 2 = A(x) with logarithmic poles at ∞1 and ∞2 . This is also an elliptic integral of the third kind. For a general set A with an arbitrary number of points consider Chebotarev’s continuum 0 and select any arc β of 0 connecting two zeros of the corresponding quadratic differential (a “zero–zero connection”). Then there exists a compact 1 of minimal capacity with two connected components, which is topologically equivalent to 0 \β, whose quadratic differential exhibits a double zero instead of the pair of zeros we have selected. We can repeat this operation with any other remaining zero–zero connections, until all these connections are gone. If p = 2m − 1, the maximal number of connected components of a set of minimal capacity is m. The unique with m components is again the zero level of he Green function for the Riemann surface of y 2 = A(x). The compact sets k and their Robin measures λk associated with A play a central role in any investigation of the strong asymptotics for complex orthogonal polynomials – denominators of the Padé approximant of functions from U(A), see [68]. 6.3. Further connections. The central part in Stahl’s solution of the convergence problem for Padé approximants was a new method of investigation (based directly on the S-property) of the n th root asymptotics of polynomials satisfying complex non-hermitian orthogonality conditions, like those verified by the denominators of the Padé approximants to functions from U(A). This method was further developed in [36] in relation with the best rational approximations; the n th root asymptotics was obtained for complex orthogonal polynomials with respect to varying weights (i.e. depending on the degree of the polynomial). The existence of a varying weight motivates the appearance of an external field in the associated equilibrium problem. Accordingly, the S-property in the related existence problem should be modified to include the external field too (we omit here the non-essential details). Let ϕ be a harmonic function in a domain ⊂ C. For a curve ⊂ let λ = λ ,ϕ be the unit equilibrium measure on in the external field ϕ, so that conditions (2.8) hold. Recall that additionally has the S-property if at the regular points of supp(λ), ∂ λ ∂ λ U + ϕ (z) = U + ϕ (z), ∂n + ∂n −

(6.5)

where n ± are the normal vectors to supp(μ) at z pointing in the opposite directions (it is assumed that the set of irregular points of supp(λ) has capacity zero). We call the support of such a λ an S-curve. The cornerstone to any application is the problem of existence of such a curve, that we discuss here briefly. Given a harmonic function ϕ in a domain and a homotopic class F of curves F ⊂ , we should find a curve ∈ F with the S-property. A direct and constructive approach to the solution of this problem is based on the observation that if such a curve ∈ F exists, its equilibrium measure λ = λ F,ϕ is (A, ϕ)-critical for some set A of fixed points, which depends on the definition of F

86

A. Martínez-Finkelshtein, E. A. Rakhmanov

and on the singularities of ϕ. Then (see Theorem 5.1 and Lemma 5.1) we conclude that (C λ + )2 = R, where R is some function, meromorphic in with (usually, known) poles at A and (usually, unknown) zeros. These zeros are the main parameters of the problem; they must be found using a system of equations, typically in terms of periods of (z √ R dt, that reflect all the given information (including the geometry of the class F). For ϕ ≡ 0 this is basically a classical method, which goes back to Abel and Riemann (abelian integrals, see the discussion below). The existence of an external field does not change the nature of the problem, but it poses additional technical difficulties. If ϕ ≡ 0, we generally have supp(λ) (compare with (6.3) and (6.4)), and finding the support of the equilibrium measure might turn out to be a formidable task even for a fixed . We refer to [36] for further details. See also [59], as well as [7], where the S-problem for = C, and ϕ = Re(P), where P is a polynomial, was considered. Another way to prove the existence of the S-curve independently of its construction is based on the electrostatic interpretation of the critical measures, which yields the following extremal problem. Consider the equilibrium energy E ϕ (·) (see (2.5)–(2.6)) of a curve F ∈ F as a functional on F: E ϕ [F] = E ϕ (λ F,ϕ ) : F → R. Under rather general assumptions it is possible to prove that if a curve ∈ F, satisfying E ϕ [ ] = max E ϕ [F] F∈F

(6.6)

exists, then it has the S-property. For ϕ ≡ 0 we have E ϕ [F] = − log(cap(F)), and (6.6) is equivalent to the minimal capacity problem considered above. For ϕ ≡ 0, this is the weighted capacity (2.7) minimization (see e.g. [77]), and the method was outlined in [36] in connection with the best rational approximation of exp(−x) on [0, +∞). The discrete analogue of problem (6.6) and its connection with Jacobi polynomials was discussed in [57]. In a rather surprising twist, a completely different problem was reduced to the existence of an S-curve in [46], where the semiclassical solution of the focusing nonlinear Schrödinger equation was constructed using methods of the inverse scattering theory. The problem itself, as well as the methods of its solution, did not have a priori any visible connection with those examined in [36]. A partial explanation of the mystery is suggested by the connection of the orthogonal polynomials with the inverse scattering via the matrix Riemann-Hilbert (RH) problem (see e.g. [24] and [30]). Apparently, the corresponding RH problems considered in [36] and [46] are similar. In connection with the theory of nonlinear partial differential equations we should mention also the seminal work of Lax and Levermore [54–56], where the connection between a singular limit of the KdV equation and the energy problem with upper constraint was established. The recently developed tools of asymptotic analysis of the RH problems of a certain class, such as the non-linear steepest descent method of Deift and Zhou (see [22–24], as well as [10]) combined with the ∂-problem ([62,63]), have become powerful weapons in the study of the strong asymptotics of polynomials of complex orthogonality. We can find multiple examples in a series of works of Aptekarev, Baik, Deift, Kuijlaars, McLaughlin, and Miller, to mention a few (see e.g. [4,24,47,49,50,64]; this is necessarily a very partial list). A very active topic of research that benefited greatly from these ideas is the random matrix theory (RMT), see [24], as well as a closely connected field of random particle ensembles with non-intersecting paths and other determinantal point processes [16,84].

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

87

The detailed behavior (as the degree goes to infinity) of the related orthogonal polynomials allows to establish fine asymptotic results for the random matrices or non-intersecting paths ensembles, see e.g. [5,8,9,11–14,27–29,48] for again a very partial list. One of the main ingredients of the solution of such kind of asymptotic problems (independently of the approach we follow) is the analysis of the S-property related to the concrete situation. The total potential of the corresponding critical (or equilibrium) measure is called the g-function (see [24]). This function typically accounts for the leading term of the asymptotics, and the support of the measure is the set where the essential oscillatory behavior takes place. Hence, the construction of the g-function or of the S-curve, or even establishing the existence of the latter without finding the parameters explicitly, is an important problem. In the presence of a significant external field it has been solved so far for some particular situations. The use of the critical measures in this context may present a new approach to the problem at large. 7. Weak Limit of Zeros of Heine-Stieltjes Polynomials We return to our original motivation, armed now with the tools developed so far, in order to analyze the possible weak limits of the polynomial solutions of (1.3). We formulate first a statement slightly more general than necessary for our problem. We are given A = {a0 , a1 , . . . , a p } ⊂ C, and a sequence of external fields of the form ϕn = Re n , n (z) = −

p ρk (n) k=0

2

log(z − ak ),

(7.1)

where ρk (n) ∈ C. Theorem 7.1. Let μn ∈ Mn , n ∈ N, be a discrete (A, ϕn )-critical measure corresponding to an external field (7.1). If for a subsequence N ⊂ N, limits lim

n∈N

ρk (n) = ρk , k = 0, 1, . . . , p, n

exist, then any weak-* limit point μ of the normalized measures {μn /n}, n ∈ N , is a continuous (A, ϕ)-critical measure with respect to the external field ϕ given by (5.1). In particular, if Re

p k=0

1 ρk > − , 2

(7.2)

then μ is a unit continuous (A, ϕ)-critical measure. ∗

Proof. Without loss of generality, we assume that νn = μn /n −→ μ, n ∈ N , where ∗ −→ means convergence in the weak-* sense. By Proposition 3.2, if (7.2) holds then supp(μn ), n ∈ N , are uniformly bounded, so that the set of normalized measures νn is weakly compact, and μ is a probability measure on C. By Remark 3.3, it is sufficient to show that for smooth functions h, condition def

d Eϕ (ν t ) |t=0 = 0, dt n n

n ∈ N,

(7.3)

88

A. Martínez-Finkelshtein, E. A. Rakhmanov

implies d E ϕ (μt ) |t=0 = 0. dt Let μn =

n k=1

δζ (n) ; k

reasoning as in the proof of Lemma 3.1 we conclude that (7.3) is equivalent to the condition (n) (n) n 1 h(ζi ) − h(ζ j ) 2 (n) (n) −

n (ζk ) h(ζk ) = 0, (n) (n) n2 n2 ζ −ζ i= j

i

n ∈ N.

(7.4)

k=1

j

Observe that (n) (n)

1 h(ζi ) − h(ζ j ) h(x) − h(y) dνn (x)dνn (y). = lim 2 (n) (n) →0 n x−y |x−y|> ζ −ζ i= j

i

j

Since (h(x) − h(y))/(x − y) is continuous, standard arguments show that (n) (n)

1 h(ζi ) − h(ζ j ) h(x) − h(y) dμ(x)dμ(y). = 2 (n) (n) x−y n∈N n ζ −ζ

lim

i= j

i

j

On the other hand, ϕn /n → ϕ, n ∈ N , locally uniformly in C\A, with ϕ given in (5.1). Since h vanishes on A, n h are continuous, and

n 1 (n) 1 (n)

(ζ ) h(ζ ) = lim (x) h(x) dν (x) =

(x) h(x) dμ(x).

n n k k n∈N n 2 n∈N n lim

k=1

In consequence, ⎞ ⎛ (n) (n) n 1 ⎝ h(ζi ) − h(ζ j ) (n) (n) lim −2

n (ζk ) h(ζk )⎠ = f ϕ (μ; h), (n) (n) n∈N n 2 ζ −ζ i= j

i

j

k=1

with f ϕ defined in (3.10). Using (7.4), we conclude that f ϕ (μ; h) = 0, and it remains to apply Lemma 3.1. We consider next a sequence of pairs (Q n , Vn ) of Heine-Stieltjes polynomials Q n of degree n and their corresponding Van Vleck polynomials Vn . One of the central results of this paper is a description of all the possible limits of the normalized zero-counting measures of the Heine-Stieltjes polynomials Q n . Since the residues in (1.6) are independent of n, by applying Theorem 7.1 with ϕ ≡ 0 we get: Corollary 7.1. Any weak-* limit point of the normalized zero counting measures ν(Q n )/n of the Heine-Stieltjes polynomials is a unit continuous A-critical measure.

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

89

Remark 7.2. Taking into account Proposition 3.1, we could restate the last result in terms of the zero-counting measures of Heine-Stieltjes polynomials corresponding to a generalized Lamé equation with coefficients depending on n. An analogue of Theorem 7.1 has been used in [61] for the study of the weak-* limits of the normalized zero counting measures of the Heine-Stieltjes polynomials with varying positive residues and A ⊂ R. By [82], the zeros of Van Vleck polynomials accumulate on the convex hull of A, so that the set of all Van Vleck polynomials is bounded (say, in the component-wise metrics). Hence, in our consideration of a sequence of pairs (Q n , Vn ) of Heine-Stieltjes polynomials Q n of degree n and their corresponding Van Vleck polynomials Vn we suppose without loss of generality that there exists a monic polynomial V of degree p − 1 such that lim Vn = V.

n→∞

(7.5)

Theorem 7.3. Under assumption (7.5), the normalized zero counting measure ν(Q n )/n converges (in a weak-* sense) to an A-critical measure μ ∈ M1 ; furthermore, the quadratic differential V (z) dz 2 A is closed, the support = supp(μ) consists of critical trajectories of ,√ C\ is con√ nected, and we can fix the single valued branch of V /A there by lim z→∞ z V (z)/A(z) = 1. With this convention, " ! ) z V 1/n (t) dt (7.6) lim |Q n (z)| = exp Re n A =−

locally uniformly in C\ , where a proper normalization of the integral in the right hand side is chosen, so that " ! ) z V lim Re (t) dt − log |z| = 0. z→∞ A In other words, weak-* limits of the normalized zero counting measures of Q n ’s are unit positive A-critical measures. The inverse inclusion (that any unit positive A-critical measure is a weak-* limit of the normalized zero counting measures of Heine-Stieltjes polynomials) is also valid, but it cannot be established using methods of this paper. We plan to present the proof in a subsequent publication related to the strong asymptotics of Heine-Stieltjes polynomials. However, in the rest of the paper we identify both sets of measures. Proof. Let μ be any weak-* accumulation point of {ν(Q n )/n}. By Lemma 5.1, the Cauchy transform of μ must satisfy an equation of the form (5.5). Rewriting (1.3) in the Riccati form and taking limits as n → ∞ (with account of (7.5)) we conclude that μ 2 V C (z) = (z), z∈ / supp(μ), A and = supp(μ) is a union of critical trajectories of . The rest of the assertions about follows from Proposition 5.1. Finally, (7.6) is the straightforward consequence of (5.5) and the fact that for any monic polynomial P, log 1/|P(z)| is the potential of its zero-counting measure.

90

A. Martínez-Finkelshtein, E. A. Rakhmanov

Theorem 7.3 provides an analytic description of the weak-* limits of the zero counting measures associated to the Heine-Stieltjes polynomials. However, this description is in a certain sense implicit, since it depends on the limit V of the Van Vleck polynomials Vn , that constitute therefore the main parameters of the problem. We must complement this description with the study of the set of all possible limits V . As it follows from Theorem 7.3, this can be done in two steps: i) describing the global structure of the trajectories of closed rational quadratic differentials with fixed denominators on the Riemann sphere, and the corresponding parameters (numerators). This problem has an independent interest; ii) extracting from this set the subset giving rise to positive unit A-critical measures. It was mentioned in Sect. 4 that problem i) is in general very difficult. In particular, we can find critical trajectories of any homotopic type. We have seen in Sect. 5.4 that not all of them will correspond to the support of an A-critical measure, and elucidating this relation is the main step towards the complete description of the weak-* limits of the zero counting measures of Heine-Stieltjes polynomials. We start with a detailed discussion in the next section of the simplest non-trivial case of three poles ( p = 2), corresponding to Heun’s differential equation. For a general p the geometry becomes so complex, that in this paper we just outline the main results (see Sect. 9). 8. Heun’s Differential Equation ( p = 2) In this section we concentrate on the differential equation (1.3) with A(z) = (z −a0 )(z − a1 )(z − a2 ) and V (z) = z − v. Our goal is to illustrate all previously established general results in the simplest non-trivial situation. Observe that the Van Vleck polynomials constitute now a 1-parameter family, that makes the whole analysis much easier. So we introduce the quadratic differential v =

v−z 2 dz A(z)

(8.1)

and two sets, V = {v ∈ C : v is closed} , def

as well as the Van Vleck set V+ = {v ∈ C : v is an accumulation point of the zeros of Van Vleck polynomials}. def

A direct consequence of Theorem 7.3 is that V+ ⊂ V. This inclusion is proper. Obviously, our main purpose, motivated by the analysis of the Heine-Stieltjes and Van Vleck polynomials, is to study V+ ; along this path related questions will be dealt with, such as the structure of the closed quadratic differentials of the trivial homotopic type, and the set of positive critical measures. On the other hand, general quadratic differentials and signed critical measures have an independent interest, and some results will be presented below.

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

91

Namely, we will show that: – the set of closed quadratic differentials (8.1) is parametrized by a family of analytic arcs, dense on the plane and joining Chebotarev’s center of A (see its definition below) with infinity; – each arc of this family represents an individual homotopic type of the quadratic differential, so that two values of the parameter v in (8.1) corresponding to different curves yield homotopic classes of critical trajectories non reducible to each other; – the trivial homotopic type of closed quadratic differentials corresponds only to the union of three subarcs joining the set A with its Chebotarev’s center. This star-shaped set is isomorphic to the set of positive A-critical measures, and coincides with V+ ; – for each v ∈ V+ , the zeros of the corresponding Heine-Stieltjes polynomials accumulate on the two critical trajectories of v in (8.1). 8.1. Global structure of trajectories. Along this section we denote by v the critical graph of v , that is, the set of critical trajectories of v together with their endpoints (critical points of v ). There exists a unique v ∗ = v ∗ (A) ∈ C such that the critical graph v ∗ is a connected set; v ∗ coincides with Chebotarev’s compact ∗ associated with A (see Sect. 6.1). To simplify terminology, in the context of three poles we call v ∗ Chebotarev’s center for A. Since the value v ∗ is in many senses exceptional, along with the poles ai , we will def introduce the notation A∗ = A ∪ {v ∗ } for the “exceptional set”. We have mentioned in Sect. 4 that the global structure of the trajectories of a quadratic differential can be extremely complicated. For the quadratic differential (8.1) a certain order is imposed by the double pole at infinity with a negative residue. Proposition 8.1. Let A(z) = (z − a0 )(z − a1 )(z − a2 ) be a polynomial with simple roots in C and v ∈ C\A∗ . Then the quadratic differential (8.1) has a closed critical trajectory β containing v. Let be the bounded domain delimited by β. Then contains at least two points from A. Proof. Due to the local structure of the trajectories, v has a closed trajectory freely homotopic to infinity (in other words, it is topologically identical to a circle and contains all the finite critical points of v in the bounded component of its complement). According to Theorem 9.4 of [91], this closed trajectory is embedded in a uniquely determined maximal ring domain R∞ , swept out by homotopic closed trajectories of v . We denote by β ∗ the bounded connected component of the boundary of R∞ . Obviously, β ∗ contains at least one critical point of v , and no finite critical points of v lie in R∞ (that is, in the unbounded component of C\β ∗ ). We conclude that all finite critical points of v lie either on β ∗ or in the bounded component of its complement. If this component is empty, it means that β ∗ contains all the critical points, and it is the Chebotarev compact for A. Otherwise, the bounded component of C\β ∗ contains an interior point, and from the local structure of the trajectories of v at simple poles we infer that β ∗ cannot contain only poles of v . Hence, v ∈ β ∗ , and at least two of the three trajectory arcs emanating from v belong to β ∗ . It is easy to see that either they end up at respective poles (and then again β ∗ = ∗ ), or they form a closed loop, that we call β. We call the bounded domain delimited by β. There is only one trajectory emanating from v remaining, that either is recurrent or ends at a pole from A. Hence, β ∗ cannot contain more than one pole, so that contains at least two poles. This concludes the proof.

92

A. Martínez-Finkelshtein, E. A. Rakhmanov

Fig. 10. Critical graph of a quadratic differential in an exterior (left) and closed interior configurations

Remark 8.1. An examination of the proof of Proposition 8.1 shows that the existence of the extremal trajectory β ∗ containing a zero of and such that all trajectories outside β ∗ are closed and homotopic to infinity is a fact, valid for an arbitrary quadratic differential of the form (4.1). Let v ∈ C\A∗ . By the theorem on the local structure of the trajectories of v , there are three trajectories originating at v. From Proposition 8.1 it follows that two of them form a single closed loop β that splits C into two domains: the bounded component of the complement, , and the unbounded one, that we denote by D. Let us denote by γ the remaining trajectory emanating from v. Observe that β and γ have a single common point, v; according to the relative position of γ with respect to β we can establish the following basic classification for the critical graph v of v : 1) Exterior configuration: γ \{v} belongs to the unbounded component D of the complement of β (see Fig. 10, left). In this case v ∈ V (the quadratic differential is closed). One of our main results, which we discuss later, is that v ∈ V+ if and only if v has an exterior configuration. 2) Interior closed configuration: γ \{v} belongs to the bounded component of the complement of β, and γ is finite (see Fig. 10, right or Fig. 11, right). In this case γ is a critical trajectory joining v with one of the poles a j , and v ∈ V (the quadratic differential is closed). 3) Interior recurrent configuration: γ \{v} belongs to the bounded component of the complement of β, and γ is not finite (see Fig. 11, left). In this case γ (in fact, all trajectories in ) is a recurrent trajectory, dense in , and v ∈ / V. In order to prove that our classification exhausts all the possibilities it is convenient to single out the following simple statement: Proposition 8.2. The quadratic differential v is closed if and only if there exists one critical trajectory of v . Proof. Obviously, we only need to prove sufficiency. Assume that γ is a critical trajectory joining for instance v and a1 . In this case, taking into account the residue at infinity, we conclude that

a3 ) t −v dt = 0 Re A(t) a2

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

93

Fig. 11. Trajectories of a quadratic differential in an interior recurrent (left) and interior closed configurations

if we integrate along a simple arc in \γ joining both poles. This means that

z) t −v dt Re A(t) a2 is a harmonic function on the Riemann surface R of y 2 = (x − a2 )(x − a3 ) with two cuts along the lift of γ to R. Reasoning as in [36] we conclude that a zero level curve of this function connects a2 and a3 , and its projection on C constitutes the second critical trajectory of v . Since the critical graph v of v is a compact set, the differential is closed. The remaining case is analyzed in a similar way, and this concludes the proof. Recall that by construction (see Proposition 8.1), β is part of the boundary β ∗ of the maximal ring domain R∞ swept out by homotopic closed trajectories of v . Hence, in the case of an exterior configuration, γ ⊂ β ∗ is a critical trajectory. By Proposition 8.2, v ∈ V. An analogous conclusion is obtained if γ \{v} ⊂ is critical. Finally, assume that γ \{v} ⊂ is recurrent. According to Corollaries (1) and (2) of Theorem 11.2 in [91], its limit set is a domain bounded by the closure of a critical trajectory. Since in this case no critical trajectories can exist in , we conclude that the closure of the limit set of γ is , which concludes the proof of the statements above. Let us summarize part of our conclusions in the following statement: Theorem 8.2. Let v ∈ V\A∗ . Then the critical graph v of v is the union of three critical trajectories, v = α ∪ β ∪ γ , such that β = β(v) is a closed loop that contains v, α = α(v) is an arc in the bounded component of C\β joining two poles from A, and γ = γ (v) connects the remaining pole with the zero v. The set C\ v is the union of two disjoint domains: – the bounded component \ v of C\ v is a ring domain bounded either by α and β (for an exterior configuration) or by v (for an interior configuration); – the unbounded component D\ v = R∞ of C\ v is a disc domain in C, bounded either by β and γ (for an exterior configuration) or by β (for an interior configuration). For what follows we will fix an orientation of both arcs α and γ . For instance, we can agree that α goes from ai to a j if i < j, and γ always goes from a pole to v.

94

A. Martínez-Finkelshtein, E. A. Rakhmanov

8.2. Homotopic type of a closed differential. The partition of the plane by the critical graph v , associated to the closed quadratic differential v and described in Theorem 8.2, allows us to introduce a geometric characterization of the trajectories of “at large”. For v ∈ V we define the homotopic type of v as the free homotopic class in C\A of any of its closed trajectories in \ v . Any such a trajectory is a closed Jordan curve in C\A containing exactly two points from A (say, a1 and a2 ) in its interior, and leaving in the exterior the third pole, a0 . We can further think that the (Carathéodory) boundary component given by the “two-sided α” belongs to this homotopy class (we think of the two-sided α as the closed curve with a1 and a2 in its interior). Then, without loss of geometric information, we can identify the homotopic class of the two-sided α with the homotopic class of α itself, considered now as an arc with fixed endpoints a1 and a2 in C\{a0 }. Thus, for v ∈ V\A∗ the homotopic type of v has a combinatorial component (namely, which two of the three poles from A are joined by α) and the geometric one (given by the homotopy of α in the punctured plane with the third pole removed). It follows from the general theory that there exists a one-(real) parametric family of closed differentials v with the prescribed homotopic type. One standard way to parametrize such a family is by the v -lengths of the trajectories in \ v ; another one is based on the v -lengths of the conjugate (orthogonal) trajectories. More precisely, recall from Sect. 4 that the v -length of a curve τ is

t − v 1 |dt|. τ v = π τ A(t) By definition of the (horizontal) arc, given v ∈ V, all closed trajectories of v in \ v have the same v -length, equal to lv = 2 α v . def

Moreover, lv is the minimum of the v -length of a closed Jordan curve separating the boundary components of \ v ; it is called the length of the circumferences of the cylinder associated with \ v (see [91, Chap. VI]). The conjugate value def h v = inf τ v : τ connects boundary components of \ v is called the height of the cylinder associated with \ v . Again, it coincides with the v -length of any arc of orthogonal trajectory connecting the boundary components of \ v . Lemma 8.1. Let α be a Jordan arc lying in C\A (except for its endpoints) and connecting two poles from A. Then for any value h > 0 there exists a unique v ∈ V such that v has the homotopic type α, and h v = h. This lemma may be proved by reduction to a general Theorem 21.1 from [91, §21] on the existence of finite differentials. Our quadratic differentials v are not finite (due to the double pole at infinity), but they can be approximated by finite differentials in a way preserving essential characteristics. The same family of differentials of the homotopic type α may be parametrized alternatively by the length lv of the circumferences of the cylinder associated with \ v . In this case each homotopic type has a minimal (strictly positive) admissible length:

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

95

Lemma 8.2. Let α be a Jordan arc lying in C\A (except for its endpoints) and connecting two poles from A. Define L = L(α) = inf{lv : v ∈ V and v has the homotopic type α}. def

Then L > 0 and for any l > L there exists a unique v ∈ V such that v has the homotopic type α, and lv = l. Lemma 8.2 may be reduced to Lemma 8.1 or derived from the general existence theorems related to the moduli problem, see [91, §21]. 8.3. Correspondence between closed differentials and A-critical measures. We have proved in Sect. 5 that for any signed A-critical measure μ with μ(C) = 1 there exists a closed quadratic differential v in terms of which the measure and its potential may be analytically expressed. In general, this is not a bijection, since many critical measures correspond to the same quadratic differential. It follows from the proof of Proposition 5.1 that for p = 2 a one-to-one correspondence between closed differentials and signed critical measures is restored if we consider the A-critical measures μ with the additional property that C\ supp(μ) is connected. This subclass is the most important for applications. √ Let v ∈ V\A∗ . Obviously, V /A, with V (z) = z − v, has a single-valued branch in C\α ∪ γ ; the critical trajectories α = α(v) and γ = √ γ (v) were introduced in Theorem 8.2. We fix the branch by requiring that lim z→∞ z V /A(z) = 1. Next we choose the positive (anti-clockwise) orientation in a neighborhood of infinity, that induces orientation on each side of α and γ . We denote by the subindex “+” the boundary value of a function at α and γ from the side where the induced orientation matches the given orientation of each arc (see the remark after Theorem 8.2). With this convention, and taking into account that α and γ are trajectories of v , we conclude that ) z−v 1 def dz (8.2) dμv (z) = πi A(z) + defines a signed real( measure on α ∪ γ . Moreover, taking into account the residue at infinity, we see that dμv = 1. Hence, we have proved the following Proposition 8.3. For any v ∈ V\A∗ there exits a unique signed A-critical measure μv with μv (C) = 1, such that supp(μv ) = α ∪ γ . Furthermore, μv is absolutely continuous on supp(μv ) with respect to the arc-length measure, and formula (8.2) holds. Remark 8.3. This construction can be extended in a natural way to Chebotarev’s compact (v = v ∗ ) and to the degenerate cases when v ∈ A; in these situations, measure μv ∗ is positive. Hence, v → μv is a mapping from V into the set of signed unit measures on C. We introduce next an analytic function that will allow us to study the structure of the set V of points v ∈ C such that v is closed. Let v0 ∈ V\{v ∗ }; recall that α = α(v0 ) joins the two poles of v0 not connected with v0 by a critical trajectory. In a simply connected neighborhood of v0 , disjoint with α,

) t −v def dt (8.3) w(v) = A(t) + α

96

A. Martínez-Finkelshtein, E. A. Rakhmanov

is analytic in v, single-valued, and w (v0 ) = i

α

1 dt = 0, √ (t − v0 )A(t)

(8.4)

since this is a period of a holomorphic differential on the elliptic Riemann surface of the algebraic function y 2 = (t − v0 )A(t). This construction defines an analytic and multi-valued function w in C, with w = 0; however, formula (8.3) allows to specify a single-valued branch of w in a neighborhood of a point only in C\{v ∗ }. Proposition 8.4. For every v0 ∈ V\{v ∗ } there exists a neighborhood B of v0 such that V ∩ B contains an analytic arc passing through v0 , and such that the homotopic class of the trajectories of v for v ∈ is invariant. Proof. In a small and simply-connected neighborhood B of v0 consider the branch of w given by formula (8.3), so that w(v0 ) = πi(1 − μ(γ (v0 ))). Since w (v0 ) = 0, the level curve = {v : Re w(v) = 0} def

is well defined in B, and constitutes an analytic arc passing through v0 . Clearly, if v ∈ B ∩ V is such that the homotopic class of the trajectories of v and v0 are the same, then necessarily v ∈ . Reciprocally, assume v ∈ ∩ B, and consider the trajectory of v that emanates from the same pole as γ (v0 ). Due to the continuity of the level curves of

z) t −v dt Re A(t) with respect to a variation of v, this trajectory is either critical (and then the proposition is proved) or recurrent. In the latter case it must intersect the orthogonal trajectory of v starting from v (see [91, §11]), which contradicts the hypothesis that Re w(v) = 0. Now we can describe completely the structure of the set V: Theorem 8.4. The set V is a union of a countable number of analytic arcs k , k ∈ Z, each connecting v ∗ and ∞. Two arcs from V are either identical or have v ∗ as the only finite common point. The homotopic type of the critical trajectories of v in C\A remains invariant on each arc k \A∗ . There are three distinguished arcs k , k ∈ {0, 1, 2}, such that (i) k connects v ∗ with infinity and passes through ak ; (ii) for every v ∈ k the homotopic class of trajectories of the closed quadratic differential v is trivial; (iii) Function μv (γ (v)) is monotonically decreasing from μv ∗ (γ (v ∗ )) to −∞ as v travels k from v ∗ to ∞. Proof. Assume that a j ’s are not collinear (the analysis of the collinear situation is simpler). For v0 = a0 ∈ V, the arc γ (v0 ) vanishes, and α(v0 ) = [a1 , a2 ] is the straight segment joining a1 and a2 , so that we can fix the single-valued branch of w in a neighborhood of a0 by

1 √ dt = πi. w(a0 ) = (8.5) (t − a1 )(t − a2 ) + [a1 ,a2 ]

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

97

Denote by H the half plane containing a0 and determined by the straight line passing through a1 and a2 . Then )

t −v w(v) = dt, A(t) + [a1 ,a2 ] and (8.5) determines the single-valued branch of w in H. Let 0 be the level curve 0 = {z ∈ H : Re w(z) = 0}. def

We have that

w(v ∗ ) = πi μv ∗ ∗ \γ (v ∗ )

( ∗ \γ (v ∗ ) is the union of two arcs of the Chebotarev compact joining v ∗ with a1 and a2 ). Hence, v ∗ ∈ 0 . Since the rotations and translations of the plane do not affect the character of the level curves of w, we can assume that both a1 , a2 ∈ R. Then it is immediate to see that 0 can intersect R at a single point (which belongs to the segment [a1 , a2 ]). The other end of 0 must diverge to infinity. This establishes the existence and properties of the distinguished arcs k , k ∈ {0, 1, 2}, described above. Consider now the analytic function w in the infinite sector delimited by two contiguous distinguished arcs k , k ∈ {0, 1, 2}. Fix there a v0 ∈ V and take the single-valued branch determined by the condition )

t − v0 w(v0 ) = dt = πiμv0 (α(v0 )). A(t) + α(v0 ) Then the level curve = {Re w(v) = 0} is an analytic curve passing through v0 , that can intersect the boundary of the sector only at v ∗ . Hence, joins v ∗ with ∞. The number of different curves is given by the number of different homotopic types of closed trajectories, which is countable. This concludes the proof. Remark 8.5. As v approaches v ∗ along an arc k ⊂ V, the support supp(μv ) = α ∪ γ tends to the Chebotarev set ∗ , but possibly covered several times, in accordance with the homotopy class of v on k . Finally, it is convenient to consider another independent parametrization of measures μv , v ∈ V, in order to connect the characteristics of their logarithmic potential U μv with the geometrically defined values of the corresponding quadratic differential v . Applying Lemma 5.4 we get Lemma 8.3. Any measure μv , v ∈ V, is characterized by the following property: U μv is constant on each connected component of supp(μ), U μv (z) ≡ cα

for z ∈ α,

U μv (z) ≡ cγ

for z ∈ γ ,

and at any regular point of supp(μv ), ∂U μv (z) ∂U μv (z) = , ∂n + ∂n −

(8.6)

where n ± are the normal vectors to supp(μv ) pointing in the opposite directions. Moreover, we have the following relations: lv = 2μv (α) = 2 (1 − μv (γ )) = 2πiw(v), h v = π |cα − cγ | = π(cα − cγ ).

98

A. Martínez-Finkelshtein, E. A. Rakhmanov

8.4. Positive A-critical measures. Theorem 8.4 gives a complete description of the set V of points v that make the quadratic differential v in (8.1) closed. By Proposition 8.3, to every v ∈ V it corresponds a unique signed A-critical measure μv , given by formula (8.2). Our next goal is to isolate the subset def $+ = V {v ∈ V : μv is positive}.

(8.7)

Proposition 8.5. Let v ∈ V. Measure μv is positive if and only if either v = v ∗ or v is in an exterior configuration. See Sect. 8.1 for the definition of the exterior configuration. Proof. The case v = v ∗ is trivial, so let us assume that v = v ∗ . Consider

z)

z) t −v t −v dt, u (z) = Im dt. u(z) = Re A(t) A(t) v v Since α = α(v) and γ = γ (v) are trajectories of v , function u is single-valued and harmonic in C\ , continuous up to the boundary, and the closed trajectory β (see Proposition 8.1) is its zero level curve. By the selection of the branch of the square root, u(z) ∼ log |z| as z → ∞, and we see that u(z) > 0 for z ∈ C\ ∪ γ , and u(z) < 0 for z ∈ \ . In consequence, ∂ u(z) > 0 ∂n and on γ ,

∂ > 0, u(z) < 0, ∂n

on α,

if v is in an exterior configuration, if v is in an interior configuration,

where ∂/∂n denotes the derivative in the sense of the outer normals. By the CauchyRiemann equations, ∂ ∂ u (z) = u(z), ∂s ∂n where ∂/∂s is the derivative along each shore of the cuts α and γ in the direction of the induced orientation. Hence, we conclude that μv |α is always positive, while μv γ is negative if and only if v is in an exterior configuration. Remark 8.6. Observe that we have proved that always μv |α > 0 . For v = v ∗ , by construction μv (α) + μv (γ ) = 1, and μ does not change sign on each connected component of , so that μv is positive if and only if μv (α) ≤ 1. An equivalent condition can be stated in terms of the v -length of the critical trajectories α and γ : μv ≥ 0

⇔

α v + γ v = 1.

(8.8)

Remark 8.7. Figure 3 illustrates that only in the exterior configuration the v -rectangles intersect the support of μv only once (cf. Lemma 5.2). $+ . Our main Proposition 8.5 provides an “implicit” geometric description of the set V result of this section describes this set completely:

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

99

Theorem 8.8. Let k , k ∈ {0, 1, 2}, be the distinguished arcs in V described in $+ is the union of the sub-arcs + of each k , k ∈ {0, 1, 2}, conTheorem 8.4. The set V k necting ak with Chebotarev’s center v ∗ (and lying in the convex hull of A). def Furthermore, let us denote m k = μv ∗ ( k∗ ), k ∈ {0, 1, 2}, where k∗ is the arc of the ∗ $+ , k ∈ Chebotarev compact connecting v ∗ with ak (m 0 + m 1 + m 2 = 1). If v ∈ k ∩ V {0, 1, 2}, then the trajectory γ (v) connects v with the pole ak and 0 ≤ μv (γ (v)) ≤ m k .

(8.9)

In this case both trajectories γ (v) and α(v) are homotopic to a segment. The bijection $+ by points of the interval [0, m k ]. μv (γ (v)) ↔ v is a parametrization of the set k ∩ V Proof. Straightforward estimates show that v → μv (α(v)) is unbounded on each arc k ⊂ V. Furthermore, μv (α(v)) = 1

⇔

μv (γ (v)) = 0

⇔

v ∈ A.

Since always μv (α(v)) > 0 (see Remark 8.6), we conclude that μv (α(v)) takes values in (0, 1) only on the portions of the distinguished arcs k , k ∈ {0, 1, 2}, joining the Chebotarev center v ∗ with each pole. For any other arc k , μv (α(v)) > 1, and μv is not positive. See illustration of the correspondence between the position of v on V and the trajectories of v in Fig. 12. Corollary 8.1. Any positive A-critical measure μv has the trivial homotopy type. Remark 8.9. Taking into account this corollary and the definition of μv we can rewrite (8.9) in the following equivalent form: )

a j ) 2 1 z−v z−v 1 dz = dz ∈ [1 − m k , 1], k ∈ {0, 1, 2}, πi α(v) A(z) + πi a j1 A(z) + where j1 = min({0, 1, 2}\{k}), j2 = max({0, 1, 2}\{k}), and we integrate along the $+ comstraight segment joining a j1 and a j2 . This system of equations defines the set V pletely. Finally, in relation with our primary goal of the description of the weak-* asymptotics of the zeros of Heine-Stieltjes and Van Vleck polynomials we state the following important result, which however will not be proved completely in this paper. $+ . Theorem 8.10. V+ = V $+ follows from the definition of V+ and Theorem 7.3: if v ∈ V+ , The inclusion V+ ⊂ V then μv is a limit distribution of the zero counting measures of the Heine-Stieltjes polynomials, so that μv ≥ 0. The inverse inclusion (μv ≥ 0 ⇒ v ∈ V+ ) is also valid, but it cannot be established using methods of this paper. We plan to present the proof in a subsequent publication related to the strong asymptotics of Heine-Stieltjes polynomials. However, as the consequence of Theorem 8.10, we identify the set of accumulation points of the zeros of Van Vleck polynomials V+ with the values of v ∈ V making μv ≥ 0.

100

A. Martínez-Finkelshtein, E. A. Rakhmanov

Fig. 12. Position of v on 0 ∪ 1 ∪ 2 (left) and the corresponding trajectories of the differential v in (8.1)

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

101

Remark 8.11. Although V+ and Chebotarev’s continuum ∗ are topologically identical and, according to numerical experiments carried out by B. Shapiro, metrically very close, they are not the same (as a consequence, the conjecture made in [83] is false). For simplicity, take a0 = 0, a1 = 1, Im a2 > 0, and define for z in the upper half plane close to the origin

z)

z ∗ z−t v −t dt, f 2 (z) = dt, f 1 (z) = A(t) A(t) 0 0 where we integrate along segments joining 0 with z. The change of variables t → zu in the integrand of f 1 yields the asymptotic expansion π a2 + 1 f 1 (z) = √ z 1 + z + O(z 2 ) , z → 0. 2 a2 8a2 On the other hand, using the asymptotic expansion of the integrand of f 2 we get 1 1 v∗ z 1 1+ 1+ − ∗ z + O(z 2 ) , z → 0, f 2 (z) = 2 a2 6 a2 v √ where we take the main branch of z. Observe that Im f 1 = 0 defines locally the set V, while Im f 2 = 0 corresponds to the Chebotarev compact of A. Assuming that both curves are tangent at the origin we √ conclude that v ∗ / a2 > 0, so that v ∗ lies on the bisector of the interior angle formed by 1 and a2 at the origin. In order to check the second order tangency, we can invert the mapping y = f 1 (z) and analyze F(y) = f 2 ( f 1−1 (y)) that maps the real line into itself at the origin. Setting v ∗ = s(1 + a2 ), s > 0, we get 1/2 √ 1/2 a2 5(a2 + 1)v ∗ 1 8 v∗ 1 F(y) = + − 8 y 3/2 + O(y 5/2 ). √ y π a2 6 2π 3 v ∗ a2 √ Setting v ∗ = s a2 , s > 0, we get √ (a2 + 1)v ∗ 1 s, = a2 + √ a2 a2 which is real only if |a2 | = 1. This shows that at a0 both curves V and ∗ , however close, are not identical, at least when the triangle with vertices at A is not isosceles. 9. General Families of A-Critical Measures Most of the arguments presented in Sect. 8 for the case p = 2 may be carried over to the case of an arbitrary p with minor modifications. However, in a certain sense the multidimensional case is significantly more complicated. The volume of this paper does not allow to develop the whole theory, covering both signed A-critical measures and closed quadratic differentials of an arbitrary homotopic type. In turn, without such a theory it is more complicated to separate positive A-critical measures from the signed ones. So, we restrict ourselves here to a less ambitious goal allowing a shorter treatment: we put forward a constructive characterization of the positive A-critical measures. We prove that the constructed measures are indeed positive, but the complete proof of the fact that there is no other positive A-critical measures is matter of a forthcoming paper.

102

A. Martínez-Finkelshtein, E. A. Rakhmanov

9.1. Mappings generated by periods of rational quadratic differentials. Let us recall the notation. We have the fixed set A = {a0 , a1 , . . . , a p } of distinct points on C, A(z) = p j=0 (z − a j ), V (z) = def

p−1

(z − v j ),

R(z) = def

j=1

V (z) , and (z) = −R(z) (dz)2 A(z)

(9.1)

is a rational quadratic differential on the Riemann sphere C. Zeros v j ’s of V are not def necessarily simple, and we denote v = {v1 , . . . , v p−1 } ∈ C p−1 with account of their multiplicity. Let also V = {v : is closed}. def

√ Occasionally, it is convenient to consider as a differential form 1i R(z)dz on the √ Riemann surface R of R, or equivalently, on the hyperelliptic surface of genus p − 1 given by w 2 = A(z)V (z). Let = γ1 ∪ · · · ∪ γ p be a set consisting of p disjoint arcs γk , each √ one connecting a pair of points from A ∪ v in such a way that C\ is connected and R is holomorphic def γk = γk+ ∪ γk− , in C\ . The Carathéodory boundary of C\ consists of p components $ γk as cycles in C\ with a positive orientation with respect to C\ . We can consider $ enclosing the endpoints of γk . Part of R over C\ splits into two disjoint sheets, so we may consider $ γk as cycles on R. Let us define ' # 1 def R(z)dz, k = 1, . . . , p, (9.2) wk (v) = wk (v, ) = 2πi $ γk √ √ where R $ R in C\ defined by γk are the boundary values of the branch of √ √ lim z→∞ z R(z) = 1. Clearly, the boundary values ( R)± on γk± are opposite in sign. √ √ Therefore, with any choice of orientation of γk and a proper choice of R = ( R)+ on γk , we will have

# 1 wk (v) = R(z)dz, k = 1, . . . , p. (9.3) πi γk By the Cauchy residue theorem we have that w1 + · · · + w p = 1 for any v ∈ C p−1 . Thus, def we can restrict the mapping v → w to p − 1 components of w = (w1 , . . . , w p−1 ) ∈ C p−1 . In this way, we have defined the mapping P(·, ) : C p−1 → C p−1 such that P(v, ) = w(v, ).

(9.4)

Each component function w j (v1 , . . . , v p−1 ) is analytic in each coordinate vk (even if vk is at one of the endpoints of γk ). Once defined by the integral in (9.3), this analytic germ allows an analytic continuation along any curve in C\A. Arcs γk are not an obstacle for the continuation since the integral in (9.3) depends only on the homotopic class of in C\(A ∪ v). The homotopy of is a continuous modification of all components simultaneously in such a way that they remain disjoint in all intermediate positions.

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

103

Fig. 13. An homology basis for R

√ Under this assumption we can continuously modify the selected branch of R in C\ along with the motion of . We note that this notion of homotopy is different from the concept of homotopic class based on a choice of a collection of Jordan contours in C\A, which is standardly used to classify closed differentials (see [91]). p−1 The homology basis {$ γ j } j=1 of the domain C\ ⊂ R defined above may be comγ p−1 , $ δ1 , . . . , $ δ p−1 for R. pleted (in several ways) to form a homology basis $ γ1 , . . . $ We can select the cycles $ δ j as a lifting to R of a collection of arcs δ j ⊂ C\ , each connecting new (different) pairs of points from A ∪ v (see Fig. 13). We denote by def def $= ($ δ1 , . . . , $ δ p−1 ). Accordingly, we define mappings = (δ1 , . . . , δ p−1 ), 1 k (v, ) = w k (v) = w 2πi def

' # $ δk

R(z) dz, k = 1, . . . , p − 1,

(9.5)

) : C p−1 → C p−1 such that P(v, ) = ( P(·, w1 (v, ), . . . w p−1 (v, )). (9.6) An important new mapping associated with the complete basis of homology on R is P(·, , ) : C p−1 −→ R2 p−2 given by P(v, , ) = (Im w1 , . . . , Im w p−1 , Im w 1 , . . . , Im w p−1 ) = (Im(P(v, )), Im(P(v, ))).

(9.7)

In order to prove that both mappings, P and P, are locally invertible we need the following lemma, which is a standard fact of the theory of the Riemann surfaces, see e.g. [85, Chap. 10] or [44, Chap. 5]. Lemma 9.1. Let W1 , . . . , W p−1 be a basis of holomorphic differentials on R (a cohomology basis). Then

104

A. Martínez-Finkelshtein, E. A. Rakhmanov

(i) p − 1 vectors !

"

γj

W1 , . . . ,

γj

W p−1

∈ C p−1 ,

j = 1, . . . , p − 1,

are linearly independent over C. (ii) the system of 2 p − 2 vectors, which consists of p − 1 vectors in (i) above plus p − 1 vectors ! "

W1 , . . . , W p−1 ∈ C p−1 , j = 1, . . . , p − 1, δj

δj

are linearly independent over R. Proposition 9.1. Both mappings w = P(v) and (Im w, Im w ) = P(v) are locally invertible at any v(v1 , . . . , v p−1 ) ∈ (C\A) p−1 with vi = v j for i = j. Proof. We have that for any j, k ∈ {1, . . . , p − 1},

# ' ' ∂w j 1 Vk (t) 1 dt 1 dt = = R(t) = Wk , √ ∂vk 2πi γ j t − vk 4πi $ 2 $ A(t)V (t) γj γj where Vk (t) = def

Vk (z) V (t) 1 def dz, k = 1, . . . , p − 1. and Wk = √ t − vk 2πi A(z)V (z)

Since Vk are linearly independent polynomials of degree p − 2 (V has simple roots), p−1 {Wk }k=1 is a basis of holomorphic differentials on R. By Lemma 9.1 (i), vectors p−1 {∂w/∂vk }k=1 are independent; this means that the Jacobian of P does not vanish p−1 and P is invertible. In a similar fashion, by Lemma 9.1 (ii), vectors {∂w/∂vk }k=1 and p−1 {∂ w /∂vk }k=1 , k = 1, . . . , p − 1, are linearly independent over R. The matrix of the linear mapping {Re v, Im v} ∈ R2 p−2 −→ P(v, , ) ∈ R2 p−2 is therefore nonsingular, and P is locally invertible. The following result is a multidimensional version of Proposition 8.4. Proposition 9.2. Let μ0 be an A-critical measure such that = supp(μ0 ) has connected components γ10 , . . . , γ p0 , and C\ is connected. Let 0 = R0 (z)(dz)2 be the quadratic differential associated with μ0 , where R0 = V0 /A, and v 0 = (v10 , . . . , v 0p−1 ) is the vector of zeros of V0 . Assume that 0 is in a general position (that is, all vk ’s are pairwise distinct and disjoint with A). Then for an ε > 0 and any m j ∈ R, j ∈ {1, . . . , p − 1}, satisfying |m j − μ0 (γ j0 )| < ε,

j = 1, . . . , p − 1,

there exists a unique solution v ∈ V of the system w j (v, ) = m j ,

j = 1, . . . , p − 1.

(9.8)

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

105

p−1 The quadratic differential = −R(z)(dz)2 , R(z) = k=1 (z − vk )/A(z), is closed. The associated A-critical measure μ with supp(μ) homotopic to supp(μ0 ) satisfies μ(γ j ) = m j , j = 1, . . . , p − 1, where supp(μ) = γ1 ∪ · · · ∪ γ p and supp(μ) is homotopic to supp(μ0 ). Proof. We use the continuity of the dependence of short or critical trajectories of from the zeros v = (v1 , . . . , v p−1 ) of V . Let, for instance, γ 0 be an arc of supp(μ0 ) connecting two v 0 -points v10 and v20 . For ε > 0 small enough the solution of (9.8) has two points, v1 and v2 , close to v10 and v20 , respectively; moreover, one of the trajectories of comes out of v1 in the direction close to the direction of γ 0 at v10 . This trajectory ( v2 √ 1 Rdt ∈ / R, will be close to γ and will pass near v2 . If it does not hit v0 then πi v1 in contradiction with (9.8). Same (even simpler) arguments apply to any trajectories connecting two points from A or a point from A with another from V. This result introduces a topology on the set of A-critical measures in a general position. We will call a cell any connected component of this topological space. By Proposition 9.2, a cell is a manifold of real dimension p − 1. We can use as local coordinates either the vector (μ1 = μ(γ1 ), . . . , μ p−1 = μ(γ p−1 )) or v = (v1 , . . . , v p−1 ). Example 9.1. Consider the case p = 2, studied in detail in Sect. 8. In this situation the real dimension of each cell is 1. The image of each non-distinguished cell in the v-plane is an analytic arc connecting Chebotarev’s center v ∗ and ∞. The unit positive measures, $+ (see (8.7)), are represented on the v-plane as a union of three parametrized by the set V cells – arcs +j , j = 0, 1, 2; their boundaries are points from A∗ = {a0 , a1 , a2 , v}, see Theorem 8.8. For an arbitrary p, the boundary of a cell consists of pieces of manifolds of dimension < p. We come to a boundary point of a cell if either: (i) one of the components – arcs γ j ⊂ supp(μ), – degenerates to a point; this case is in turn subdivided into two subcases: (a) coalescence of a zero and a pole joined by an arc; reduction of p; (b) coalescence of two zeros joined by an arc; the polynomial V gains a double zero. (ii) two components (arcs) of supp(μ) meet (at a zero of V ). We need to take a closer look at the case (ii). Again, a simple but important example $+ is defined by (see Theorem 8.8) is p = 2, Sect. 8. An arc +1 ∈ V

v# 1 w1 (v) = R(t) dt = μ1 ∈ (0, m 1 ) ⊂ R, πi a1 with a selection of the proper branch of the function. This is a P-parametrization of the cell +1 , which uses the coordinates μ1 = w(v, ), = [a1 , v]. The extremal values μ1 = 0 and μ1 = m 1 represent the boundary of the cell. Function w(v) is analytic at both points; it is convenient to analyze the reconstruction of μv near μ1 = 0, that is, v = a1 . The boundary point μ1 = m 1 , corresponding to v = v ∗ , is better seen from the point of view of the P-mapping. Let δ be an arc from a2 to v (homotopic to the arc [a2 , v ∗ ] ⊂ ∗ for v close to v ∗ ). Together with w(v) = w(v, γ ) we consider the function

# 1 def w (v) = R(t) dt. πi δ

106

A. Martínez-Finkelshtein, E. A. Rakhmanov

Then v = v ∗ is uniquely defined by equations Im w(v) = Im w (v) = 0, and moreover, a nearby point v on the arc Im w(v) = 0 is uniquely determined by a coordinate h(v) = Im w (v) ∈ R. It is important that h(v ∗ ) = 0, and h(v) changes sign ∗ when v crosses v along the arc (see again Fig. 12, where the reconstruction of supp(μv ) in dependence of v is illustrated). At this moment it is not really important to determine which sign of h(v) corresponds to +1 . It is more convenient to introduce a number s = ±1 (depending on the branches of w and w ) such that +1 = {v ∈ C : Im w(v) = 0, Im w (v) = sh, h > 0}. This parametrization introduced originally around v = v ∗ , may be then extended to the whole arc +1 . We note that h is the “height of the cylinder” (see [91]) and = 2μ = Re w(v) is the “length of the circumferences”, discussed in more details in Sect. 8.2. 9.2. Structure of the set of positive A-critical measures. Let ∗ = ∗ (A) be Chebotarev’s continuum associated with A = {a0 , . . . , a p }, see Sect. 6.1; it consists of critical trajectories of = −R ∗ dz 2 , R ∗ = V ∗ /A. We will assume again a general position for p−1 the set A, that is, A and V ∗ do not have common zeros and V ∗ (z) = k=1 (z − vk∗ ) does not have multiple zeros. Thus, the critical set A∗ = A ∪ {v1∗ , . . . , v ∗p−1 } consists of 2 p different points, and ∗ is comprised of 2 p − 1 arcs, that are critical trajectories of . Each trajectory joins two different points from A∗ . We begin the construction of positive critical measures by introducing local v$+ with the correcoordinates. As above (see Sect. 9.1), we identify measures μ ∈ V p−1 sponding polynomials V (z) = k=1 (z − vk ); furthermore, we define V by the vector v = (v1 , . . . , v p−1 ) ∈ C p−1 of its zeros (numeration is not important). Then each cell $+ is a subspace of C p−1 R2 p−2 , which is a manifold of the real dimension p − 1, in V defined by p − 1 real equations of the form ! "

# 1 Im w j (v) = Im R(t) dt = 0, j = 1, . . . , p − 1, (9.9) 2πi $ γj γ p−1 ∪ $ γ p is a union of Jordan contours $ γk on R (double arcs) where $ =$ γ1 ∪ · · · ∪ $ depending on v, but mutually homotopically equivalent for values of v from the same cell. Practically any v 0 ∈ G has a neighborhood of v satisfying (9.9) with constant $ γk ’s. Now we come to the procedure of selection of combinatorial (rather than homotopic) types of cells; once the combinatorial type is fixed, the homotopic one will be determined from Chebotarev’s continuum, as described next. We start with Chebota p−1 rev’s continuum ∗ and the corresponding polynomial V ∗ (z) = k=1 (z − vk∗ ). Each zero v ∗ = vk∗ , k = 1, . . . , p − 1, is connected by component arcs of ∗ with three other points, say a1∗ , a2∗ , a3∗ ∈ A∗ . We select one of these three arcs (for definiteness, [v ∗ , a1∗ ]) and join two other arcs to make a single arc [a2∗ , a3∗ ], bypassing v ∗ (we think that the arc [a2∗ , a3∗ ] still follows the two arcs from ∗ , but without touching v ∗ , instead passing infinitely close to it). This procedure, carried out at each zeros vk∗ of V ∗ , creates $+ . a compact set , and consequently, a cell G( ) of corresponding measures μ ∈ V The selection of , and hence, of the cell G( ), is made by choosing one of the three connections for each vk∗ ; there are 3 p−1 ways to make the choice. Any choice splits ∗

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

107

into p “disjoint” arcs ∗ = γ1 ∪ · · · ∪ γ p ; out of them we select p − 1 arcs to make an homology basis for C\ ∗ , say γ1 , . . . , γ p−1 , and then consider the corresponding cycles $ γk , as described in Sect. 9.1. Next, together with the link γk connecting vk∗ to one of its neighbors from A∗ we will mark one more arc δk connecting vk∗ with a different neighbor (i.e., a different point from A∗ connected by a branch to vk∗ , see Sect. 6.1). The choice of γk was arbitrary for each p−1 k; the choice of {δk }k=1 has to be made in such a way that p − 1 arcs γk and p − 1 arcs δk $ } = {$ are all different. Then the corresponding cycles {$ , γ1 , . . . , $ γ p−1 , $ δ1 , . . . , $ δ p−1 } form an homology basis on R and may serve to define the mappings P and P as in Sect. 9.1. We will mention first the description of the cell G( ) in terms of the mapping P. This way of parametrization is equivalent (this equivalence is, however, not completely on the surface) to the “length of the circumferences” parametrization of the closed differentials (see Sect. 8 for the case p = 2). Let w = w(v) = P(v, $ ). We claim that the cell G( ) is completely defined by the system w j (v) = t j ∈ R+ ,

j = 1, . . . , p − 1;

(9.10) p−1

more precisely, there exists a domain M( ) = {(t1 , . . . , t p−1 ) ∈ R+ } such that for any point (t1 , . . . , t p−1 ) ∈ M( ) system (9.10) has a unique solution v ∈ C p−1 . Moreover, the corresponding measure μ = μv satisfies μ(γ j ) = t j , and supp(μ) = γ1 ∪· · · ∪γ p = v is homotopic to . $+ of unit positive A-critical measures Summarizing, a rough description of the set V $+ is a union of 3 p−1 of closed bounded cells G( ) may be made as follows. The set V ( = γ1 ∪ · · · ∪ γ p may be selected in 3 p−1 ways). The interior G( ) of each cell consists of measures μ in general position with supp(μ) homotopic to . Interiors of ∗ ∗ different cells are disjoint. Chebotarev’s measure * μ (the Robin measure of ) is the ∗ only common point of all boundaries: μ = ∂G( ). The detailed proof of the assertions above and further analysis is beyond the scope of this paper. We will prove only that there exists a cell G( ) with the homotopic type consisting of positive A-critical measures. It is easier to do it using the P-mapping (equivalent to the “height of cylinders” parametrization of the closed differentials). We consider the mapping {Im w, Im w } = P(v; , ), described in Sect. 9.1, which is invertible in a neighborhood of v ∗ . We select a vector (s1 , . . . , s p−1 ) of signs: each s j ∈ {−1, +1}, and consider j (v) = s j h j , h j ∈ R, Im w j (v) = 0, Im w

j = 1, . . . , p − 1.

(9.11)

For h j = 0, system (9.11) has a unique solution v ∗ = (v1∗ , . . . , v ∗p−1 ). For sufficiently small h j > 0 this system is still uniquely solvable. Equations Im w j (v) = 0 imply that differential in (9.1) is closed, the associated measure μ is A-critical, and supp(μ) = v = γ1,v ∪ · · · ∪ γ p,v . The homotopic type and signs of the components of μ depend on the behavior of trajectories of , which are originated at the points a ∗ ⊂ A∗ and close to trajectories δ j . Any such a trajectory will hit the corresponding point v j if h j = 0. If h j > 0, then it will pass from the left of v j or from the right of v j , see Fig. 14. A change from s j to −s j will change the direction of the turn. Therefore, there is a unique selection of vectors (s1 , . . . , s p−1 ) such that all turns are right. Then the branch

108

A. Martínez-Finkelshtein, E. A. Rakhmanov

Fig. 14. Left and right turns

√ √ of R in C\ v will be close to the branch of R ∗ in C\ ∗ , and therefore the corresponding measure μ will be positive. In this sense, the cell we entered contains some positive measures. Therefore, they are positive, since supp(μ) are all homotopic. Acknowledgements. We are indebted to B. Shapiro for interesting discussions and for providing us with the early version of his manuscripts [82] and [83]; after the first version of this paper was made public in the arxiv, we learned about a work in preparation of B. Shapiro and collaborators, which has some overlappings with this paper. Fortunately, the methods and the paths we follow are very different. We also gratefully acknowledge many helpful conversations with H. Stahl and A. Vasil ev, as well as useful remarks from M. Yattselev concerning the first version of this manuscript. The software for computing the parameters of Chebotarev’s compacts, provided by the authors of [69] and freely available at their web site, was also useful for gaining some additional insight. AMF is partially supported by Junta de Andalucía, grants FQM-229, P06-FQM-01735, and P09-FQM4643, as well as by the research project MTM2008-06689-C02-01 from the Ministry of Science and Innovation of Spain and the European Regional Development Fund (ERDF). EAR is partially supported by the NSF grant DMS-9801677.

References 1. Agnew, A., Bourget, A.: The semiclassical density of states for the quantum asymmetric top. J. Phys. A. Math. and Theor. 41(18), 185205 (2008) 2. Al-Rashed, A.M., Zaheer, N.: Zeros of Stieltjes and Van Vleck polynomials and applications. J. Math. Anal. Appl. 110(2), 327–339 (1985) 3. Alam, M.: Zeros of Stieltjes and Van Vleck polynomials. Trans. Amer. Math. Soc. 252, 197–204 (1979) 4. Aptekarev, A.I.: Sharp constants for rational approximations of analytic functions. Mat. Sb. 193, 3–72 (2003); Engl. Trans. Sb. Math. 193(3), 1–72 (2003) 5. Aptekarev, A.I., Bleher, P.M., Kuijlaars, A.B.J.: Large n limit of Gaussian random matrices with external source. II. Commun. Math. Phys. 259(2), 367–389 (2005) 6. Bergkvist, T., Rullgård, H.: On polynomial eigenfunctions for a class of differential operators. Math. Res. Lett. 9(2–3), 153–171 (2002) 7. Bertola, M.: Boutroux curves with external field: equilibrium measures without a minimization problem. http://arxiv.org/abs/0705.3062v3 [nlin.SI], 2007 8. Bertola, M., Eynard, B., Harnad, J.: Duality: biorthogonal polynomials and multi-matrix models. Commun. Math. Phys. 229, 73–120 (2002) 9. Bertola, M., Gekhtman, M., Szmigielski, J.: The Cauchy two-matrix model. Commun. Math. Phys. 287(3), 983–1014 (2009) 10. Bleher, P.M., Its, A.: Semiclassical asymptotics of orthogonal polynomials, Riemann–Hilbert problem, and universality in the matrix model. Ann. Math. 150, 185–266 (1999) 11. Bleher, P.M., Delvaux, S., Kuijlaars, A.B.J.: Random matrix model with external source and a constrained vector equilibrium problem. http://arxiv.org/abs/1001.1238v1 [math.ph], 2010

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

109

12. Bleher, P.M., Kuijlaars, A.B.J.: Random matrices with external source and multiple orthogonal polynomials. Int. Math. Res. Not. (3), 109–129 (2004) 13. Bleher, P.M., Kuijlaars, A.B.J.: Large n limit of Gaussian random matrices with external source. I. Commun. Math. Phys. 252(1–3), 43–76 (2004) 14. Bleher, P.M., Kuijlaars, A.B.J.: Large n limit of Gaussian random matrices with external source. III. Double scaling limit. Commun. Math. Phys. 270(2), 481–517 (2007) 15. Bôcher, M.: The roots of polynomials that satisfy certain differential equations of the second order. Bull. Amer. Math. Soc. 4, 256–258 (1987) 16. Borodin, A.: Biorthogonal ensembles. Nucl. Phys. B 536, 704–732 (1998) 17. Bourget, A., McMillen, T.: Spectral inequalities for the quantum assymetrical top. J. Phys. A: Math. Theor. 42(9), 095209 (2009) 18. Bourget, A., McMillen, T., Vargas, A.: Interlacing and non-orthogonality of spectral polynomials for the lamé operator. Proc. Amer. Math. Soc. 137(5), 1699–1710 (2009) 19. Courant, R.: Dirichlet’s Principle, Conformal Mapping, and Minimal Surfaces. New York: Interscience Publishers, Inc., 1950, including an, Appendix “Some recent developments in the theory of conformal mapping” by M. Schiffer 20. Craig, W.: The trace formula for Schrödinger operators on the line. Commun. Math. Phys. 126, 379–407 (1989) 21. Deift, P., Kriecherbauer, T., McLaughlin, K.T.-R.: New results on the equilibrium measure for logarithmic potentials in the presence of an external field. J. Approx. Theory 95(3), 388–475 (1998) 22. Deift, P., Kriecherbauer, T., McLaughlin, K.T.-R., Venakides, S., Zhou, X.: Strong asymptotics of orthogonal polynomials with respect to exponential weights. Comm. Pure Appl. Math. 52(12), 1491–1552 (1999) 23. Deift, P., Kriecherbauer, T., McLaughlin, K.T.-R., Venakides, S., Zhou, X.: Uniform asymptotics for polynomials orthogonal with respect to varying exponential weights and applications to universality questions in random matrix theory. Comm. Pure Appl. Math. 52(11), 1335–1425 (1999) 24. Deift, P.A.: Orthogonal polynomials and random matrices: a Riemann-Hilbert approach. New York: New York University Courant Institute of Mathematical Sciences, 1999 25. Dimitrov, D.K., Van Assche, W.: Lamé differential equations and electrostatics. Proc. Amer. Math. Soc. 128(12), 3621–3628, (2000), Erratum: Proc. Amer. Math. Soc. 131(7), 2303 (2003) 26. Dragnev, P., Saff, E.B.: Constrained energy problems with applications to orthogonal polynomials of a discrete variable. J. d’Anal. Math. 72, 229–265 (1997) 27. Duits, M., Geudens, D., Kuijlaars, A.B.J.: A vector equilibrium problem for the two-matrix model in the quartic/quadratic case. http://arxiv.org/abs/1007.3137v1 [math,CA], 2010 28. Duits, M., Kuijlaars, A.B.J.: Universality in the two matrix model: a Riemann-Hilbert steepest descent analysis. Comm. Pure Appl. Math. 62, 1076–1153 (2009) 29. Ercolani, N., McLaughlin, K.D.T.-R.: Asymptotics and integrable structures for biorthogonal polynomials associated to a random two-matrix model. Physica D 152/153, 232–268 (2001) 30. Fokas, A.S., Its, A.R, Kitaev, A.V.: The isomonodromy approach to matrix models in 2D quantum gravity. Comm. Math. Phys. 147, 395–430 (1992) 31. Gesztesy, F., Zinchenko, M.: Local spectral properties of reflectionless Jacobi, CMV, and Schrödinger operators. J. Diff. Eq. 246, 78–107 (2009) 32. Gesztesy, F., Zinchenko, M.: A Borg-type theorem associated with orthogonal polynomials on the unit circle. J. London Math. Soc. 74(2), 757–777 (2006) 33. Gesztesy, F., Zinchenko, M.: Weyl–Titchmarsh theory for CMV operators associated with orthogonal polynomials on the unit circle. J. Approx. Th. 139, 172–213 (2006) 34. Gonchar, A.A., Rakhmanov, E.A.: Equilibrium measure and the distribution of zeros of extremal polynomials. Mat. Sbornik 125(2), 117–127, (1984), translation from Mat. Sb., Nov. Ser. 134(176), No. 3(11), 306–352 (1987) 35. Gonchar, A.A., Rakhmanov, E.A.: The equilibrium problem for vector potentials. Usp. Mat. Nauk, 40(4(244)), 155–156 (1985) 36. Gonchar, A.A., Rakhmanov, E.A.: Equilibrium distributions and degree of rational approximation of analytic functions. Math. USSR Sbornik, 62(2), 305–348, (1987), translation from Mat. Sb., Nov. Ser. 134(176), No. 3(11), 306–352 (1987) 37. Grosset, M.P., Veselov, A.P.: Lamé equation, quantum top and elliptic Bernoulli polynomials. Proc. Edinb. Math. Soc. (2) 51(3), 635–650 (2008) 38. Grötzsch, H.: Über ein Variationsproblem der konformen Abbildungen. Ber. Verh.- Sächs. Akad. Wiss. Leipzig 82, 251–263 (1930) 39. Grünbaum, F.A.: Variations on a theme of Heine and Stieltjes: An electrostatic interpretation of the zeros of certain polynomials. J. Comput. Appl. Math. 99, 189–194 (1998) 40. Harnad, J., Winternitz, P.: Harmonics on hyperspheres, separation of variables and the Bethe ansatz. Lett. Math. Phys. 33(1), 61–74 (1995)

110

A. Martínez-Finkelshtein, E. A. Rakhmanov

41. Heine, E.: Handbuch der Kugelfunctionen. Volume II. 2nd. edition. Berlin: G. Reimer (1878) 42. Ismail, M.E.H.: An electrostatic model for zeros of general orthogonal polynomials. Pacific J. Math. 193, 355–369 (2000) 43. Jenkins, J.A.: Univalent functions and conformal mapping. Ergebnisse der Mathematik und ihrer Grenzgebiete. Neue Folge, Heft 18. Reihe: Moderne Funktionentheorie. Berlin: Springer-Verlag, 1958 44. Jost, J.: Compact Riemann Surfaces. Springer Universitext. 3rd. edition. Berlin-Heidelberg, New York: Springer, 2006 45. Kamvissis, S., Rakhmanov, E.A.: Existence and regularity for an energy maximization problem in two dimensions. J. Math. Phys. 46(8), 083505 (2005) 46. Kamvissis, S., McLaughlin, K.D.T.-R., Miller, P.D.: Semiclassical soliton ensembles for the focusing nonlinear Schrödinger equation, Volume 154 of Annals of Mathematics Studies. Princeton, NJ: Princeton University Press, 2003 47. Kuijlaars, A.B.J., Martínez-Finkelshtein, A.: Strong asymptotics for Jacobi polynomials with varying nonstandard parameters. J. Anal. Math. 94, 195–234 (2004) 48. Kuijlaars, A.B.J., Martínez-Finkelshtein, A., Wielonsky, F.: Non-intersecting squared Bessel paths and multiple orthogonal polynomials for modified Bessel weights. Commun. Math. Phys. 286(1), 217–275 (2009) 49. Kuijlaars, A.B.J., McLaughlin, K.T.-R.: Asymptotic zero behavior of Laguerre polynomials with negative parameter. Constructive Approximation 20(4), 497–523 (2004) 50. Kuijlaars, A.B.J., McLaughlin, K.T.-R., Van Assche, W., Vanlessen, M.: The Riemann-Hilbert approach to strong asymptotics for orthogonal polynomials on [−1, 1]. Adv. Math. 188(2), 337–398 (2004) 51. Kuz’mina, G.V.: Moduli of families of curves and quadratic differentials. Proc. Steklov Inst. Math. 139, 1–231 (1982) 52. Lavrentieff, M.: Sur un problème de maximum dans la représentation conforme. C. R. 191, 827–829 (1930) 53. Lavrentieff, M.: On the theory of conformal mappings. Trudy Fiz.-Mat. Inst. Steklov. Otdel. Mat. 5, 159–245 (1934) (Russian) 54. Lax, P.D., Levermore, C.D.: The small dispersion limit of the Korteweg-de Vries equation. I. Comm. Pure Appl. Math. 36(3), 253–290 (1983) 55. Lax, P.D., Levermore, C.D.: The small dispersion limit of the Korteweg-de Vries equation. II. Comm. Pure Appl. Math. 36(5), 571–593 (1983) 56. Lax, P.D., Levermore, C.D.: The small dispersion limit of the Korteweg-de Vries equation. III. Comm. Pure Appl. Math. 36(6), 809–829 (1983) 57. Marcellán, F., Martínez-Finkelshtein, A., Martínez-González, P.: Electrostatic models for zeros of polynomials: Old, new, and some open problems. J. Comput. Appl. Math. 207(2), 258–272 (2007) 58. Marden, M.: Geometry of Polynomials, Volume 3 of Math. Surveys. 2nd. edition, Amer. Math. Soc., Providence, R. I., 1966 59. Martines Finkel shte˘ın, A.: On the rate of rational approximation of the function exp(−x) on the positive semi-axis. Vestnik Moskov. Univ. Ser. I Mat. Mekh., (6), 94–96 (1991), Engl. transl. in Moscow Univ. Math. Bull. 6, 65–67 (1991) 60. Martínez-Finkelshtein, A., Orive, R.: Riemann-Hilbert analysis of Jacobi polynomials orthogonal on a single contour. J. Approx. Theory 134(2), 137–170 (2005) 61. Martínez-Finkelshtein, A., Saff, E.B.: Asymptotic properties of Heine-Stieltjes and Van Vleck polynomials. J. Approx. Theory 118(1), 131–151 (2002) 62. McLaughlin, K. T.-R., Miller, P.D.: The ∂ steepest descent method and the asymptotic behavior of polynomials orthogonal on the unit circle with fixed and exponentially varying nonanalytic weights. IMRP Int. Math. Res. Pap., pages Art. ID 48673, 1–77 (2006) 63. McLaughlin, K.T.-R., Miller, P.D.: The ∂ steepest descent method for orthogonal polynomials on the real line with varying weights. Int. Math. Res. Not. IMRN, pages Art. ID rnn 075, 66, (2008) 64. McLaughlin, K.T.-R., Vartanian, A.H., Zhou, X.: Asymptotics of recurrence relation coefficients, Hankel determinant ratios, and root products associated with Laurent polynomials orthogonal with respect to varying exponential weights. Acta Appl. Math. 100(1), 39–104 (2008) 65. Melnikov, M., Poltoratski, A., Volberg, A.: Uniqueness theorems for Cauchy integrals. Publ. Mat. 52(2), 289–314 (2008) 66. Mhaskar, H.N., Saff, E.B.: Extremal problems for polynomials with exponential weights. Trans. Amer. Math. Soc. 285, 204–234 (1984) 67. Nuttall, J.: Asymptotics of diagonal Hermite-Padé polynomials. J. Approx. Theory 42(4), 299–386 (1984) 68. Nuttall, J.: Asymptotics of generalized Jacobi polynomials. Constr. Approx. 2(1), 59–77 (1986) 69. Ortega-Cerdà, J., Pridhnani, B.: The Pólya-Tchebotaröv problem. In Harmonic Analysis and Partial Differential Equations, pp. 153–170. Contemp. Math., 505, Amer. Math. Soc., Providence, R.I., 2010 70. Pólya, G.: Sur un théoreme de Stieltjes. C. R. Acad. Sci. Paris 155, 767–769 (1912)

Asymptotic Zero Distribution of Heine-Stieltjes Polynomials

111

71. Pólya, G.: Beitrag zur Verallgemeinerung des Verzerrungssatzes auf mehrfach zusammenhängende Gebiete. III. Sitzungsberichte Akad. Berlin 1929, 55–62 (1929) 72. Pommerenke, Ch.: Univalent Functions. Göttingen: Vandenhoeck & Ruprecht, 1975 73. Rakhmanov, E.A.: On asymptotic properties of polynomials orthogonal on the real axis. Math. USSR Sb. 47, 155–193 (1984) 74. Rakhmanov, E.A.: Equilibrium measure and the distribution of zeros of the extremal polynomials of a discrete variable. Sb. Math. 187, 1213–1228 (1996) 75. Rakhmanov, E.A., Perevozhnikova, E.A.: Variations of the equilibrium energy and S-property of compacta of minimal capacity. Preprint, 1994 76. Ronveaux, A. (ed.): Heun’s differential equations. New York: The Clarendon Press Oxford University Press, (1995), With contributions by F. M. Arscott, S. Yu. Slavyanov, D. Schmidt, G. Wolf, P. Maroni and A. Duval 77. Saff, E.B., Totik, V.: Logarithmic Potentials with External Fields. Volume 316 of Grundlehren der Mathematischen Wissenschaften. Berlin: Springer-Verlag, 1997 78. Shah, G.M.: On the zeros of Van Vleck polynomials. Proc. of the Amer. Math. Soc. 19(6), 1421–1426 (1968) 79. Shah, G.M.: Confluence of the singularities of the generalized Lame’s differential equation. J. Natur. Sci. and Math. 91, 33–147 (1969) 80. Shah, G.M.: Monotonic variation of the zeros of Stieltjes and Van Vleck polynomials. J. Indian Math. Soc. (N.S.) 33, 85–92 (1969) 81. Shah, G.M.: On the zeros of Stieltjes and Van Vleck polynomials. Illinois J. Math. 14, 522–528 (1970) 82. Shapiro, B.: Algebro-geometric aspects of Heine–Stieltjes polynomials. http://arxiv.org/abs/0812. 4193v2 [math.ph], 2008 83. Shapiro, B., Tater, M.: On spectral polynomials of the Heun equation. I. J. Approx. Theory 1162(4), 766–781 (2010) 84. Soshnikov, A.: Determinantal random point fields. Russ. Math. Surv. 55, 923–975 (2000) 85. Springer, G.: Introduction to Riemann surfaces. Reading, Mass: Addison-Wesley Publishing Company, 1957 86. Stahl, H.: Sets of minimal capacity and extremal domains. Preprint, 2008 87. Stahl, H.: Extremal domains associated with an analytic function. I, II. Complex Variables Theory Appl. 4(4), 311–324, 325–338 (1985) 88. Stahl, H.: Orthogonal polynomials with complex-valued weight function. I, II. Constr. Approx. 2(3), 225–240, 241–251 (1986) 89. Stahl, H.: On the convergence of generalized Padé approximants. Constr. Approx. 5(2), 221–240 (1989) 90. Stieltjes, T.J.: Sur certains polynômes que vérifient une équation différentielle linéaire du second ordre et sur la teorie des fonctions de Lamé. Acta Math. 6, 321–326 (1885) 91. Strebel, K.: Quadratic differentials. Volume 5 of Ergebnisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)]. Berlin: Springer-Verlag, 1984 92. Szeg˝o, G.: Orthogonal Polynomials. Volume 23 of Amer. Math. Soc. Colloq. Publ. fourth edition, Providence, RI: Amer. Math. Soc., 1975 93. Teichmüller, O.: Unlersuchungen über konforme unu quasikonforme Abbildungen. Deutsche Math. 3, 621–678 (1938) 94. Teschl, G.: Jacobi Operators and Completely Integrable Nonlinear Lattices. Providence, RF: Amer. Math. Soc., 1999 95. Van Vleck, E.B.: On the polynomials of Stieltjes. Bull. Amer. Math. Soc. 4, 426–438 (1898) 96. Vasil ev, A.: Moduli of families of curves for conformal and quasiconformal mappings. Volume 1788 of Lecture Notes in Mathematics. Berlin: Springer-Verlag, 2002 97. Volkmer, H.: Multiparameter eigenvalue problems and expansion theorems. Lecture Notes Math., 1356, Berlin-Hedelberg, New York: Springer, 1988 98. Volkmer, H.: Generalized ellipsoidal and spheroconal harmonics. SIGMA Symmetry Integrability Geom. Methods Appl. 2, paper 071, pp. 16 (2006) 99. Volkmer, H.: External ellipsoidal harmonics for the Dunkl–Laplacian. SIGMA 4, paper 091, pp. 13 (2008) 100. Whittaker, E.T., Watson, G.N.: A Course of Modern Analysis. Cambridge: Cambridge Univ. Press, 1996 101. Zaheer, N.: On Stieltjes and Van Vleck polynomials. Proc. Amer. Math. Soc. 60, 169–174 (1976) 102. Zaheer, N., Alam, M.: On the zeros of Stieltjes and Van Vleck polynomials. Trans. Amer. Math. Soc. 229, 279–288 (1977) Communicated by S. Zelditch

Commun. Math. Phys. 302, 113–159 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1180-y

Communications in

Mathematical Physics

Gravitational Descendants in Symplectic Field Theory Oliver Fabert Mathematisches Institut, Ludwig-Maximilians-Universit¨at M¨unchen, Theresienstr. 39, 80333 M¨unchen, Germany E-mail: [email protected] Received: 13 July 2009 / Accepted: 19 August 2010 Published online: 6 January 2011 – © Springer-Verlag 2011

Abstract: It was pointed out by Y. Eliashberg in his ICM 2006 plenary talk that the rich algebraic formalism of symplectic field theory leads to a natural appearance of quantum and classical integrable systems, at least in the case when the contact manifold is the prequantization space of a symplectic manifold. In this paper we generalize the definition of gravitational descendants in SFT from circle bundles in the Morse-Bott case to general contact manifolds. After we have shown using the ideas in Okounkov and Pandharipande (Ann Math 163(2):517–560, 2006) that for the basic examples of holomorphic curves in SFT, that is, branched covers of cylinders over closed Reeb orbits, the gravitational descendants have a geometric interpretation in terms of branching conditions, we follow the ideas in Cieliebak and Latschev (http://arixiv.org/abs/0706.3284v2 [math.s6], 2007) to compute the corresponding sequence of Poisson-commuting functions when the contact manifold is the unit cotangent bundle of a Riemannian manifold. Contents Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. Symplectic Field Theory with Gravitational Descendants . . 1.1 Symplectic field theory . . . . . . . . . . . . . . . . . 1.2 Gravitational descendants . . . . . . . . . . . . . . . . 1.3 Invariance statement . . . . . . . . . . . . . . . . . . . 1.4 The circle bundle case . . . . . . . . . . . . . . . . . . 2. Example: Symplectic Field Theory of Closed Geodesics . . 2.1 Symplectic field theory of a single Reeb orbit . . . . . 2.2 Gravitational descendants = branching conditions . . . 2.3 Branched covers of trivial half-cylinders . . . . . . . . 2.4 Obstruction bundles and transversality . . . . . . . . . 2.5 Additional marked points and gravitational descendants References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Research supported by the German Research Foundation (DFG).

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

113 119 119 121 128 134 135 135 142 146 149 155 158

114

O. Fabert

Summary Symplectic field theory (SFT), introduced by H. Hofer, A. Givental and Y. Eliashberg in 2000 ([EGH]), is a very large project and can be viewed as a topological quantum field theory approach to Gromov-Witten theory. Besides providing a unified view on established pseudoholomorphic curve theories like symplectic Floer homology, contact homology and Gromov-Witten theory, it leads to numerous new applications and opens new routes yet to be explored. While symplectic field theory leads to algebraic invariants with very rich algebraic structures, which are currently studied by a large group of researchers, for all the geometric applications found so far it was sufficient to work with simpler invariants like cylindrical contact homology. Although cylindrical contact homology is not always defined, it is much easier to compute, not only since it involves just moduli spaces of holomorphic cylinders but also due to the simpler algebraic formalism. While the rich algebraic formalism of the higher invariants of symplectic field theory seems to be too complicated for concrete geometric applications, it was pointed out by Eliashberg in his ICM 2006 plenary talk ([E]) that the integrable systems of rational Gromov-Witten theory very naturally appear in rational symplectic field theory by using the link between the rational symplectic field theory of circle bundles in the Morse-Bott version and the rational Gromov-Witten potential of the underlying symplectic manifold. Indeed, after introducing gravitational descendants as in Gromov-Witten theory, it is precisely the rich algebraic formalism of SFT with its Weyl and Poisson structures that provides a natural link between symplectic field theory and (quantum) integrable systems. In particular, in the case where the contact manifold is a circle bundle over a closed symplectic manifold, the rich algebraic formalism of symplectic field theory seems to provide the right framework to understand the deep relation between Gromov-Witten theory and integrable systems, at least in the genus zero case. While in the Morse-Bott case in [E] it follows from the corresponding statements for the Gromov-Witten descendant potential that the sequences of commuting operators and Poisson-commuting functions are independent of auxiliary choices like almost complex structure and abstract perturbations, for the case of general contact manifolds it is well-known that the SFT Hamiltonian, however in general, explicitly depend on choices like contact form, cylindrical almost complex structure and coherent abstract perturbations, and hence is not an invariant for the contact manifold itself. But before we can come down to the question of invariance, we first need to give a rigorous definition of gravitational descendants in the context of symplectic field theory. While in Gromov-Witten theory the gravitational descendants were defined by integrating powers of the first Chern class of the tautological line bundle over the moduli space, which by Poincaré duality corresponds to counting common zeroes of sections in this bundle, in symplectic field theory, more generally every holomorphic curves theory, where curves with punctures and/or boundary are considered, we are faced with the problem that the moduli spaces generically have codimension-one boundary, so that the count of zeroes of sections in general depends on the chosen sections in the boundary. It follows that the integration of the first Chern class of the tautological line bundle over a single moduli space has to be replaced by a construction involving all moduli space at once. Note that this is similar to the choice of coherent abstract perturbations for the moduli spaces in symplectic field theory in order to achieve transversality for the Cauchy-Riemann operator. Keeping the interpretation of descendants as common zero sets of sections in powers of the tautological line bundles (which will turn out to be particularly useful when one studies the topological meaning of descendants by

Descendants in SFT

115

localizing on special divisors, see [FR]), we define in this paper the notion of coherent collections of sections in the tautological line bundles over all moduli spaces, which just formalizes how the sections chosen for the lower-dimensional moduli spaces should affect the section chosen for moduli spaces on its boundary. To be more precise, since the sections should be invariant under obvious symmetries like reordering of the punctures and the marked points, we actually need to work with multi-sections in order to meet both the symmetry and the transversality assumption. We will then define descendants j of moduli spaces M ⊂ M, which we obtain inductively as zero sets of these coherent collections of sections (s j ) in the tautological line bundles over the descendant moduli spaces M

j−1

⊂ M, and define descendant Hamiltonians Hi,1 j by integrating chosen j

closed differential forms θi over M . For these we prove the following theorem. Theorem. Counting holomorphic curves with one marked point after integrating differential forms and introducing gravitational descendants defines a sequence of distinguished elements Hi,1 j ∈ H∗ −1 W0 , D 0 −1 0 in the full SFT homology algebra with differential D 0 = [H0 , ·] : −1 W0 → W , −1 0 0 which commute with respect to the commutator bracket on H∗ W , D ,

Hi,1 j , H1k, = 0, (i, j), (k, ) ∈ {1, . . . , N } × N .

In contrast to the Morse-Bott case considered in [E] it follows that, when the differential in symplectic field theory counting holomorphic curves without additional marked points is no longer zero, the sequences of generating functions no longer commute with respect to the bracket, but only commute after passing to homology. On the other hand, in the same way as the rational symplectic field theory of a contact manifold is defined by counting only curves with genus zero, we immediately obtain a rational version of the above statement by expanding H0 and the Hi,1 j in powers of the formal variable for the genus. Corollary. Counting rational holomorphic curves with one marked point after integrating differential forms and introducing gravitational descendants defines a sequence of distinguished elements hi,1 j ∈ H∗ P0 , d 0 , 0 0 0 0 in the rational SFT homology algebra with differential 0d = {h , ·} : P → P , which 0 commute with respect to the Poisson bracket on H∗ P , d ,

hi,1 j , h1k, = 0, (i, j), (k, ) ∈ {1, . . . , N } × N .

As we already outlined above, in contrast to the circle bundle case we have to expect that the sequence of descendant Hamiltonians depends on the auxiliary choices like contact form, cylindrical almost complex structure and coherent abstract polyfold perturbations. Here we prove the following natural invariance statements.

116

O. Fabert

Theorem. For different choices of contact form λ± , cylindrical almost complex structure J± , abstract polyfold perturbations and sequences of coherent collections of sec −1 0,− 0,− 1,− ,D tions s ± j the resulting systems of commuting operators Hi, j on H∗ W −1 0,+ 0,+ 1,+ are isomorphic, i.e., there exists an isomorphism of and Hi, j on H∗ W , D −1 0,− 0,− the Weyl algebras H∗ W , D and H∗ −1 W0,+ , D 0,+ which maps Hi,1,− j ∈ −1 0,− 0,− −1 0,+ 0,+ 1,+ H∗ W , D to Hi, j ∈ H∗ W , D . Note that this theorem is an extension of the theorem in [EGH] stating that for different choices of auxiliary data the Weyl algebras H∗ −1 W0,− , D 0,− and H∗ −1 W0,+ , D 0,+ are isomorphic. As above we clearly also get a rational version of the invariance statement: Corollary. For different choices of contact form λ± , cylindrical almost complex structure J ± , abstract polyfold perturbations and sequences of coherent collecthe resulting system of Poisson-commuting functions hi,1,− tions of sections s ± j on j 0,− 0,− and hi,1,+j on H∗ P0,+ , d 0,+ are isomorphic, i.e., there exists an isoH∗ P , d morphism of the Poisson algebras H∗ P0,− , d 0,− and H∗ P0,+ , d 0,+ which maps 0,− 0,− hi,1,− to hi,1,+j ∈ H∗ P0,+ , d 0,+ . ,d j ∈ H∗ P As a concrete example beyond the case of circle bundles discussed in [E] we consider the symplectic field theory of a closed geodesic. For this recall that in [F2] the author introduces the symplectic field theory of a closed Reeb orbit γ , which is defined by counting only those holomorphic curves which are branched covers of the orbit cylinder R ×γ in R ×V . In [F2] we prove that these orbit curves do not contribute to the algebraic invariants of symplectic field theory as long as they do not carry additional marked points. Our proof explicitly uses that the subset of orbit curves over a fixed orbit is closed under taking boundaries and gluing, which follows from the fact that they are also trivial in the sense that they have trivial contact area and that this contact area is preserved under taking boundaries and gluing. It follows that every algebraic invariant of symplectic field theory has a natural analog defined by counting only orbit curves. In particular, in the same way as we define sequences of descendant Hamiltonians Hi,1 j and hi,1 j by counting general curves in the symplectization of a contact manifold, we can define sequences of descendant Hamiltonians H1γ ,i, j and h1γ ,i, j by just counting branched covers of the orbit cylinder over γ with signs (and weights), where the preservation of the contact area under splitting and gluing of curves proves that for every theorem from above we have a version for γ . For this let W0γ be the graded Weyl subalgebra of the Weyl algebra W0 , which is generated only by those p- and q-variables pn = pγ n , qn = qγ n corresponding to Reeb orbits which are multiple covers of the fixed orbit γ and which are good in the sense of [BM]. In the same way we further introduce the Poisson subalgebra P0γ of P0 . We further prove that for branched covers of orbit cylinders over any closed Reeb orbit the gravitational descendants indeed have a geometric interpretation in terms of branching conditions, which generalizes the work of [OP] used in [E] for the circle. Since all the considered holomorphic curves factor through the embedding of the closed Reeb orbit into the contact manifold, it follows that it only makes sense to consider differential forms of degree zero or one. While it follows from the result h0γ = 0 in [F2] that the sequences h1γ ,i, j indeed commute with respect to the Poisson bracket (before passing to homology), the same proof as in [F2] shows that every descendant

Descendants in SFT

117

Hamiltonian in the sequence vanishes if the differential form is of degree zero. For differential forms of degree one the strategy of the proof however no longer applies and it is indeed shown in [E] that for γ = V = S 1 and θ = dt we get nontrivial contributions from branched covers. In this paper we want to study the corresponding Poisson-commuting sequence in the special case where the contact manifold is the unit cotangent bundle S ∗ Q of a (m-dimensional) Riemannian manifold Q, so that every closed Reeb orbit γ on V = S ∗ Q corresponds to a closed geodesic γ¯ on Q. When the closed geodesic γ¯ represents a hyperbolic Reeb orbit in the unit cotangent bundle of a surface Q a simple computation shows that all moduli spaces with 2 j + 1 punctures possibly contribute to the descendant Hamiltonian h1γ , j . Since in this case the Fredholm index is 2 j − 1 and hence for j > 0 strictly smaller than the dimension of the underlying nonregular moduli space of branched covers, which is 4 j − 2, transversality cannot be satisfied but the cokernels of the linearized operators fit together to give an obstruction bundle

of rank 2 j − 1. While like for every closed Reeb orbit we have that h1γ ,0 = h1S 1 ,0 = n pn qn , the other Hamiltonians h1γ , j are not so easy to determine. While in the case of the circle we obtain a complete set of integrals, our following theorem shows that the Hamiltonian system with symmetries obtained for different choices of Reeb orbits does not need to be integrable.

Theorem. Assume that the closed geodesic γ¯ represents a hyperbolic Reeb orbit in the unit cotangent bundle of a surface Q. Then gγ1¯ , j = 0 and hence h1γ , j = 0 for all j > 0. Apart from the fact that this result shows that the resulting Hamiltonian systems with symmetries are in general not very interesting from the point of view of integrable systems, let us sketch how the Hamiltonian systems with symmetries assigned to a closed Reeb orbit can be applied to embedding problems in symplectic geometry. To this end the author is currently working on a local version of SFT, which generalizes local Gromov-Witten (GW) theory in the same way as the usual SFT generalizes usual GW theory: While in local GW theory we count multiple covers over a fixed super-rigid closed holomorphic curve, in local SFT we count multiple covers over super-rigid punctured holomorphic curves, where the technical assumption of super-ridity guarantees that multiple covers are isolated. In particular, instead of getting invariants for contact manifolds, we now get the above invariants for closed Reeb orbits by counting multiple covers over the corresponding orbit cylinder. On the other hand, in the very same way as Paolo Rossi was able to compute part of the GW potential of the sphere using the SFT of the circle in [R1], we can use these new SFT invariants for the closed Reeb orbits appearing in the splitting process to derive information about the local GW potential of the original closed holomorphic curve. On the other hand, it can also be used in order to derive a contradiction and hence should be applicable to embedding problems in symplectic geometry. We claim that our above theorem can be used to show that an exceptional sphere cannot split along a hyperbolic Reeb orbit in the unit cotangent bundle, which also gives an alternative proof of the fact that every oriented embedded Lagrangian in a closed symplectic four-manifold, which intersects an exceptional sphere in a homologically nontrivial way, must have genus zero or one: Since it follows from the above theorem that the descendant Hamiltonians of every hyperbolic orbit representing a closed geodesic are zero, and this then implies that there are no descendant contributions of degree-two classes in the local Gromov-Witten descendant potential, we can easily derive a contradiction using the topological recursion relation in rational Gromov-Witten theory.

118

O. Fabert

More precisely, we will show that the resulting system of Poisson-commuting functions h1γ , j , j ∈ N on P0γ is isomorphic to the system of Poisson-commuting functions g1γ¯ , j , j ∈ N on Pγ0¯ = P0γ , where for every j ∈ N the descendant Hamiltonian g1γ¯ , j is given by qn 1 · . . . · qn j+2 gγ1¯ , j = , ( n) ( j + 2)! where q−n = pn and the sum runs over all ordered monomials qn 1 · · · qn j+2 with n 1 + · · · + n j+2 = 0 and which are of degree 2(m + j − 3). Here ( n ) ∈ {−1, 0, +1} is fixed by a choice of coherent orientations in symplectic field theory and is zero if and only if one of the orbits γ n 1 , . . . , γ n j+2 is bad. For this recall from [BM] that in order to orient moduli spaces in symplectic field theory one additionally needs to choose orientations for all occuring Reeb orbits, while the resulting invariants are independent of these auxiliary choices. While it follows from our proof that when the degree is maximal we have an obstruction bundle of rank zero over a discrete non-regular moduli space, we show in Proposition 2.8 how (for j = 1) this obstruction bundle and hence its orientation is determined by the tangent spaces to the unstable manifolds of the multiply-covered geodesics. While the orientation of a closed Reeb orbit in SFT corresponds to an orientation of the (finite-dimensional) unstable manifold, the sign in front of pn 1 pn 2 qn k (n 1k + n 2k = n k ) in gγ1¯ ,1 is given by k

k

comparing the orientations of the finite-dimensional linear subspaces T W − (γ¯ 2 ) and (T W − (γ¯ ) ⊕ T W − (γ¯ )) ∩ = {(v1 , v2 ) ∈ T W − (γ¯ ) ⊕ T W − (γ¯ ) : v1 (0) = v2 (0)} of C ∞ ((γ¯ 2 )∗ N ) (N is the normal bundle to γ¯ in Q, see Proposition 2.8). For j > 1 the obstruction bundle gets much more complicated, but the idea is the same. Apart from the fact that the commutativity condition {gγ1¯ , j , gγ1¯ ,k } = 0 clearly leads to relations between the different ( n ), observe that a choice of orientation for γ does not lead to a canonical choice of orientations for its multiples γ k . While we expect that it is in general very hard to write down a set of signs ( n ) explicitly, for all the geometric applications we have in mind and the educational purposes as a test model beyond the Gromov-Witten case we are rather interested in proving vanishing results as the one above. Forgetting about the appearing sign issues, it follows that the sequence g1γ¯ , j is obtained from the sequence for the circle by removing all summands with the wrong, that is, not maximal degree, so that the system is completely determined by the KdV hierarchy and the Morse indices of the closed geodesic and its iterates. Indeed note that when the underlying Poisson algebra is graded so that the Poisson bracket is of pure degree, then one naturally gets from a Hamiltonian system with symmetries h1j a new Hamiltonian system with symmetries h1j , where h1j denotes the part of h1j with maximal degree, since [h1j ], [h1k ] = h1j , h1k . Note that with our grading conventions the Poisson bracket is indeed of pure degree since | pn | + |qn | = − CZ (γ n ) + (m − 3) + CZ (γ n ) + (m − 3) = 2(m − 3) is independent of the multiplicity n, where CZ (γ n ) denotes the index of γ n (Morse index of γ¯ n ).

Conley-Zehnder inx −inx On the other hand, since u(x) = n pn e + qn e is not of pure degree, our new Hamiltonian systems with symmetries have no good translation (using inverse Fourier transform) into the formal loop space {u : S 1 → Rk }, k = 1 which is the classical phase space of the integrable systems of Gromov-Witten theory, see [R2]. Note that in the case of the circle γ¯ = Q = S 1 the degree condition is automatically fulfilled and we just get back the sequence of descendant Hamiltonians for the

Descendants in SFT

119

circle in [E], which agrees with the sequence of Poisson-commuting integrals of the dispersionless KdV integrable hierarchy, while in the case of a hyperbolic geodesic on a surface it follows from the multiplicativity of the Conley-Zehnder index that none of the monomials qn 1 · . . . · qn j+2 has the right degree. Apart from using the geometric interpretation of gravitational descendants for branched covers of orbit cylinders over a closed Reeb orbit in terms of branching conditions mentioned above, the second main ingredient for the proof is the idea in [CL] to compute the symplectic field theory of V = S ∗ Q from the string topology of the underlying Riemannian manifold Q by studying holomorphic curves in the cotangent bundle T ∗ Q. More precisely, we compute the symplectic field theory of a closed Reeb orbit γ in S ∗ Q including differential forms and gravitational descendants by studying branched covers of the trivial half-cylinder connecting the closed Reeb orbit in the unit cotangent bundle with the underlying closed geodesic in the cotangent bundle T ∗ Q with special branching data, where the latter uses the geometric interpretation of gravitational descendants. In order to give a complete proof we also prove the neccessary transversality theorems using finite-dimensional obstruction bundles over the underlying nonregular moduli spaces. While on the SFT side one has very complicated obstruction bundles over nonregular moduli spaces of arbitrary large dimension, on the string side all relevant nonregular moduli spaces already turn out to be discrete, so that the obstruction bundles disappear if the Fredholm index is right. It follows that the system of Poisson-commuting function for a closed geodesic is completely determined by the KdV hierarchy and the Morse indices of the closed geodesic and its iterates. This paper is organized as follows. Section One is concerned with the definition and the basic results about gravitational descendants in symplectic field theory. After we recalled the basic definitions of symplectic field theory in Subsect. 1.1, we define gravitational descendants in Subsect. 1.2 using the coherent collections of sections and prove that the resulting sequences of descendant Hamiltonians commute after passing to homology. In Subsect. 1.3 we prove the desired invariance statement and discuss the important case of circle bundles in the Morse-Bott setup outlined in [E] in 1.4. After we treated the general case in Sect. One, Sect. Two is concerned with a concrete example beyond the case of circle bundles, the symplectic field theory of a closed geodesic, which naturally generalizes the case of the circle in [E]. After we have recalled the definition of symplectic field theory for a closed Reeb orbit including the results from [F2] in Subsect. 2.1, we show in Subsect. 2.2 that for branched covers of orbit cylinders the gravitational descendants have a geometric interpretation in terms of branching conditions. After outlining that there exists a version of the isomorphism in [CL] involving the symplectic field theory of a closed Reeb orbit in the unit cotangent bundle, we study the moduli space of branched covers of the corresponding trivial half-cylinder in the cotangent bundle in Subsect. 2.3. Since we meet the same transversality problems as in [F2], we study the neccessary obstruction bundle setup including Banach manifolds and Banach space bundles in Subsect. 2.3. In Subsect. 2.4 we finally prove the above theorem by studying branched covers of the trivial half-cylinder with special branching behavior. 1. Symplectic Field Theory with Gravitational Descendants 1.1. Symplectic field theory. Symplectic field theory (SFT) is a very large project, initiated by Eliashberg, Givental and Hofer in their paper [EGH], designed to describe in a

120

O. Fabert

unified way the theory of pseudoholomorphic curves in symplectic and contact topology. Besides providing a unified view on well-known theories like symplectic Floer homology and Gromov-Witten theory, it shows how to assign algebraic invariants to closed contact manifolds (V, ξ = {λ = 0}): Recall that a contact one-form λ defines a vector field R on V by R ∈ ker dλ and λ(R) = 1, which is called the Reeb vector field. We assume that the contact form is Morse in the sense that all closed orbits of the Reeb vector field are nondegenerate in the sense of [BEHWZ]; in particular, the set of closed Reeb orbits is discrete. The invariants are defined by counting J -holomorphic curves in R ×V which are asymptot ically cylindrical over chosen collections of Reeb orbits ± = γ1± , . . . , γn±± as the R-factor tends to ±∞, see [BEHWZ]. The almost complex structure J on the cylindrical manifold R ×V is required to be cylindrical in the sense that it is R-independent, links the two natural vector fields on R ×V , namely the Reeb vector field R and the R-direction ∂s , by J ∂s = R, and turns the distribution ξ on V into a complex subbundle of T V, ξ = T V ∩ J T V . We denote by Mg,r ( + , − ) the corresponding compactified moduli space of genus g curves with r additional marked points ([BEHWZ,EGH]). Possibly after choosing abstract perturbations using polyfolds (see [HWZ]), obstruction bundles ([F2]) or domain-dependent structures ([F1]) following the ideas in [CM] we get that Mg,r ( + , − ) is a branched-labelled orbifold with boundaries and corners of dimension equal to the Fredholm index of the Cauchy-Riemann operator for J . Note that in the same way as we will not discuss transversality for the general case but just refer to the upcoming papers on polyfolds by Hofer and his co-workers, in what follows we will for simplicity assume that every moduli space is indeed a manifold with boundaries and corners, since we expect that all the upcoming constructions can be generalized in an appropriate way. Let us now briefly introduce the algebraic formalism of SFT as described in [EGH]: Recall that a multiply-covered Reeb orbit γ k is called bad if CZ γ k = CZ(γ ) mod 2, where CZ(γ ) denotes the Conley-Zehnder index of γ . Calling a Reeb orbit γ good if it is not bad we assign to every good Reeb orbit γ two formal graded variables pγ , qγ with grading | pγ | = m − 3 − CZ(γ ), |qγ | = m − 3 + CZ(γ ) when dim V = 2m −1. In order to include higher-dimensional moduli spaces we further assume that a string of closed (homogeneous) differential forms = (θ1 , . . . , θ N ) on V is chosen and assign to every θi ∈ ∗ (V ) formal variables ti with grading |ti | = 2 − deg θi . Finally, let be another formal variable of degree || = 2(m − 3). Let W be the graded Weyl algebra over C of power series in the variables , pγ and ti with coefficients which are polynomials in the variables qγ , which is equipped with the associative product in which all variables super-commute according to their grading except for the variables pγ , qγ corresponding to the same Reeb orbit γ , [ pγ , qγ ] = pγ qγ − (−1)| pγ ||qγ | qγ pγ = κγ . (κγ denotes the multiplicity of γ .) Following [EGH] we further introduce the Poisson algebra P of formal power series in the variables pγ and ti with coefficients which are

Descendants in SFT

121

polynomials in the variables qγ with Poisson bracket given by ∂ f ∂g | f ||g| ∂g ∂ f . κγ − (−1) { f, g} = ∂ pγ ∂qγ ∂ pγ ∂qγ γ As in Gromov-Witten theory we want to organize all moduli spaces Mg,r ( + , − ) into a generating function H ∈ −1 W, called Hamiltonian. In order to include also higher-dimensional moduli spaces, in [EGH] the authors follow the approach in Gromov-Witten theory to integrate the chosen differential forms θ1 , . . . , θ N over the moduli spaces after pulling them back under the evaluation map from target manifold V . The Hamiltonian H is then defined by + − H= ev∗1 θi1 ∧ . . . ∧ evr∗ θir g−1 t I p q

+ , −

Mg,r ( + , − )/ R

−

with t I = ti1 . . . tir , p = pγ1+ . . . pγ ++ and q = qγ − . . . qγ − . Expanding +

n

H = −1

1

n−

H g g

g

we further get a rational Hamiltonian h = H0 ∈ P, which counts only curves with genus zero. While the Hamiltonian H explicitly depends on the chosen contact form, the cylindrical almost complex structure, the differential forms and abstract polyfold perturbations making all moduli spaces regular, it is outlined in [EGH] how to construct algebraic invariants, which just depend on the contact structure and the cohomology classes of the differential forms. 1.2. Gravitational descendants. For the relation to integrable systems it is outlined in [E] that, as in Gromov-Witten theory, symplectic field theory must be enriched by considering so-called gravitational descendants of the primary Hamiltonian H. Before we give a rigorous definition of gravitational descendants in SFT, we recall the definition from Gromov-Witten theory. Denote by Mr = Mg,r (X, J ) the compactified moduli space of closed J -holomorphic curves in the closed symplectic manifold X of genus g with r marked points (and fixed homology class). Following [MDSa] we introduce over Mr so-called tautological line bundles L1 , . . . , Lr , where the fibre of Li over a punctured curve (u, z 1 , . . . , zr ) ∈ Mr in the noncompactified moduli space is given by the cotangent line to the underlying, possibly unstable closed nodal Riemann surface S at the i th marked point, (Li )(u,z 1 ,...,zr ) = Tz∗i S, i = 1, . . . , r. To be more formal, observe that there exists a canonical map π : Mr +1 → Mr by forgetting the (r + 1)st marked point and stabilizing the map, where the fibre over the curve (u, z 1 , . . . , zr ) agrees with the curve itself. Then the tautological line bundle Li can be defined as the pull-back of the vertical cotangent line bundle of π : Mr +1 → Mr under the canonical section σi : Mr → Mr +1 mapping to the i th marked point in the fibre. Note that while the vertical cotangent line bundle is rather a sheaf than a true bundle

122

O. Fabert

since it becomes singular at the nodes in the fibres, the pull-backs under the canonical sections are indeed true line bundles as the marked points are different from the nodes and hence these sections avoid the singular loci. Denoting by c1 (Li ) the first Chern class of the complex line bundle Li , one then considers for the descendant potential of Gromov-Witten theory integrals of the form ev∗1 θi1 ∧ c1 (L1 ) j1 ∧ . . . ∧ evr∗ θir ∧ c1 (Lr ) jr , Mr

where (i k , jk ) ∈ {1, . . . , N } × N, which can again be organized into a generating function. Like pulling-back cohomology classes from the target manifold, the introduction of the tautological line bundles hence has the effect that the generating function also sees the higher-dimensional moduli spaces. On the other hand, in contrast to the former, the latter refers to partially fixing the complex structure on the underlying punctured Riemann surface. Before we can turn to the definition of gravitational descendants in SFT, it will turn out to be useful to give an alternative definition, where the integration of the powers of the first Chern classes is replaced by considering zero sets of sections. Restricting for notational simplicity to the case with one marked point, we can define by induction over j+1 j j ∈ N a nested sequence of moduli spaces M1 ⊂ M1 ⊂ M1 such that 1 ev∗ θi ∧ c1 (L) j = ev∗ θi . · j! M1j M1 For j = 1 observe that, since the first Chern class of a line bundle agrees with its Euler class, the homology class obtained by integrating c1 (L) over the compactified moduli space M1 can be represented by the zero set of a generic section s1 in L. Note that here we use that M1 represents a pseudo-cycle and hence has no codimension-one boundary strata. In other words, we find that ev∗ θi ∧ c1 (L) = ev∗ θi , 1 M1

M1

where M1 = s1−1 (0). 1

j−1

⊂ M1 . Now consider the restriction of the tautological line bundle L to M1 Instead of describing the integration of powers of the first Chern class in terms of common zero sets of sections in the same line bundle L, it turns out to be more geometric (see 2.2) to choose a section s j not in L but in its j-fold (complex) tensor product L⊗ j and define M1 = s −1 j (0) ⊂ M1 j

⊗j

Since c1 (L

.

) = j · c1 (L) it follows that ∗ ev θ = j · ev∗ θi ∧ c1 (L), i j j−1 M1

so that by induction

M1

as desired.

j−1

M1

ev∗ θi ∧ c1 (L) j =

1 ev∗ θi · j! M1j

Descendants in SFT

123

While the result of the integration is well-known to be independent of the choice of the almost complex structure and the abstract polyfold perturbations, it also follows that the result is independent of the precise choice of the sequence of sections s1 , . . . , s j . Like for the almost complex structure and the perturbations this results from the fact that the moduli spaces studied in Gromov-Witten theory have no codimension-one boundary. On the other hand, it is well-known that the moduli spaces in SFT typically have codimension-one boundary, so that now the result of the integration will not only depend on the chosen contact form, cylindrical almost complex structure and abstract polyfold perturbations, but also additionally explicitly depend on the chosen sequences of sections s1 , . . . , s j . While the Hamiltonian is hence known to depend on all extra choices, it is well-known from Floer theory that we can expect to find algebraic invariants independent of these choices. While the problem of dependency on contact form, cylindrical almost complex structure and abstract polyfold perturbations is sketched in [EGH], we will now show how to include gravitational descendants into their algebraic constructions. For this we will define descendants of moduli spaces, which we obtain as zero sets of coherent collections of sections in the tautological line bundles over all moduli spaces. From now on let Mr denote the moduli space Mg,r ( + , − )/ R studied in SFT for chosen collections of Reeb orbits + , − . In complete analogy to Gromov-Witten theory we can introduce r tautological line bundles L1 , . . . , Lr , where the fibre of Li over a punctured curve (u, z 1 , . . . , zr ) ∈ Mr is again given by the cotangent line to the underlying, possibly unstable nodal Riemann surface (without ghost components) at the i th marked point and which again formally can be defined as the pull-back of the vertical cotangent line bundle of π : Mr +1 → Mr under the canonical section σi : Mr → Mr +1 mapping to the i th marked point in the fibre. Note again that while the vertical cotangent line bundle is rather a sheaf than a true bundle since it becomes singular at the nodes in the fibres, the pull-backs under the canonical sections are still true line bundles as the marked points are different from the nodes and hence these sections avoid the singular loci. For notational simplicity let us again restrict to the case r = 1. Following the compactness statement in [BEHWZ], the codimension-one boundary of M1 consists of curves with two levels (in the sense of [BEHWZ]), whose moduli spaces can be represented as products M1,1 × M2,0 or M1,0 × M2,1 of moduli spaces of strictly lower dimension, where the marked point sits on the first or the second level. As we want to keep the notation as simple as possible, note that here and in what follows for product moduli spaces the first index refers to the level and not to the genus of the curve. To be more precise, after introducing asymptotic markers as in [EGH] for orientation issues, one obtains a fibre rather than a direct product, see also [F2]. However, since all the bundles and sections we will consider do or should not depend on these asymptotic markers, we will forget about this issue in order to keep the notation as simple as possible. On the other hand, it directly follows from the definition of the tautological line bundle L over M1 that over the boundary components M1,1 × M2,0 and M1,0 × M2,1 it is given by L |M1,1 ×M2,0 = π1∗ L1 , L |M1,0 ×M2,1 = π2∗ L2 , where L1 , L2 denotes the tautological line bundle over the moduli space M1,1 , M2,1 and π1 , π2 is the projection onto the first or second factor, respectively. With this we can now introduce the notion of coherent collections of sections in (tensor products of) tautological line bundles.

124

O. Fabert

Definition 1.1. Assume that we have chosen sections s in the tautological line bundles L over all moduli spaces M1 of J -holomorphic curves with one additional marked point. Then this collection of sections (s) is called coherent if for every section s in L over a moduli space M1 the following holds: Over every codimension-one boundary component M1,1 × M2,0 , M1,0 × M2,1 of M1 the section s agrees with the pull-back π1∗ s1 , π2∗ s2 of the chosen section s1 , s2 in the tautological line bundle L1 over M1,1 , L2 over M2,1 , respectively. Remark. Since in the end we will again be interested in the zero sets of these sections, we will assume that all occuring sections are transversal to the zero section. Furthermore, we want to assume that all the chosen sections are indeed invariant under the obvious symmetries like reordering of punctures and marked points. In order to meet both requirements, it follows that we actually need to employ multi-sections as in [CMS], which we however want to suppress for the rest of this exposition. The important observation is clearly that one can always find coherent collections of (transversal) sections (s) by using induction on the dimension of the underlying moduli space. While for the induction start it suffices to choose a non-vanishing section in the tautological line bundle over the moduli space of orbit cylinders with one marked point, for the induction step observe that the coherency condition fixes the section on the boundary of the moduli space. Here it is important to remark that the coherency condition further ensures that two different codimension-one boundary components actually agree on their common boundary strata of higher codimension. On the other hand, we can use our assumption that every moduli space is indeed a manifold with corners to obtain the desired section by simply extending the section from the boundary to the interior of the moduli space in an arbitrary way. For a given coherent collection of transversal sections (s) we will again define for every moduli space 1

M1 = s −1 (0) ⊂ M1 . 1

As an immediate consequence of the above definition we find that M1 is a neat submanifold (with corners) of M1 , i.e., the components of the codimension-one bound1 1 1 1 ary of M1 are given by products M1,1 × M2,0 and M1,0 × M2,1 , where M1,1 = s1−1 (0), M2,1 = s2−1 (0) for the section s1 in L1 over M1,1 , s2 in L2 over M2,1 , respectively. To be more precise, since we actually need to work with multi-sections rather than sections in the usual sense, the zero set is indeed a branched-labelled manifold. On the other hand, since we already suppressed the fact that our moduli spaces are indeed branched and labelled, we want to continue ignoring this technical aspect. On the other hand, we can use the above result as an induction start to obtain for every moduli space j j−1 M1 a sequence of nested subspaces M1 ⊂ M1 ⊂ M1 as in Gromov-Witten theory. 1

j−1

Definition 1.2. Let j ∈ N. Assume that for all moduli spaces we have chosen M1 ⊂ j−1 M1 such that the components of the codimension-one boundary of M1 are given by j−1 j−1 products of the form M1,1 × M2,0 and M1,0 × M2,0 . Then we again call a collection of transversal sections (s j ) in the j-fold tensor products L⊗ j of the tautological line j−1

bundles over M1

⊂ M1 coherent if for every section s j the following holds: Over

Descendants in SFT

125 j−1

j−1

j−1

every codimension-one boundary component M1,1 × M2,0 , M1,0 × M2,1 of M1 the section s j agrees with the pull-back π1∗ s1, j , π2∗ s2, j of the section s1, j , s2, j in the line ⊗j

j−1

⊗j

j−1

bundle L1 over M1,1 , L2 over M2,1 , respectively. With this we will now introduce (gravitational) descendants of moduli spaces. Definition 1.3. Assume that we have inductively defined a subsequence of nested subj j−1 j j−1 spaces M1 ⊂ M1 ⊂ M1 by requiring that M1 = s −1 for a coherent j (0) ⊂ M1 j−1

collection of sections s j in the line bundles L⊗ j over the moduli spaces M1 we call

j M1

the

j th

. Then

(gravitational) descendant of M1 .

Let W0 be the graded Weyl algebra over C of power series in the variables and pγ with coefficients which are polynomials in the variables qγ , which is obtained from the big Weyl algebra W by setting all variables ti equal to zero. In the same way define the subalgebra P0 of the Poisson algebra P. Apart from the Hamiltonian H0 ∈ −1 W0 counting only curves with no additional marked points, H0 =

−

#Mg,0 ( + , − )/ R g−1 p q , +

+ , −

we now want to use the chosen differential forms θi ∈ ∗ (V ), i = 1, . . . , N and the j j sequences M1 = Mg,1 ( + , − )/ R of gravitational descendants to define sequences of new SFT Hamiltonians Hi,1 j ∈ −1 W0 , (i, j) ∈ {1, . . . , N } × N, by Hi,1 j =

+ , −

j Mg,1 ( + , − )/ R

−

ev∗ θi g−1 p q . +

We want to emphasize that the following statement is not yet a theorem in the strict mathematical sense as the analytical foundations of symplectic field theory, in particular, the necessary transversality theorems for the Cauchy-Riemann operator, are not yet fully established. Since it can be expected that the polyfold project by Hofer and his collaborators sketched in [HWZ] will provide the required transversality theorems, we follow other papers in the field in proving everything up to transversality and state it nevertheless as a theorem. Theorem 1.4. Counting holomorphic curves with one marked point after integrating differential forms and introducing gravitational descendants defines a sequence of distinguished elements Hi,1 j ∈ H∗ −1 W0 , D 0 −1 0 −1 0 in the full SFT homology algebra with differential D 0 = [H0 , ·] : W → W , −1 0 0 which commute with respect to the bracket on H∗ W , D ,

Hi,1 j , H1k, = 0, (i, j), (k, ) ∈ {1, . . . , N } × N .

126

O. Fabert

Proof. While the boundary equation D 0 ◦ D 0 = 0 is well-known to follow from the identity [H0 , H0 ] = 0, the fact that every Hi,1 j , (i, j) ∈ {1, . . . , N } × N defines an element in the homology H∗ −1 W0 , D 0 follows from the identity H0 , Hi,1 j = 0, since this proves Hi,1 j ∈ ker D 0 . On the other hand, in order to see that any two Hi,1 j , H1k, commute after passing to homology it suffices to prove the identity Hi,1 j , H1k, ± H0 , H2(i, j),(k,) = 0 for any (i, j), (k, ) ∈ {1, . . . , N } × N, where the new Hamiltonian H2(i, j),(k,) is defined below using descendant moduli spaces with two additional marked points. The latter two identities directly follow from our definition of gravitational descendants of moduli spaces based on the definition of coherent sections in tautological line bundles and the compactness theorem in [BEHWZ]. Indeed, in the same way as the identity H0 , H0 = 0 follows from the fact that the codimension-one boundary of every moduli space M0 is formed by products of moduli spaces M1,0 × M2,0 , the second = 0 follows from the fact that the codimension-one boundary of

identity H0 , Hi,1 j

j

j

a descendant moduli space M1 is given by products of the form M1,1 × M2,0 and j

M1,0 × M2,1 .

In order to prove the third identity Hi,1 j , H1k, ± H0 , H2(i, j),(k,) = 0 for every (i, j), (k, ) ∈ {1, . . . , N }×N, we slightly have to enlarge our definition of gravitational descendants in order to include moduli spaces with two additional marked points. For ( j,k) of M2 by this observe that for every pair j, k ∈ N we can define decendants M2 ( j,k)

( j,0)

(0,k)

( j,0)

(0,k)

setting M2 = M2 ∩ M2 , where M2 , M2 ⊂ M2 are defined in the j k same way as M1 , M1 ⊂ M1 by simply forgetting the second or first additional marked point, respectively. Since the boundary of a moduli space of curves with two marked points consists of products of the form M1,1 × M2,1 and M1,0 × M2,2 , M1,2 × M2,0 , ( j,0) j j it follows that the boundary of M2 consists of products M1,1 × M2,1 , M1,1 × M2,1 ( j,0)

( j,0)

and M1,0 ×M2,2 , M1,2 ×M2,0 . Together with the similar result about the boundary (0,k)

of M2

and using the inclusions we hence obtain that the codimension-one boundary

( j,k) j k k j of M2 is given by products of the form M1,1 × M2,1 , M1,1 × M2,1 and M1,0 × ( j,k) ( j,k) M2,2 , M1,2 × M2,0 . While summing over the first two products (with signs) we obtain Hi,1 j , H1k, , summing over the latter two we get H0 , H2(i, j),(k,) , which hence

sum up to zero.

Remark. While the proof suggests that for the above algebraic relations one only has to care about the codimension-one boundary strata of the moduli spaces, it is actually even more important that the coherency condition further ensures that two different codimension-one boundary components can be glued along their common boundary strata of higher codimension.

Descendants in SFT

127

As above we further again obtain a rational version of the above statement by expanding H0 and the Hi,1 j in powers of . Corollary 1.5. Counting rational holomorphic curves with one marked point after integrating differential forms and introducing gravitational descendants defines a sequence of distinguished elements hi,1 j ∈ H∗ (P0 , d 0 ),

in the rational SFT homology algebra with differential d 0 = h0 , · : P0 → P0 , which 0 0 commute with respect to the Poisson bracket on H∗ P , d , hi,1 j , h1k, = 0, (i, j), (k, ) ∈ {1, . . . , N } × N . So far we have only considered the case with one additional marked point. On the other hand, the general case with r additional marked points is just notationally more involved. Indeed, as we did in the proof of the above theorem we can easily define for every moduli space Mr with r additional marked points and every r -tuple of natural ( j1 ,..., jr ) ⊂ Mr by setting numbers ( j1 , . . . , jr ) descendants Mr ( j1 ,..., jr )

Mr

( j1 ,0,...,0)

= Mr

(0,...,0, jr )

∩ . . . ∩ Mr

,

(0,...,0, j ,0,...,0)

k where the descendant moduli spaces Mr ⊂ Mr are defined in the same j way as the one-point descendant M1k ⊂ M1 by looking at the r tautological line bundles over the moduli space Mr = Mr ( + , − )/ R separately and forgetting about the other points. With this we can define the descendant Hamiltonian of SFT, which we will continue denoting by H, while the Hamiltonian defined in [EGH] will from now on be called primary. In order to keep track of the descendants we will assign to every chosen differential form θi now a sequence of formal variables ti, j with grading

|ti, j | = 2(1 − j) − deg θi . Then the descendant Hamiltonian H of SFT is defined by + − H= ev∗1 θi1 ∧ . . . ∧ evr∗ θir g−1 t I p q , ( j ,..., jr )

+ , − ,I

Mg,r1

( + , − )/ R −

where p = pγ1+ . . . pγ ++ , q = qγ − . . . qγ − and t I = ti1 , j1 . . . tir , jr for I = n 1 n− ((i 1 , j1 ), . . . , (ir , jr )). Note that expanding the Hamiltonian H in powers of the formal variables ti, j , H = H0 + ti, j Hi,1 j +o(t 2 ), +

i, j

we get back our Hamiltonians H0 and the sequences of descendant Hamiltonians Hi,1 j from above and it is easy to see that the primary Hamiltonian from [EGH] is recovered by setting all formal variables ti, j with j > 0 equal to zero.

128

O. Fabert

In the same way as it was shown for the primary Hamiltonian in [EGH], the descendant Hamiltonian continues to satisfy the master equation [H, H] = 0, which is just a generalization of the identities for H0 , Hi,1 j and hence can be shown along the same lines by studying the codimension-one boundaries of descendant moduli spaces. On the other hand, expanding H ∈ −1 W in terms of powers of , g−1 Hg , H= g

note that for the rational descendant Hamiltonian h = H0 ∈ P we still have {h, h} = 0. 1.3. Invariance statement. We now turn to the question of independence of these nice algebraic structures from the choices like contact form, cylindrical almost complex structure, abstract polyfold perturbations and, of course, the choice of the coherent collection of sections. This is the content of the following theorem, where we however again want to emphasize that the following statement is not yet a theorem in the strict mathematical sense as the analytical foundations of symplectic field theory, in particular, the neccessary transversality theorems for the Cauchy-Riemann operator, are not yet fully established. Theorem 1.6. For different choices of contact form λ± , cylindrical almost complex structure J ± , abstract polyfold perturbations and sequences of coherent collections of sections (s ± ) the resulting systems of commuting operators Hi,1,− j on −1 0,− 0,− j −1 0,+ 0,+ 1,+ H∗ W , D and Hi, j on H∗ W , D are isomorphic, i.e., there exists −1 0,− 0,− an isomorphism of the Weyl algebras H∗ W , D and H∗ −1 W0,+ , D 0,+ −1 0,− , D 0,− to H1,+ ∈ H −1 W0,+ , D 0,+ . which maps Hi,1,− ∗ j ∈ H∗ W i, j As above we clearly also get a rational version of the invariance statement: Corollary 1.7. For different choices of contact form λ± , cylindrical almost complex structure J ± , abstract polyfold perturbations and sequences of coherent collec1,− tions of sections (s ± j ) the resulting system of Poisson-commuting functions hi, j on H∗ (P0,− , d 0,− ) and hi,1,+j on H∗ (P0,+ , d 0,+ ) are isomorphic, i.e., there exists an isomorphism of the Poisson algebras H∗ (P0,− , d 0,− ) and H∗ (P0,+ , d 0,+ ) which maps 0,− 0,− hi,1,− , d ) to hi,1,+j ∈ H∗ (P0,+ , d 0,+ ). j ∈ H∗ (P This theorem is an extension of the theorem in [EGH] which states that for different choices of auxiliary data the small Weyl algebras H∗ −1 W0,− , D 0,− and H∗ −1 W0,+ , D 0,+ are isomorphic. On the other hand, assuming that the contact form, the cylindrical almost complex structure and also the abstract polyfold sections are fixed to have well-defined moduli spaces, the isomorphism of the homology algebras is the identity and hence the theorem states the sequence of commuting operators is indeed independent of the chosen sequences of coherent collections of sections (s ± j ), 1,+ −1 0 0 . Hi,1,− j = Hi, j ∈ H∗ W , D For the proof we have to extend the proof in [EGH] to include gravitational descendants. To this end we have to study sections in the tautological line bundles over moduli spaces of holomorphic curves in symplectic manifolds with cylindrical ends.

Descendants in SFT

129

ω) be a symplectic manifold with cylindrical ends R+ ×V + , λ+ and −Let (W, R ×V − , λ− in the sense of [BEHWZ] which is equipped with an almost complex structure J which agrees with the cylindrical almost complex structures J ± on R+ ×V + . Then we study J -holomorphic curves in W which are asymptotically cylindrical over chosen collections of orbits ± = {γ1± , . . . , γn±± } of the Reeb vector fields R ± in V ± as the R± -factor tends to ±∞, see [BEHWZ], and denote by Mg,r ( + , − ) the corresponding moduli space of genus g curves with r additional marked points ([BEHWZ,EGH]). Possibly after choosing abstract perturbations using polyfolds, obstruction bundles or domain-dependent structures, which agree with chosen abstract perturbations in the boundary as described above, we find that Mg,r ( + , − ) is a weighted branched manifold of dimension equal to the Fredholm index of the Cauchy-Riemann operator for J . Note that as remarked above we will for simplicity assume that moduli space is indeed a manifold with corners, since this will be sufficient for our example and we expect that all the upcoming constructions can be generalized in an appropriate way. We further extend the chosen differential forms θ1± , . . . θ N± on V ± to differential forms θ1 , . . . , θ N on W as described in [EGH]. From now on let Mr denote the moduli space Mg,r ( + , − ) of holomorphic curves in W for chosen collections of Reeb orbits + , − . Note in particular that there is no longer an R-action on the moduli space which we have to quotient out. In order to distinguish these moduli spaces in non-cylindrical manifolds from those of holomorphic ± curves in the cylindrical manifolds, we will use the short-hand notation Mr for mod+ − ± uli spaces Mg,r ( , )/ R of holomorphic curves in R ×V , respectively. Like in Gromov-Witten theory we can introduce r tautological line bundles L1 , . . . , Lr , where the fibre of Li over a punctured curve (u, z 1 , . . . , zr ) ∈ Mr in the noncompactified moduli space is again given by the cotangent line to the underlying closed Riemann surface at the i th marked point and which formally can be defined as the pull-back of the vertical cotangent line bundle under the canonical section σi of π : Mr +1 → Mr mapping to the i th marked point in the fibre. For notational simplicity let us again restrict to the case r = 1. Following the compactness statement in [BEHWZ] the codimension-one boundary of M1 now consists of curves with one non-cylindrical level and one cylindrical level (in the sense of [BEHWZ]), whose moduli spaces can now be represented as products M1,1 × + − + − M2,0 , M1,1 × M2,0 or M1,0 × M2,1 , M1,0 × M2,1 of moduli spaces of strictly lower dimension, where the marked point sits on the first or the second level. Again note that here and in what follows for product moduli spaces the first index refers to the level and not to the genus of the curve. Furthermore it follows from the definition of the tautological + − line bundle L over M1 that over the boundary components M1,1 ×M2,0 , M1,1 ×M2,0 +

−

and M1,0 × M2,1 , M1,0 × M2,1 it is given by L |M

+ 1,1 ×M2,0

= π1∗ L1 , L |M ×M+ = π2∗ L+2 , 1,0 2,1

= π2∗ L2 , L |M− ×M = π1∗ L− 1 , L |M− 2,0 1,1 1,0 ×M2,1 (−)

(+)

(−)

(+)

where L1 , L2 denotes the tautological line bundle over the moduli space M1,1 , M2,1 and π1 , π2 is the projection onto the first or second factor, respectively. With this we can now introduce collections of sections in (tensor products of) tautological line bundles coherently connecting two chosen coherent collections of sections.

130

O. Fabert

Definition 1.8. Let W be a symplectic manifold with cylindrical ends V ± and let (s± ) be two coherent collections of sections in the tautological line bundles L± over all ± moduli spaces M1 of J -holomorphic curves with one additional marked point in the cylindrical manifolds R ×V ± . Assume that we have chosen transversal sections s in the tautological line bundles L over all moduli spaces M1 of J -holomorphic curves in the non-cylindrical manifold W with one additional marked point. Then this collection of sections (s) is called coherently connecting (s−) and (s+ ) if for every section s in L over a moduli space M1 the following holds: Over every codimension-one boundary + − + − component M1,1 × M2,0 , M1,1 × M2,0 and M1,0 × M2,1 , M1,0 × M2,1 of M1 the section s agrees with the pull-back π1∗ s1 , π1∗ s1− or π2∗ s2+ , π2∗ s2 of the chosen sec(−)

tions s1,(−) , s2,(+) in the tautological line bundles L1 respectively.

(−)

(+)

over M1,1 , L2

(+)

over M2,1 ,

Note that one can always find collections of sections (s) coherently connecting given coherent collections of sections (s+ ) and (s− ) as before by using induction on the dimension of the underlying moduli space. Indeed, for the induction step observe that the coherency condition again fixes the section on the boundary of the moduli space, so that the desired section can be obtained by simply extending the section from the boundary to the interior of the moduli space in an arbitrary way. For a given coherently connecting collection of sections (s) we will again define for every moduli space 1

M1 = s −1 (0) ⊂ M1 . As an immediate consequence of the above definition we find that the components 1 1 + 1,− of the codimension-one boundary of M1 are given by products M1,1 × M2,0 , M1,1 × −

1,+

1

1,(−)

M2,0 and M1,0 × M2,1 , M1,0 × M2,1 , where M1,1 (−)

(−)

(+)

−1 −1 = s1,(−) (0), M2,1 = s2,(+) (0) 1,(+)

(+)

for the section s1,(−) in L1 over M1,1 , s2,(+) in L2 over M2,1 , respectively. As before we can use this result as an induction start to obtain for every moduli space M1 a sequence j j−1 of nested subspaces M1 ⊂ M1 ⊂ M1 . Definition 1.9. Let j ∈ N and let (s j,± ) be two coherent collections of sections in the j-fold tensor products L±,⊗ j of the tautological line bundles over the j −1st gravitational j−1,± ± ⊂ M1 of all moduli spaces of curves in the cylindrical manifolds descendants M1 R ×V ± . Assume that for all moduli spaces of curves in the non-cylindrical manifold W j−1 we have chosen M1 ⊂ M1 such that the components of the codimension-one boundj−1 j−1 + j−1,− ary of M1 are given by products of the form M1,1 × M2,0 , M1,1 × M2,0 and j−1,+

M1,0 ×M2,0

−

j−1

, M1,0 ×M2,0 . Then we again call a collection of transversal sections j−1

(s j ) in the j-fold tensor products L⊗ j of the tautological line bundles over M1 ⊂ M1 coherently connecting (s j,− ) and (s j,+ ) if for every section s j the following holds: Over j−1

+

j−1,−

every codimension-one boundary component M1,1 × M2,0 , M1,1

j−1,+ − j−1 j−1 M1,0 × M2,1 , M1,0 × M2,1 of M1 the section s j agrees ∗ ∗ ∗ ∗ π1 s1, j , π1 s1, j,− or π2 s2, j,+ , π2 s2, j of the section s1, j,(−) , s2, j,(+) j−1,(−) j−1,(+) (−),⊗ j (+),⊗ j L1 over M1,1 , L2 over M2,1 , respectively.

× M2,0 and

with the pull-back in the line bundle

Descendants in SFT

131

With this we can now introduce gravitational descendants of moduli spaces for symplectic manifolds with cylindrical ends. Definition 1.10. Assume that we have the inductively defined subsequence of nested j j−1 j j−1 for a colsubspaces M1 ⊂ M1 ⊂ M1 by requiring that M1 = s −1 j (0) ⊂ M1 j−1

lection of sections s j in the line bundles L⊗ j over the moduli spaces M1

coherently j

connecting the coherent collections of sections (s j,− ) and (s j,+ ). Then we call M1 the j th (gravitational) descendant of M1 . In order to prove the above invariance theorem we now recall the extension of the algebraic formalism of SFT from cylindrical manifolds to symplectic cobordisms with cylindrical ends as described in [EGH]. Let D0 be the space of formal power series in the variables , pγ+ with coefficients which are polynomials in the variables qγ− . Elements in W0,± then act as differential operators from the right/left on D0 via the replacements ←−− −−→ ∂ ∂ + − q γ → κγ + , p γ → κγ − . ∂ pγ ∂qγ Apart from the potential F0 ∈ −1 W0 counting only curves in W with no additional marked points, + − F0 = #Mg,0 ( + , − ) g−1 p q ,

+ , −

we now want to use the extensions θi , i = 1, . . . , N on W of the chosen differential j j forms θ1± , . . . θ N± on V ± and these sequences M1 = Mg,1 ( + , − ) of gravitational descendants to define sequences of new SFT potentials Fi,1 j , (i, j) ∈ {1, . . . , N } × N, by + − Fi,1 j = ev∗ θi g−1 p q . j

+ , −

Mg,1 ( + , − )

For the potential counting curves with no additional marked points we have the following identity, where we however again want to emphasize that the following statement should again be understood as a theorem up to the transversality problem in SFT. Theorem ([EGH]). The potential F0 ∈ −1 D satisfies the master equation −− −−→ 0 0← eF H0,+ − H0,− eF = 0. In [EGH] it is shown that this implies that 0 0 ←−− 0 0 0 −−→ 0 D F : −1 D0 → −1 D0 , D F g = e− F H0,− geF − (−1)|g| geF H0,+ e− F 0

0

satisfies D F ◦ DF = 0, and hence can be used to define the homology algebra 0 H∗ −1 D0 , D F . Furthermore it is shown that the maps F 0,− : −1 W0,− → −1 D0 , F 0,+ : −1 W0,+ → −1 D0 ,

0− → 0 f → e− F f e+ F , 0← 0 − f → e+ F f e− F

132

O. Fabert

commute with the boundary operators, 0

F 0,± ◦ D 0,± = D F ◦ F 0,± , and hence descend to maps between the homology algebras 0 F∗0,± : H∗ −1 W0,± , D 0,± → H∗ −1 D0 , D F . Now assume that the contact forms λ+ and λ− are chosen such that they define the same contact structure (V + , ξ + ) = (V − , ξ − ) =: (V, ξ ) and let W = R ×V be the topologically trivial cobordism. Then in [EGH] the authors prove (up to transversality) the following fundamental theorem. Theorem ([EGH]). The map −1 F∗0,+ ◦ F∗0,− : H∗ −1 W0,− , D 0,− → H∗ −1 W0,+ , D 0,+ is an isomorphism of graded Weyl algebras. For the proof of the invariance statement we want to show that this map identifies the −1 0,± , D 0,± ). In order to get the sequences Hi,1,± j , (i, j) ∈ {1, . . . , N } × N on H∗ ( W right idea for the proof, it turns out to be useful to even enlarge the picture as follows. Precisely in the same way as for cylindrical manifolds we can define for every tuple ( j1 ,..., jr ) ( j1 , . . . , jr ) of natural numbers gravitational descendants M ⊂ M1 of moduli spaces of curves in non-cylindrical manifolds with more than one additional marked point, which are collected in the descendant potential F ∈ −1 D, where D is again obtained from D0 by considering coefficients which are formal powers in the graded formal variables ti, j , (i, j) ∈ {1, . . . , N } × N. Assuming for the moment that we have proven the fundamental identity ← − −→ e F H+ − H− e F = 0 and expanding the potential F ∈ −1 D and the two Hamiltonians H± ∈ −1 W± in powers of the t-variables, 2 F = F0 + ti, j Fi,1 j +o(t 2 ), H± = H0,± + ti, j Hi,1,± j +o(t ), i, j

i, j

we can deduce besides the master equation for F0 , −− −−→ 0 0← eF H0,+ − H0,− eF = 0 and other identities also the identity 0 ←−− −− −−→ 0 −−→ 0 0← F 0,− F 1 F 1 0,+ eF Hi,1,+j − Hi,1,− e − e e = H F F i, j i, j H , j about F0 , Fi,1 j and H0,± , Hi,1,± j , where we used that ⎛ ⎞ 0 eF = eF · ⎝1 + ti, j Fi,1 j ⎠ + o t 2 . i, j

Descendants in SFT

133

Proof of the theorem. Instead of proving the master equation for the full descendant potential F, we first show that it suffices to prove 0 ←−− −− −−→ 0 −−→ 0 0← F = H0,− eF Fi,1 j − eF Fi,1 j H0,+ . eF Hi,1,+j − Hi,1,− j e Indeed, it is easy to see that the desired identity implies that −− 0← 0 0 −−→ + F0 = e+ F Hi,1,+j e− F − e− F Hi,1,− F 0,+ Hi,1,+j − F 0,− Hi,1,− j j e is equal to 0 0 ←−− 0 −−→ 0 0 e− F H0,− e+ F Fi,1 j − e+ F Fi,1 j H0,+ e− F = D F Fi,1 j , so that, after passing to homology, we have 0 F∗0,+ Hi,1,+j = F∗0,− Hi,1,− ∈ H∗ −1 D0 , D F j as desired. On the other hand, the above identity directly follows from our definition of gravitational descendants of moduli spaces based on the definition of coherently connecting sections in tautological line bundles and the compactness theorem in [BEHWZ]. Indeed, in the same way as it is shown in [EGH] that the master equation for F0 and H0,± follows from the fact that the codimension-one boundary of every moduli space M0 is + − formed by products of moduli spaces M1,0 × M2,0 and M1,0 × M2,0 , the desired identity relating F0 , Fi,1 j and H0,± , Hi,1,± j can be seen to follow from the fact that the j

codimension-one boundary of a descendant moduli space M1 is given by products of j + j,− j,+ − j the form M1,1 × M2,0 , M1,1 × M2,0 and M1,0 × M2,1 , M1,0 × M2,1 : While the two 1,+ summands involving F0 and Hi,1,− j , Hi, j on the left-hand-side of the equation collect all j,−

j,+

boundary components of the form M1,1 × M2,0 , M1,0 × M2,1 , the two summands involving Fi,1 j and H0,− , H0,+ on the right-hand-side of the equation collect all boundary −

j

j

+

components of the form M1,0 × M2,1 , M1,1 × M2,0 , respectively. Note that as for the master equation for F0 and H0,± the appearance of F0 in the exponential follows from the fact that there corresponding curves may appear with an arbitrary number of 1 connected components, while the curves counted for in H0,± , Hi,1,± j , Fi, j can only appear once due to index reasons or since there is just one additional marked point. 1,+ Finally, in order to see why we actually have Hi,1,− j = Hi, j on homology if we fixed − + − + λ = λ = λ, J = J = J and the abstract polyfold perturbations to have well0 defined moduli spaces, observe that in this case F just counts orbit cylinders, so that F 0,± and hence F 0,± ∗ is the identity.

134

O. Fabert

1.4. The circle bundle case. In this subsection we briefly want to discuss the important case of circle bundles over closed symplectic manifolds, which links our constructions to gravitational descendants in Gromov-Witten theory, see also [R]. For this recall that to any closed symplectic manifold (M, ω) with integral symplectic form [ω] ∈ H 2 (M, Z) one can canonically assign a principal circle bundle π : V → M over (M, ω) by requiring that c1 (V ) = [ω]. Furthermore, it is easy to see that an S 1 -connection form λ with curvature ω on π : V → M is a contact form on the total space V , where the underlying contact structure agrees with the corresponding horizontal plane field ξ = ker λ, while the Reeb vector field R agrees with the infinitesimal generator of the S 1 -action. Observe that a ω-compatible almost complex structure J on M naturally equips R ×V with a cylindrical almost complex structure by requiring that J maps the Reeb vector field to the R-direction and agrees with J on the horizontal plane field ξ , which is naturally identified with T M. Since every fibre of the circle bundle is hence a closed Reeb orbit for the contact form λ, it follows that the space of orbits is given by M × N, where the second factor just refers to the multiplicity of the orbit. Hence, while every contact form in this class is not Morse as long as the symplectic manifold is not a point, it is still of Morse-Bott type. Following [EGH] the Weyl algebra W0 in this Morse-Bott case is now generated by sequences of graded formal variables pα,k , qα,k , k ∈ N assigned to cohomology classes α forming a basis of H ∗ (M, Z). For circle bundles in the Morse-Bott setup we now show that the general theorem from above leads to the following stronger statement. Note that in the following theorem we do not assume that the sequences of coherent collections of sections are neccessarily S 1 -invariant. Theorem 1.11. For circle bundles over symplectic manifolds, which are equipped with S 1 -invariant contact forms, cylindrical almost complex structures (and abstract polyfold perturbations) as described above, the descendant Hamiltonians Hi,1 j define a sequence of commuting operators on W0 , which is independent of the auxiliary data. Proof. Observing that a map u˜ : (, j) → (R ×V, J ) from a punctured Riemann sphere to the cylindrical manifold R ×V , which is equipped with the canonical cylindrical almost complex structure J defined by the ω-compatible almost complex structure J on M, can be viewed as tuple (h, u), where u : (, j) → (M, J ) is a J -holomorphic curve in M and h is a holomorphic section in R ×u ∗ V → , it is easy to see that every moduli space studied in SFT for the contact manifold V carries a natural circle bundle structure after quotienting out the natural R-action. It follows that D 0 = 0, so that by our first theorem the H1j already commute as elements in W0 . On the other hand, as long as the two different collections of auxiliary structures for V are actually obtained as pull-backs of the corresponding auxiliary structures on M, it follows in the same way that the only rigid holomorphic curves in the resulting cobordisms are the orbit cylinders, so that the resulting automorphism is indeed the identity. For S 1 and S 3 Eliashberg already pointed out in his ICM 2006 talk, see [E], that the corresponding sequences h1j counting only genus zero curves lead to classical integrable systems, while the sequences of commuting operators H1j provide deformation quantizations for these hierarchies. This is based on the surprising fact that the sequence h1j of Poisson-commuting functions actually agrees with the integrable system for genus zero from Gromov-Witten theory obtained using the underlying Frobenius manifold

Descendants in SFT

135

structure. In particular, for V = S 1 it follows that that the resulting system of Poissoncommuting functions are precisely the commuting integrals of the dispersionless KdV hierarchy, u j+2 (x) d x, u(x) = h1j = pn e+2πinx + qn e−2πinx , 1 ( j + 2)! S n∈N

while in the case of the Hopf fibration V = S 3 over M = S 2 one arrives at the Poissoncommuting integrals of the continuous limit of the Toda lattice. In order to see why in genus zero the SFT of the circle bundle V is so closely related to the Gromov-Witten theory of its symplectic base M, we recall from the proof of the theorem that every J -holomorphic curve u˜ can be identified with a tuple (h, u), where u is a J -holomorphic curve in M and h is a holomorphic section in R ×u ∗ V → , whose poles and zeroes correspond to the positive and negative punctures with multiplicities. Since the zeroth Picard group of S 2 is trivial and hence every degree zero divisor is indeed a principal divisor, it follows that for every map u the space of sections is isomorphic to C and hence that the SFT moduli space of J -holomorphic curves in R ×V is indeed a circle bundle over the corresponding Gromov-Witten moduli space of J -holomorphic curves in M. While this explains the close relation of SFT of circle bundles and Gromov-Witten theory in the genus zero case, the non-triviality of the Picard group for nonzero genus implies that the relation gets much more obscure when we allow for curves of arbitrary genus. Indeed, while in the case of V = S 1 the sequence H1j defined by counting curves of arbitrary genus in R ×V leads to the deformation quantization of the dispersionless KdV hierarchy, in particular, a quantum integrable system, counting curves of all genera in the underlying symplectic manifold, that is, the point, leads by Witten’s conjecture to the classical integrable system given by the full KdV hierarchy as proven by Kontsevich. At the end of this subsection we again want to emphasize that the above statement crucially relies on the fact that V is equipped with a S 1 -invariant contact form, cylindrical almost complex structure and abstract polyfold perturbations. Assuming for the moment that the sequences of coherent collections of sections are also chosen to be S 1 -invariant, note that in this case the above invariance statement can directly be deduced from the independence of the descendant Gromov-Witten potential of the auxiliary data used to define it, which essentially relies on the fact that all moduli spaces have only boundary components of codimension greater than or equal to two, so that absolute rather than relative virtual classes are defined. In particular, the gravitational descendants can be defined by integrating powers of the first Chern class over the absolute moduli cycle. On the other hand, recall that for the above theorem we did not require that the sequences of coherent collections of sections are neccessarily S 1 -invariant. While our definition of coherent collections of sections seems to be very weak, our above theorem shows that the nice invariance property continues to hold even for a larger class of sections. 2. Example: Symplectic Field Theory of Closed Geodesics 2.1. Symplectic field theory of a single Reeb orbit. We are now going to consider a concrete example, which actually formed the starting point for the formal discussion from above. As above consider a closed contact manifold V with chosen contact form λ ∈ 1 (V ) and let J be a compatible cylindrical almost complex structure on R ×V . For any closed

136

O. Fabert

orbit γ of the corresponding Reeb vector field R on V the orbit cylinder R ×γ together with its branched covers are the basic examples of J -holomorphic curves in R ×V . In [F2] we prove that these orbit curves do not contribute to the algebraic invariants of symplectic field theory as long as they do not carry additional marked points. Our proof explicitly uses that the orbit curves (over a fixed orbit) are closed under taking boundaries and gluing, which follows from the fact that orbit curves are also trivial in the sense that they have trivial contact area and that this contact area is preserved under taking boundaries and gluing. In particular, it follows, see [F2], that every algebraic invariant of symplectic field theory has a natural analog defined by counting only orbit curves. Further specifying the underlying Reeb orbit let us hence introduce the symplectic field theory of the Reeb orbit γ : For this denote by W0γ the graded Weyl subalgebra of the Weyl algebra W, which is generated only by those p- and q-variables pn = pγ n , qn = qγ n corresponding to Reeb orbits which are multiple covers of the fixed orbit γ and which are good in the sense of [BM]. In the same way we further introduce the Poisson subalgebra P0γ of P0 . It will become important that the natural identification of the formal variables pn and qn does not lead to an isomorphism of the graded algebras W0γ and P0γ with the corresponding graded algebras W0S 1 and P0S 1 for γ = V = S 1 , not only since the gradings of pn and qn are different and hence even the commutation rules may change but also that variables pn and qn may not be there since they would correspond to bad orbits. In the same way as we introduced the (rational) Hamiltonian H0 and h0 as well as sequences of descendant Hamiltonians H1j and h1j by counting general curves in the symplectization of a contact manifold, we can define distinguished elements H0γ ∈ −1 W0γ and h0γ ∈ P0γ , as well as sequences of descendant Hamiltonians H1γ , j and h1γ , j by just counting branched covers of the orbit cylinder over γ with signs (and weights), where the preservation of the contact area under splitting and gluing of curves proves that for every theorem from above we have a version for γ . While for the general part described above we have already emphasized that the theorems are not yet theorems in the strict mathematical sense since the necessary transversality theorems for the Cauchy-Riemann operator are part of the on-going polyfold project by Hofer and his collaborators and we further used the assumption that all occurring moduli spaces are manifolds with corners, for the rest of this paper we will restrict to the rational case, i.e., we will only be interested in the Poisson-commuting sequences h1γ , j on H∗ (P0γ , dγ0 ), but in return solve the occurring analytical problems in all detail. In particular, we have already proven in the paper [F2] that for (rational) orbit curves the transversality problem can indeed be solved using finite-dimensional obstruction bundles instead of infinite-dimensional polybundles. In order to see why this is even neccessary, observe that while in the case when γ = V = S 1 the Fredholm index equals the dimension of the moduli space, for general γ ⊂ V the Fredholm index of a true branched cover is in general strictly smaller than the dimension of the moduli space of branched covers, so that transversality for the Cauchy-Riemann operator can in general not be satisfied. So let us recall the main results about obstruction bundle transversality for orbit curves, where we refer to [F2] for all details. The first observation for orbit curves is that the cokernels of the linearized Cauchy-Riemann operators indeed fit together to give a smooth vector bundle Coker ∂¯ J over the compactified (nonregular) moduli spaces M of orbit curves (of constant rank). It follows that every transveral section ν¯ of this cokernel bundle leads to a compact perturbation making the Cauchy-Riemann operator transversal to the zero section in the underlying polyfold setup.

Descendants in SFT

137

In Gromov-Witten theory we would hence obtain the contribution of the regular perturbed moduli space by integrating the Euler class of the finite-dimensional obstruction bundle over the compactified moduli space. On the other hand, passing from GromovWitten theory back to symplectic field theory again, we see that we just arrive at the same problem we had to face with when we wanted to define gravitational descendants in symplectic field theory. Indeed, as for the tautological line bundles, the presence of codimension-one boundary of the (nonregular) moduli spaces of branched covers implies that Euler numbers for sections in the cokernel bundles are not defined in general, since the count of zeroes depends on the compact perturbations chosen for the moduli spaces in the boundary. Instead of looking at a single moduli space, we hence again have to consider all moduli spaces at once. Replacing the tautological line bundle L by the cokernel bundle Coker ∂¯ J and considering the nonregular moduli space of branched covers instead of the regular moduli space itself, we hence now define coherent collections of sections in the obstruction bundles Coker ∂¯ J over all moduli spaces M as follows. Following the compactness statement in [BEHWZ] for the contact manifold S 1 the codimension-one boundary of every moduli space of branched covers M again consists of curves with two levels (in the sense of [BEHWZ]), whose moduli spaces can be represented as products M1 ×M2 of moduli spaces of strictly lower dimension, where the first index again refers to the level. On the other hand, it follows from the linear gluing result in [F2] that over the boundary component M1 × M2 the cokernel bundle Coker ∂¯ J is given by 1 2 Coker ∂¯ J |M1 ×M2 = π1∗ Coker ∂¯ J ⊕ π2∗ Coker ∂¯ J , 1 2 where Coker ∂¯ J , Coker ∂¯ J denotes the cokernel bundle over the moduli space M1 , M2 and π1 , π2 is the projection onto the first or second factor, respectively. Assuming that we have chosen sections ν¯ in the cokernel bundles Coker ∂¯ J over all moduli spaces M of branched covers, we again call this collection of sections (¯ν ) coherent if over every codimension-one boundary component M1 × M2 of a moduli space M the corresponding section ν¯ agrees with the pull-back π1∗ ν¯ 1 ⊕ π2∗ ν¯ 2 of the 1 2 chosen sections ν¯ 1 , ν¯ 2 in the cokernel bundles Coker ∂¯ J over M1 , Coker ∂¯ J over M2 , respectively. Since in the end we will again be interested in the zero sets of these sections, we will again assume that all occurring sections are transversal to the zero section. As before it is not hard to see that one can always find such coherent collections of (transversal) sections in the cokernel bundles by using induction on the dimension of the underlying nonregular moduli space of branched covers. Note that the latter is not equal to the Fredholm index. In [F2] we prove the following result about orbit curves with no additional marked points.

Theorem ([F2]). For the cokernel bundle Coker ∂¯ J over the compactification M of every moduli space of branched covers over an orbit cylinder with dim M − rank Coker ∂¯ J = 0 the following holds: • For every pair ν¯ 0 , ν¯ 1 of coherent and transversal sections in Coker ∂¯ J the algebraic count of zeroes of ν¯ 0 and ν¯ 1 are finite and agree, so that we can define an Euler

138

O. Fabert

number χ Coker ∂¯ J for coherent sections in Coker ∂¯ J by χ Coker ∂¯ J := (¯ν 0 )−1 (0) = (¯ν 1 )−1 (0). • This Euler number is χ Coker ∂¯ J = 0. This theorem in turn has the following consequence. Corollary 2.1. For every closed Reeb orbit γ the Hamiltonian h0γ vanishes independently of the chosen coherent collection of sections (¯ν ) in the cokernel bundles over all moduli spaces of branched covers, h0 = h0,¯ν = 0. In particular, the sequences of descendant Hamiltonians h1γ , j already Poisson-commute as elements in P0γ . Note that the latter statement is obvious in the case γ = V = S 1 . While it directly follows from index reasons that h1S 1 , j = 0 when the string of differential forms just con-

sists of the zero-form 1 on S 1 , it is shown in [E] using the results from Okounkov and Pandharipande in [OP] that for the one-form dt on S 1 the system of Poisson commuting functions on P0S 1 is given by h1S 1 , j =

S1

u j+2 (x) d x, u(x) = pn e+2πinx + qn e−2πinx , ( j + 2)! n∈N

i.e., hence agrees with the dispersionless KdV (or Burger) integrable hierarchy. Going back from γ = V = S 1 to the case of orbit curves over general Reeb orbits γ , observe that, since for the orbit curves the evaluation map to V factors through the inclusion map γ ⊂ V , it follows that it again only makes sense to consider zero- or one-forms, where we can assume without loss of generality that the zero-form agrees with 1 ∈ 0 (V ) and that the integral of the one-form θ ∈ 1 (V ) over the Reeb orbit is one, θ = 1. γ

For the case with no gravitational descendants, note that it follows from index reasons that the only curves to be considered are orbit cylinders with one marked point, since introducing an additional marked point adds two or one to the Fredholm index. Since orbit cylinders are always regular and their contribution hence just equals the integral of the form θ over the closed orbit γ , we hence get just like in the case of γ = V = S 1 that the zeroth descendant Hamiltonian h1γ ,0 vanishes if deg θ = 0 and h1γ ,0 =

S1

u 2 (x) dx = pn q n 2!

if deg θ = 1 with the normalization from above. For the sum note that we only assigned formal variables pn , qn to Reeb orbits which are good in the sense of [BM]. While the Hamiltonians h1γ ,0 hence agree with the Hamiltonian h1S 1 ,0 for γ = V = S 1 up to the problem of bad orbits, since no obstruction bundles have to be considered, it is

Descendants in SFT

139

easy to see that the argument breaks down when gravitational descendants are introduced, since the underlying orbit curve then has non-zero Fredholm index 1 + 2( j − 1) + deg θ and hence need not be an orbit cylinder anymore. While for the case of a one-form we can hence expect to find new integrals for the nontrivial Hamiltonian h1S 1 ,0 = h1γ , j , we first show that in the case of a zero-form not only the zeroth Hamiltonian but even the whole sequence of descendant Hamiltonians h1γ , j is trivial. Theorem 2.2. Let γ be a Reeb orbit in any contact manifold V and assume that the string of differential forms on V just consists of the zero-form 1 ∈ 0 (V ). Then the sequence of Poisson-commuting functions h1γ , j on P0γ is trivial, h1γ , j = 0,

j ∈N

just like in the case of γ = V = S 1 . Proof. Since the proof of this theorem follows from completely the same arguments as the proof of our theorem in [F2] about Euler numbers of coherent sections in obstruction bundles from above, we shortly give the main idea for the proof in [F2] about orbit curves without additional marked points and then discuss its generalization to orbit curves with zero-forms and gravitational descendants. After proving that we can work with finite-dimensional obstruction bundles instead of infinite-dimensional polybundles, recall that the main problem lies in the presence of codimension-one boundary of the (nonregular) moduli space, so that Euler numbers of Fredholm problems are not defined in general, since the count of zeroes in general depends on the compact perturbations chosen for the moduli spaces in the boundary. In [F2] we prove the existence of the Euler number for moduli spaces of orbit curves without additional marked points by induction on the number of punctures. For the induction step we do not only use that there exist Euler numbers for the moduli spaces in the boundary, but it is further important that all these Euler numbers are in fact trivial. The vanishing of the Euler number in turn is deduced from the different parities of the Fredholm index of the Cauchy-Riemann operator and the actual dimension of the moduli space of branched covers following the idea for the vanishing of the Euler characteristic for odd-dimensional manifolds. For the generalization to the case of additional marked points and gravitational descendants, it is clear that it still suffices to work with finite-dimensional obstruction bundles. On the other hand, recall that the only further ingredient to our proof in [F2] was that the Fredholm index and the dimension of the moduli spaces always have different parity. Hence it follows that the proof in [F2] also works for the case when θ is a zero-form as the actual dimension of the moduli spaces is still even, while it breaks down in the case when θ is a one-form. Observe that for one-forms it is indeed no longer clear that the every Euler number has to be zero, as we for γ = V = S 1 and θ = dt we get nontrivial contributions from true branched covers. While at first glance the major problem seems to be the truly complicated computation of the Euler number (see [HT1,HT2] for related results), we further have the problem that Euler numbers need no longer exist for all Fredholm problems. For the rest of this paper we will hence only be interested in the case where the chosen differential form has degree one, deg θ = 1. While for γ = V = S 1 we actually get a unique sequence of Poisson-commuting functions, observe that for general fixed Reeb orbits γ in contact manifolds V the

140

O. Fabert

ν descendant Hamiltonians h1γ , j = h1,¯ γ , j may indeed depend on the chosen collection of sections in the cokernel bundles Coker ∂¯ J . Hence the invariance statement is no longer trivial, but implies that for different choices of coherent abstract perturbations ν¯ ± for the moduli spaces the resulting system of commuting elements h1,− γ , j , j = 0, 1, 2, .. and

0 h1,+ γ , j , j = 0, 1, 2, .. on Pγ are just isomorphic, i.e., there exists an automorphism of the

1,+ 0 0 Poisson algebra P0γ which identifies h1,− γ , j ∈ Pγ with hγ , j ∈ Pγ for all j ∈ N. The above discussion hence shows that the computation of the symplectic field theory of a closed Reeb orbit gets much more difficult when gravitational descendants are considered. In what follows we want to determine it in the special case where the contact manifold is the unit cotangent bundle S ∗ Q of a (m-dimensional) Riemannian manifold Q, so that every closed Reeb orbit γ on V = S ∗ Q corresponds to a closed geodesic γ¯ on Q. Before we can state the theorem we first want to expand the descendant Hamiltonians h1S 1 , j in terms of the pn - and qn -variables, where set pn = q−n . Abbreviating

u n (x) = qn einx for every nonzero integer n it follows from u = n u n that u n 1 (x) · . . . · u n j+2 (x) u j+2 (x) h1S 1 , j = dx = d x. ( j + 2)! S 1 ( j + 2)! S1

On the other hand, note that the integration around the circle corresponds to selecting only those sequences of multiplicities (n 1 , . . . , n j+2 ), whose sum is equal to zero, so that qn 1 · . . . · qn j+2 . h1S 1 , j = ( j + 2)! n 1 +...+n j+2 =0

Apart from the sequence of Poisson-commuting functions for the circle, the grading of the functions given by the grading of pn - and qn -variables will play a central role for the upcoming theorem. For this observe that it follows from the grading conventions in symplectic theory that the grading of the full Hamiltonian H0 is −1, so that

field 0 g−1 by H = g H0g the grading for the rational Hamiltonian h0 = H00 is given by 0 0 | h | = | H | + || = −1 + 2(m − 2). Since this grading has to agree with the grading of t j h1j with |t j | = 2(1 − j) − deg θ = 1 − 2 j, it follows that for every Reeb orbit γ ⊂ V we have 1 hγ , j = −1 + 2(m − 2) − 1 + 2 j = 2(m + j − 3). We already mentioned that the natural identification of the formal variables pn and qn does not lead to an isomorphism of the graded algebras W0γ and P0γ with the corresponding graded algebras W0S 1 and P0S 1 for γ = V = S 1 , not only since the gradings of pn and qn are different and hence even the commutation rules may change but even that variables pn and qn may not be there since they would correspond to bad orbits. While for the grading of γ = V = S 1 given by | pn | = |qn | = −2 in the descendant Hamiltonians h1S 1 , j every summand indeed has the same degree 2(m + j − 3), passing over to a general Reeb orbit γ with the new grading given by | pn | = m − 3 − CZ(γ n ), |qn | = m − 3 + CZ(γ n ) the descendant Hamiltonian h1S 1 , j is no longer of pure degree, i.e., different summands of the same descendant Hamiltonian usually have different degree. While the Poissoncommuting sequence for the circle seems not to be related to the sequence of descendant

Descendants in SFT

141

Hamiltonians for general Reeb orbits γ , we prove the following result in the case when the Reeb orbit corresponds to a closed geodesic. Theorem 2.3. Assume that the contact manifold is the unit cotangent bundle V = S ∗ Q of a Riemannian manifold Q, so that the closed Reeb orbit γ corresponds to a closed geodesic γ¯ on Q, and that the string of differential forms just consists of a single one-form which integrates to one around the orbit. Then the resulting system of Poisson-commuting functions h1γ , j , j ∈ N on P0γ is isomorphic to the system of Poisson-commuting functions gγ1¯ , j , j ∈ N on Pγ0¯ = P0γ , where for every j ∈ N the descendant Hamiltonian gγ1¯ , j is given by gγ1¯ , j =

( n)

qn 1 · . . . · qn j+2 ( j + 2)!

,

where the sum runs over all ordered monomials qn 1 · . . . · qn j+2 with n 1 + · · · + n j+2 = 0 and which are of degree 2(m + j − 3). Further ( n ) ∈ {−1, 0, +1} is fixed by a choice of coherent orientations in symplectic field theory and is zero if and only if one of the orbits γ n 1 , . . . , γ n j+2 is bad. We have the following immediate corollary, which immediately follows from the behavior of the Conley-Zehnder index for multiple covers. Corollary 2.4. Assume that the closed geodesic γ¯ represents a hyperbolic Reeb orbit in the unit cotangent bundle and dim Q > 1. Then gγ1¯ , j = 0 and hence h1γ , j = 0 for all j > 0. Indeed, since for hyperbolic Reeb orbits the Conley-Zehnder index CZ(γ n ) of γ n is given by CZ(γ n ) = n · CZ(γ ), an easy computation shows that there are no products of the above form of the desired degree. On the other hand, note that without the degree condition we would just get back the sequence of descendant Hamiltonians for the circle. Forgetting about orientation issues, in simple words we can hence say that the sequence gγ1¯ , j is obtained from the sequence for γ¯ = Q = S 1 by removing all summands with the wrong, that is, not maximal degree, where the latter can explicitly be computed using the formulas in [Lo] but also follows from our proof. The proof relies on the observation that for orbit curves the gravitational descendants indeed have a geometric meaning in terms of branching conditions, which is a slight generalization of the result for the circle shown by Okounkov and Pandharipande in [OP]. Applying (and generalizing) the ideas of Cieliebak and Latschev in [CL] for relating the symplectic field theory of V = S ∗ Q to the string topology of the underlying Riemannian manifold Q, we then study branched covers of the corresponding trivial half-cylinders in the cotangent bundle connecting the Reeb orbit γ with the underlying geodesic γ¯ to prove that the sequence of Poisson-commuting functions h1γ , j is isomorphic to a sequence of Poisson-commuting functions gγ1¯ , j . While the descendant Hamiltonians h1γ , j on the SFT side are defined using very complicated obstruction bundles over (nonregular) moduli spaces of arbitary large dimension, the key observation is that for the descendant Hamiltonians gγ1¯ , j on the string side we indeed only have to study obstruction bundles over discrete sets, which clearly disappear if the Fredholm index is right. With this we get that the Poisson-commuting sequences for the closed geodesics can be computed from the sequences for the circle and the Morse indices of the geodesic and its iterates as stated in the theorem.

142

O. Fabert

2.2. Gravitational descendants = branching conditions. Recall that by the above theorem from the last subsection we only have to consider the case where θ is a one-form on V , where we still assume without loss of generality that the integral of θ over γ is one. It follows that integrating the pullback of θ under the evaluation map over the moduli space of orbit curves with one additional marked point and dividing out the natural R-action on the target R ×S 1 ∼ = R ×γ is equivalent to restricting to orbit curves where the additional marked point is mapped to a special point on R ×S 1 . In other words, in what follows we will view h 1γ , j no longer as part of the Hamiltonian for γ but as part of the potential for the cylinder over γ equipped with a non-translation-invariant two-form. In order to save notation, M1 = M1 ( + , − ) will from now on denote the corresponding moduli space. On the other hand, after introducing coherent collections (¯ν ) of obstruction ν¯ bundle sections, it is easy to see that the tautological line bundle Lν¯ over M1 is just the ν¯ restriction of the tautological line bundle L over M1 to M1 = ν¯ −1 (0) ⊂ M1 . For the orbit curves we now want to give a geometric interpretation of gravitational descendants in terms of branching conditions over the special point on R ×S 1 . Before we state the corresponding theorem and give a rigorous proof using the stretching-ofthe-neck procedure from SFT, we first informally describe a naive direct approach based on our definition of gravitational descendants from above, which should illuminate the underlying geometric ideas. ν¯ Recall that if (h, z) is an element in the non-compactified moduli space Mν1 ⊂ M1 ∗ the fibre of the canonical line bundle L over (h, z) is given by L(h,z) = Tz S. Identifying the tangent space to the cylinder at the special point with C it follows that ν ∗ s(h, z) = ∂h ∂z (z) ∈ Tz S is a section in the restriction of L to M1 . Since s is a transversal ν section in the tautological line bundle over M1 if and only if it extends to a section over M1 such that s ⊕ ν is transversal to the zero section in L ⊕ Coker ∂¯ J over M1 , we may assume after possibly perturbing ν that s is indeed transversal. On the other hand, since ∂h ∂z (z) = 0 is equivalent to saying that z ∈ S is a branch point of the holomorphic map h : S → CP1 , it follows that M11 := s −1 (0) ⊂ M1 indeed agrees with the space of all orbit curves (h, z) with one additional marked point, where z is a branch point of h. Further moving on to the case j = 2 observe that a natural candidate for a generic section s2 in the restriction of the product line bundle L⊗2 to M11 ⊂ M1 is given by 2 s2 (h, z) = ∂∂zh2 (z) ∈ (Tz∗ S)⊗2 , for which M21 = s2−1 (0) ⊂ M11 agrees with the space of holomorphic maps where z ∈ S is now a branch point of order at least two. For general j we can hence proceed by induction and define the section s j in L⊗ j over j−1 j ∂ jh M1 := s −1 j−1 (0) ⊂ M1 by s j (h, z) = ∂z j (z), so that M1 agrees with the space of holomorphic maps where z ∈ S is a branch point of order at least j. If the chosen sections s1 , . . . , s j over the non-compactified moduli spaces would extend in the same way to a coherent collection of sections in the tautological line bunν¯ dles over the compactified moduli spaces M1 , the above would show that in the case of orbit curves considering the j th descendant moduli space is equivalent after passing to homology to requiring that the underlying additional marked point is a branch point of order j. In [OP] it was however shown that already for the case of the circle γ = V = S 1 the latter assumption is not entirely true, but that one instead additionally obtains corrections from the boundary M1 − M1 . To this end, we define a branching condition to be a tuple of natural numbers μ = (μ1 , . . . , μ(μ) ) of length (μ) and total branching order |μ| = μ1 + · · · + μ(μ) . Then

Descendants in SFT

143 μ

μ

the moduli space M = M ( + , − ) consists of orbit curves with (μ) connected components, where every connected component carries one additional marked point z i , which is mapped to the special point on R ×γ and is a branch point of order μi − 1 for i = 1, . . . , (μ). For every branching condition μ = (μ1 , . . . , μ(μ) ) we then define ν new Hamiltonians h1γ ,μ = h1,¯ γ ,μ by setting h1γ ,μ =

μ

−

#M1 ( + , − ) p q . +

+ , −

With the following theorem we will prove that the abstract descendants-branching correspondence from [OP] holds for every closed Reeb orbit γ ⊂ V . For every j ∈ N and every branching condition μ we let ρ 0j,μ be the number given by integrating the j th power of the first Chern class of the tautological line bundle over the moduli space of connected rational curves over CP1 with one marked point mapped to 0 and (μ) additional marked points z i mapped to ∞ which are branch points of order μi −1, i = 1, . . . , (μ). Lemma 2.5. Each of the descendant Hamiltonians h1γ , j can be written as a sum, h1γ , j =

1 · h1γ ,( j+1) + ρ 0j,μ · h1γ ,μ , j! |μ|< j

where h1γ ,μ ∈ P0γ counts branched covers of the orbit cylinder with (μ) connected components, where each component carries one additional marked point z i , which is mapped to the special point on R ×γ and is a branch point of order μi − 1 for i = 1, . . . , (μ). Note that the statement of the lemma can be rephrased by saying that the integration of the j th power of the first Chern class corresponds to a weighted sum of branching conditions, . 1 c1 (L) j = · ( j + 1) + ρ 0j,μ · μ, j! |μ|< j

which is the rational version of the abstract descendants-branching correspondence from 0 is nonzero and agrees with the coefficient [OP] for the circle, where the coefficient ρμ, j ρ j,μ from [OP] only if the genus g determined by the Fredholm index, j + 1 = 2g − 1 + |μ| + (μ) is zero. Proof. Recall that the result in [OP] for the circle relies on the degeneration formula from relative Gromov-Witten theory, where the target sphere with three special points x − = 0, x + = ∞ and x degenerates in such a way that the original sphere only carries the two special points x − = 0, x + = ∞ while the third special point sits on a second sphere connected to the original one by a node. Viewing a sphere with two special points as a cylinder, it is clear that a corresponding statement can be proven for a Reeb orbit γ in a general contact manifold if the standard cylinder is replaced by the orbit cylinder R ×γ in the symplectization of the contact manifold which degenerates to an orbit cylinder with a ghost bubble attached.

144

O. Fabert

Since the degeneration formula from relative Gromov-Witten theory is no longer applicable, we will have to use the neck-stretching process from symplectic field theory, which however agrees with the degeneration process from relative Gromov-Witten theory in the case of the circle. For this observe that performing a neck-stretching at a small circle around the special point on the standard cylinder we obtain a pair-of-pants together with a complex plane carrying the special point, which can be identified with spheres with three or two special points, respectively. Replacing the circle by a Reeb orbit γ in a general contact manifold the neck-stretching yields besides a complex plane with a special point a pair-of-pants with a positive and a negative cylindrical end over γ together with a cylindrical end over the circle. Note that in order to include infinitesimal deformations needed for the obstruction bundles, we identify the orbit cylinder R ×γ (together with an infinitesimal tubular neighborhood) with (an infinitesimal neighborhood of the zero section in) its normal bundle over R ×S 1 with fibre given by the contact distribution ξ and twist around the puncture given by the linearized Reeb flow along γ . Then the (infinitesimal) neckstretching is performed along the (infinitesimal) hypersurface given by the restriction of the normal bundle to the small circle in R ×S 1 . Before we make the proof rigorous by studying coherent collections of sections in the cokernel bundles and the tautological line bundles over the moduli space of branched covers for the circle, observe that Theorem 2.5.5 in [EGH] concerning composition of cobordisms suggests that h1γ , j , viewed as a potential on P0γ , is homotopic, and by h0γ = 0 hence agrees with a potential, which can directly be computed from the potential for the complex plane counting rational curves with one additional marked point mapped to the special point and the potential for the pair-of-pants with its cylindrical ends over γ and the circle counting rational curves with no additional marked points. Indeed it follows from the compactness statement in [BEHWZ] that under the neck-stretching procedure every branched cover of the orbit cylinder with one additional marked point mapped to the special point splits into a branched cover of the complex plane with one additional marked point mapped to the special point and a branched cover of the pair-of-pants with no additional marked points. While from S 1 -symmetry reasons the potential for the complex plane with one special point can only count connected curves, note that under the splitting process the connected curve may split into branched covers of the pair-of-pants with more than one connected component. On the other hand, since the glued curve has genus zero, it follows that the branched cover of the complex plane and any connected component of the branched cover of the pair-of-pants cannot be glued at more than one cylindrical end, so that the number of connected components of the branched cover of the pair-of-pants agrees with the number of cylindrical ends of the branched cover of the complex plane. Note that a collection of closed Reeb orbits in the contact manifold S 1 is naturally identified with a tuple μ = (μ1 , . . . , μ(μ) ) of multiplicities and a branched cover is asymptotically cylindrical over the μith iterate of the circle near the puncture z i precisely if z i is a branch point of order μi − 1. With this it follows that h1γ , j can be computed as desired by summing over all branching conditions μ = (μ1 , . . . , μ(μ) ), where for each μ the summand is given by the product of ρ 0j,μ , obtained by integrating the j th power of the first Chern class of the tautological line bundle over the moduli space of branched covers of CP1 with one marked point mapped to the special point and (μ) additional marked points z i mapped to ∞ which are branch points of order μi − 1, with the branching Hamiltonians h1γ ,μ ∈ P0γ , counting branched covers of the orbit cylinder with (μ) connected components, where each component carries one additional marked

Descendants in SFT

145

point z i , which is mapped to the special point on R ×γ and is again a branch point of order μi − 1 for i = 1, . . . , (μ). In order to make the proof rigorous it remains to understand the above statement on the level of coherent collections of sections in the cokernel bundles and the tautological line bundles over the moduli spaces of branched covers for the circle. For this observe that for every chosen collections of Reeb orbits + , − the neck-stretching procedure at a small circle around the special point on the standard cylinder leads 1 = M 1 ( + , − ). It is shown in [BEHWZ] that to a compactified moduli space M this compactified moduli space has the desired codimension-one boundary components M1 = M1 ( + , − ) counting branched covers of the original orbit cylinder with one special point and M1,1 × M2,0 with M1,1 = M1 ( ) and M2,0 = M0 ( + , , − ) counting branched covers of the complex plane with one additional marked point and of the pair-of-pants with possibly more than one connected component, respectively. On the other hand, in contrast to the degeneration process from relative Gromov-Witten theory, it follows from [BEHWZ] that one also has to consider codimension-one boundary 1,1 × M2,0 with M 1,1 = M 1 ( + , − ), M2,0 = M0 ( + , − )/ R strata of the form M 1 2 1 2 − + ˜ 2,1 with M1,0 = M0 ( , )/ R, M 2,1 = M 1 ( + , − ), which corand M1,0 × M 1 2 1 2 respond to a splitting of a curve into two levels during the stretching process and which are irrelevant for the case of the circle due to the S 1 -symmetry on M1,0 and M2,0 , respectively. First, since the coherent collections of sections in the cokernel bundles over the moduli spaces of branched covers by definition are not affected by the position of the additional marked point, it follows that one can use the same obstruction bundle perturbations ν¯ = ν¯ ( + , − ) throughout the stretching process. In particular, it follows that the reg 1 has codimension-one boundary components ν1¯ = ν¯ −1 (0) ⊂ M ular moduli space M ν¯ ν¯ ¯2 ν¯ 1 ν ¯ ¯2 1 1,1 × Mν2,0 ν2,1 M1 and M1,1 × M2,0 as well as M and M1,0 × M , where ν¯ 1 , ν¯ 2 are sec1 2 tions in the cokernel bundles Coker ∂¯ J , Coker ∂¯ J over M1,0 = M0 ( + , − ), M2,0 = 1

1

M0 ( 2+ , 2− ) and which are determined by ν¯ by the coherency condition. On the other hand, concerning the coherent collections of sections in the tautologν¯ ical line bundles, it can be shown as above that the tautological line bundle L = L ν¯ ν ¯ 1 agrees with the tautological line bundle L over M1 , with the pullback π ∗ L1 over M 1 ν¯ ν¯ 2 ¯1 ν1,1 L1 over M L2 over over M1,1 × M2,0 and with the pullbacks π ∗ × M2,0 and π ∗

1 2 ν¯ 1 ν¯ 2 M1,0 × M2,1 , respectively. Assuming that we have chosen coherent collections of secν¯ ν¯ tions (s) in the tautological line bundles L over all moduli spaces M1 = M1 ( + , − ) of branched covers of the orbit cylinder with one special point and (s1 ) in the tautolog-

ical line bundles L1 over all moduli spaces M1,1 = M1 ( ) of branched covers of the complex plane with one special point, we as above can choose coherent collection of ν1¯ the sections (˜s ) connecting (s) and (s1 ) by requiring that over every moduli space M ν¯ ν¯ section s˜ agrees with the section s over M1 , with the pullback π1∗ s1 over M1,1 × M2,0 ν¯ 2 ν¯ 1 ¯1 ν¯ 2 ν1,1 2,1 and with the pullbacks π1∗ s˜1 over M × M2,0 and π2∗ s˜2 over M1,0 × M , respectively. Proceeding by induction it then follows that the regular descendant moduli space ν¯ , j j ν¯ ν¯ , j M1 has codimension-one boundary components M1 and M1,1 × M2,0 as well as ν¯ 2 ν¯ 1 ν¯ 1 ν¯ 2 ¯1 , j ¯2 , j ν2,1 ν1,1 ×M2,0 and M1,0 ×M , respectively. Since we have that #M1,0 = #M2,0 = 0 M

146

O. Fabert

by the result in [F] it hence follows that ν¯ , j

#M1

ν¯

j

= #M1,1 · #M2,0 ,

which finally proves the decendants-branching correspondence on the level of coherent collections of sections in obstruction bundles and tautological line bundles. Note in particular that using this stretching process we were able to separate the transversality problem from the problem of defining gravitational descendants. Since the moduli space M1,1 = M1 ( ) is independent of the chosen Reeb orbit and agrees with the moduli space obtained from the degeneration process in relative GromovWitten theory, it follows precisely like in the circle bundle case described above that the j count of elements in the descendant moduli space M1,1 is independent of the chosen coherent collection of sections and agrees with the integral of the j th power of the first Chern class over M1,1 . 2.3. Branched covers of trivial half-cylinders. In the case when the contact manifold V is the unit cotangent bundle S ∗ Q of a Riemannian manifold Q, Cieliebak and Latschev have shown in [CL] that, when suitably interpreted, the symplectic field theory of V = S ∗ Q without differential forms and gravitational descendants agrees with the string topology of Q. The required isomorphism is established by studying punctured holomorphic curves in T ∗ Q with boundary on the Lagrangian Q ⊂ T ∗ Q. For this they equip T ∗ Q with an almost complex structure J such that (T ∗ Q, J ) is an almost complex manifold with one positive cylindrical end (R+ ×S ∗ Q, J ). After showing that the contact area of holomorphic curve is given as differences of the sums of the actions of the Reeb orbits in S ∗ Q and the sum of the lengths of the boundary components on Q, they use the natural filtration by action on symplectic field theory and by length on string topology to show that the morphism has the form of a unitriangular matrix. The entries on the diagonal count cylinders with zero contact area, which are precisely the trivial half-cylinders in T ∗ Q connecting the geodesic γ¯ on Q with the corresponding Reeb orbit γ in S ∗ Q. On the other hand, since orbit curves are characterized by the fact that they have zero contact area, it hence directly follows from their proof that there exists a version of their isomorphism statement for the symplectic field theory of a closed Reeb orbit γ by studying branched covers over the trivial half-cylinder connecting γ¯ and γ . For this let us first recall some definitions from [CL]. Let A0 be the graded commutative subalgebra of W of polynomials in the variables qγ , where, following our notation from before, the subscript 0 indicates that no t-variables are involved. The Hamiltonian − → H0 ∈ −1 W0 defines a differential operator 0SFT := H0 : A0 [[]] → A0 [[]] via the replacements p γ → κγ

−−→ ∂ . ∂qγ

The resulting pair (A0 [[]], 0SFT ) has then the structure of a BV∞ -algebra, in particular, 0SFT ◦ 0SFT = 0. On the contrary, given a BV∞ -algebra (A0 [[]], 0) where A0 is a space of polynomials in variables q, it follows, see [CL], that 0 : A0 [[]] → A0 [[]] is a true differential operator. In particular, we naturally get a Weyl algebra W0 with

Descendants in SFT

147

distinguished element H0 ∈ −1 W satisfying [H0 , H0 ] = 0 by introducing for each qvariable a dualizing p-variable, considering the natural commutator relation and using the replacement for p-variables from above. As already mentioned above, in [CL] it is shown that the BV∞ -algebra (A0 [[]], SFT) representing the symplectic field theory of S ∗ Q is isomorphic to a BV∞ -algebra (C0 [[]], 0string ) constructed from the string topology of Q, where C0 is a space of chains in the string space = Q of Q. The differential is given by 0string = ∂ + + ∇ : C0 [[]] → C0 [[]], where ∂ is the singular boundary operator and ∇, are defined using the string bracket and cobracket operations of Chas and Sullivan. The BV∞ -iso− → morphism L0 is defined using the potential of (T ∗ Q, Q), L0 = evg,s − ( ) p g−1 g,s − ,

− using the evaluation cycles evg,s − ( ) = ev = ev1 , . . . , evs : Mg,s − ( ) → Q × · · · × Q (s − -times) starting from the moduli space of holomorphic curves in T ∗ Q with positive asymptotics , genus g and s − boundary components on Q. Now for moving from the symplectic field theory of S ∗ Q to the symplectic field theory − → of a closed Reeb orbit γ in S ∗ Q, we obviously just have to replace (A0 [[]], 0SFT = H0 ) − → by the BV∞ -algebra (A0γ [[]], 0, γ SFT = H0γ ) generated only by the q-variables representing the multiples of the fixed orbit γ . Furthermore the potential L0 of (T ∗ Q, Q) is now replaced by the potential L0γ ,γ¯ counting branched covers of the trivial halfcylinder connecting γ¯ in Q and γ in S ∗ Q, which defines a BV∞ -isomorphism from (A0γ [[]], 0, γ SFT ) to a BV∞ -algebra (C0γ [[]], 0, γ¯ string ). Assigning as for the Reeb orbits formal q-variables to multiples of the underlying closed geodesic γ¯ , the potential L0γ ,γ¯ is defined by L0γ ,γ¯ =

¯ p q ¯ g−1 #Mg ( , )

¯ g, ,

¯ of branched covers of the trivial half-cylinsumming over all moduli spaces Mg ( , ) der with Fredholm index zero. Note that it follows from the area estimate from above for curves in T ∗ Q with boundary on Q in terms of action of the Reeb orbits and length of the ¯ boundary component that, assuming enough transversality, the moduli space Mg ( , ) agrees with the preimage of the product stable manifold ¯ = W + (γ¯ n 1 ) × · · · × W + (γ¯ n s − ) ⊂ Q × · · · × Q W + ( ) of the energy functional E : Q → R on the string space under the evaluation evg,s − : Mg,s − ( ) → Q × · · · × Q ¯ = evg,s − ( )−1 (W + ( )) ¯ ⊂ Mg,s − ( ). Mg ( , ) Now the BV∞ -algebra (C0 [[]], 0string ) is replaced by the BV∞ -algebra (Cγ0¯ [[]], 0, γ¯ string ) of polynomials in the q-variables assigned to multiples of γ¯ . Since this algebra is now indeed an algebra of polynomials, we have seen above that we assign to (Cγ0¯ [[]], 0, γ¯ string ) again a Weyl algebra Wγ0¯ with bracket [·, ·] generated by

148

O. Fabert

p- and q-variables assigned to multiples of γ¯ together with a distinguished element Gγ0¯ ∈ −1 Wγ0¯ satisfying [G0γ¯ , Gγ0¯ ] = 0. Since BV∞ -algebras (A0γ [[]], 0, γ SFT = − → − → H0γ ), (Cγ0¯ [[]], 0, γ¯ string = Gγ0¯ ) determine the Weyl algebras with Hamiltonians (W0γ , H0γ ), (W0γ¯ , Gγ0¯ ) and vice versa, it follows that the BV∞ -isomorphism given by the potential L0γ ,γ¯ indeed leads to an isomorphism of the structures defined by (W0γ , H0γ ) and (Wγ0¯ , Gγ0¯ ): Indeed, let D0γ ,γ¯ be the space of formal power series in the p-variables for multiples of γ and with coefficients which are polynomials in the q-variables assigned to multiples of γ¯ . Then it follows that L0γ ,γ¯ is an element of −1 D0γ ,γ¯ satisfying the master equation − − → 0 0 ← eLγ ,γ¯ H0γ − Gγ0¯ eLγ ,γ¯ = 0. In particular, it follows in the notation of Sect. 1 that the map

L0,+ γ ,γ¯

−1 0,γ¯ 0,γ ◦ L0,− : H∗ −1 Wγ0¯ , Dstring → H∗ −1 W0γ , DSFT γ ,γ¯ ∗

∗

is an isomorphism of Weyl algebras. 0,γ¯ In order to understand Dstring , recall that the differential in the string topology was given by 0string = ∂ + + ∇ : C0 [[]] → C0 [[]], where ∇ is defined using the string bracket and using the string cobracket operations defined by Chas and Sullivan. While the singular boundary ∂ does not appear as we restrict ourselves to zero-dimensional moduli spaces, we expect to get contributions of the string bracket and string cobracket to Gγ0¯ , where we claim that the string bracket restricts to the operation of concatenating two multiples γ¯ n 1 , γ¯ n 2 to the multiple γ¯ n 1 +n 2 of γ¯ , while the string cobracket corresponds to splitting up the multiple γ¯ n 1 +n 2 again into γ¯ n 1 , γ¯ n 2 . In order to see this note that the compactification of the moduli spaces of branched covers of the trivial half-cylinder counted in the potential L0γ ,γ¯ can be entirely understood in terms of branch points of the branched covering map. While branch points moving the infinite end lead to the appearance of H0γ in the master equation, the Hamiltonian Gγ0¯ describes what happens if branch points are moving through the boundary of the branched cover, which itself sits over the boundary of the half-cylinder. The important observation is now that for the codimension-one boundary of the moduli space we only have to consider the case where a single branch point is leaving the branched cover through the boundary. In order to see that this is described by the concatenation and splitting operations of the multiples of γ¯ , observe that the case when a branch point sits in the boundary of the branched cover is equivalent to the fact that the boundary of R+ ×S 1 is a critical level set of the branching map followed by the projection to the first factor. Observe that the branch point may leave the branched cover through any point of its boundary, which itself is diffeomorphic to (a number of copies of) the circle. Note that this corresponds to the fact that the concatenation and splitting operation may take place anywhere over any point on γ¯ . It follows that we always get an one-dimensional family of configurations. Before we continue, we want to restrict ourselves as before to the rational case. In particular, there exists a version of the above isomorphism, given by counting rational branched covers of the trivial half-cylinder, which relates the rational symplectic field

Descendants in SFT

149

theory H∗ P0γ , dγ0 of γ with H∗ Pγ0¯ , dγ0¯ , where dγ0¯ = gγ0¯ , · : P0γ¯ → Pγ0¯ and Gγ0¯ = −1 gγ0¯ +o () . Before we discuss the rational Hamiltonian gγ0¯ ∈ Pγ0¯ , recall that it was shown in [F2] that h0γ = 0. Note that we have indeed not considered additional marked points so far. In particular, it follows from the above isomorphism that also gγ0¯ has to vanish. Since we have seen above that for Gγ0¯ and hence for gγ0¯ we always get one-dimensional sets of configurations, the vanishing of gγ0¯ seems to follow from a stupid dimension argument. On the other hand, recall that we have shown in [F2] that the corresponding statement for h0γ does not simply follow from a symmetry argument but indeed requires a careful study of sections in obstruction bundles in order to find compact perturbations making the Cauchy-Riemann operator transversal to the zero section. With the work in [F2] it is clear that the same transversality problem should continue to hold for branched covers of trivial half-cylinders. In the next section it will turn out that, like on the symplectic field theory side, also on the string side we are working in a highly degenerate situation, so that the transversality requirement is usually not fulfilled. 2.4. Obstruction bundles and transversality. In order to solve the transversality problem we follow the author’s paper [F2] in employing finite-dimensional obstruction bundles over the nonregular configuration spaces. Here is a sketch of the main points. For this let S˙ denote a (possibly disconnected) punctured Riemann surface with − boundary of genus zero with s + punctures circles s boundary z+1 , . . . , z s ++ and − n− n n n + C1 , . . . , Cs − and fix two ordered sets = γ 1 , . . . , γ s , ¯ = γ¯ 1 , . . . , γ¯ s − of iterates of γ , γ¯ , respectively. Let ξ = T T ∗ Q/T (R+0 ×S 1 ), J ξ denote the complex normal bundle to the trivial half-cylinder R+0 ×S 1 , {0} × S 1 → (T ∗ Q, Q) as defined in [CL], which over the boundary {0} × S 1 ∼ = γ¯ ⊂ Q has the property that ξ ∩ T Q agrees with the normal bundle N to the geodesic γ¯ in Q. Note that the tangent space T W + (γ¯ n ) to the stable manifold of the energy functional in the critical point γ¯ n can be identified with a subspace of the space of normal deformations C 0 ((γ¯ n )∗ N ). ˙ ∂ S) ˙ → (R+ ×S 1 , {0} × S 1 ) of the trivial half-cylGiven a branched covering h : ( S, 0 inder, for p > 2 let H 1, p (h ∗ ξ ) ⊂ C 0 (h ∗ ξ ) denote the space of H 1, p -sections in h ∗ ξ − which over every boundary component Ck ⊂ ∂ S˙ restrict to a section in C 0 ((γ¯ n k )∗ N ). 1, p Furthermore we will consider the subspace H ¯ (h ∗ ξ ) ⊂ H 1, p (h ∗ ξ ) consisting of all sections in h ∗ ξ , which over every boundary circle Ck restrict to sections in the subspace − − T W + (γ¯ n k ) ⊂ C 0 ((γ¯ n k )∗ N ). While the latter Sobolev spaces describe the normal deformations of the branched covering, we introduce similar as in [F2] for sufficiently small 1, p,d ˙ d > 0 a Sobolev space with asymptotic weights Hconst ( S, C) in order to keep track of tangential deformations, where, additionally to the definitions in [F2], we impose the natural constraint that the function is real-valued over the boundary. In the same way we define the Banach spaces L p ((0,1) S˙ ⊗ j,J ξ h ∗ ξ ) and L p,d ((0,1) S˙ ⊗ j,i C). Further we denote by M0,s − ,s + the moduli space of Riemann surfaces with s − boundary circles, s + punctures and genus zero. Following [F2,BM] for the general case and [W] for the case with boundary, there exists a Banach space bundle E over a Banach manifold of maps B in which the Cauchy-Riemann operator ∂¯ J extends to a smooth section. In our special case it follows

150

O. Fabert

as in [F2] that the fibre is given by E h, j = L p,d (0,1 S˙ ⊗ j,i C) ⊕ L p (0,1 S˙ ⊗ j,J ξ h ∗ ξ ), while the tangent space to the Banach manifold of maps B = B 0,s − ( ) at (h, j) ∈ M = M0,s − ( ) is given by ˙ C) ⊕ H 1, p (h ∗ ξ ) ⊕ T j M0,n . Th, j B = Hconst ( S, 1, p,d

It follows that the linearization Dh, j of the Cauchy-Riemann operator ∂¯ J is a linear map from Th, j B to E h, j , which is surjective in the case when transversality for ∂¯ J is satisfied. In this case it follows from the implicit function theorem that ker Dh, j = Th, j M. ¯ = In order to prove that the dimension of the desired moduli space M ¯ = M( , ) ¯ ⊂ M( ) agrees with the virtual dimension expected by the Fredholm ev−1 (W + ( )) − index, it remains to prove that the evaluation map ev : M → Q s is transversal to + ¯ the product stable manifold W ( ). In order to deal with this additional transversality problem, we introduce the Banach ¯ ⊂ B with tangent space submanifold of maps B ¯ = ev−1 (W + ( )) 1, p,d ˙ 1, p Th, j B ¯ = Hconst ( S, C) ⊕ H ¯ (h ∗ ξ ) ⊕ T j M0,n

¯ = {v ∈ Th, j B : v|∂ S˙ ∈ T W + ( )}

and view the Cauchy-Riemann operator as a smooth section in E → B ¯ . Then we have the following nice transversality lemma. Lemma 2.6. Assume that Dh, j : Th, j B ¯ → E h, j is surjective. Then the linearization of ¯ = C 0 ( ¯ ∗ N )/T W + ( ) ¯ is surjective. the evaluation map dh, j ev : Th, j M → T W − ( ) ¯ choose v˜ ∈ Th, j B such that dh, j ev ·v˜ = v0 . On the other Proof. Given v0 ∈ T W − ( ), hand, since Dh, j : Th, j B ¯ → E h, j is onto, we can find v ∈ Th, j B ¯ with Dh, j v = ¯ Dh, j v, ˜ that is, v−v ˜ ∈ ker Dh, j = Th, j M. On the other hand, since dh, j ev ·v ∈ T W + ( ) for all v ∈ Th, j B ¯ by definition, we have dh, j ev ·(v˜ − v) = dh, j ev ·v˜ = v0 and the claim follows. We have seen that, instead of requiring transversality for the Cauchy-Riemann operator in the Banach space bundle over B and geometric transversality for the evaluation map, it suffices to require transversality for the Cauchy-Riemann operator in the Banach space bundle over the smaller Banach manifold B ¯ . Along the same lines as for Proposition 2.1 in [F2] it can be shown that the linearized Cauchy-Riemann operator is of the form ˙ C) ⊕ H (h ∗ ξ ) ⊕ T j M0,n Dh, j : Hconst ( S,

¯ 1, p,d

1, p

→ L p,d (0,1 S˙ ⊗ j,i C) ⊕ L p (0,1 S˙ ⊗ j,J ξ h ∗ ξ ), ¯ 1 + D j y, D ξ v2 ), Dh, j · (v1 , v2 , y) = (∂v h

1, p,d ˙ C) → L p,d (0,1 S˙ ⊗ j,i C) is the standard Cauchy-Riemann operwhere ∂¯ : Hconst ( S, ξ 1, p ∗ ator, Dh : H (h ξ ) → L p (0,1 S˙ ⊗ j,J ξ h ∗ ξ ) describes the linearization of ∂¯ J in the

Descendants in SFT

151

direction of ξ ⊂ T T ∗ Q and D j : T j M0,n → L p,d (T ∗ S˙ ⊗ j,i C) describes the variation of ∂¯ J with j ∈ M0,n . In [F2] we have shown that for branched covers of orbit cylinders the cokernels of the linearizations of the Cauchy-Riemann operator have the same dimension for every branched cover and hence fit together to give a smooth vector bundle over the nonregular moduli space of branched covers, so that we can prove transversality without waiting for the completion of the polyfold project of Hofer, Wysocki and Zehnder. The following proposition, proved in complete analogy, outlines that this still holds true for branched covers of trivial half-cylinders. Proposition 2.7. The cokernels of the linearizations of the Cauchy-Riemann operator fit together to give a smooth finite-dimensional vector bundle over the moduli space of branched covers of the half-cylinder. Proof. As in [F2] this result relies on the transversality of the standard Cauchy-Riemann operator and the super-rigidity of the trivial half-cylinder ξ coker ∂¯ = {0} and ker Dh = {0},

where the second statement is now just a linearized version of Lemma 7.2 in [CL] which states that, as for orbit cylinders in the symplectizations, the branched covers of the trivial half-cylinder are characterized by the fact that they carry no energy in the sense that the action of Reeb orbits above agrees with the lengths of the closed geodesics below. It remains to study the extension Coker ∂¯ J of the cokernel bundle Coker ∂¯ J to the compactified moduli space. For this recall that the components of the codimensionone-boundary of the nonregular moduli space M = M ¯ of branched covers of the half-cylinder are either of the form M1 × M2 , where M1 = M1 ( 1+ , 1− )/ R, M2 = M2 ( 2 , ¯ 2 ) are nonregular compactified moduli spaces of branched covers of the orbit cylinder or of the trivial half-cylinder, respectively, or of the form M0 × S 1 , where M0 = M0 ( , ¯ 0 ) is again a nonregular compactified moduli space of branched covers of the trivial half-cylinder while S 1 refers to the concatenation or splitting locus, which agrees with the locus where the single branch point is leaving the branched covering through the boundary. Note that for ¯ = (γ¯ n 1 , . . . , γ¯ n s − ) the ordered set ¯ 0 is either of the form 1 2 or

¯ 0 = γ¯ n 1 , . . . , γ¯ n k−1 , γ¯ n k , γ¯ n k , γ¯ n k+1 , . . . , γ¯ n s − n

¯ 0 = γ¯ 1 , . . . , γ¯ n k−1 , γ¯ n k +n k+1 , γ¯ n k+2 , . . . , γ¯ n s − , 1

1

corresponding to concatenating γ¯ n k and γ¯ n k to get γ¯ n k (n 1k + n 2k = n k ) or the splitting of γ¯ n k +n k+1 to get γ¯ n k and γ¯ n k+1 . Restricting to the concatenation case, recall that the chosen special point on the simple closed Reeb orbit determines a special point on the underlying simple geodesic and that we may assume that every holomorphic curve comes equipped with asymptotic markers in the sense of [EGH] not only on the cylindrical ends but also on the boundary circles. In particular, for the concatenation and splitting processes we may assume that all multiply-covered geodesics come equipped with a parametrization 1 1 by S 1 . Denoting by t1 , t2 ∈ S 1 the points on γ¯ n k , γ¯ n k , where we want to concatenate 1 2 the two multiply-covered geodesics to get the multiply-covered geodesic γ¯ n k +n k , we see

152

O. Fabert

that the coordinates must satisfy n 1k t1 = n 2k t2 in order to represent the same point on the underlying simple geodesic, so that the configuration space agrees with S 1 by setting t1 = n 2k t, t2 = n 1k t for t ∈ S 1 . While it directly follows from [F2] that over the boundary components M1 × M2 ⊂ M the extended cokernel bundle Coker ∂¯ J is of the form 1 2 Coker ∂¯ J |M1 ×M2 = π1∗ Coker ∂¯ J ⊕ π2∗ Coker ∂¯ J , 1 2 where Coker ∂¯ J , Coker ∂¯ J denote the (extended) cokernel bundles over M1 , M2 , respectively, it remains to study the cokernel bundle over the boundary components M0 × S 1 .

Proposition 2.8. Over the boundary components M0 × S 1 ⊂ M the extended cokernel bundle Coker ∂¯ J is also of product form, 0 Coker ∂¯ J |M0 ×S 1 = π1∗ Coker ∂¯ J ⊕ π2∗ ,

where Coker ∂¯ J denotes the (extended) cokernel bundle over the moduli space M0 and is a vector bundle over S 1 which is determined by the tangent spaces to the stable manifolds of the multiply-covered closed geodesics involved into the concatenation or splitting process. 0

Proof. Still restricting to the concatenation case, let S˙0 = S˙01 ∪ S˙02 denote the disconnected Riemann surface of genus zero with s + punctures and s − + 1 boundary circles C1 , . . . , Ck1 , Ck2 , . . . , Cs − , where we assume that ∂ S˙01 = C1 ∪ . . . ∪ Ck1 and ∂ S˙02 = Ck2 , . . . , Cs − . As before we know that the tangent spaces to the corresponding Banach manifolds of maps B 0 , B 0 ¯ at a branched covering (h 0 , j0 ) : ( S˙0 , ∂ S˙0 ) → (R+0 ×S 1 , {0} × S 1 ) are given by

0

Th 0 , j0 B 0 = Hconst ( S˙0 , C) ⊕ H 1, p (h ∗0 ξ ) ⊕ T j0 M0,n , 1, p,d

Th 0 , j0 B 0 ¯ = Hconst ( S˙0 , C) ⊕ H ¯ (h ∗0 ξ ) ⊕ T j0 M0,n 0 0 0 = v ∈ Th 0 , j0 B : v|∂ S˙0 ∈ T W + ( ¯ 0 ) , 1, p,d

1, p

while the fibre of the corresponding Banach space bundle is given by E 0h 0 , j0 = L p,d 0,1 S˙0 ⊗ j0 ,i C ⊕ L p 0,1 S˙0 ⊗ j0 ,J ξ h ∗0 ξ . For (h 0 , j0 , t) ∈ M0 × S 1 we further introduce the Banach manifold of maps B ∗ ¯ ⊂ B∗ ⊂ B0 which should consist of all branched covers of the trivial half-cylinder in B 0 for which the boundary circles Ck1 , Ck2 ∼ = S 1 are concatenated at (t1 , t2 ) = (n 2k t, n 1k t) ∈ 1 2 Ck ×Ck , to give the singular Riemann surface S˙∗ with s − boundary circles C1 , . . . , Ck1 ∪t Ck2 , . . . , Cs − and we have for v1,2 := v|C 1,2 , Th 0 , j0 ,t B ∗ = v ∈ Th 0 , j0 B 0 : v1 n 2k t = v2 n 1k t k Th 0 , j0 ,t B∗ ¯ = v ∈ Th 0 , j0 ,t B∗ : v|∂ S˙∗ ∈ T W + ¯ 0 .

Descendants in SFT

153

The proof of the general gluing theorem in [MDSa] suggests that over (h 0 , j0 , t) ∈ M0 × S 1 ⊂ M the extended cokernel bundle Coker ∂¯ J has fibre Coker ∂¯ J h , j ,t = coker Dh 0 , j0 ,t , Dh 0 , j0 ,t : Th 0 , j0 ,t B ∗ ¯ → E 0h 0 , j0 . 0

0

Before we describe the relation to the cokernel bundle Coker ∂¯ J over the first factor M0 with fibre 0 Coker ∂¯ J = coker Dh 0 , j0 , Dh 0 , j0 : Th 0 , j0 ,t B 0 ¯ → E 0h 0 , j0 , 0

h 0 , j0

0

observe that we still have ξ

ξ

coker Dh 0 , j0 = coker Dh 0 ,

0,ξ

0

ξ

coker Dh 0 , j0 ,t = coker Dh 0 ,t , ξ

ξ

Dh 0 : Th 0 , j0 B 0 ¯ → E h 0 , j0 , ξ

ξ

Dh 0 ,t : Th 0 , j0 ,t B ∗ ¯ → E h 0 , j0 ,

ξ

ξ

and ker Dh 0 = ker Dh 0 ,t = {0}, where Th 0 , j0 B 0 ¯

0

0,ξ

ξ

⊂ Th 0 , j0 B0 ¯ , Th 0 , j0 ,t B ∗ ¯ ⊂ 0

Th 0 , j0 ,t B∗ ¯ and E h 0 , j0 ⊂ E 0h 0 , j0 are the subspaces corresponding to normal deformations. Now assume without loss of generality that t = 0 and n 1k = n 2k = 1. Viewing γ¯ : S 1 → Q as a map starting from [0, 1] (without identifying 0 and 1), we introduce ∞ (γ¯ ∗ N ). With this auxiliary space it is not very hard to observe the space of sections C[0,1] that the space of deformations of γ¯ and γ¯ 2 can be expressed as linear subspaces 0,ξ

∞ C ∞ (γ¯ ∗ N ) = {v ∈ C[0,1] (γ¯ ∗ N ) : v(0) = v(1)}, ∞ ∞ C ∞ ((γ¯ 2 )∗ N ) = {(v1 , v2 ) ∈ C[0,1] (γ¯ ∗ N ) ⊕ C[0,1] (γ¯ ∗ N ) : v1 (0) = v2 (1), v1 (1) = v2 (0)}.

Observing for the tangent spaces to the stable manifolds W + (γ¯ ) at γ¯ that T W + (γ¯ ) ⊕ T W + (γ¯ ) ⊂ {(v1 , v2 ) ∈ T W + (γ¯ 2 ) : v1 (0) = v2 (0)} ⊂ C ∞ (γ¯ ∗ N ) ⊕ C ∞ (γ¯ ∗ N ),

we get from ξ ξ Th 0 , j0 B0 ¯ = {v ∈ Th 0 , j0 B 0 : v|∂ S˙0 ∈ T W + ( ¯ 0 )}, 0

ξ

ξ

¯ Th 0 , j0 ,0 B 0 ¯ = {v ∈ Th 0 , j0 ,0 B ∗ : v|∂ S˙0 ∈ T W + ( )} ξ

ξ

that Th 0 , j0 B0 ¯ ⊂ Th 0 , j0 ,0 B 0 ¯ with quotient space 0

ξ

Th 0 , j0 B 0 ¯

0

ξ Th 0 , j0 ,t

B0 ¯

=

T W + (γ¯ ) ⊕ T W + (γ¯ ) . {(v1 , v2 ) ∈ T W + (γ¯ ) : v1 (0) = v2 (0)} ξ

ξ

On the other hand, since ker Dh 0 = ker Dh 0 ,0 = {0} we find that ξ

coker Dh 0 ,0 ξ

coker Dh 0

ξ

ξ

=

im Dh 0 ξ

im Dh 0 ,0

=

Th 0 , j0 B 0 ¯ ξ

0

Th 0 , j0 ,0 B 0 ¯

=

T W + (γ¯ ) ⊕ T W + (γ¯ ) , T W + (γ¯ 2 ) ∩ (C ∞ (γ¯ ∗ N ) ⊕ C ∞ (γ¯ ∗ N )

154

O. Fabert ξ

ξ

where the first equality follows from the fact that Dh 0 and Dh 0 ,0 both map to the same 0,ξ

Banach space E h 0 . In order to finish the proof, it hence only remains to prove that T W + (γ¯ ) ⊕ T W + (γ¯ ) T W − (γ¯ 2 ) = . T W + (γ¯ 2 ) ∩ (C ∞ (γ¯ ∗ N ) ⊕ C ∞ (γ¯ ∗ N ) (T W − (γ¯ ) ⊕ T W − (γ¯ )) ∩ C ∞ ((γ¯ 2 )∗ N ) But this is an immediate consequence of T W + (γ¯ ) ⊕ T W + (γ¯ ) T W + (γ¯ 2 ) ∩ (C ∞ (γ¯ ∗ N ) ⊕ C ∞ (γ¯ ∗ N )) T W + (γ¯ ) ⊕ T W + (γ¯ ) ⊕ (C ∞ (γ¯ ∗ N ) ⊕ C ∞ (γ¯ ∗ N ))⊥ = T W + (γ¯ 2 ) (T W − (γ¯ ) ⊕ T W − (γ¯ ))⊥ = T W + (γ¯ 2 ) and T W − (γ¯ 2 ) (T W − (γ¯ ) ⊕ T W − (γ¯ )) ∩ C ∞ ((γ¯ 2 )∗ N ) T W − (γ¯ 2 ) ⊕ (C ∞ ((γ¯ 2 )∗ N ))⊥ = T W − (γ¯ ) ⊕ T W − (γ¯ ) (T W + (γ¯ 2 ))⊥ = , T W − (γ¯ ) ⊕ T W − (γ¯ ) ∞ (γ¯ ∗ N ) ⊕ where A⊥ denotes the complement of the linear subspace A in C[0,1] ∞ ∗ C[0,1] (γ¯ N ). Defining an obstruction bundle over S 1 by setting

t =

T W − (γ¯ n k ) n 1k

2

{(v1 , v2 ) ∈ T W − (γ¯ ) ⊕ T W − (γ¯ n k ) : v1 (n 2k t) = v2 (n 1k t)}

and putting everything together we hence found that 0 ∼ Coker ∂¯ J = Coker ∂¯ J h 0 , j0 ,t

as desired.

h 0 , j0

⊕ t ,

With this we can prove the desired statement about gγ0¯ . Corollary 2.9. We have gγ0¯ = 0. Proof. It follows that the obstruction bundle over the one-dimensional configuration space has rank 2 1 rank = Morse γ¯ n k − Morse γ¯ n k − Morse γ¯ n k + dim Q − 1 ≥ 0, where the latter inequality can be verified as in [F2] using the multiple cover index formulas in [Lo]. When by index reasons the configuration is expected to be discrete we get a rank-one obstruction bundle over the boundary of the branched cover, which by orientability reasons must indeed be trivial.

Descendants in SFT

155

On the other hand, we want to emphasize that the proof of gγ0¯ = 0 is much simpler than the proof of h0γ = 0 in [F2], which has to involve obstruction bundles of arbitrary large rank and uses induction. Besides that our proof in [F2] also holds for Reeb orbits in general contact manifolds, this does not come as surprise. Going back to the symplectic field theory of unit cotangent bundles S ∗ Q, it is already mentioned in − → [CL] that the SFT differential 0SFT = H0 : A0 [[]] → A0 [[]] involving all moduli spaces of holomorphic curves in R ×S ∗ Q is much larger than the string differential 0string = ∂ + + ∇ : C0 [[]] → C0 [[]], which just involves the singular boundary operator and the string bracket and cobracket operations.

2.5. Additional marked points and gravitational descendants. We now want to understand the system of commuting operators defined for Reeb orbits by studying moduli spaces of branched covers over the cylinder over γ in terms of operations defined for the underlying closed geodesic γ¯ . To this end we have to extend the picture of [CL] used for computing the symplectic field theory of Reeb orbits to include additional marked points on the moduli spaces, integration of differential forms and gravitational descendants. Reintroducing the sequence of formal variables t j , j ∈ N, we now consider the graded Weyl algebras Wγ , Wγ¯ of power series in , the p-variables corresponding to multiples of γ , γ¯ and t-variables with coefficients which are polynomials in the q-variables corresponding to multiples of γ , γ¯ . In the same way we can introduce the graded commutative algebras Aγ , Cγ¯ of power series in , the t-variables with coefficients which are polynomials

in the q-variables corresponding to multiples of γ , γ¯ . For the expansion Hγ = H0γ + j t j H1γ , j +o(t 2 ) of the Hamiltonian from before, we are hence looking for an extended potential Lγ ,γ¯ as well as extended string Hamiltonian Gγ¯ , Lγ ,γ¯ = L0γ ,γ¯ +

j

Gγ¯ =

Gγ0¯

+

t j L1γ ,γ¯ , j +o t 2 ,

t j Gγ1¯ , j +o t 2 ,

j

−−→ − → − → such that Lγ ,γ¯ : Aγ [[]], Hγ → Cγ¯ [[]], Gγ¯ is an isomorphism of BV∞ -algebras. For this we have to prove the extended master equation ← − − → eLγ ,γ¯ Hγ − Gγ¯ eLγ ,γ¯ = 0, while the isomorphism property again follows using the natural filtration given by the t-variables. Since we are only interested in the system of commuting operators H1γ , j , j ∈ N, which is defined by counting branched covers of orbit cylinders with at most one additional marked point, we again will only discuss the required compactness statements in the case of one additional marked point. Furthermore we will still just restrict to the rational case. In other words we will prove the following proposition, which is just a reformulation of our theorem from above. Proposition 2.10. The system of Poisson-commuting functions h1γ , j , j ∈ N on P0γ is isomorphic to a system of Poisson-commuting functions g1γ¯ , j , j ∈ N on Pγ0¯ = P0γ ,

156

O. Fabert

where for every j ∈ N the descendant Hamiltonian gγ1¯ , j given by gγ1¯ , j =

( n)

qn 1 · . . . · qn j+2 ( j + 2)!

,

where the sum runs over all ordered monomials qn 1 · . . . · qn j+2 with n 1 + · · · + n j+2 = 0 and which are of degree 2(m + j − 3). Further ( n ) ∈ {−1, 0, +1} is fixed by a choice of coherent orientations in symplectic field theory and is zero if and only if one of the orbits γ n 1 , . . . , γ n j+2 is bad. Proof. While the proof seems to require the definition of gravitational descendants for moduli spaces of holomorphic curves not only with punctures but also with boundary, instead of defining them, recall that we have shown in the previous Subsect. 2.2 that the gravitational descendants can be replaced by imposing branching conditions over the special marked point on the orbit cylinder. More precisely, recall the lemma in Subsect. 2.2 states that we can indeed write each of the Hamiltonians h1γ , j as a weighted sum, h1γ , j =

1 · h1γ ,( j) + ρ 0j,μ · h1γ ,μ , j! |μ|< j

where h1γ ,μ ∈ P0γ counts rational branched covers of the orbit cylinder with (μ) connected components carrying precisely one additional marked point z 1 , . . . , z (μ) , which are mapped to the special point on the orbit cylinder and z i is a branch point of order μi − 1 for all i = 1, . . . , (μ). While for the invariance statement for gravitational descendants we were studying the compactification of the moduli spaces of holomorphic curves with one additional marked point, it follows from the definition of h1γ ,μ that now it is natural to study the moduli spaces of branched covers of the trivial half-cylinder with (μ) connected components carrying precisely one additional marked point z 1 , . . . , z (μ) , which are mapped to the special point on the trivial half-cylinder and z i is a branch point of order μi − 1 for all i = 1, . . . , (μ). While for the orbit cylinder the natural R-action is used to fix not only the S 1 -coordinate but also the R-coordinate of the special point, note that, in order to find the branched covers of the orbit cylinder counted in h1γ ,μ in the boundary, for the trivial half-cylinder we still fix the S 1 -coordinate but allow the R-coordinate to vary in R+ = (0, ∞). It follows that besides the boundary phenomena of the moduli spaces of branched covers of the trivial half-cylinder already described above, which can be described as seen above as the moving of branch points to infinity or leaving the branched cover through the boundary, the new boundary phenomena are the moving of the additional marked points to infinity or leaving the branched cover through the boundary, which are equivalent to the moving of the special point to infinity or leaving the half-cylinder through the boundary. In particular, it follows from the latter equivalence that the additional marked points z 1 , . . . , z (μ) move to infinity or leave the branched cover all at once. While the moving of the additional marked points to infinity, possibly together with other branch points, is counted in h1γ ,μ , the corresponding string Hamiltonian g1γ¯ ,μ should describe what happens if the additional marked points leave the branched cover through the boundary. Provided that we have found gγ1¯ ,μ ∈ Pγ0¯ for all branching profiles

Descendants in SFT

157

μ, it then follows from linearity that we obtain the desired Poisson-commuting sequence gγ1¯ , j by setting gγ1¯ , j =

1 ρ 0j,μ · gγ1¯ ,μ . · g1γ¯ ,( j) + j! |μ|< j

On the other hand, recall that in the computation of gγ0¯ we were faced with a transversality problem. While we have shown that the set of configurations counted for gγ0¯ is always one-dimensional, one can compute using the Morse indices of the involved multiply-covered geodesics that it happens that the Fredholm index expects the same set to be discrete. In the case when the Fredholm index is right, we have shown that to get an obstruction bundle of rank one to cut down the dimension of the configuration space, which is however trivial by orientability. For gγ1¯ ,μ we now show that the situation is even nicer. Lemma 2.11. For every branching condition μ the set of configurations studied for gγ1¯ ,μ is already discrete before we add abstract perturbations to the Cauchy-Riemann operator. It follows that, if the Fredholm index is right, there is no obstruction bundle. Before we show why this lemma leads to a proof of the above proposition and hence of the theorem, note that when γ¯ = Q = S 1 transversality is always satisfied and hence there are no obstruction bundles at all. On the other hand, note that the above proposition is formulated such that it holds in this case, where we use that g1S 1 ,μ = h1S 1 ,μ , which follows from the fact that the (rational) potential L0S 1 ,S 1 l0S 1 ,S 1 only counts orbit cylinders. In order to see that for an arbitrary closed geodesic γ¯ ⊂ Q the lemma proves the proposition and hence the theorem, observe that the Fredholm index is right precisely when it leads to the maximal degree 2(m + j − 3) from the proposition. Since the configuration space is independent of γ¯ before perturbing, in this case the lemma tells us that the corresponding configurations counted for gγ1¯ ,μ indeed agree with the ones counted for g1S 1 ,μ , up to sign determined by a choice of coherent orientations for the moduli spaces as described in [BM]. On the other hand, the results in [BM] show that the bad orbits indeed cancel out. For both statements we refer to the work of Cieliebak and Latschev in order to show that the orientation choices for closed Reeb orbits have a natural translation into orientation choices for to the underlying closed geodesics, that is, their unstable manifolds for the energy functional. In particular, we have, see [CL], that the Reeb orbit γ is bad if and only if the unstable manifold of γ¯ is not orientable. On the other hand, when the Fredholm is not right and hence maximal, we do not get a contribution to gγ1¯ ,μ by definition. Hence it just remains to prove the lemma. Proof of the lemma. For simplicity we first prove the statement for μ = (2). Following 1 the above description of g1γ ,μ ¯ it follows that gγ¯ ,(2) describes what happens if the additional marked point, which is a simple branch point, leaves the branched cover through the boundary. While at first this sounds that gγ1¯ ,(2) agrees with gγ0¯ , note that now the branch point is required to sit over the special point on the boundary of the half-cylinder. Since the S 1 -coordinate of the special point is fixed, it follows that the branch point can

158

O. Fabert

no longer leave the branched cover through every point on the boundary. In particular, while for gγ0¯ we obtained a one-dimensional configuration space due to the obvious S 1 -symmetry, it follows that for the configurations counted in gγ1¯ ,(2) the S 1 -symmetry is no longer present. Due to the important observation (which we already used to compute g0γ¯ ) that for the codimension-one boundary we can assume that there are no other branch points leaving the boundary at the same time, it follows that the set of configurations is indeed discrete. On the other hand, it is clear that this argument immediately generalizes to all branching profiles μ, since all the (μ) additional marked points are mapped to the same fixed special point. Together with the observation that the additional marked points z 1 , . . . , z (μ) leave the branched cover through the boundary all at once when the special point leaves the half-cylinder through the boundary, but again no other branch points by codimension reasons, the corresponding set of configurations stays discrete. To finish the proof of the theorem, observe that the sign ( n ) ∈ {−1, 0, +1} is fixed by a choice of coherent orientations in symplectic field theory and is zero if and only if one of the orbits γ n 1 , . . . , γ n j+2 is bad. For this recall from [BM] that in order to orient moduli spaces in symplectic field theory one additionally needs to choose orientations for all occurring Reeb orbits, while the resulting invariants are independent of these auxiliary choices. Recall that we have shown in Proposition 2.8 how (for j = 1) this obstruction bundle and hence its orientation is determined by the tangent spaces to the unstable manifolds of the multiply-covered geodesics. While the orientation of a closed Reeb orbit in SFT corresponds to an orientation of the (finite-dimensional) unstable manifold, the sign in front of pn 1 pn 2 qn k (n 1k + n 2k = n k ) in gγ1¯ ,1 is given by k

k

comparing the orientations of the finite-dimensional linear subspaces T W − (γ¯ 2 ) and (T W − (γ¯ ) ⊕ T W − (γ¯ )) ∩ = {(v1 , v2 ) ∈ T W − (γ¯ ) ⊕ T W − (γ¯ ) : v1 (0) = v2 (0)} of C ∞ ((γ¯ 2 )∗ N ). For j > 1 the obstruction bundle gets much more complicated, but the 1 1 idea is the same. Apart from the fact that the commutativity condition gγ¯ , j , gγ¯ ,k = 0 clearly leads to relations between the different ( n ), observe that a choice of orientation for γ does not lead to a canonical choice of orientations for its multiples γ k . While we expect that it is in general very hard to write down a set of signs ( n ) explicitly, for all the geometric applications we have in mind and the educational purposes as a test model beyond the Gromov-Witten case we are rather interested in proving vanishing results as the one above.

Acknowledgements. This research was supported by the German Research Foundation (DFG). The author thanks K. Cieliebak, Y. Eliashberg, K. Fukaya, M. Hutchings and P. Rossi for useful discussions.

References [BEHWZ] [BM] [CL] [CM] [CMS]

Bourgeois, F., Eliashberg, Y., Hofer, H., Wysocki, K., Zehnder, E.: Compactness results in symplectic field theory. Geom. and Top. 7, 799–888 (2003) Bourgeois, F., Mohnke, K.: Coherent orientations in symplectic field theory. Math. Z. 248, (2003) Cieliebak, K., Latschev, J.: The role of string topology in symplectic field theory. http://arixiv. org/abs/0706.3284v2 [math.s6], 2007 Cieliebak, K., Mohnke, K.: Symplectic hypersurfaces and transversality for gromov-witten theory. J. Symp. Geom. 5, 281–356 (2007) Cieliebak, K., Mundet, I., Salamon, D.: Equivariant moduli problems, branched manifolds, and the euler class. Topology 42(3), 641–700 (2003)

Descendants in SFT

[E] [EGH] [F1] [F2] [FR] [HT1] [HT2] [HWZ] [L] [Lo] [MDSa] [OP] [R1] [R2] [Sch] [W]

159

Eliashberg, Y.: Symplectic field theory and its applications. Proceedings of the ICM 2006., available at http://math.stanford.edu/~eliash/Public/eliashberg.pdf, 2006 Eliashberg, Y., Givental, A., Hofer, H.: Introduction to symplectic field theory. GAFA 2000 Visions in Mathematics Special Volume, Part II, 560–673 (2000) Fabert, O.: Contact homology of hamiltonian mapping tori. Comm. Math. Helv. 85, 203– 241 (2010) Fabert, O.: Obstruction bundles over moduli spaces with boundary and the action filtration in symplectic field theory. http://arxiv.org/abs/0709.3312v3 [math.s6], 2010 Fabert, O., P. Rossi: String, dilaton and divisor equation in symplectic field theory. http://arxiv. org/abs/1001.3094v2 [math.s6], 2010 Hutchings, M., Taubes, C.: Gluing pseudoholomorphic curves along branched covered cylinders i. J. Symp. Geom. 5, 43–138 (2007) Hutchings, M., Taubes, C.: Gluing pseudoholomorphic curves along branched covered cylinders ii. J. Symp. Geom. 7, 29–133 (2009) Hofer, H., Wysocki, K., Zehnder, E.: A general fredholm theory i: a splicing-based differential geometry. J. Eur. Math. Soc. 9(4), 841–876 (2007) Li, J.: A degeneration formula of gw-invariants. J. Diff. Geom. 60(2), 199–293 (2002) Long, Y.: Index theory for symplectic paths with applications. Progress in Mathematics 207, Basel-Bostoni Birkh¨auser, 2002 McDuff, D., Salamon, D.A.: J -holomorphic curves and symplectic topology. AMS Colloquium Publications, Providence RI; Amer. Math. Six., 2004 Okounkov, A., Pandharipande, R.: Gromov theory, hurwitz theory and completed cycles. Ann. of Math. 163(2), 517–560 (2006) Rossi, P.: Gromov-witten invariants of target curves via symplectic field theory. J. Geom. Phys. 58, 931–941 (2008) Rossi, P.: Integrable systems and holomorphic curves. http://arxiv.org/abs/0912.0451v2 [math.s6], 2010 Schwarz, M.: Cohomology operations from S 1 -cobordisms in Floer homology. Ph.D. thesis, Swiss Federal Inst. of Techn. Zurich, Diss. ETH No. 11182, 1995 Wendl, C.: Automatic Transversality and Orbifolds of Punctured Holomorphic Curves in Dimension Four. http://arxiv.org/abs/0802.3842v4 [math.s6], 2009

Communicated by N.A. Nekrasov

Commun. Math. Phys. 302, 161–224 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1182-9

Communications in

Mathematical Physics

A New Variational Approach to the Stability of Gravitational Systems Mohammed Lemou1 , Florian Méhats1 , Pierre Raphaël2 1 CNRS and IRMAR, Université de Rennes 1, Rennes, France.

E-mail: [email protected]; [email protected]

2 IMT, Université Paul Sabatier, Toulouse, France.

E-mail: [email protected] Received: 25 September 2009 / Accepted: 31 July 2010 Published online: 11 January 2011 – © Springer-Verlag 2011

Abstract: We consider the three dimensional gravitational Vlasov Poisson system which describes the mechanical state of a stellar system subject to its own gravity. A well-known conjecture in astrophysics is that the steady state solutions which are nonincreasing functions of their microscopic energy are nonlinearly stable by the flow. This was proved at the linear level by several authors based on the pioneering work by Antonov in 1961. Since then, standard variational techniques based on concentration compactness methods as introduced by P.-L. Lions in 1983 have led to the nonlinear stability of subclasses of stationary solutions of ground state type. In this paper, inspired by pioneering works from the physics litterature (MNRAS 241:15, 1989), (Mon. Not. R. Astr. Soc. 144:189–217, 1969), (Mon. Not. R. Ast. Soc. 223:623–646, 1988) we use the monotonicity of the Hamiltonian under generalized symmetric rearrangement transformations to prove that non increasing steady solutions are the local minimizer of the Hamiltonian under equimeasurable constraints, and extract compactness from suitable minimizing sequences. This implies the nonlinear stability of nonincreasing anisotropic steady states under radially symmetric perturbations. 1. Introduction and Main Results 1.1. Setting of the problem. We consider the three dimensional gravitational VlasovPoisson system

∂t f + v · ∇x f − ∇φ f · ∇v f = 0, f (t = 0, x, v) = f 0 (x, v) ≥ 0,

(t, x, v) ∈ R+ × R3 × R3 ,

where, throughout this paper, f (x, v) dv and φ f (x) = − ρ f (x) = R3

1 ∗ ρf 4π |x|

(1.1)

(1.2)

162

M. Lemou, F. Florian, P. Raphaël

are the density and the gravitational Poisson field associated to f . This nonlinear transport equation is a well known model in astrophysics for the description of the mechanical state of a stellar system subject to its own gravity and the dynamics of galaxies, see for instance [10,15]. Unique global classical solutions for initial data f 0 ∈ Cc1 , f 0 ≥ 0, where Cc1 denotes the space of compactly supported and continuously differentiable functions, have been shown to exist in [40,47,49] and to propagate the corresponding regularity. Two fundamental properties of the nonlinear transport flow (1.1) are then first the preservation of the total Hamiltonian 1 1 H( f (t)) = |v|2 f (t, x, v)d xdv − |∇φ f (t, x)|2 d x = H( f (0)), (1.3) 2 R6 2 R3 and second the preservation of all the so-called Casimir functions: ∀G ∈ C 1 ([0, +∞), R+ ) such that G(0) = 0, G( f (t, x, v)) d xdv = G( f 0 (x, v)) d xdv. (1.4) R6

R6

This last property induces a continuum of conservation laws and is the major difference between this kind of problem and other nonlinear dispersive problems like nonlinear wave or Schrödinger equations. 1.2. Nonlinear stability of steady state solutions. A classical problem which has attracted a considerable amount of work both in the astrophysical [2–4,25,26,41,42,54] and mathematical communities, is the question of the nonlinear stability of stationary states. If we restrict our study to radially symmetric stationary states –that is a priori depending on (|x|, |v|, x · v) only, Jean’s theorem [8] ensures that they can be described as functions of their own microscopic energy and their angular momentum: |v|2 + φ Q (x), (x, v) = |x × v|2 , 2 Q(x, v) = F (e(x, v), (x, v)) .

e(x, v) =

(1.5) (1.6)

The existence of such steady states has been discussed in [8] for a large class of smooth functions F. A well-known conjecture in astrophysics, [10], is now that among these stationary solutions, those who are nonincreasing functions of their microscopic energy e are nonlinearly stable by the Vlasov Poisson flow, explicitly: Conjecture. Non increasing anisotropic galaxies F = F(e, ) with ∂∂eF < 0 on the support of Q are stable by spherically symmetric perturbations for the flow (1.1). Non increasing isotropic spherical galaxies F = F(e) with ∂∂eF < 0 on the support of Q are orbitally stable against general perturbations for the flow (1.1). Remarkably enough, this conjecture has been proved at the linear level by Doremus, Baumann and Feix [14] (see also [17,25,52] for related works), following the pioneering work by Antonov in the 60’s [3,4]. These results are based on some coercivity properties of the linearized Hamiltonian under constraints formally arising from the linearization of the Casimir conservation laws (1.4), see Lynden-Bell [41]. At the nonlinear level, the general problem is open. However, the nonlinear stability of a large class of stationary solutions of so-called ground state type including

New Variational Approach to the Stability of Gravitational Systems

163

the polytropic states has been obtained using variational methods in [13,18,20–22,55], completed by [50]. In [28–30], see also [48], we observed that a direct application of Lion’s concentration compactness techniques [38,39], implies that or a large class of convex functions j, the two parameters –according to the scaling symmetry of (1.1)– minimization problem I (M1 , M j ) =

inf

| f | L 1 =M1 , | j ( f )| L 1 =M j

H( f ),

M1 , M j > 0

(1.7)

is attained up to symmetries on a steady state solution to (1.1) of the form (1.6), and all minimizing sequences to (1.7) are relatively compact up to a translation shift in the natural energy space E = { f ≥ 0 with | f |E = | f | L 1 + | f | L ∞ + ||v|2 f | L 1 < +∞}. The so-called Cazenave, Lions [11] theory of orbital stability then immediately implies the orbital stability of the corresponding ground state steady solution, [29]. In fact, this last step requires the knowledge of the uniqueness of the minimizer to (1.7) which is a delicate open problem in general, see [50], but this difficulty was overcome in [30]. Other non variational approaches based on linearization techniques have also been explored in [23,53]. Recently, Guo and Lin [19] proved the radial stability of the so called King model F(e) = (exp(e0 − e) − 1)+ which is not in the class of ground states as obtained in the framework of (1.7). Adapting a robust approach developed by Lin and Strauss in their study of the Vlasov Maxwell system, [35–37], the authors use the infinity of conservation laws provided by the nonlinear transport to construct a sufficient large approximation of the kernel of the linearized operator close to the steady state. This allows them to recover a coercivity statement of the linearized energy using Antonov’s coercivity property which after linearization and control of higher order terms for the King model yields the claimed stability in the radial class.

1.3. Additional conserved quantities in the radial setting. Our main purpose in this paper is to describe a generalized variational approach for the nonlinear stability of steady states which fully takes into account the nonlinear transport structure of the problem, and in particular the continuum of constraints at hand from (1.4). First recall that in general, the full set of invariant quantities conserved by the nonlinear transport flow (1.1) depends on the initial data and its possible symmetries. From now and for the rest of this paper, we shall restrict our attention to spherically symmetric solutions f (x, v) = f (|x|, |v|, x · v), where we will systematically abuse notations and identify f with its image through various diffeomorphisms. We then let Erad be the space of spherically symmetric distribution functions of finite energy Erad = { f ∈ E,

f spherically symmetric},

(1.8)

and recall that if f is spherically symmetric, then ρ f (x) = ρ f (|x|) and φ f (x) = φ f (|x|). This implies in particular from a direct computation that the momentum = |x × v|2 is conserved by the characteristic flow associated to (1.1), and hence a larger class of Casimir conservation laws (1.4) holds:

164

M. Lemou, F. Florian, P. Raphaël

R6

G( f (t, x, v), |x × v|2 )d xdv =

R6

G( f 0 (x, v), |x × v|2 )d xdv

(1.9)

for all G ∈ C 1 ([0, +∞) × [0, +∞), R+ ) with G(0, ) = 0, ∀ ≥ 0. Let us reformulate (1.9) in terms of equimeasurability properties of f and f 0 . Performing the change of variables r = |x|, w = |v|, x · v = |x||v| cos θ, r, w > 0, θ ∈ ]0, π [, the Lebesgue measure is mapped onto: ∞ +∞ 2 f (x, v)d xdv = 8π R6

r =0 w=0

π θ=0

f (r, w, cos θ )r 2 w 2 sin θ dr dwdθ.

We then perform the second change of variables, r = r, u = w sign(cos θ ), = r 2 w 2 sin2 θ and get from Fubini: f (x, v)d xdv = R6

+∞

=0

(r,u)∈

f (r, u, )dν d

(1.10)

with = {(r, u) ∈ R+ × R with r 2 u 2 > }

(1.11)

dν = 4π 2 1r 2 u 2 > (r 2 u 2 − )−1/2 r |u|dr du.

(1.12)

and We then define the distribution function of f at given kinetic momentum : ∀ > 0, ∀s ≥ 0, μ f (s, ) = ν {(r, u) ∈ , f (r, u, ) > s}, or equivalently μ f (s, ) = 4π

2

+∞ +∞

r =0

u=−∞

1 f (r,u,)>s (r 2 u 2 − )−1/2 r |u|1r 2 u 2 > dr du.

(1.13)

(1.14)

We now define the set of distribution functions which are equimeasurable to f at given by: Eq( f ) = {g ≥ 0 spherically symmetric, ∀s > 0, μ f (s, ) = μg (s, ) a.e. }. (1.15) We then have from standard arguments: Lemma 1.1 (Characterization of Eq( f )). Let f ∈ L 1 ∩L ∞ , nonnegative and spherically symmetric, then the following are equivalent: (i) g ∈ Eq( f ); (ii) ∀G(h, ) ≥ 0, C 1 with G(0, ) = 0, G( f (x, v), |x × v|2 )d xdv = G(g(x, v), |x × v|2 )d xdv holds . R6

R6

Lemma 1.1 allows us to reformulate the conservation laws of the full Casimir class (1.9) in the radial setting as follows: ∀t ≥ 0,

f (t) ∈ Eq( f 0 ).

(1.16)

New Variational Approach to the Stability of Gravitational Systems

165

1.4. Assumption (A) on the steady state. Before stating the results, let us fix our assumptions on the steady state Q. (i) Q is a continuous, nonnegative, non zero, compactly supported steady state solution of the Vlasov-Poisson system (1.1). (ii) There exists a continuous function F : R × R+ → R+ such that ∀(x, v) ∈ R6 ,

Q(x, v) = F

|v|2 + φ Q (x), |x × v|2 . 2

(1.17)

(iii) There exists e0 < 0 such that: O = {(e, ) ∈ R × R+ : F(e, ) > 0} ⊂] − ∞, e0 [×R+ , F is C 1 on O,

with

∂F ∂e

< 0.

Remark 1.2. Note that ∂∂eF may be infinite at the boundary of O, as is the case for polyq tropic ground states F(e, ) = (e0 − e)+ κ , for some 0 < q < 1 and κ ≥ 0. Below we list a number of physically relevant models for which our non linear stability result applies. All these examples are extracted from [10] to which we refer for a detailed physical description of various gravitational models. Examples. – Polytropes and double-power models: The polytropes correspond to the following form of F: F(e, ) = (e0 − e)+ κ , q

0 < q < 7/2,

κ ≥ 0,

where e0 < 0 is a constant threshold energy. A generalization of these polytropes is provided by the so-called double-power model [10]: F(e, ) =

αi j (e0 − e)+i κ j , q

0≤i, j≤N

where αi j are nonnegative constants. – Michie-King models: F(e, ) = exp(−/2ra2 ) (exp(e0 − e) − 1)+ , where e0 < 0 and the constant ra > 0 is the anisotropy radius [10]. When ra goes to infinity, this model reduces to the King model. – Osipkov-Merritt models: F(e, ) = G e0 − e + 2 , 2ra where e0 < 0, ra > 0 are constants, and G is a nonincreasing C 1 function such that G(t) = 0 for all t ≤ 0.

166

M. Lemou, F. Florian, P. Raphaël

1.5. Statement of the results. From (1.16), a natural generalization of (1.7) in the radial setting is to minimize the Hamiltonian under constraints of given equimeasurability. This is a very natural strategy to prove stability in a nonlinear transport setting which goes back in fluid mechanics to the celebrated works of Arnold, see e.g [5–7], Marchioro and Pulvirenti [43,45], Wolansky and Ghil [56], and references therein, and is also very much present in the physics litterature, see in particular Lynden-bell [41], Gardner [16], Wiechen, Ziegler, Schindler [54], Aly [2] and references therein. The mathematical implementation of the corresponding variational problem is however confronted to the description of bounded sequences in Eq( f 0 ) and a possible lack of compactness in general, see for example Alvino, Trombetti and Lions [1] for an introduction to this kind of problem. Our first result is the characterization of non increasing states as local minimizers of the Hamiltonian in Erad under a constraint of equimeasurability: Theorem 1.3 (Local variational characterization of Q). There exists a constant C0 > 0 such that the following holds. For all R > 0, there exists δ0 (R) > 0 such that, for all f ∈ Erad ∩ Eq(Q) satisfying | f − Q|E ≤ R,

|∇φ f − ∇φ Q | L 2 ≤ δ0 (R),

(1.18)

we have H( f ) − H(Q) ≥ C0 |∇φ f − ∇φ Q |2L 2 .

(1.19)

If in addition H( f ) = H(Q), then f = Q. Theorem 1.3 was first obtained by Guo, Rein [23] for a perturbation f near Q 1 in the specific case of the isotropic King model, and for isotropic relativistic models F(e, ) = F(e) with locally bounded derivative F (e) in [24], and this excludes any singularity at the boundary –as many polytropic models would have. Let us stress onto the fact that Theorem 1.3 by itself alone is too weak to yield a stability statement including the full set of radial pertubations. Hence the importance of Theorem 1.3 relies in fact mostly on its proof. Indeed, a new important feature of our analysis is to use a monotonicity property of the Hamiltonian under a generalized Schwarz symmetrization which is not the standard radial rearrangement but a rearrange2 ment with respect to a given microscopic energy |v|2 + φ(x), at fixed angular momentum |x × v|2 , see Proposition 2.8 for a precise definition and Proposition 3.1 for the monotonicity statement. This monotonicity is very much a consequence of the “bathtub” principle for symmetric rearrangements, see Lieb and Loss [33], and was already observed in the physics literature, see Gardner [16], Aly [2]. It produces a reduced functional J (φ f ) which depends on the Poisson field φ f only and not the full distribution function. The outcome is a lower bound H( f ) − H(Q) ≥ J (φ f ) − J (φ Q ).

(1.20)

Interestingly enough, the reduced functional J was first introduced on physical ground as a generalized potential energy in the pioneering works by Lynden-Bell [41], see also Wiechen, Ziegler, Schindler [54]. It now turns out from explicit computation that the critical points of J are the Poisson field of steady states, and that the Hessian of J near the Poisson field of a nondecreasing steady state can be directly connected to 1 And not only φ near φ , which is an issue for the proof of Theorem 1.4. f Q

New Variational Approach to the Stability of Gravitational Systems

167

the Hartree-Fock exchange operator [41], which is coercive from Antonov’s stability criterion, see Sect. 4, and hence φ Q itself is a local minimizer of J . The important outcome of the structure (1.20) is that by reducing the problem to a problem on the Poisson field only, we are able to extract compactness in the radial setting from any minimizing sequence whose Hamiltonian converges to Q without the assumption of equimeasurability, thanks to the smoothing and compactness provided by the radial Poisson equation. This allows us to prove the following compactness result which is the heart of our analysis. Given f ∈ Erad , we consider the family of its Schwarz symmetrizations f ∗ (·, ), > 0, as defined in Proposition 2.6. We then claim: Theorem 1.4 (Compactness of local minimizing sequences). There exists δ > 0 such that the following holds. Let f n be a sequence of functions of Erad , bounded in L ∞ , such that |∇φ fn − ∇φ Q | L 2 < δ,

(1.21)

and lim sup H( f n ) ≤ H(Q), n→+∞

f n∗ → Q ∗ in L 1 (R+ × R+ ) as n → +∞

(1.22)

then f n → Q in L 1 (R6 ),

|v|2 f n → |v|2 Q in L 1 (R6 ).

(1.23)

Theorem 1.4 is the key to the radial Cazenave-Lions’ theory of orbital stability [11] and implies that any compactly supported non increasing steady state Q as defined by (1.17), is nonlinearly stable under the action of the Vlasov-Poisson flow with respect to spherical perturbations. We thus obtain the main result of this paper: Theorem 1.5 (Nonlinear stability of Q under the nonlinear flow (1.1)). For all M large enough and for all ε > 0, there exists η > 0 such that the following holds true. Let f 0 ∈ Erad ∩ Cc1 with | f 0 − Q| L 1 < η, | f 0 | L ∞ < M, H( f 0 ) < H(Q) + η,

(1.24)

then the corresponding global strong solution f (t) to (1.1) satisfies: ∀t ≥ 0, | f (t) − Q| L 1 + ||v|2 ( f (t) − Q)| L 1 < ε, | f (t)| L ∞ < M.

(1.25)

Comments on Theorem 1.5. 1. Linear versus nonlinear stability. A natural strategy to pass from linear to nonlinear stability is to try to linearize the problem and estimate higher order terms as perturbations. This turns out to be quite delicate in general and the control of higher order terms may be challenging, see [19,53] for a treatment of the King model, [32] for the polytropic case. Our analysis avoids this classical difficulty using two facts. We first derive a global monotonicity property which is fundamentally a nonlinear property and does not rely on any linearization procedure, Proposition 3.1, and which reduces the problem to understanding a simpler functional on the Poisson field φ only. For this functional, we do apply a linearization procedure that is a Taylor expansion near φ Q , but we avoid the computation of higher order terms thanks to compactness properties of the Hessian, see (4.45), (4.61).

168

M. Lemou, F. Florian, P. Raphaël

2. Comparison with previous nonlinear stability results. In view of the nonlinear stability result obtained for ground state type minimizers of (1.7) which are not restricted to the radial class, one may ask whether a generic steady solution of the form (1.17) can in fact be obtained as a ground state for (1.7). This is a nontrivial issue which is connected to the notion of equivalence of ensemble in statistical physics. In a forthcoming work [31] and following pioneering ideas from Lieb and Yau [34], we will exhibit a large class of monotonic functions F for which the equivalence of ensemble actually holds. There are however of course many well known examples where this equivalence of ensembles fails. Note also that physical investigations around these minimization problems can be found in [12] and the references therein. 3. Comparison with 2D incompressible Euler. The conservation of equimeasurability properties by the nonlinear transport flow has also been used in the literature to prove the stability of steady states for the 2D incompressible Euler flow, see for example Marchioro, Pulvirenti [45] and references therein. For a discussion on variational problems with equimeasurability constraints in fluid dynamics, one can also refer to Serre [51]. Our result generalizes this approach to the Vlasov-Poisson system which is however more delicate due to the non-trivial structure of both the Hamiltonian and the steady states solutions. The conjecture of stability of nonincreasing radially symmetric steady states is hence proved for radial perturbations. Note that the result is expected to be optimal for anisotropic galaxies with a non-trivial dependence on as some numerical simulations suggest the possible instability of anisotropic models against general perturbations, see [10]. One important open problem after this work is certainly the general setting of nonradial perturbations for spherical models. 1.6. Strategy of the proof. Let us give a brief insight into the proof of the variational characterization of Q given by Theorem 1.3 and the lower bound (1.20) which are key features of our analysis. It follows in three main steps. Step 1. Rearrangement with respect to a given Poisson field. Let a Poisson field φ and a radially symmetric distribution function f ∈ Erad ; we aim at defining the Schwarz 2 symmetrization of f with respect to the microscopic energy e = |v|2 + φ(x) at each 2 given kinetic momentum . In other words,

given = |x × v| > 0, we are looking for |v|2 ∗φ a function f (x, v) = G 2 + φ(x), which is a nonincreasing function of e and which is equimeasurable to f in the sense of (1.13), (1.15) i.e.: ∀t > 0, μ f (t, ) = μ f ∗φ (t, ) a.e > 0. As a simple change of variables formula similar to (1.10) reveals, the choice of f ∗φ is essentially unique and given by: 2 |v| + φ(x), |x × v|2 , |x × v|2 1 |v|2 f ∗φ (x, v) = f ∗ aφ , (1.26) 2 2 +φ(x)<0 where f ∗ is the standard Schwarz symmetrization of f at given -see Proposition 2.7 for a precise statement- and aφ is the Jacobian of the change of variables, explicitly: √ +∞ 1/2 e − φ(r ) − 2 aφ (e, ) = 8π 2 2 dr. (1.27) 2r + 0

New Variational Approach to the Stability of Gravitational Systems

169

Note that the steady state Q being by assumption a nonincreasing function of its microscopic energy, it is automatically a fixed point for this transformation–see Corollary 2.9: Q = Q ∗φ Q .

(1.28)

Step 2. Monotonicity of the Hamiltonian under the f ∗φ f rearrangement. The key property which can be found in the physics literature, see in particular Aly [2], is now the monotonicity of the Hamiltonian (1.3) under the generalized rearrangement (1.26), see Proposition 3.1: ∀ f ∈ Erad , H( f ) ≥ H( f ∗φ f ).

(1.29)

Pick then f as in the hypothesis of Theorem 1.3 so that f ∗ = Q ∗ , then a slightly more careful analysis of the monotonicity formula (1.29) implies a lower bound of the Hamiltonian by a functional which depends on the Poisson field φ f only: H( f ) − H(Q) ≥ J (φ f ) − J (φ Q ) with J (φ f ) = H(Q ∗φ f ) +

(1.30)

1 |∇φ Q ∗φ f − ∇φ f |2 . 2 R3

This dependence in φ only which displays nice compactness properties in the radial setting is the key to the proof of the convergence of minimizing sequences, Theorem 1.4. Another important feature in the proof of Theorem 1.4 will be to not only use the monotonicity (1.29), but to observe that some norm is controlled by H( f ) − H( f ∗φ f ), see Sect. 3.3. Step 3. Coercivity of the Hessian of J at φ Q . From (1.30), the lower bound (1.19) now follows from the lower bound: J (φ f ) − J (φ Q ) ≥ C|∇φ f − ∇φ Q |2L 2

(1.31)

in the vicinity of φ Q . This local coercivity lower bound relies on an explicit computation of the Taylor expansion of J at φ Q , Proposition 4.3. The steady state equation (1.28) implies that φ Q is a critical point of J , while the Hessian at φ Q is intimately related to the Lynden-Bell Hartree-Fock exchange operator [41], which coercivity was essentially proved 40 years ago by Antonov, [3,4], see Proposition 4.1, using in particular the fact that in radial symmetry, the kernel of the linearized transport operator close to Q is explicit. Note that the rigorous derivation of the first two derivatives of J at φ Q requires a detailed study of √ the regularity properties of the Jacobian aφ given by (1.27) which a priori displays a · regularity only. This paper is organized as follows. In Sect. 2, we introduce the Schwarz symmetri2 zation with respect to the microscopic energy |v|2 + φ(x), at fixed kinetic momentum |x × v|2 , and prove some natural continuity property of the corresponding object f ∗φ , Proposition 2.8, and of the Jacobian function aφ , Lemmas 2.3, 2.4, 2.5. In Sect. 3, we prove the key monotonicity Proposition 3.1 which reduces the analysis to coercivity properties of the functional J (φ) near φ Q , Proposition 3.2. We then first conclude the proofs of Theorems 1.3, 1.4, 1.5. assuming Proposition 3.2 which is eventually proved in Sect. 4.

170

M. Lemou, F. Florian, P. Raphaël

2. Symmetric Rearrangement with Respect to a Given Microscopic Energy Our aim in this section is to introduce the symmetric rearrangement f ∗φ of a distribu2 tion function f ∈ Erad with respect to a given microscopic energy |v|2 + φ(x). This notion generalizes the standard Schwarz symmetrization and is the well fitted object for the study of the minimization problem (1.19). This symmetrization involves the use the Jacobian function aφ given by (1.27). We start with proving some continuity and differentiability properties of this functional which will be used all along the paper, Lemma 2.3, 2.4, 2.5, and then define f ∗φ and give its first properties, Proposition 2.8. 2.1. Definition and differentiability in e of the Jacobian function aφ . Our aim in this subsection is to study the Jacobian aφ (e, ) given by (1.27), which appears in the definition of the generalized Schwarz symmetrization (2.63). The class of Poisson potentials φ which is well fitted for the analysis is (2.1) rad = φ : R3 → R such that there exists f ∈ Erad with φ = φ f . Let us start with some properties of the so called effective potential appearing in the definition (1.27) which are elementary but crucial to obtain uniform bounds on aφ and its various derivatives. Lemma 2.1 (Structure of the effective potential for φ ∈ rad ). Let φ = φ f ∈ rad , be non zero. For > 0, consider the effective potential ψφ, (r ) = φ(r ) +

, 2r 2

r > 0.

(i) Structure of ψφ, : ψφ, ∈ C 1 (R3 \{0}) and

eφ, = inf ψφ, (r ) , r ≥0

(2.2)

(2.3)

is attained at a unique r0 (φ, ). ψφ, is strictly decreasing on (0, r0 (φ, )) and strictly increasing on (r0 (φ, ), +∞) with lim ψφ, (r ) = +∞,

r →0

lim ψφ, (r ) = 0.

r →+∞

(2.4)

Moreover, the function → eφ, is continuous on R∗+ , with the uniform bound: | f |2L 1 ∀ > 0, max φ(0), − (2.5) ≤ eφ, < 0. 2 (ii) Level sets of ψφ, : for eφ, < e < 0, let r1 (φ, e, ) = inf r ≥ 0 st. e − ψφ, (r ) > 0 , r2 (φ, e, ) = sup r ≥ 0 st. e − ψφ, (r ) > 0 .

(2.6) (2.7)

Then r1 (φ, e, ), r2 (φ, e, ) are C 1 functions of e with uniform bounds: ∀eφ, < e < 0: 0<

−| f | L 1 . ≤ r1 (φ, e, ) < r2 (φ, e, ) ≤ 2| f | L 1 e

(2.8)

New Variational Approach to the Stability of Gravitational Systems

171

Fig. 1. Profile of the effective potential ψφ, (r )

(iii) Concavity lower bound: there holds the uniform concavity lower bound ∀e ∈ (eφ, , 0), ∀r ∈ [r1 (e, φ, ), r2 (e, φ, )], e − ψφ, (r ) ≥

2r 2 r

1 r2

(r − r1 (φ, e, ))(r2 (φ, e, ) − r ).

(2.9)

On Fig. 1, we summarize the properties of ψ described above. Remark 2.2. In the sequel and when there is no ambiguity, we will avoid the (φ, e, ) dependence and note r0 , r1 , r2 . Proof. The proof is elementary but relies on a crucial way on the positivity of φ f . Let us recall the standard interpolation estimate for f ∈ E: 1/2

7/6

1/3

|∇φ f |2L 2 ≤ C||v|2 f | L 1 | f | L 1 | f | L ∞ .

(2.10)

Let φ = φ f ∈ rad , then by interpolation and Sobolev embedding, ρ f ∈ L 5/3 (R3 ) and 2,5/3 thus φ f ∈ Wloc (R3 ) ⊂ C 0 (R3 ) and φ f ∈ C 1 (R3 \{0}) by elliptic regularity and the radial assumption, from which ψφ, ∈ C 1 (R3 \{0}). We now integrate the radial Poisson equation and get: r r 2 φ f (r ) = 4π s 2 ρ f (s)ds ≥ 0, lim r φ f (r ) = −| f | L 1 . (2.11) r →+∞

0

Note that the second identity is obtained by integrating the first one as follows: r +∞ s 2 ρ f (s)ds − 4πr sρ f (s)ds. (2.12) r φ f (r ) = −4π 0

r

We deduce that φ = φ f is continuous, nondecreasing and nonpositive on [0, +∞[ with | f |L 1 , ∀r ≥ 0, (2.13) φ(r ) ≥ max φ(0), − r and there exists r˜ > 0, such that φ(r ) ≤ −

| f |L 1 , ∀ r ≥ r˜ . 2r

(2.14)

172

M. Lemou, F. Florian, P. Raphaël

Thus (2.13), (2.14) imply (2.4). From (2.14), eφ, given by (2.3) satisfies | f |L 1 + 2 < 0, eφ, ≤ inf − 2r 2r r ≥˜r since by assumption f = 0, and hence eφ, is attained at some r0 = r0 (e, φ, ). Thus from (2.13): | f |2L 1 | f |L 1 eφ, = φ(r0 ) + 2 ≥ max φ(0), − + 2 ≥ max φ(0), − ∀ > 0, r0 2 2r0 2r0 and (2.5) is proved. Observe now from (2.11) again that: ψφ, (r ) = φ (r ) −

, and (r 2 ψφ, (r )) = r 2 ρ f + 2 > 0, 3 r r

and hence from ψ (r0 (e, φ, )) = 0: ∀r > 0, r

2

ψφ, (r )

=

r

r0

2 r ρ f (r ) + 2 dr, r

(2.15)

(2.16)

which yields the uniqueness of the minimum r0 > 0 and the claimed monotonicity properties of ψφ, . Together with (2.4), we conclude from (2.16) that r1 , r2 given by (2.6), (2.7) are well defined for eφ, < e < 0, and are C 1 functions of e from the implicit function theorem. To prove the uniform bound (2.8), we observe from (2.13): | f |L 1 − 2 >0 , r ≥ 0; st. e − φ(r ) − 2 > 0 ⊂ r ≥ 0; st. e + 2r r 2r and hence using from (2.5) that | f |2L 1 + 2e > 0 for e > eφ, : ⎤ ⎡ | f |L 1 ⎦. r ≥ 0; st. e+ − 2 >0 ⊂⎣ , r 2r 2 2 | f | L 1 + | f | L 1 +2e | f | L 1 − | f | L 1 +2e We then use the definitions (2.6) and (2.7) to get 0<

| f |L 1

≤ r1 (φ, e, ) < r2 (φ, e, ) ≤ , 2 + | f | L 1 + 2e | f | L 1 − | f |2L 1 + 2e

which implies (2.8). Let us now prove the continuity of the function → eφ, on R∗+ . Let 0 < 1 < 2 be fixed. From the definitions (2.2) and (2.3), for all ∈ [1 , 2 ] we have eφ, ≤ eφ,2 thus, applying (2.8) with e = 21 eφ,2 gives α1 ≤ r0 (φ, ) ≤ α2 , with α1 =

1 2| f | L 1 > 0. > 0, α2 = 2| f | L 1 |eφ,2 |

New Variational Approach to the Stability of Gravitational Systems

173

Hence, (r, ) → ψφ, (r ) being continuous, the function ∈ [1 , 2 ] → eφ, = min ψφ, (r ) r ∈[α,α2 ]

is continuous. It remains to prove the concavity bound (2.9). Let w(r ) = e − ψφ, (r ) − then

2r 2 r

1 r2

(r − r1 )(r2 − r ),

1 2 −(r ψφ, (r )) + 2 = −rρ f (r ) ≤ 0, (r w(r )) = r r

where we used (2.15). Hence the function r → r w(r ) is concave. Since it vanishes at r1 and r2 , we conclude that w(r ) ≥ 0 for all r ∈ [r1 , r2 ] and (2.9) is proved. This concludes the proof of Lemma 2.1. Let us now define the Jacobian function aφ (e, ) and examine its differentiability properties in e: Lemma 2.3 (Definition and differentiability properties in e of the Jacobian aφ ). For φ = φ f ∈ rad non zero and > 0, we define: 2 ν (r, u) ∈ (R+ )2 : u2 + φ(r ) < e for e < 0 and > 0, aφ (e, ) = (2.17) +∞ for e ≥ 0, and > 0, where ν is the measure given by (1.12), equivalently: ∀ > 0, ∀e < 0, √ r2 1/2 e − ψφ, (r ) dr. aφ (e, ) = 8π 2 2

(2.18)

r1

Then: (i) Behavior of aφ : aφ (e, ) = 0 for e < eφ, and ∀ > 0, aφ (eφ, , ) = 0,

lim aφ (e, ) = +∞.

e→0−

(2.19)

(ii) Uniform bounds on aφ : let 0 < m φ := inf (r + 1)|φ(r )| < +∞,

(2.20)

∀e < 0, aφ (e, ) ≤ 16π 2 |e|−1/2 | f | L 1 ,

(2.21)

r ≥0

then the bounds

and ∀e ∈

−

m 2φ 4(2m φ + )

,0 ,

aφ (e, ) ≥

4π 2 −1/2 |e| m φ hold. 3

(2.22)

174

M. Lemou, F. Florian, P. Raphaël

(iii) Differentiability in e: the map e → aφ (e, ) is a C 1 -diffeomorphims from (eφ, , 0) to (0, +∞) with: √ ∂aφ (e, ) = 4π 2 2 ∀e ∈]eφ, , 0[, ∂e

r2

r1

−1/2 e − ψφ, (r ) dr > 0.

(2.23)

Abusing notations, we shall denote in the sequel aφ−1 (·, ) : (0, +∞) → (eφ, , 0) its inverse function. Proof. Step 1. Bounds on aφ . First compute from the definitions (2.17) and (1.12): ∀e < 0, ∀ > 0 : aφ (e, ) = 8π 2

r >0 u>0

1 u2

= 8π 2

r >0

2

1e−φ(r )−

√ 2 = 8π 2

r2

r1

1 2 2 (r 2 u 2 +φ(r )<e r u > √ 2(e−φ(r )) >0 2r 2

√

u=

e − φ(r ) − 2 2r

r

− )−1/2 r udr du

(r 2 u 2 − )−1/2 udu r dr

1/2 dr, +

this is (2.18) or, equivalently, (1.27). Then aφ (e, ) = 0 for e ≤ eφ, and aφ (e, ) > 0 on (eφ, , 0) from Lemma 2.1. We now estimate aφ from above for e < 0 using (2.13) and (2.8) as follows: 1/2 e − φ(r ) − 2 dr 2r r1 (φ,e,) √ r2 (φ,e,) | f | L 1 1/2 2 ≤ 8π 2 dr r r1 (φ,e,)

√ 1/2 ≤ 16π 2 2| f | L 1 r2 (φ, e, )1/2 − r1 (φ, e, )1/2

√ aφ (e, ) = 8π 2 2

r2 (φ,e,)

≤ 16π 2 | f | L 1 |e|−1/2 , and (2.21) is proved. To estimate aφ (e, ) from below, first observe that (2.20) follows from (2.11). We then write: √ aφ (e, ) ≥ 8π 2 2 √ ≥ 8π 2 2

+∞

e+ 0

mφ − 2 r + 1 2r

1/2 dr +

mφ 1/2 e+ − dr r + 1 2r 2 + 1+/m φ +∞

and observe that for r ≥ 1 + /m φ , we have mφ mφ mφ mφ − 2 ≥ − ≥ 1− ≥ . 2 r + 1 2r r + 1 2(r − 1) r +1 2m φ (r − 1) 2(r + 1)

New Variational Approach to the Stability of Gravitational Systems

Thus:

√ aφ (e, ) ≥ 8π 2 2 √ ≥ 8π 2 2 ≥ 8π

e+

mφ 2(r + 1)

1+/m φ |e| 1/2 +∞

√ 2

+∞

mφ

1+/m φ

175

1/2 dr +

1/2 m φ − 2|e|(r + 1) + dr

2m φ + 2 −1/2 |e| m φ 1 − 2|e| 3 m 2φ m2

3/2 . +

2

This means that for |e| ≤ 4(2m φφ +) , aφ (e, ) ≥ 4π3 |e|−1/2 m φ , and (2.19) and (2.22) are proved. The continuity and the monotonicity of the application e → aφ (e, ) is a consequence of (2.8) and of the dominated convergence theorem, since 1/2 e − φ(r ) − 2 ≤ (−φ(0))1/2 , for all r ∈]eφ, , 0[. 2r + Step 2. Differentiability of aφ . We are now in position to prove the differentiability of the function e → aφ (e, ) which follows from the version of Lebesgue’s derivation theorem given by Lemma A.1. Let us fix > 0 and write +∞ g(e, r )dr aφ (e, ) = 0

with

√ √ 1/2 1/2 g(e, r ) = 8π 2 2 e − ψφ, (r ) 1r1 (φ,e,)
and with ψφ, given by (2.2). Let e0 ∈]eφ, , 0[ and I =]e0 − ε, e0 + ε[, with ε small enough such that I ⊂]eφ, , 0[. Let us check the assumptions of Lemma A.1. By (2.13), we have | f | L 1 1/2 0 ≤ g(e, r ) ≤ C 1r1 (φ,e,)
176

M. Lemou, F. Florian, P. Raphaël

Now from the concavity estimate (2.9): √ r r1 r2 C ∂g (e, r ) ≤ √ √ 1r1
(2.25)

where we applied (2.8) and recall that r1 = r1 (φ, e, ), r2 = r2 (φ, e, ). We observe that r2 2 C | f |L 1 qe (r )dr = √ π, (2.26) |e0 + ε|2 r1 1 which implies in particular that qe ∈ L 1 (R+ ) and ∂g ∂e (e, r ) ∈ L (I × R+ ). Together with (2.24), this implies that Assumption (i) of Lemma A.1 is satisfied. Let us now check Assumption (ii). From the continuity of r1 (φ, e, ) and r2 (φ, e, ) with respect to e, we deduce that

qe (r ) → qe0 (r ) as e → e0 , for all r = r1 (φ, e0 , ), r = r2 (φ, e0 , ). This a.e. convergence, coupled to the fact that, by (2.26), the integral of the positive functions qe is independent of e, is enough to conclude, thanks to the Brézis-Lieb Theorem (see Theorem 1.9 of [33]), that qe converges to qe0 in L 1 (R+ ) as e → e0 . Assumption (ii) is then satisfied. Hence Lemma A.1 can be applied and aφ (e, ) is C 1 with respect ∂a to e with its derivative given by (2.23). From Lemma 2.1, ∂eφ (e, ) > 0 on ]eφ, , 0[, so by (2.19) e → aφ (e, ) is a C 1 diffeomorphism from ]eφ, , 0[ to ]0, +∞[. This concludes the proof of Lemma 2.3. 2.2. Regularity properties in φ of aφ and aφ−1 . We continue the analysis of the Jacobian aφ and claim further continuity and differentiability properties with respect to φ. Lemma 2.4. (Continuity properties of aφ (e, ) with respect to φ). Let f ∈ Erad be nonzero, let f n be a bounded sequence in Erad and denote φn = φ fn , φ = φ f . Assume that ∇φ fn → ∇φ f in L 2 (R3 ) as n → +∞. Then, for all > 0 fixed, the following convergence properties hold as n → +∞: eφn , → eφ, , r1 (φn , e, ) → r1 (φ, e, ), r2 (φn , e, ) → r2 (φ, e, ), ∀e ∈ ]eφ, , 0[, inf inf (1 + r )|φn (r )| > 0, n r >0

aφn (·, ) → aφ (·, ) uniformly on any compact subset of ] − ∞, 0[, (s, ) aφ−1 n

→

aφ−1 (s, )

for all s > 0.

(2.27) (2.28) (2.29) (2.30) (2.31)

Proof. Step 1. Convergence of the potentials. As a standard consequence of interpolation and ∇φn → ∇φ in L 2 , we have that ρ fn ρ f

weakly in L 5/3 (R3 ).

(2.32)

New Variational Approach to the Stability of Gravitational Systems

177

Moreover, by Sobolev embeddings and elliptic regularity, together with the spherical symmetry of f n , we have: φn → φ in L ∞ (R+ ) as n → +∞

(2.33)

φn → φ in L ∞ ([a, b]) as n → +∞, for all 0 < a < b.

(2.34)

and

Step 2. Proof of (2.27). To prove that eφn , → eφ, , we first pass to the limit n → ∞ into (2.3), and get lim sup eφn , ≤ eφ, .

(2.35)

n→+∞

Now, we know that the infimum eφn , is attained at rn = r0 (φn , ) and from (2.13) we have eφn , = φn (rn ) +

| fn |L 1 ≥− + 2. 2rn2 rn 2rn

(2.36)

We observe that inf | f n | L 1 > 0. n

(2.37)

Otherwise we would have, up to a subsequence, f n → 0 in L 1 (R6 ), and then φ f = 0. This means ρ f = φ f = 0, and then f = 0, which contradicts the assumption f = 0. Therefore, (2.36) and (2.37) ensure that the sequence rn is bounded and bounded away from 0 since (2.35) implies lim sup eφn , < 0. Then, any subsequence rn of rn satisfies rn → r0 > 0 (up to extraction) and one can pass to the limit in eφn , = φn (rn ) + 2r2 n

to get eφn , → φ(r0 ) +

≥ eφ, , 2r02

thus lim inf eφn , ≥ eφ, . n→+∞

(2.38)

Finally (2.35) and (2.38) imply (2.27). Step 3. Proof of (2.28). Let e ∈]eφ, , 0[. From Step 1, we have e ∈]eφn , , 0[ for n large enough, thus r1 (φn , e, ) is well-defined. By Lemma 2.1, r1 (φn , e, ) is characterized by e = φn (r1 ) +

and φn (r1 ) − 3 < 0. 2 2r1 r1

(2.39)

Moreover, by (2.8), r1 (φn , e, ) lies in a compact interval of R∗+ . Therefore, after extraction of a subsequence, we have r1 (φn , e, ) → r∗ > 0. Thanks to (2.33) and (2.34), one can pass to the limit in (2.39) and obtain e = φ(r∗ ) +

and φ (r∗ ) − 3 ≤ 0. 2 2r∗ r∗

178

M. Lemou, F. Florian, P. Raphaël

This is enough to conclude that r∗ = r1 (φ, e, ). We have thus proved (2.28) for r1 . The proof of r2 (φn , e, ) → r2 (φ, e, ) is similar. Step 4. Proof of (2.29). Let us prove (2.29): inf m φn =: m > 0, n

where m φn is defined by (2.20). Assume that m = 0. Then there exists a sequence rn such that (1 + rn )φn (rn ) → 0 as n → +∞. If rn is bounded, then, up to a subsequence, it goes to some r0 ≥ 0 and from (2.33), we get φ(r0 ) = 0, which is not possible from (2.11) and f = 0. Hence rn is not bounded and, up to a subsequence, it goes to +∞. Let a > 0. We get from (2.12), rn a (1 + rn )|φn (rn )| ≥ rn |φn (rn )| ≥ 4π s 2 ρ fn (s)ds ≥ 4π s 2 ρ fn (s)ds 0

0

for n large enough. Passing to the limit in this inequality and using (2.32), we get ρ f = 0, which again contradicts the assumption f = 0. We have thus proved (2.29). Step 5. Proof of (2.30). Now, we prove the uniform convergence of aφn . We observe from (2.8) that the interval of integration in the expression (2.18) of aφn is bounded. Thus, the dominated convergence theorem applies since

e − φn (r ) − 2 2r

1/2 ≤ (−φn (0))1/2 ≤ C, +

from (2.33). This yields aφn (e, ) → aφ (e, ), for all e < 0, > 0. Now using the monotonicity of the function e → aφn (e, ) at fixed and applying the second Dini’s theorem, we get the desired uniform convergence. Step 6. Proof of (2.31). Let (s, ) ∈ (R∗+ )2 . Denote (s, ), en = aφ−1 n

e0 = aφ−1 (s, ).

We will prove that en → e0 . From (2.21), we get |en | ≤ C

| fn |L 1 aφn (en , )

2

=C

| fn |L 1 s

2 .

(2.40)

m2 . 4( + 2m)

(2.41)

Now we claim that |en | ≥ C

m 2 s

> 0 if |en | ≤

Indeed, we first get from (2.22), |en | ≥ C provided that |e| ≤

m 2φn 4(+2m φn ) ,

m φn aφn (en , )

2 =C

m

φn

s

2

> 0,

(2.42)

with m φn defined by (2.20). From (2.29), we have m φn ≥

m > 0. Therefore, (2.42) implies (2.41) since the function t →

t2 +2t

is increasing.

New Variational Approach to the Stability of Gravitational Systems

179

We then deduce from (2.40) and (2.41) that the sequence en belongs to a compact interval of R∗− thus, up to a subsequence, we have en → e∞ ∈ R∗− as n → +∞. Using (2.30), we have s = aφn (en , ) → aφ (e∞ , ) as n → +∞. Hence, aφ (e∞ , ) = aφ (e0 , ) = s ∈ (0, ∞). Since e → aφ (e, ) is invertible from (eφ, , 0) onto (0, ∞), we deduce that e0 = aφ−1 (s, ) = e∞ , which means that en → e0 as n → +∞. The proof of (2.31) is complete. This concludes the proof of Lemma 2.4. Let us now examine the differentiability of aφ and aφ−1 with respect to φ. To shorten the statement of the next lemma, we introduce a few notations. We consider two nonzero = φ potentials φ = φ f ∈ rad and φ f ∈ rad and set: − φ. h=φ

(2.43)

For all > 0 and λ ∈ [0, 1], we recall the notation eφ+λh, = inf ψφ, (r ) + λh(r ) ,

(2.44)

where ψφ Q , (r ) is defined by (2.2), and denote , ) = (λ, e) : λ ∈ [0, 1] and e ∈]eφ+λh, , 0[ . (φ, φ

(2.45)

r ≥0

Let s ∈ R∗+ and λ ∈ [0, 1]. Recall that, by Lemma 2.8, there exists a unique e ∈ −1 (s, ), such that aφ+λh (e, ) = s. Finally, we set ]eφ+λh, , 0[, denoted by aφ+λh f | L 1 ). M = max(| f | L 1 , |

(2.46)

Lemma 2.5 (Differentiability of aφ (e, ) with respect to φ). Let > 0 be fixed. Consider . Then, with the notations (2.43)–(2.46), ∈ rad both nonzero and let h = φ − φ φ, φ the following holds: (i) The function (λ, e) → aφ+λh (e, ) , ). Moreover, we have is a C 1 function on (φ, φ √ r2 (φ+λh,e,) −1/2 ∂ aφ+λh (e, ) = −4π 2 2 e − ψφ, (r ) − λh(r ) h(r )dr, ∂λ r1 (φ+λh,e,) (2.47) with the bound: 2 ∂aφ+λh ≤ C M√ , ∀(λ, e) ∈ (φ, φ , ), (e, ) ∂λ e2 for some universal constant C > 0.

(2.48)

180

M. Lemou, F. Florian, P. Raphaël

−1 (ii) Let s ∈ R∗+ . Then the function λ → aφ+λh (s, ) is differentiable on [0, 1] and we have

−1/2 r2 −1 h(r )dr r1 aφ+λh (s, ) − ψφ, (r ) − λh(r ) ∂ −1 aφ+λh (s, ) = , (2.49)

−1/2 ∂λ r2 −1 a (s, ) − ψ (r ) − λh(r ) dr φ, φ+λh r1 −1 (s, ), ). where (ri )i=1,2 shortly denotes ri (φ + λh, aφ+λh

Proof. Recall from Lemma 2.4 that the functions eφ+λh, , r1 (φ + λh, e, ) and r2 (φ + λh, e, ) are continuous functions of λ (for fixed e and ). Step 1. Proof of (i). This proof of (i) will be done with Lemma A.1, exactly in the same manner as the regularity of aφ+λh (e, ) with respect to e in Lemma 2.8. We fix > 0 and introduce the following function: √ 1/2 g(λ, e, r ) = 8π 2 2 e − ψφ, (r ) − λh(r ) + , so that aφ+λh (e, ) =

r2 (φ+λh,e,)

r1 (φ+λh,e,)

g(λ, e, r )dr.

By (2.13) and (2.8), we have the following uniform bound: g(λ, e, r ) ≤ C

M r1 (φ + λh, e, )

1/2

M ≤ C√ ,

where M is defined by (2.46). Hence, one deduces from standard dominated convergence that (λ, e) → aφ+λh (e, ) is a C 0 function on [0, 1] × R− and satisfies , ). aφ+λh (e, ) > 0 ⇔ (λ, e) ∈ (φ, φ Let us now prove the differentiability of aφ+λh (e, ) with respect to λ. Let λ0 ∈ [0, 1], e0 = eφ+λ0 h, , and e ∈]e0 , 0[ be fixed. From the continuity of eφ+λh, with respect to λ, we have e ∈]eφ+λh, , 0[ for λ in a neighborhood I0 of λ0 . Hence, for λ ∈ I0 , the distributional partial derivative of g is given by √ −1/2 ∂g (λ, e, r ) = −4π 2 2 e − ψφ, (r ) − λh(r ) 1r1 (φ+λh,e,)
(2.50)

for all r = r1 (φ + λ0 h, e, ), r = r2 (φ + λ0 h, e, ). Now, we use (2.9) and (2.8): 0≤

1 M2 ∂g (λ, e, r ) ≤ C √ √ 1r1
(2.51)

New Variational Approach to the Stability of Gravitational Systems

181

with

r2

r1

qλ,e (r )dr = C

M2 √ π. e2

(2.52)

As in Step 2 of the proof of Lemma 2.8, one deduces from (2.50), (2.51), (2.52) and from the Brézis-Lieb Theorem that Assumptions (i) and (ii) of Lemma A.1 are satisfied. Hence the function aφ+λh (e, ) is differentiable with respect to λ and its differential ∂ ∂λ aφ+λh (e, ) is given by (2.47). ∂ , ). We now claim that ∂λ aφ+λh (e, ) is a continuous function of (λ, e) on (φ, φ Indeed, let λ0 ∈ [0, 1] and e0 = eφ+λ0 h, be fixed. A direct adaptation of the proof of (2.28) enables to show that r1 (φ + λh, e, ) → r1 (φ + λ0 h, e0 , ), r2 (φ + λh, e, ) → r2 (φ + λ0 h, e0 , ) as (λ, e) → (λ0 , e0 ). Thus we have the following a.e. convergence: ∂g ∂g (λ, e, r ) → (λ0 , e0 , r ) ∂λ ∂λ

as (λ, e) → (λ0 , e0 ), for all r = r2 (φ + λ0 h, e0 , ). (2.53)

Hence, using again the domination (2.51) and the fact that qλ,e → qλ0 ,e0 in L 1 (R+ ) as (λ, e) → (λ0 , e0 ), we deduce from the dominated convergence theorem that +∞ +∞ ∂g ∂g (λ, e, r )dr → (λ0 , e0 , r )dr as (λ, e) → (λ0 , e0 ). ∂λ ∂λ 0 0 ∂ ∂ aφ+λh (e, ) is a continuous function of (λ, e). Similarly, ∂e aφ+λh (e, ) In other words, ∂λ is a continuous function of (λ, e). Therefore, the function (λ, e) → aφ+λh (e, ) is C 1 on , ). Since the bound (2.48) stems directly from (2.51) and (2.52), the proof of (φ, φ Item (i) of Lemma 2.5 is complete. −1 Step 2. Differentiability of aφ+λh (s, ). Let (s, ) ∈ (R∗+ )2 . From Lemma 2.4, we already

−1 know that the function λ → aφ+λh (s, ) is continuous. Let λ0 ∈ [0, 1] and consider a sequence λn ∈ [0, 1] such that λn → λ0 as n → +∞. We write −1 −1 aφ+λ (s, ) − aφ+λ (s, ) nh 0h

λn − λ0

= A1 (λn ) A2 (λn ),

(2.54)

where we have set A1 (λn ) =

−1 −1 (s, ) − aφ+λ (s, ) aφ+λ nh 0h

−1 −1 aφ+λ0 h (aφ+λ (s, ), ) − aφ+λ0 h (aφ+λ (s, ), ) nh 0h

,

and A2 (λn ) = =

−1 −1 (s, ), ) − aφ+λ0 h (aφ+λ (s, ), ) aφ+λ0 h (aφ+λ nh 0h

λn − λ0 −1 −1 (s, ), ) aφ+λ0 h (aφ+λn h (s, ), ) − aφ+λn h (aφ+λ nh λn − λ0

,

182

M. Lemou, F. Florian, P. Raphaël

−1 −1 and where we simply used that aφ+λ0 h (aφ+λ (s, ), ) = s = aφ+λn h (aφ+λ (s, ), ). nh 0h Let us examine separately the convergence of the two factors A1 and A2 in (2.54). From −1 (s, ), we have: the continuity of λ → aφ+λh

lim a −1 (s, ) n→+∞ φ+λn h

−1 = aφ+λ (s, ), 0h

(2.55)

hence lim A1 (λn ) =

n→+∞

=

1 ∂aφ+λ0 h −1 ∂e (aφ+λ0 h (s, ), )

1

−1/2 . √ r2 −1 4π 2 2 r1 aφ+λ0 h (s, ) − ψφ, (r ) − λ0 h(r ) dr

(2.56)

Let us now examine the convergence of the term A2 (λn ), that we rewrite as follows: λn ∂A 1 A2 (λn ) = − (μ, en )dμ, λn − λ0 λ0 ∂λ where we have denoted A(λ, e) = aφ+λh (e, ),

−1 en = aφ+λ (s, ). nh

−1 (s, ). Moreover, from Step 1, we know that the By (2.55), we have en → e0 = aφ+λ 0h ∂A function (λ, e) → ∂λ (λ, e) is continuous at (λ0 , e0 ). Hence, we have ∂aφ+λh ∂A ∂A −1 (aφ+λ (s, ), ), lim (μ, en ) = (λ0 , e0 ) = 0h μ→λ0 , n→∞ ∂λ ∂λ ∂λ λ=λ0

from which we deduce that ∂aφ+λh −1 lim A2 (λn ) = − (aφ+λ (s, ), ) 0h n→+∞ ∂λ λ=λ0

−1/2 r2 √ −1 aφ+λ = 4π 2 2 (s, ) − ψ (r ) − λ h(r ) h(r )dr, φ, 0 0h r1

where we used (2.47). Finally, (2.56) and (2.57) give (2.49). This concludes the proof of Lemma 2.5. 2.3. Rearrangement with respect to a given microscopic energy. In this section, we introduce the Schwarz symmetrization of a function f with respect to a given micro2 scopic energy e = |v|2 + φ(x) at given momentum > 0. We start by defining a suitable rearrangement of f ∈ Erad at given momentum > 0 which preserves the generalized Casimir functionals (1.9). We proceed similarly like for the usual Schwarz symmetrization, see [33,27,1]. Let us recall the definition (1.13), (1.14) of the distribution function of f at given > 0: μ f (s, ) = ν {(r, u) ∈ , f (r, u, ) > s} +∞ +∞ 2 1 f (r,u,)>s (r 2 u 2 − )−1/2 r |u|1r 2 u 2 > dr du. = 4π r =0

u=−∞

We then have the following elementary lemma:

New Variational Approach to the Stability of Gravitational Systems

183

Lemma 2.6 (Properties of μ f ). Let f ∈ L 1 ∩ L ∞ (R6 ), nonnegative and spherically symmetric, and let μ f (s, ) be the distribution function of f at given as defined by (1.14). Then there exists a set A with |A|R+ = 0 such that ∀ ∈ R+ \A, ∀s > 0, μ f (s, ) < +∞, ∀ ∈ R+ \A, ∀s ≥ | f | L ∞ , μ f (s, ) = 0.

(2.57) (2.58)

Moreover, ∀ ∈ R+ \A, the map s → μ f (s, ) is right continuous on R∗+ . We may now introduce the generalized Schwarz symmetrization: Proposition 2.7 (Schwarz symmetrization at fixed > 0). Let f ∈ L 1 ∩ L ∞ (R6 ), nonnegative and spherically symmetric, let μ f (t, ) given by (1.14) and let A be the zero measure set given by Lemma 2.6. We define the Schwarz symmetrization f ∗ (·, ) of f at fixed as being the pseudo inverse of μ f (·, ): sup{s ≥ 0 : μ f (s, ) > t} for t < μ f (0, ) ∗ ∗ ∀t ≥ 0, ∀ ∈ R+ \A, f (t, ) = 0 for t ≥ μ (0, ) f (2.59) f ∗ (·, )

is a nonincreasing function on [0, ∞) and (2.60) ∀t ≥ 0, ∀ ∈ R∗+ \A, μ f (t, ) = {s > 0; f ∗ (s, ) > t |R+

with μ f given by (1.14). Then

In particular | f ∗ | L p (R+ ×R+ ) = | f | L p (R6 ) , ∀ p ∈ [1, +∞].

(2.61)

Moreover, the contractivity relation | f ∗ − g ∗ | L 1 ≤ | f − g| L 1 holds.

(2.62)

Lemma 2.6 and Proposition 2.7 can be derived from standard arguments by adapting for example the arguments in [44]; this is left to the reader. Given f ∈ Erad be and φ ∈ rad , we now define the rearrangement of f with respect 2 to the microscopic energy |v|2 + φ(x). Proposition 2.8 (Symmetric rearrangement with respect to a given microscopic energy). Let f ∈ Erad and φ ∈ rad non zero. Let f ∗ be its symmetric rearrangement defined by (2.59). We define the rearrangement f ∗φ of f with respect to the microscopic 2 energy |v|2 + φ(x) by: 2 |v| f ∗φ (x, v) = f ∗ aφ + φ(x), |x × v|2 , |x × v|2 1 |v|2 , (2.63) 2 2 +φ(x)<0 where aφ is defined in Lemma 2.3. Then f ∗φ ∈ Eq( f ),

(2.64)

where Eq( f ) is defined by (1.15), and in particular: | f ∗φ | L p = | f | L p , ∀ p ∈ [1, +∞]. Moreover,

f ∗φ

(2.65)

∈ Erad with estimate: ||v|2 f ∗φ | L 1 ≤ C|∇φ| L 2 | f | L 1 | f | L ∞ . 4/3

7/9

2/9

(2.66)

184

M. Lemou, F. Florian, P. Raphaël

Before proving this proposition, we give a corollary which says that nonincreasing steady states are invariant through the above rearrangement. Corollary 2.9 (Identification of Q ∗φ Q ). Let Q satisfy Assumption (A) and let Q ∗φ Q be defined according to (2.63). Then Q ∗φ Q = Q and we have Q ∗ aφ Q (e, ), = F(e, ), ∀ > 0, ∀e ∈ [eφ Q , , 0[, (2.67) where eφ Q , is defined in Lemma 2.1. In particular, for all > 0, Q ∗ (·, ) is a C 1 function on ]0, μ Q (0, )[, where μ Q is defined by (1.13). Proof of Corollary 2.9. Let > 0 be fixed and recall the function F defined in Assumption (A). Assume ) = 0 for all e < 0. From definition (1.13) we have first that F(e, 2 μ Q (s, ) = ν (r, u) : F |u|2 + φ Q (r ), > s = 0 for all s ≥ 0. This implies from (2.59) that Q ∗ (·, ) = 0, and then identity (2.67) is satisfied. Assume now that F(·, ) is not zero on R∗− and let e0 () = sup {e < 0 : F(e, ) > 0} .

(2.68)

By Assumption (A), we have e0 () ≤ e0 < 0 and the function e → F(e, ) is continuous, strictly decreasing on ]−∞, e0 ()] and vanishes for e ≥ e0 (). As F is nonnegative, we have from (1.13): 2 |u| + φ Q (r ), > F(e, ) , ∀e ∈ R, μ Q (F(e, ), ) = ν (r, u) : F 2 and, F(·, ) being strictly decreasing on ] − ∞, e0 ()], this identity implies |u|2 + φ Q (r ) < e , ∀e ≤ e0 () μ Q (F(e, ), ) = ν (r, u) : 2 = aφ Q (e, ), ∀e ≤ e0 ().

(2.69)

Assume that μ Q (0, ) = 0, then μ Q (·, ) = 0 since it is a nonincreasing function. Hence, from definition (2.59) we get Q ∗ (·, ) = 0. Now, we write (2.69) for e = e0 () and deduce from the structure of aφ Q that e0 () ≤ eφ Q , . This means that F(e, ) = 0 for e ∈ [eφ Q , , 0[, and identity (2.67) is satisfied. We now assume μ Q (0, ) > 0, which implies from (2.69) that e0 () > eφ Q , . We know that aφ Q (·, ) (resp. F(·, )) is continuous and one-to-one from [eφ Q , , e0 ()] to [0, aφ Q (e0 (), )] (resp. [0, F(eφ Q , , )]). Hence, identity (2.69) ensures that μ Q (·, ) is invertible from [0, F(eφ Q , , )] to [0, aφ Q (e0 (), )] and Q ∗ (which is by definition its pseudoinverse) is its inverse in this case. Therefore, (2.69) implies Q ∗ aφ Q (e, ), = F(e, ), ∀e ∈ [eφ Q , , e0 ()]. Now (2.69) implies that aφ Q (e, ) ≥ aφ Q (e0 (), ) = μ Q (0, ) for e ∈ [e0 (), 0[, which together with the definition of Q ∗ ensure that both terms in (2.67) vanish for e ∈ [e0 (), 0[. This ends the proof of (2.67). Finally, using (2.67), we conclude that the stated C 1 regularity of Q ∗ on ]0, aφ Q (e0 (), )[ is an immediate consequence of the C 1 regularity and the non vanishing derivatives of F and aφ Q on ]eφ Q , , e0 ()[. To end the proof of Corollary 2.9, it remains to identify Q and Q ∗φ Q for a.e. x, v. Let 2 (x, v) ∈ R6 such that = |x × v|2 > 0 and let e(x, v) = |v|2 + φ Q (r ) ≥ ψφ Q , (r ) ≥

New Variational Approach to the Stability of Gravitational Systems

185

eφ Q , , where we used that |v|2 ≥ r2 . If e(x, v) < 0, then (2.67) gives directly Q(x, v) = F(e(x, v), ) = Q ∗φ Q (x, v), by Assumption (A) and (2.63). If e(x, v) ≥ 0, then we have Q(x, v) = F(e(x, v), ) = Q ∗φ Q (x, v) = 0, using again (2.63). This concludes the proof of Corollary 2.9. Proof of Proposition 2.8. We first notice that the formula (2.63) is well-defined for a.e. (x, v) ∈ R6 by Proposition 2.7. Indeed, from (1.10) we have that (x, v) ∈ R6 : |x × v|2 ∈ A 6 = 0, R

where A is the measure zero exceptional set given in Lemma 2.6. Step 1. The change of variables formula. The equimeasurability of f and f ∗φ relies on the following elementary change of variables formula: let two nonnegative functions α ∈ C 0 (R) ∩ L ∞ (R), β ∈ L 1 (R+ × R+ ), then ∀ > 0, 2 +∞ +∞ 2 u u + φ(r ) β aφ + φ(r ), , 1 u 2 2 α dν 2 2 2 +φ(r )<0 0 0 +∞ = α(aφ−1 (s, ))β(s, )ds, (2.70) 0

where ν is given by (1.12). This implies in particular from (1.10): 2 2 |v| |v| α + φ(x) β aφ + φ(x), |x × v|2 , |x × v|2 d xdv 2 2 R6 +∞ +∞ = α(aφ−1 (s, ))β(s, )dsd. (2.71) 0

0

Let us prove (2.70). We first perform the change of variable on the integral on u in the 2 lhs of (2.70), e = u2 + φ(r ), to obtain 2 2 u u + φ(r ) β aφ + φ(r ), , 1 u 2 2 α dν 2 2 2 +φ(r )<0 r,u≥0 √ +∞ 0 −1/2 2 = 4π 2 α(e)β aφ (e, ), e−φ(r )− 2 1e−φ(r )− >0 dedr. 2r 2r 2 r =0 e=−∞ Now from Fubini and (2.23), we get 2 2 u u + φ(r ) β aφ + φ(r ), , 1 u 2 α dν 2 2 2 2 +φ(r )<0 r,u≥0 0 ∂aφ (e, )de. = α(e)β aφ (e, ), ∂e eφ, From Lemma 2.3, for any > 0, the map e → aφ (e, ) is a C 1 -diffeomorphism from ]eφ, , 0[ to ]0, +∞[, and we may therefore perform the change of variable s = aφ (e, ), which together with (2.19) yields (2.70). Step 2. Equimeasurability and proof of (2.66). Let now φ ∈ rad , f ∈ Erad and f ∗φ given by (2.63). We first prove that f ∗φ ∈ Eq( f ) according to the definition (1.15). For

186

M. Lemou, F. Florian, P. Raphaël

all t ≥ 0 and > 0, we use the definition of μ f (·, ) (1.14), of f ∗φ (2.63) and the formula (2.70) with α = 1 and β(s, ) = 1 f ∗ (s,)>t to get: +∞ +∞ +∞ 1 f ∗φ (r,u,)>t dν = 1 f ∗ (s,)>t ds, μ f ∗φ (t, ) = 2 0

0

0

and hence from (2.60): ∀t ≥ 0, a.e. > 0, μ f ∗φ (t, ) = μ f (t, ), which implies the equimeasurability of f and f ∗φ according to the definition (1.15) It remains to control the kinetic energy of f ∗φ according to (2.66). Indeed: 2 |v| + φ f ∗φ (x, v)d xdv + 2 ∇φ(x) · ∇φ f ∗φ d x ||v|2 f ∗φ | L 1 = 2 2 1/4 7/12 1/6 ≤ 2 ∇φ(x) · ∇φ f ∗φ d x |∇φ| L 2 ||v|2 f ∗φ | L 1 | f ∗φ | L 1 | f ∗φ | L ∞ , where we used (2.63) and the interpolation inequality (2.10). This together with a straightforward localization argument concludes the proof of (2.66). This concludes the proof of Proposition 2.8. Let us conclude this section with an elementary lemma which will be useful in the sequel. Lemma 2.10 (Pseudo inverse of f ∗ (aφ (·, ), )). Let f ∈ Erad and φ ∈ rad be given nonzero functions, and let > 0 such that f ∗ (0, ) > 0. The function e → f ∗ (aφ (e, ), ) is nonincreasing from [eφ, , 0[ to [0, f ∗ (0, )]. We define its pseudo inverse, which we denote (with abuse of notation) s → ( f ∗ ◦ aφ )−1 (s, ), as follows: ( f ∗ ◦ aφ )−1 (s, ) = sup{e ∈ [eφ, , 0[: f ∗ (aφ (e, ), ) > s},

(2.72)

for all s ∈]0, f ∗ (0, )[. Then s → ( f ∗ ◦ aφ )−1 (s, ) is a nonincreasing function and ∀(x, v) ∈ (R3 )2 such that |x × v|2 = , ∀s ∈]0, f ∗ (0, )[, |v|2 + φ(x) ≤ ( f ∗ ◦ aφ )−1 (s, ), 2 |v|2 + φ(x) ≥ ( f ∗ ◦ aφ )−1 (s, ). f ∗φ (x, v) ≤ s ⇒ 2

f ∗φ (x, v) > s ⇒

(2.73) (2.74)

Proof. Let > 0 and s ∈ (0, f ∗ (0, )), then f ∗ (aφ (eφ, , ), ) = f ∗ (0, ) > s and hence {e ∈ [eφ, , 0) : f ∗ (aφ (e, ), ) > s} is not empty. This means that ( f ∗ ◦aφ )−1 (s, ) is well defined for s ∈ (0, f ∗ (0, )). The monotonicity of ( f ∗ ◦ aφ )−1 follows from the monotonicity of f ∗ and aφ . Let now (x, v) ∈ R6 be such that |x × v|2 = > 0. Assume f ∗φ (x, v) > s, 2 2 then f ∗ (aφ ( |v|2 + φ(x), ), ) > s and thus |v|2 + φ(x) < 0. Thus we have either |v|2 2

+ φ(x) < eφ, , and in this case (2.73) is trivial, or

|v|2 2

+ φ(x) ∈ [eφ, , 0), and

New Variational Approach to the Stability of Gravitational Systems

this implies

|v|2 2

+ φ(x) ≤ ( f ∗ ◦ aφ )−1 (s, )) from the definition (2.72). Thus (2.73) |v|2 2 + 2 that |v|2

is then proved. Assume now f ∗φ (x, v) ≤ s. If otherwise

187

f ∗φ (x, v)

≤s < f ∗ (a

e ∈ {e ∈ [eφ, , 0) : and (2.74) follows.

f ∗ (0, )

φ (e, ), )

implies

φ(x) ≥ 0 then (2.74) is trivial, + φ(x) ∈ (eφ, , 0). Thus for all

> s} which is a non empty set,

|v|2 2

+ φ(x) ≥ e,

3. Nonlinear Stability of the Vlasov Poisson System This section is devoted to the proof of the main results of this paper. We first exhibit the key monotonicity formula involving the generalized symmetric rearrangement with respect to the Poisson field (2.63), Proposition 3.1, which allows us to reduce the study of the minimization problem of Theorem 1.3 to the one of an unconstrained minimization problem on the Poisson field only. The study of this new problem, that is the proof of Proposition 3.2, is postponed to Sect. 4, and immediately yields Theorem 1.3. We then show how to extract compactness from minimizing sequences to prove Theorem 1.4 which now implies Theorem 1.5 from standard arguments. 3.1. The monotonicity formula. Given f ∈ Erad , we will note to ease notation: f = f ∗φ f ,

(3.1)

f ∈ Erad ∩ Eq( f ).

(3.2)

and recall from Proposition 2.8 that:

We introduce the functional of φ ∈ rad : 1 J f ∗ (φ) = H( f ∗φ ) + |∇φ − ∇φ f ∗φ |2 , 2

(3.3)

and claim the following monotonicity formula which is a fundamental key for our analysis -see also [2] for related statements: Proposition 3.1 (Monotonicity of the Hamiltonian under the f ∗φ f rearrangement). Let f ∈ Erad , non zero, and f given by (3.1), then: H( f ) ≥ J f ∗ (φ f ) ≥ H( f ).

(3.4)

Moreover, H( f ) = H( f ) if and only if f = f. Proof. Let f, g ∈ Erad , then: 1 1 2 H( f ) = |v| f − |∇φ f |2 2 R6 2 R3 2 |v| 1 1 2 + φ f ( f − g) + = |v| g + φf g + |∇φ f |2 2 2 R6 2 R6 R3 2 |v| 1 1 +φ f ( f − g)+ = H(g)+ |∇φg |2 − ∇φ f · ∇φg + |∇φ f |2 , 2 2 2 R6

188

M. Lemou, F. Florian, P. Raphaël

and hence the general formula: ∀ f, g ∈ Erad , 2 |v| 1 2 + φ f (x) ( f − g) d xdv. H( f ) = H(g) + |∇φ f − ∇φg | L 2 + 2 2 R6

(3.5)

We apply this formula with g = f = f ∗φ f and rewrite the result using (3.3): 2 |v| + φ f (x) ( f − H( f ) = J f ∗ (φ f ) + f ) d xdv. 2 R6 We now claim:

R6

|v|2 f ) d xdv ≥ 0, + φ f (x) ( f − 2

(3.6)

with equality if and only if f = f , which immediately implies (3.4). The proof of (3.6) is reminiscent from the standard inequality for symmetric rearrangement known as the "bathtub" principle ∗ |x| f ≥ |x| f, R6

+∞

R6

see [33]. Indeed, use f (x, v) = t=0 1t< f (x,v) dt and Fubini to derive: 2 +∞ 2 |v| |v| + φf ( f − + φf dt f ) d xdv = 2 2 R6 R6 t=0

× 1t< f (x,v) − 1t< f (x,v) d xdv +∞ 2

|v| 1 + φf = dt − 1 f (x,v)≤t< f (x,v) f (x,v)≤t< f (x,v) d xdv 2 R6 t=0 ∞ f ∗ (0,) u2 + φ f (r ) dν , = d dt − (3.7) 2 S1, (t) S2, (t) =0 t=0 where dν is given by (1.12), and f (r, u, ) ≤ t < f (r, u, )}, S1, (t) = {(r, u) ∈ , S2, (t) = {(r, u) ∈ , f (r, u, ) ≤ t < f (r, u, )}. We now use (2.73) in Lemma 2.10 to obtain: ∀t ∈ (0, f ∗ (0, )), 2 u + φ f (r ) dν ≤ ( f ∗ ◦ aφ )−1 (t, )ν (S2, (t)), 2 S2, (t) where we recall that

ν (S2, (t)) = 4π 2

S2, (t)

1r 2 u 2 > (r 2 u 2 − )−1/2 r |u|dr du.

We then observe from f ∈ Eq( f ) that: for a.e. t > 0, ν (S1, (t)) = ν (S2, (t)),

New Variational Approach to the Stability of Gravitational Systems

189

and deduce

S2, (t)

u2 ∗ −1 + φ f (r ) dν ≤ ( f ◦ aφ ) (t, ) dν . 2 S1, (t)

Injecting this into (3.7) and using (2.74) yields: 2 |v| f ) d xdv + φf ( f − 2 R6 2 ∞ f ∗ (0,) u ≥ + φ f (r ) − ( f ∗ ◦ aφ )−1 (t, ) dν ≥ 0 d dt 2 S1, (t) =0 t=0 and the analogous inequality for S2, (t):

|v|2 f ) d xdv + φf ( f − 2 R6 ∞ f ∗ (0,) u2 ≥ ( f ∗ ◦ aφ )−1 (t, ) − − φ f (r ) dν ≥ 0. d dt 2 S2, (t) =0 t=0

2 Moreover, assume that R6 |v|2 + φ f (x) ( f − f ) d xdv = 0. Recalling that ν (S1, (t)) = ν (S2, (t)) = 0 for t > f ∗ (0, ), the above two chains of equalities imply that for a.e t, > 0, either ν (S1, (t)) = ν (S2, (t)) = 0 or ν (S1, (t)) = ν (S2, (t)) > 0 with: u 21 u2 + φ f (r1 ) = ( f ∗ ◦ aφ )−1 (t, ) = 2 + φ f (r2 ), 2 2 for a.e (r1 , u 1 ) ∈ S1, (t), a.e (r2 , u 2 ) ∈ S2, (t), which contradicts the fact that f (r1 , u 1 , ) ≤ t < f (r2 , u 2 , ). We conclude that a.e t, > 0, ν (S1, (t)) = ν (S2, (t)) = 0 which implies f = f . This concludes the proof of (3.6) and of Proposition 3.1. 3.2. Reduction to a variational problem on φ and proof of Theorem 1.3. We now claim the following local coercivity property of the functional of φ given by (3.3). To ease notations, we let for φ ∈ rad : 1 ∗φ |∇φ − ∇φ Q ∗φ |2 . (3.8) J (φ) = J Q ∗ (φ) = H(Q ) + 2 R3 Proposition 3.2 (φ Q is a local strict minimizer of J ). There exist a constant C0 > 0 such that the following holds. For all R > 0, there exists δ0 (R) ∈]0, 21 |∇φ Q | L 2 ] such that, for all f ∈ Erad satisfying | f − Q|E ≤ R,

|∇φ f − ∇φ Q | L 2 ≤ δ0 (R),

we have J (φ f ) − J (φ Q ) ≥ C0 |∇φ f − ∇φ Q |2L 2 .

(3.9)

190

M. Lemou, F. Florian, P. Raphaël

The proof of this proposition essentially relies on Antonov’s coercivity property and is postponed to Sect. 4. Theorem 1.3 is now a straightforward consequence of Propositions 3.1 and 3.2. Proof of Theorem 1.3. Let R > 0 and f ∈ Erad ∩ Eq(Q) satisfying (1.18), where δ0 (R) is as in Proposition 3.2. In particular, note that |∇φ f − ∇φ Q | L 2 ≤

1 |∇φ Q | L 2 2

implies that φ f = 0 and f = 0. Then the monotonicity property (3.4), f ∗ = Q ∗ and (3.3) yield: H( f ) − H(Q) ≥ J f ∗ (φ f ) − H(Q) = J (φ f ) − H(Q).

(3.10)

On the other hand, recall from Corollary 2.9 that our assumption on the ground state Q ensures = Q ∗φ Q = Q and thus H(Q) = J (φ Q ). Q Injecting this together with (3.9) into (3.10) yields: H( f ) − H(Q) ≥ J (φ f ) − J (φ Q ) ≥ C0 |∇φ f − ∇φ Q |2L 2 ,

(3.11)

this is (1.19). If in addition H( f ) = H(Q), then φ f = φ Q and hence using f ∗ = Q ∗ : H( f ∗φ f ) = H(Q ∗φ f ) = H(Q ∗φ Q ) = H(Q) = H( f ). We thus are in the case of equality of Proposition 3.1 from which: f = f ∗φ f = f ∗φ Q = Q ∗φ Q = Q. This concludes the proof of Theorem 1.3.

3.3. Compactness of minimizing sequences. We are now in position to prove Theorem 1.4. Proof of Theorem 1.4. The key to extract compactness is the monotonicity formula (3.11) which yields a lower bound on the Hamiltonian involving the Poisson field φ f only, while standard Sobolev embeddings ensure that φ f enjoys nice compactness properties in the radial setting. Step 1. Weak convergence in L p , p > 1. Let 7/9

2/9

R = |Q|E + C(1 + |∇φ Q | L 2 )4/3 |Q| L 1 |Q| L ∞ + |Q| L 1 + |Q| L ∞ ,

(3.12)

where C is the constant in the interpolation inequality (2.66). Let f n ∈ Erad be a sequence satisfying (1.21), (1.22), where δ will be fixed further, satisfying in particular 1 δ ≤ min 1, |∇φ Q | L 2 . (3.13) 2

New Variational Approach to the Stability of Gravitational Systems

191

Observe that (1.21) and (3.13) imply φ fn = 0. The sequence f n∗ is bounded in L 1 by (1.22), so f n is itself bounded in L 1 . Moreover, from H( f n ) < C, the L ∞ bound of f n and the interpolation inequality (2.10), |v|2 f n is uniformly bounded in L 1 . Hence f n is bounded in the energy space Erad . We then get: f n f ∈ Erad in L p for all 1 < p < +∞,

(3.14)

up to a subsequence. Moreover, by a standard consequence of interpolation, Sobolev embeddings and elliptic regularity, we have |∇φ fn − ∇φ f | L 2 → 0 and |φ fn − φ f | L ∞ → 0 as n → +∞.

(3.15)

From assumptions (1.21) and (1.22): |∇φ f − ∇φ Q | L 2 ≤ δ.

(3.16)

In particular, φ f = 0, since δ < |∇φ Q | L 2 from (3.13). Hence, by Proposition 2.8, we have Q ∗φ f ∈ Eq(Q).

(3.17)

Step 2. Strong convergence in E of the sequence Q ∗φ fn . We now aim at extracting a preliminary compactness from f n . Let f n = Q ∗φ fn ,

f = Q ∗φ f ,

(3.18)

and observe that f n is in fact a function of φ fn . We then claim that the strong convergence (3.15) automatically implies some strong compactness in E for fn : (1 + |v|2 ) f n → (1 + |v|2 ) f in L 1 (R6 ).

(3.19)

We claim also that there exists δ1 (R) such that, for 0 < δ ≤ δ1 (R) we have |∇φ f − ∇φ Q | L 2 ≤

δ0 (R) , 2

(3.20)

where R is defined by (3.12) and δ0 (R) is defined in Theorem 1.3. We are now ready to fix the constant δ of Theorem 1.4 as follows: 1 δ = min 1, |∇φ Q | L 2 , δ1 (R) . 2 Proof of (3.19), (3.20). We first claim the a.e convergence: f as n → +∞ for a.e (x, v) ∈ R6 . fn → Indeed, let (x, v) ∈ R6 such that |x × v|2 = > 0. If e = (3.15),

|v|2 2

|v|2 2

(3.21)

+ φ f (x) < 0, then from

+ φ fn (x) < e/2 for n large enough and 2 2 |v| = − |v| − φ f (x) ≤ −φ f (0) ≤ C. + φ (x) fn n n 2 2

(3.22)

We now recall from Lemma 2.4 that for all > 0: aφ fn (e, ) → aφ f (e, ),

(3.23)

192

M. Lemou, F. Florian, P. Raphaël

uniformly with respect to e lying in a compact subset of ] − ∞, 0[. Therefore, from |v|2 2 + φn (x) < e/2 < 0 and from (3.22), 2 2 |v| |v| aφ f n as n → +∞. + φ fn (x), |x × v|2 → aφ + φ f (x), |x × v|2 2 2 Since, by Corollary 2.9, Lemma 2.3 and Assumption (A), the function Q ∗ (·, ) is con2 tinuous, this implies Q ∗φ fn (x, v) → Q ∗φ (x, v). Similarly, |v|2 + φ f (x) > 0 implies |v|2 2 + φn (x) > Q ∗φ fn → Q ∗φ f

0 for n large enough and thus Q ∗φ fn (x, v) = Q ∗φ f (x, v) = 0. Hence a.e in R6 and (3.21) is proved. Now recall from Proposition 2.8 and from φ fn = 0, φ f = 0, that f n ∈ Eq(Q) and f ∈ Eq(Q) so that Q= fn = f. ∀n ≥ 1, R6

R6

R6

to f and the fact that | fn |L 1 = | f |L 1 The almost everywhere convergence of fn = allows us to apply the Brézis-Lieb Lemma (see [33], Theorem 1.9) and get the strong L 1 convergence, Q ∗φ fn

fn → f in L 1 as n → +∞.

(3.24)

It remains to prove the strong convergence of the kinetic energy. Let us decompose f n = 1|v|2 ≤R f n + 1|v|2 >R f n = gn,R + h n,R . The L 1 convergence (3.24) implies: ∀R > 0, |v|2 gn,R → |v|2 1|v|2 ≤R f n in L 1 . Consider 2 the other term. We recall that f n = Q ∗φ fn is supported in the set |v|2 + φ fn (x) < 0. Hence, by interpolation, ||v|2 h n,R | L 1 = |v|2 h n,R (x, v) d xdv ≤ −2 φ fn (x)h n,R (x, v) d xdv 1/4

7/12

1/6

|∇φ fn | L 2 ||v|2 h n,R | L 1 |h n,R | L 1 |Q| L ∞ , which yields 7/9

||v|2 h n,R | L 1 ≤ C |h n,R | L 1 . By writing |h n,R | L 1 ≤ |Q ∗φ fn − Q ∗φ f | L 1 +

|v|2 >R

Q ∗φ f (x, v) d xdv,

we obtain that ||v|2 h n,R | L 1 converges to 0 when R → +∞ and n → +∞ independently. This together with the convergence of |v|2 gn,R concludes the proof of (3.19). We now turn to the proof of (3.20) and claim that it follows directly from (3.16) and the definition f = Q ∗φ f . Indeed, arguing by contradiction, we extract a subsequence δ0 (R) ∇φn → ∇φ Q in L 2 and gn = Q ∗φn such that |∇φ gn − ∇φ Q | L 2 ≥ 2 . From (2.66), gn is a bounded sequence in Erad and then the same proof like for (3.19) yields gn → (1 + |v|2 )Q ∗φ Q = (1 + |v|2 )Q in L 1 (1 + |v|2 ) 2 and hence ∇φ gn → ∇φ Q in L , a contradiction. This concludes the proof of (3.20).

New Variational Approach to the Stability of Gravitational Systems

193

Step 3. Identification of the limit. Following (3.1), we let: f n = f n ∗φ fn .

(3.25)

We now claim that the variational characterization of Q given by Theorem 1.3 and the monotonicity of Proposition 3.1 allow us to identify the limit: f = Q and φ f˜ = φ f = φ Q ,

(3.26)

and to obtain the additional convergence: 2 |v| + φ fn (x) f n − f n d xdv → 0 as n → +∞. 2 R6

(3.27)

Proof of (3.26), (3.27). First observe from (3.19), | f n | L ∞ = |Q ∗ | L ∞ and (2.10) that: H( f n ) → H( f ),

∇φ in L 2 . f n → ∇φ f

(3.28)

From (2.71), | fn − fn |L 1 2 |v| ∗ ∗ 2 2 + φ fn , |x × v| , |x × v| 1 |v|2 = | f n − Q | aφ f n d xdv 2 2 +φ f n (x)<0 R6 +∞ +∞ = | f n∗ − Q ∗ |(s, ) dsd 0

0

→ 0 as n → +∞ holds from the assumption (1.22). Together with (3.19), this yields: f in L 1 (R6 ). fn →

(3.29)

We now invoke the identity (3.5) with g = f n to derive: 2 |v| 1 2 |∇φ fn − ∇φ + φ f n (x) ( f n − f n ) d xdv f n | + H( f n ) − H(Q) + 2 2 R6 2 |v| + φ fn (x) ( = H( f n ) − H(Q) + fn − f n ) d xdv. (3.30) 2 R6 Let us examine the various terms of this identity. From (3.28) and (3.20), |∇φ f n − ∇φ Q | L 2 < δ0 (R)

(3.31)

for n large enough, where δ0 (R) is defined in Theorem 1.3. Moreover, from the definition (3.18) and from Proposition 2.8, we have the following estimates on f n and f: f n | L ∞ = |Q| L ∞ , ||v|2 f n | L 1 ≤ C|∇φ fn | L 2 |Q| L 1 |Q| L ∞ , | f n | L 1 = |Q| L 1 , | 4/3

7/9

2/9

| f | L 1 = |Q| L 1 , | f | L ∞ = |Q| L ∞ , ||v|2 f | L 1 ≤ C|∇φ f | L 2 |Q| L 1 |Q| L ∞ , 4/3

7/9

2/9

where C is the constant in the interpolation inequality (2.66). Since, by (3.13) and (1.21), we have |∇φ fn | L 2 ≤ 1 + |∇φ Q | L 2 ,

194

M. Lemou, F. Florian, P. Raphaël

we deduce from (3.15) and (3.12) that | f n − Q|E ≤ |Q|E + | f n |E ≤ R,

| f − Q|E ≤ |Q|E + | f |E ≤ R.

(3.32)

Therefore, from (3.31), (3.32) and f n ∈ Eq(Q), the variational characterization of Q given by Theorem 1.3 ensures: H( f n ) − H(Q) ≥ 0. Next, from (3.6):

R6

We now claim that: R6

|v|2 + φ fn (x) ( f n − f n ) d xdv ≥ 0. 2

|v|2 fn − + φ fn ( f n ) d xdv → 0 2

as n → +∞.

(3.33)

Indeed, from (2.71):

|v|2 + φ fn (x) ( fn − f n ) d xdv 2 R6 2 |v|2 |v| ∗ ∗ 2 2 + φ f n (x) (Q − f n ) aφ fn + φ fn (x), |x × v| , |x × v| = 2 2 R6 ×1 |v|2 d xdv 2 +φ f n <0 +∞ = aφ−1f , (s)(Q ∗ − f n∗ )(s, ) dsd,

0

n

where we recall that aφ−1f , is the inverse of the diffeomorphism e → aφ (e, ). From n

Lemma 2.3, (2.5), and (3.15), we have |aφ−1f , (s)| ≤ −eφ fn , ≤ −φ fn (0) ≤ C, and n

hence the L 1 convergence of f n∗ to Q ∗ yields (3.33). Finally, since (1.22) gives lim sup H( f n ) ≤ H(Q), we deduce that all the nonnegative n→+∞

quantities in the left-hand side of (3.30) converge to 0: ∇φ fn − ∇φ fn → 0

in L 2 ,

(3.34)

H( f n ) → H(Q),

(3.35)

H( f n ) → H(Q).

(3.36)

and (3.27) holds, and moreover:

Hence, (3.28) and (3.35) imply H( f ) = H(Q) and thus (3.20), (3.32) and Theorem 1.3 ensure: f = Q and φ f = φQ = φ f ,

New Variational Approach to the Stability of Gravitational Systems

195

where we used (3.15), (3.28) and (3.34) for the last identity. This concludes the proof of (3.26), (3.27). Step 4. Strong convergence in L 1 of f n to Q . We first note that (3.29) and (3.26) imply f n → Q in L 1 . We then claim that the extra gain (3.27) -which is again a consequence of the monotonicity (3.4)- allows us to identify Q as the limit of f n . We indeed claim that (3.27) and | f n − Q| L 1 → 0 imply (Q − f n )+ d xdv → 0 as n → +∞. (3.37) R6

Let us assume (3.37) and conclude the proof of Theorem 1.4. We first claim that (3.37), together with f n → Q in L 1 , imply ( f n − Q)+ d xdv → 0 as n → +∞. (3.38) R6

Indeed we observe that for all g, h ∈ L 1 (R6 ) with g ≥ 0, h ≥ 0, we have +∞ meas {g ≤ t < h} dt = (h − g)+ d xdv ≤ |g − h| L 1 , R6

0

and thus: R6

( f n − Q)+ d xdv ≤ ≤ =

( fn − f n )+ d xdv +

6 R+∞

0 +∞ 0

R6

(3.39)

( f n − Q)+ d xdv

f n − Q| L 1 meas f n ≤ t < f n dt + | f n − Q| L 1 meas f n ≤ t < f n dt + |

( f n − f n )+ d xdv + | f n − Q| L 1 ≤ (Q − f n )+ d xdv + ( f n − Q)+ + | f n − Q| L 1 6 R6 R ≤ (Q − f n )+ d xdv + 2| f n − Q| L 1 ,

=

R6

R6

where we repeatedly used (3.39) and the fact that f n ∈ Eq( f n ) implies fn . ∀t > 0, meas f n ≤ t < f n = meas f n ≤ t < As f n → Q in L 1 , we then conclude that (3.37) implies (3.38). Finally adding (3.37) and (3.38) gives | f n − Q| L 1 → 0 as n → +∞. Furthermore, (3.36) and the strong convergence ∇φ fn → ∇φ Q in L 2 imply: ||v|2 f n | L 1 → ||v|2 Q| L 1

as n → +∞.

196

M. Lemou, F. Florian, P. Raphaël

Together with the a.e. convergence of f n , this yields the strong convergence of |v|2 f n to |v|2 Q. Note that the uniqueness of the limit now implies the convergence of all the sequence f n which completes the proof of (1.23). Proof of (3.37). It is a consequence of (3.27) and of the convergence: f n → Q in L 1 .

(3.40)

We first claim that (3.27) remains true if one replaces φ fn by φ Q : 2

|v| ∗φ + φ Q (x) f n − f n Q d xdv → 0, as n → +∞. 0 ≤ Tn := 2 R6

(3.41)

The fact that Tn ≥ 0 can be proved exactly in the same way as for (3.6), since f n and ∗φ f n Q are equimeasurable. Let us now prove that Tn → 0. We observe from assumption ∗φ (1.22) that | f n Q | L 1 = | f n∗ | L 1 = | f n | L 1 → |Q| L 1 , and that f n∗ (s, ) → Q ∗ (s, ), for ae s > 0, > 0. This implies that ∗φ Q

fn

(x, v) → Q(x, v), for ae (x, v) ∈ R6 .

As a consequence of the Brézis-Lieb Lemma (see [33], Theorem 1.9), we then get ∗φ Q

| fn

− Q| L 1 → 0.

(3.42)

We now write 2

|v| ∗φ + φ fn (x) Tn − f n − f n fn d xdv 2 R6

|v|2

∗φ ∗φ Q ∗φ + φ f n (x) = + φ Q − φ fn f n − f n f n fn − f n Q d xdv 2 R6 R6 2 |v| ∗φ Q ∗φ fn φ f − φ Q f n − f n∗φ Q + f − ≤ (x) − f − φ d xdv f n n n n 2 R6 R6 + ∗φ Q

≤ |φ f n − φ Q | L ∞ | f n − f n

∗φ fn

| L 1 + |φ f n (0)|| f n

∗φ Q

− fn

|L 1

→ 0,

where we have used the definition (2.63) of f ∗φ , the uniform convergence of the potential ∗φ φ fn , the boundedness of f n and f n Q in the energy space, and the L 1 convergences (3.40) and (3.42). Using Tn ≥ 0 and the convergence (3.27), we finally deduce that Tn → 0, and (3.41) is proved. Arguing as in the proof of (3.6), we write (3.41) in the following equivalent form 2 2 +∞ |v| |v| Tn = + φ Q (x) d xdv − + φ Q (x) d xdv → 0, dt 2 2 S1n (t) S2n (t) t=0 (3.43) where ∗φ Q

S1n (t) = {(x, v) ∈ R6 , f n S2n (t)

(x, v) ≤ t < f n (x, v)}, ∗φ Q

= {(x, v) ∈ R , f n (x, v) ≤ t < f n 6

((x, v)}.

New Variational Approach to the Stability of Gravitational Systems

197

Now from (2.74), we have |v|2 + φ Q (x) ≥ ( f n∗ ◦ aφ Q )−1 (t, |x × v|2 ), 2 if (x, v) ∈ S1n (t). Thus +∞

Tn ≥

dt

S1n (t)

t=0

( f n∗ ◦ aφ Q )−1 (t, |x ×v|2 )d xdv−

S2n (t)

|v|2 +φ Q (x) d xdv . 2 (3.44)

∗φ Q

As a consequence of the equimeasurability of f n S2n (t)

−

and f n , we claim that

( f n∗ ◦ aφ Q )−1 (t, |x × v|2 )d xdv = 0.

S1n (t)

(3.45)

Indeed, we first use the change of variables r = |x|, u = |v|, = |x × v|2 , to get S1n (t)

( f n∗

◦ aφ Q )

−1

(t, |x × v| )d xdv = 2

=

∞

=0 ∞ =0

n (t) S1,

( f n∗ ◦ aφ Q )−1 (t, )dν (r, u)d,

n ( f n∗ ◦ aφ Q )−1 (t, )ν (S1, )(t)d,

and the same identity holds for S2n (t), where ν is given by (1.12), and ∗φ Q

n S1, (t) = {(r, u) ∈ , f n

(r, u, ) ≤ t < f n (r, u, )}, ∗φ Q

n (t) = {(r, u) ∈ , f n (r, u, ) ≤ t < f n S2, ∗φ Q

Since f n

∈ Eq( f n ), we have:

n ν (S1, (t))

=

n ν (S2, (t))

= 4π

2 n (t) S2,

(r, u, )}.

1r 2 u 2 > (r 2 u 2 − )−1/2 r |u|dr du.

This implies (3.45) and then (3.44) gives: 2 +∞ |v| ( f n∗ ◦ aφ Q )−1 (t, |x × v|2 ) − Tn ≥ + φ Q (x) d xdv. dt 2 S2n (t) t=0

(3.46)

Now from (2.73), we have ( f n∗ ◦ aφ Q )−1 (t, |x × v|2 ) ≥

|v|2 + φ Q (x), 2

for (x, v) ∈ S2n (t). Thus, from (3.41) and (3.46), we get 2 |v| + φ Q (x) 1 S2n (t) (x, v) → 0, An = ( f n∗ ◦ aφ Q )−1 (t, |x × v|2 ) − 2

(3.47)

198

M. Lemou, F. Florian, P. Raphaël

as n → +∞, for almost every (t, x, v) ∈ R+ × R3 × R3 . We now claim that this implies 2 |v| ∗ −1 2 Bn = (Q ◦ aφ Q ) (t, |x × v| ) − (3.48) + φ Q (x) 1 S n (t) (x, v) → 0, 2 2 as n → +∞, for almost every (t, x, v) ∈ R+ × R3 × R3 , where n

S 2 (t) = {(x, v) ∈ R6 ), f n (x, v) ≤ t < Q(x, v)}. To prove (3.48), we write

n n S2n = S2n \S 2 ∪ S2n ∩ S 2 ,

n n n S 2 = S 2 \S2n ∪ S2n ∩ S 2 ,

and get An − Bn =

|v|2 + φ Q (x) − (Q ∗ ◦ aφ Q )−1 (t, |x × v|2 ) 1 S n (t)\S n (t) 2 2 2 2 |v| − φ Q (x) 1 S n (t)\S n (t) + ( f n∗ ◦ aφ Q )−1 (t, |x × v|2 ) − 2 2 2 ! + ( f n∗ ◦ aφ Q )−1 (t, |x × v|2 ) − (Q ∗ ◦ aφ Q )−1 (t, |x × v|2 ) 1 S n (t)∩S n (t) . 2

2

(3.49) We shall now examine the behavior of each of these terms when n → ∞. We first observe from (3.39) and (3.42) that +∞ n ∗φ meas(S2n (t)\S 2 (t))dt ≤ | f n Q − Q| L 1 → 0, 0

which implies (up to a subsequence extraction) 1 S n (t)\S n (t) −→ 0, 2

2

for ae (t, x, v) ∈ R+ × R3 × R3 .

Using in addition the estimate ∗ ( f n ◦ aφ Q )(−1) (t, |x × v|2 ) ≤ |eφ Q ,|x×v|2 | ≤ |φ Q (0)|, we deduce that the first two terms of the decomposition (3.49) go to 0 when n goes to infinity, for almost every (t, x, v) ∈ R+ × R3 × R3 . We now treat the third term and show that ! q0 = lim inf ( f n∗ ◦ aφ Q )−1 (t, |x × v|2 ) − (Q ∗ ◦ aφ Q )−1 (t, |x × v|2 ) n→∞

×1 S n (t)∩S n (t) ≥ 0, 2

(3.50)

2

for almost every (t, x, v). To prove (3.50), one may assume that 1 S n (t)∩S n (t) (x, v) = 1 2 2 for n large enough, (t, x, v) being fixed, otherwise q0 = 0 and (3.50) is proved. Let us also recall from standard argument that the strong L 1 convergence (1.22) together with the monotonicity of f n∗ in e and the continuity of Q ∗ in e ensure: a.e. > 0, ∀e ∈ (eφ Q , , 0),

f n∗ (aφ Q (e, ), ) → Q ∗ (aφ Q (e, ), ).

New Variational Approach to the Stability of Gravitational Systems

199

Hence, from (1.10), we deduce that for a.e. (x, v) ∈ R6 , we have ∀e ∈ (eφ Q , , 0), f n∗ (aφ Q (e, ), ) → Q ∗ (aφ Q (e, ), ), where = |x × v|2 > 0. (3.51) Let then (t, x, v) being fixed such that 1 S n (t)∩S n (t) (x, v) = 1 for n large enough and 2 2 (3.51) holds. From 2 |v| Q(x, v) = Q ∗ aφ Q , + φ Q (x), , > t 2 and from the continuity of Q ∗ (·, ), we deduce that (Q ∗ ◦ aφ Q )−1 (t, ) = sup{e ∈ ]eφ Q , , 0[: Q ∗ (aφ Q (e, ), ) > t}.

(3.52)

Take now any e such that eφ Q , < e < 0,

and Q ∗ (aφ Q (e, ), ) > t,

(3.53)

then from (3.51): f n∗ (aφ Q (e, ), ) > t, for n large enough. Using the definition of the pseudo-inverse given in Lemma 2.10, we then get e ≤ ( f n∗ ◦ aφ Q )−1 (t, |x × v|2 ) for n large enough, and hence e ≤ lim inf ( f n∗ ◦ aφ Q )−1 (t, |x × v|2 ). n→∞

Since this equality holds for all e satisfying (3.53), we conclude from (3.52) that lim inf ( f n∗ ◦ aφ Q )−1 (t, |x × v|2 ) ≥ (Q ∗ ◦ aφ Q )−1 (t, |x × v|2 ), n→∞

and (3.50) is proved. We now turn to the decomposition (3.49) and get from (3.50), lim inf(An − Bn ) ≥ 0, for a.e. (t, x, v). Finally, observing that Bn ≥ 0 and using (3.47), we conclude that (3.48) holds true: |v|2 − φ Q (x) 1{ fn ≤t

|v|2 + φ Q (x), |x × v|2 2

> t.

By Assumption (A) and Corollary 2.9, e → F(e, |x × v|2 ) is continuous and strictly 2 decreasing with respect to e = |v|2 + φ Q (x) for (x, v) ∈ {Q > 0}, and thus: t < Q(x, v) implies (Q ∗ ◦ aφ Q )(−1) (t, |x × v|2 ) −

|v|2 − φ Q (x) > 0. 2

200

M. Lemou, F. Florian, P. Raphaël

We then deduce from (3.54) that 1{ fn ≤t
0

we may apply the dominated convergence theorem to conclude: ∞ 1{ fn ≤t
0

Injecting this into (3.39) yields (3.37). This concludes the proof of Theorem 1.4.

3.4. Nonlinear stability of Q. We now turn to the proof of Theorem 1.5 which is a direct consequence of Theorem 1.4 and the known regularity of strong solutions to the Vlasov-Poisson system. Proof of Theorem 1.5. Let f 0 ∈ Erad ∩ Cc1 and let f (t) ∈ Erad be the corresponding global strong solution to (1.1). Then from the properties of the flow of the Vlasov-Poisson system (1.1), ∇φ f ∈ C([0, +∞), L 2 (R3 )) holds,

(3.55)

and ∀t ≥ 0,

f (t) ∈ Eq( f 0 ),

H( f (t)) = H( f 0 ).

(3.56)

Note that f (t) ∈ Eq( f 0 ) means the equality of the symmetric rearrangements at given : for all t ≥ 0, f (t)∗ = f 0∗ . Recall also from the property of contractivity of the symmetric rearrangement (2.62) that: | f 0∗ − Q ∗ | L 1 ≤ | f 0 − Q| L 1 .

(3.57)

Remark finally that the following inequality can be proved by interpolation: for all f ∈ Erad satisfying H( f ) ≤ H(Q) + 1, | f | L 1 ≤ |Q| L 1 + 1 and | f | L ∞ ≤ M, we have 7/12

|∇φ f − ∇φ Q | L 2 ≤ C Q | f − Q| L 1 .

(3.58)

Let us fix ε0 > 0 such that 7/12

C Q ε0

=

δ , 2

(3.59)

where C Q is the constant in (3.58) and δ is as in Theorem 1.4. An equivalent reformulation of Theorem 1.4 is the following: for all ε > 0, there exists η satisfying 0 < η ≤ min(ε0 , 1)

(3.60)

New Variational Approach to the Stability of Gravitational Systems

201

such that the following holds true: if f ∈ Erad is such that | f ∗ − Q ∗ | L 1 < η, | f | L ∞ < M, H( f ) < H(Q) + η

(3.61)

|∇φ f − ∇φ Q | L 2 < δ,

(3.62)

| f − Q| L 1 < min(ε, ε0 ), | f | L ∞ < M, ||v|2 ( f − Q)| L 1 < ε.

(3.63)

and

then we have

Let f 0 ∈ Erad ∩ Cc1 satisfying (1.24). From (3.56) and (3.57), we first deduce that the corresponding solution f (t) of (1.1) satisfies (3.61) for all t ≥ 0. Hence, if we prove that ∀t ≥ 0,

|∇φ f (t) − ∇φ Q | L 2 < δ,

(3.64)

then it will imply that f (t) satisfies (3.63) for all t ≥ 0, which is nothing but (1.25). Therefore, it remains to prove (3.64). By (1.24) and (3.60), the initial data f (0) = f 0 satisfies | f (0) − Q| L 1 < min(ε0 , 1), | f (0)| L ∞ < M and H( f (0)) < H(Q) + η, thus (3.58) and (3.59) imply that |∇φ f (0) − ∇φ Q | L 2 < δ/2. Now, assume that (3.64) is not true. Then there exists t1 > 0 such that |∇φ f (t1 ) − ∇φ Q | L 2 ≥ δ and, by the continuity property (3.55), there exists t2 > 0 such that |∇φ f (t2 ) − ∇φ Q | L 2 = 2δ/3.

(3.65)

Hence, since the function f (t2 ) satisfies (3.61) and (3.62), it satisfies (3.63). Therefore, we have | f (t2 ) − Q| L 1 < ε0 ,

| f (t2 )| L ∞ ≤ M and H( f (t2 )) ≤ H(Q) + η,

thus (3.58) and (3.59) imply that |∇φ f (t2 ) − ∇φ Q | L 2 ≤ δ/2, which contradicts (3.65). The proof of Theorem 1.5 is complete. 4. Study of the Reduced Functional J This section is devoted to the proof of Proposition 3.2 which requires a detailed study of the reduced functional J defined by (3.8). In particular, we aim at proving that J is twice differentiable at φ Q , that φ Q is a critical point and that the Hessian at φ Q is definite positive. Remarkably enough, this last fact holds because the Hessian of J is deeply connected to the structure of the linearized transport operator close to φ Q which is explicit in the radial setting. The coercivity of the corresponding Hartree-Fock exchange operator, [41], then follows from Antonov’s celebrated coercivity property [3,4].

202

M. Lemou, F. Florian, P. Raphaël

4.1. Antonov’s coercivity and the structure of the linearized transport operator. Let us start by describing some properties of the linearized transport operator generated by φ Q which is deeply connected to the structure of the Hessian of J , see Proposition 4.3. These properties rely on standard abstract functional analysis results and a remarkable coercivity property due to Antonov, [3,4]. Let Fe denote the partial derivative with respect to e of the function F(e, ) = ∗ Q aφ Q (e, ), defined in Assumption (A). Abusing notations, we will note when no confusion is possible: Fe = Fe (x, v) =

∂F ∂e

|v|2 + φ Q (x), |x × v|2 . 2

(4.1)

Denote = (x, v) ∈ R6 :

Q(x, v) > 0 .

(4.2)

At any (x, v) ∈ , we have ( |v|2 + φ Q (x), |x × v|2 ) ∈ O, where O is defined in Assumption (A), hence Fe (x, v) < 0. Moreover, the function (x, v) → Fe (x, v) is continuous on . We now consider the L 2 weighted Hilbert space: 2

L 2,r |Fe |

=

f ∈

1 L loc ()

spherically symmetric with

f2 d xdv < +∞ |Fe |

and introduce an orthogonal decomposition: 2,even ⊕ L 2,odd L 2,r |F | = L |F | |F | , e

e

e

where f ∈ L 2,r with f (x, −v) = f (x, v) , |Fe | = f ∈ L 2,r |F | with f (x, −v) = − f (x, v) .

L 2,even = |F | e

L 2,odd |Fe |

e

We then consider the unbounded transport operator: 2,r T f = v · ∇x f − ∇φ Q · ∇v f, D(T ) = f ∈ L 2,r , T f ∈ L |F | |F | . e

e

2,r Note that Cc∞ () ⊂ D(T ) is dense in L 2,r |Fe | and hence D(T ) is dense in L |Fe | . We claim the following properties of T :

Proposition 4.1 (Properties of T ). (i) Structure of the kernel: iT is a self adjoint operator with kernel: N (T ) = f

∈ L 2,r |Fe |

2 |v| 2 ˜ . +φ Q (x), |x ×v| of the form f (x, v) = f 2

(4.3)

New Variational Approach to the Stability of Gravitational Systems

203

(ii) Coercivity of the Antonov functional: The Antonov functional g2 d xdv − |∇φg |2L 2 A (g, g) := |Fe |

(4.4)

is continuous on L 2,r |F | . Moreover, e

∀ξ ∈ D(T ) ∩

L 2,odd |Fe | ,

A (T ξ, T ξ ) ≥

(ξ )2 φ Q (r ) d xdv. |Fe | r

(4.5)

(iii) Let g ∈ [N (T )]⊥ ∩ L 2,even |Fe | . Then A(g, g) ≥ 0 and we have A(g, g) = 0 if and only if g = 0. Proof. Step 1: Description of the kernel. Property (i) relies on the integration of the characteristic equations associated with T f = 0 and is a standard consequence of the integrability of Newton’s equation with central force field in radial symmetry. The proof follows similarly like for the proof of Jean’s theorem in [8], see also [19]. Step 2. Proof of (ii). Let g ∈ Cc0 (). We integrate by parts to get: |∇φg |2L 2

=−

1/2 2 g(x, v)φg (x) d xdv ≤ |g| L 2,r (φg (x)) Fe d xdv 6 |F | R

e

|g| L 2,r |∇φg | L 2 , |Fe |

where we used (C.1) proved in the Appendix. The density of Cc0 () into L 2,r |Fe | allows us to extend this estimate: ∀g ∈ L 2,r |F | , |∇φg | L 2 |g| L 2,r , |Fe |

e

and the continuity of (4.4) onto L 2,r |F | follows. e

Antonov’s coercivity property is now the following claim: ∀ξ ∈ Cc∞ () ∩ L 2,odd |F | , e

A (T ξ, T ξ ) ≥

(ξ )2 φ Q (r ) d xdv. |Fe | r

(4.6)

In the case where the function F depends only on e = |v|2 /2 + φ(x), a proof of this inequality can be found in [25,23,46,52]. In our context F depends on e = |v|2 /2+φ(x) and = |x × v|2 , and for the sake of clarity and completeness, we give a proof of this inequality in Appendix B which is a simple extension of the proof in [23]. Let us extend this estimate to all ξ ∈ D(T ) ∩ L 2,odd |F | using standard regularization e

arguments. Let ξ ∈ D(T ) ∩ L 2,odd |Fe | and assume first that Supp(ξ ) ⊂ . From the continuity of Fe , we deduce that Fe (x, v) ≤ δ < 0 for all (x, v) ∈ Supp(ξ ). Let a mol|v| ∞ 6 lifying sequence ζn (x, v) = n16 ζ ( |x| n , n ) ∈ Cc (R ) with ζ ≥ 0, then from standard regularization arguments: ζn ξ → ξ, ζn (T ξ ) → T ξ in L 2|F | as n → +∞, e

204

M. Lemou, F. Florian, P. Raphaël

and T (ζn ξ ) → T ξ in L 2|F | as n → +∞. e

Antonov’s coercivity property applied to ζn ξ ∈ Cc∞ () ∩ L 2,odd |F | , the continuity of A on L 2|F | and the boundedness of e

φ Q (r ) r

e

yield the claim. Consider now a general

n a C ∞ function such that ξ ∈ D(T ) ∩ L 2,odd |F | . We let χ e

⎧ 1 (s) = 0 for s ≤ 2n , ⎪ ⎨χ

1 1 χ increasing on 2n , n , ⎪ ⎩ χ (s) = 1 for s ≥ n1 ,

(4.7)

n (Q(x, v)). χn (x, v) = χ

(4.8)

and we set C1

function with a compact support in , satisfying T χn = 0. Therefore Then χn is a 2,odd χn ξ ∈ L |F | , has compact support in and e

T (χn ξ ) = χn T ξ → T ξ in L 2|F | , e

and hence the previous step and the continuity of A on L 2|F | yield (4.5). e

Step 3. Proof of (iii). We first observe that the transport operator exchanges parity in v: ∀ξ ∈ D(T ),

2,even ξ ∈ L 2,odd , ξ ∈ L 2,even . ⇒ T ξ ∈ L 2,odd |F | ⇒ T ξ ∈ L |F | |F | |F | e

e

e

e

This implies: R(T |

L 2,odd |Fe |

2,even ) = R(T ) ∩ L |F . |

(4.9)

e

On the other hand, iT being self-adjoint, there holds –see Cor. II.17, p. 28 in [9]: R(T ) = N (T )⊥ . Let g sequence

∈ [N (T )]⊥ ∩ L 2,even |Fe | . From 2,odd ξn ∈ D(T ) ∩ L |F | such that e

(4.10)

(4.9) and (4.10), we infer the existence of a

T ξn → g in L 2,r |F |

(4.11)

e

as n → +∞. Hence, from the continuity of the Antonov functional on L 2,r |F | , we have e

A (T ξn , T ξn ) → A (g, g).

(4.12)

Moreover, by (4.5), we have A (T ξn , T ξn ) ≥

(ξn )2 φ Q (r ) d xdv ≥ 0. |Fe | r

Thus (4.12) and (4.13) imply A (g, g) ≥ 0.

(4.13)

New Variational Approach to the Stability of Gravitational Systems

205

Assume now that A (g, g) = 0. Then (4.12) and (4.13) imply that (ξn )2 φ Q (r ) d xdv → 0 r |Fe |

(4.14)

as n → +∞. Solving the Poisson equation in radial coordinates yields: r r 2 φ Q (r ) = 4π ρ Q (s)s 2 ds. 0

Denote r0 = inf (x,v)∈ |x|. From the definition (4.2) of and the continuity of Q, we have a sequence r j → r0 , r j > r0 , such that ρ Q (r j ) > 0. Hence, for all r > r0 , we have φ Q (r ) |Fe |r

r 2 φ Q (r ) ≥ r 2j φ Q (r j ) > 0, for j large enough. Thus, the function and strictly positive on and (4.14) implies that

is continuous

2 (). ξn → 0 in L loc

Therefore, T ξn 0 in the distribution sense D () and, by (4.11), g = 0. This concludes the proof of Proposition 4.1. A standard consequence of the explicit description of the kernel of T given by (4.3) is that we can compute the projection onto N (T )–see [19] for related statements. For later use, we introduce the following homogeneous Sobolev space: H˙ r1 = h ∈ H˙ 1 (R3 ) s.t. h is radially symmetric . Lemma 4.2 (Projection onto the kernel of T ). Let D = (e, ) ∈ R∗− × R∗+ : e > eφ Q , ,

(4.15)

where eφ Q , is defined by (2.3). Given h ∈ H˙ r1 , we define the projection operator:

−1/2 r2 (x,v) e(x, v) − φ (r ) − h(r )dr Q r1 2r 2 Ph(x, v) = 1(e(x,v),(x,v))∈D , (4.16)

−1/2 r2 (x,v) e(x, v) − φ (r ) − dr Q 2 r1 2r where r1 = r1 (φ Q , e(x, v), (x, v)), r2 = r2 (φ Q , e(x, v), (x, v)) are defined by (2.6), (2.7), and where e(x, v) =

|v|2 + φ Q (x), 2

(x, v) = |x × v|2 .

Then: h Fe ∈ L 2,r |F | , e

(Ph)Fe ∈ L 2,r |F |

(4.17)

e

and (Ph)|Fe | ∈ N (T ), (h − Ph)Fe ∈ [N (T )]⊥ ∩ L 2,even |F | e

with Fe given by (4.1). The proof is given in Appendix C.

(4.18)

206

M. Lemou, F. Florian, P. Raphaël

4.2. Differentiability of J . Our aim in this section is to prove the differentiability of J at φ Q and to compute the first two derivatives. We shall in particular exhibit an intimate link between the Hessian of J and the projection operator (4.16). Proposition 4.3 (Differentiability of J ). The functional J defined by (3.8) on rad satisfies the following properties. = φ (i) Differentiability of J . Let φ = φ f ∈ rad and φ f ∈ rad , both nonzero. Then, the functional − φ)) λ → J (φ + λ(φ is twice differentiable on [0, 1]. (ii) Taylor expansion of J near φ Q . Let R > 0 and f ∈ B R := {g ∈ Erad such that |g − Q|E < R} .

(4.19)

Then we have the following Taylor expansion near φ Q : J (φ f ) − J (φ Q ) =

1 2 D J (φ Q )(φ f − φ Q , φ f − φ Q ) 2 +ε R (φ f ) |∇φ f − ∇φ Q |2L 2 ,

(4.20)

where ε R (φ f ) → 0 as |∇φ f − ∇φ Q | L 2 → 0 with f ∈ B R , and where the second derivative of J in the direction h is given by D 2 J (φ Q )(h, h) = |∇h|2 d x + h(x)(h(x) − Ph(e, ))Fe (e, )d xdv R3

R6

(4.21) with Ph given by (4.16) and e =

|v|2 2

+ φ Q (x), = |x × v|2 .

Proof. Let us decompose J into a kinetic part and a potential part: 1 1 J (φ) = J Q ∗ (φ) = H(Q ∗φ ) + |∇φ − ∇φ Q ∗φ |2 = |∇φ|2 d x + J0 (φ) 2 2 with

(4.22)

2 |v|2 |v| ∗ 2 2 + φ(x) Q aφ + φ(x), |x × v| , |x × v| d xdv J0 (φ) = 2 2 R6 2 |v| = + φ(x) Q ∗φ (x, v) d xdv. (4.23) 6 2 R

involve two derivObserve that (4.23) seems to suggest that two derivatives of J0 should√ atives of Q ∗ and aφ which are not available in particular from the · regularity only of the integral (2.17) defining aφ . We claim that is in fact not the case and that suitable integration by parts and change of variables and a careful track of the dependence on (e, φ, ) of the various estimates on aφ and its derivatives given by Lemmas 2.3, 2.4, 2.5 will yield the result.

New Variational Approach to the Stability of Gravitational Systems

207

Step 1. Bounds for the support of Q ∗ . In Corollary 2.9, we have identified the function Q∗:

(s, ), , ∀ > 0, ∀s ≥ 0. (4.24) Q ∗ (s, ) = F aφ−1 Q Recall that, by Assumption (A), for all ≥ 0 the function e → F(e, ) is nonincreasing. Let us define L = > 0 : F(eφ Q , , ) > 0 , (4.25) where eφ Q , is defined in Lemma 2.1. By Lemma 2.1 (i) and by the continuity of F, the function → F(eφ Q , , ) is continuous on R∗+ , thus L is an open set. (s, ) ≥ eφ Q , implies If ∈ R∗+ \L, then aφ−1 Q F(aφ−1 (s, ), ) ≤ F(eφ Q , , ) = 0 Q for all s ≥ 0, thus ∀ ∈ R∗+ \L ,

Q ∗ (·, ) = 0.

(4.26)

In particular, since Q = Q ∗φ Q is not zero, the measure of L cannot be zero. Let now ∈ L and let s0 () = aφ Q (e0 (), ), where we recall the definition (2.68) of e0 (). From Assumption (A), Lemma 2.3 and (4.24), we infer that the function Q ∗ (·, ) is continuous on R+ , that its support is [0, s0 ()] and that this function is strictly decreasing and C 1 on ]0, s0 ()[. Furthermore, from (2.21), we deduce that ∀ ∈ L , 0 < s0 () ≤ s0 := 16π 2 |Q| L 1 |e0 |−1/2 .

(4.27)

Finally, let us prove that the set L is bounded. From Assumption (A), (x, v) → Q(x, v) is compactly supported, thus there exist r0 , u 0 > 0 such that Q(x, v) = 0 for all (x, v) such that |x| ≥ r0 or |v| ≥ u 0 . Hence, we have Q(x, v) = 0 for all (x, v) such that |x × v|2 ≥ r0 u 0 and then, by definition of Q ∗ , Q ∗ (·, ) = 0 for all ≥ 0 := r02 u 20 . Therefore, we have L ⊂]0, 0 [.

(4.28)

Step 2. First derivative of J0 . We first transform the expression (4.23) of J0 . Using the change of variable (2.71) and the bounds (4.26), (4.27) for the support of Q ∗ , we get s0 () J0 (φ) = aφ−1 (s, )Q ∗ (s, )dds, (4.29) ∀φ ∈ rad \{0}, ∈L

0

where we recall that aφ−1 (·, ) is defined as the inverse function of e → aφ (·, ) at given φ ∈ rad \{0}, and > 0. as in Proposition 4.3 (i) and h = φ − φ. Let us differentiate the following Let φ and φ function with respect to λ ∈ [0, 1]: s0 () −1 J0 (φ + λh) = aφ+λh (s, )Q ∗ (s, )dds. (4.30) L

0

208

M. Lemou, F. Florian, P. Raphaël

Let −1 (s, )Q ∗ (s, ). g(λ, s, ) = aφ+λh

According to (2.49), we have

−1/2 r2 −1 a (s, ) − ψ (r ) − λh(r ) h(r )dr φ, φ+λh r1 ∂g Q ∗ (s, ), (λ, s, ) =

−1/2 ∂λ r2 −1 dr r1 aφ+λh (s, ) − ψφ, (r ) − λh(r ) −1 where ri , i = 1, 2, shortly denotes ri (φ + λh, aφ+λh (s, ), ) defined by (2.6), (2.7), and ψφ, (r ) is defined by (2.2). Therefore,

∂g (λ, s, ) ≤ |h| L ∞ Q ∗ (s, ) ∈ L 1 (R+ , R+ ), ∂λ

0≤

and we deduce from dominated convergence that J0 is differentiable at φ in the direction h with:

−1/2 s0 () r2 a −1 (s, ) − ψφ, (r ) h(r )dr φ r1 DJ0 (φ)(h) = Q ∗ (s, ) dds.

−1/2 r2 −1 L s=0 dr r1 aφ (s, ) − ψφ, (r ) Using the change of variable s → e = aφ−1 (s, ) and (2.23), we now get the following equivalent expression: √ 0 r2 ∗ −1/2 2 DJ0 (φ)(h) = 4π 2 Q aφ (e, ), e − ψφ, (r ) h(r )dr ded. L

r1

eφ,

(4.31) Step 3. Second derivative of J0 . Let us now compute the second derivative of J0 (φ + λh) with respect to λ. First, we write the first derivative in a more convenient form. Let ∗ ∗ Dφ, = (r, e) ∈ R+ × R− s.t. e − φ(r ) − 2 > 0 2r = (r, e) ∈ R∗+ ×]eφ, , 0[ s.t. r1 (φ, e, ) < r < r2 (φ, e, ) . An integration by parts gives ∂ J0 (φ + λh) ∂λ √ = 8π 2 2 & ×

L

0 e=eφ+λh,

∂ Q ∗ aφ+λh (e, ), ∂e

' 1/2 e − ψφ, (r ) − λh(r ) h(r )dr ded

r2 (φ+λh,e,)

r1 (φ+λh,e,)

√ = −8π 2

∂aφ+λh ∂ Q∗ aφ+λh (e, ), (e, ) ∂e L Dφ+λh, ∂s 1/2 h(r )dr ded, × e − ψφ, (r ) − λh(r ) 2

New Variational Approach to the Stability of Gravitational Systems

209

where the boundary terms of the integration by parts vanish. Now, we perform the change of variable e → s = aφ+λh (e, ) and get √ s0 () +∞ ∂ J0 (φ + λh) = −8π 2 2 G(λ, s, , r )dr dsd, (4.32) ∂λ r =0 L 0 with G(λ, s, , r ) = We have

1/2 ∂ Q∗ −1 (s, ) aφ+λh (s, ) − ψφ, (r ) − λh(r ) h(r ). + ∂s

−1 ∂aφ+λh ∂G 1 ∂ Q∗ = (s, ) (s, ) − h(r ) h(r ) ∂λ 2 ∂s ∂λ

−1/2 −1 × aφ+λh (s, ) − ψφ, (r ) − λh(r ) 1r1
(4.33)

From (2.49): −1 ∂a φ+λh (s, ) − h(r ) ≤ 2|h| L ∞ . ∂λ

(4.34)

Moreover, applying (2.9) to the potential φ + λh gives

√

−1/2 1 r 2r1r2 −1 ≤√ aφ+λh (s, ) − ψφ, (r ) − λh(r ) √ (r − r1 )(r2 − r )

(4.35)

for r ∈ ]r1 , r2 [. Inserting (4.34) and (4.35) in (4.33) yields √ ∗ ∂G r1 r2 1 ≤ C|h| L ∞ |r h| L ∞ − ∂ Q (s, ) √ √ , ∂λ ∂s (r − r1 )(r2 − r )

(4.36)

∗

Q < 0 for ∈ L and 0 < s < s0 (). for r ∈ ]r1 , r2 [, where we recall that ∂∂s In order to apply Lebesgue’s derivation Lemma A.1, one has to bound the right-hand side of (4.36) by suitable L 1 functions. To this aim, we need estimates for −1 −1 (s, ), ), r2 = r2 (φ + λh, aφ+λh (s, ), ). r1 = r1 (φ + λh, aφ+λh

We claim that, for all φ = φ f ∈ rad , ∀ ≤ 0 , ∀s ≤ s0 (),

r2 (φ, aφ−1 (s, ), ) ≤ C | f | L 1

m φ + 0 + s02 m 2φ

,

(4.37)

where m φ is defined by (2.20) and where C is a universal constant. Let us assume (4.37) and conclude the computation of the second derivative. Let m = min(m φ , m φ) > 0,

M = max(| f | L 1 , | f | L 1 ).

For λ ∈ [0, 1], we have (r )|) ≥ (1 − λ)m φ + λm φ ≥ m > 0. m φ+λh = inf (r + 1)((1 − λ)|φ(r )| + λ|φ r >0

210

M. Lemou, F. Florian, P. Raphaël

From (4.28), (4.36) and (4.37), we get ∂G ∂λ ≤ C|h| L ∞ |r h| L ∞ qλ (s, , r ),

(4.38)

for ∈ L, s ≤ s0 (), with 0 ≤ qλ (s, , r ) = −

1r1
and where the constant C only depends on m, M, 0 and s0 . By Lemma 2.4 and the continuity of ri (φ + λh, e, ), i = 1, 2 with respect to λ, we have for all λ0 ∈ [0, 1], qλ (s, , r ) → qλ0 (s, , r ) as λ → λ0 for a.e. r, s, . Moreover,

Q∗

s0 () +∞

L 0 ∞ L (R+

0

qλ (s, , r )dr dsd = π

L

d Q ∗ (0, ) √ .

Since ∈ × R+ ) and from (4.28), this integral is finite and its value is independent of λ. Now we invoke the Brézis-Lieb Lemma to conclude: qλ (s, , r ) → qλ0 (s, , r ) as λ → λ0 in L 1 ([0, s0 ] × [0, 0 ] × R+ ). We conclude from (4.32), (4.38) and Lemma A.1 that λ → J0 (φ + λh) is twice differentiable on [0, 1]. In particular, J0 is twice differentiable at φ in the direction h with: −1 √ s0 () r2 ∂ Q ∗ ∂aφ 2 2 D J0 (φ)(h, h) = −4π 2 (s, ) (s, ) − h(r ) h(r ) × ∂s ∂λ L 0 r1

−1/2 × aφ−1 (s, ) − ψφ, (r ) dr dsd. By (2.49), this expression can be simplified into D 2 J0 (φ)(h, h) √ = 4π 2 2

s0 () 0

L

√ −4π 2 2 L

0

∂ Q∗ (s, ) ∂s

s0 ()

∂ Q∗ (s, ) ∂s

r2

r1

−1/2 aφ−1 (s, ) − ψφ, (r ) (h(r ))2 dr dsd

2

−1/2 r2 −1 a (s, ) − ψ (r ) h(r )dr φ, φ r1

−1/2 r2 −1 a (s, ) − ψ (r ) dr φ, φ r1

dsd. (4.39)

Proof of the claim (4.37). In order to show that r1 and r2 are not allowed to go to infinity, we shall use (2.8). To this aim, we first need to show that e = aφ−1 (s, ) is not allowed to go to zero when ≤ 0 and s ≤ s0 (). From (2.22), we deduce that, for < 0 , s ≤ s0 (), mφ mφ mφ 4π 2 m φ 1/2 ( , |e| ≥ min . (4.40) , ≥ C min ( 2 2m φ + 3 s m φ + 0 s0 Finally (4.37) can be deduced from (2.8) and (4.40).

New Variational Approach to the Stability of Gravitational Systems

211

Step 4. Identification of the first and second derivatives of J at φ Q . Let f ∈ Erad and h = φ f − φ Q . We claim that DJ (φ Q )(h) = 0.

(4.41)

In order to prove this claim, we first remark from (4.22) that DJ (φ Q )(h) = DJ0 (φ Q )(h) + ∇φ Q · ∇h d x. R3

(4.42)

Moreover, by (4.31), we have DJ0 (φ Q )(h) √ 0 = 4π 2 2 L eφ

+∞ 0

Q ,

−1/2 F(e, ) e − ψφ Q , (r ) h(r )1e−ψφ Q , (r )>0 dr ded,

( where we used (2.67). Applying the change of variable e → u = 2(e − φ Q (r )), it +∞ +∞ +∞ 2 u + φ Q (r ), h(r ) 1r u> dν d DJ0 (φ Q )(h) = 2 F 2 0 0 0 = Q(x, v)h(x) d xdv comes, R6

where we used (1.10), Assumption (A), and recall that h is radially symmetric. Hence, from the Poisson equation, we deduce after an integration by parts that DJ0 (φ Q )(h) = − ∇φ Q · ∇h d x, R3

which together with (4.42) implies (4.41). Let us now identify the right second derivative of J at φ Q . We have 2 2 |∇h|2 d x D J (φ Q )(h, h) = D J0 (φ Q )(h, h) + R3

(4.43)

and, by (4.39), D 2 J0 (φ Q )(h, h) √ s () = 4π 2 2 L 0 0

−1/2 r2 −1 ∂ Q∗ a (s, ) (s, ) − ψ (r ) (h(r ))2 dr dsd φ , Q φQ r1 ∂s 2

−1/2 r 2 a −1 (s,)−ψ h(r )dr φ Q , (r ) √ φ r ∗ Q 1 s () Q (s, ) r dsd. −4π 2 2 L 0 0 ∂∂s

−1/2 2 a −1 (s,)−ψ dr φ Q , (r ) φ r Q

1

(s, ), (2.23) and (2.67), we get Using first the change of variable s → e = aφ−1 Q D 2 J0 (φ Q )(h, h) √ 2 = 4π 2 L

e0 () eφ Q ,

√ − 4π 2 2 L

Fe (e, )

e0 ()

eφ Q ,

−1/2 e − ψφ Q , (r ) (h(r ))2 dr ded

r2

r1

Fe (e, )

2 −1/2 e − ψφ Q , (r ) h(r )dr ded. r2 −1/2 dr r1 e − ψφ Q , (r )

r2 r1

212

M. Lemou, F. Florian, P. Raphaël

( We next apply the change of variable e → u = 2(e − φ Q (x)) and use (1.10) to get: Fe (e, )(h(x))2 d xdv − Fe (e, )h(x)Ph(e, )d xdv, D 2 J0 (φ Q )(h, h) = R6

R6

where we used the definition (4.16) and where we shortly denoted e=

|v|2 + φ Q (x), 2

= |x × v|2 .

This together with (4.43) concludes the proof of (4.21). Step 5. Proof of the Taylor expansion (4.20). We are now ready to prove the Taylor expansion (4.20). We first deduce from (4.41) and from the fact that J (φ Q + λh) twice differentiable with respect to λ that 1 ∂2 J (φ Q + h) − J (φ Q ) = (1 − λ) 2 J (φ Q + λh) dλ. ∂λ 0 Hence, for h = 0, 1 J (φ Q + h) − J (φ Q ) − D 2 J (φ Q )(h, h) 2 1

= (1 − λ) D 2 J (φ Q + λh) − D 2 J (φ Q ) (h, h) dλ 0 1

2 2 2 = |∇h| L 2 (1 − λ) D J0 (φ Q + λh) − D J0 (φ Q ) 0

h h , |∇h| L 2 |∇h| L 2

dλ. (4.44)

We now claim the following continuity property:

sup sup D 2 J0 (φ Q + λ(φ f − φ Q ) − D 2 J0 (φ Q ) ( h, h) → 0 λ∈[0,1] |∇ h|

(4.45)

L 2 =1

as |∇φ f − ∇φ Q | L 2 → 0, f satisfying (4.19). Assume (4.45). Then: 1

2 2 (1 − λ) D J0 (φ Q + λh) − D J0 (φ Q ) 0

h h , |∇h| L 2 |∇h| L 2

dλ → 0

and (4.44) now yields (4.20). hn Proof of (4.45). We argue by contradiction and consider ε > 0, f n satisfying (4.19), and λn ∈ [0, 1] such that |∇φ fn − ∇φ Q | L 2 < and

1 , n

|∇ h n | L 2 = 1,

2 hn , h n ) − D 2 J0 (φ Q )( hn , h n ) > ε. D J0 (φ Q + λn (φ fn − φ Q ))(

(4.46)

(4.47)

New Variational Approach to the Stability of Gravitational Systems

213

We denote h n = λn (φ fn − φ Q ). Recall from (4.39): D 2 J0 (φ Q + h n )( hn , hn ) = rn s0 () ∂ Q ∗ √ 2 = 4π 2 2 (s, ) ∂s L 0 r1n

−1/2 × aφ−1 (s, ) − ψ (r ) − h (r ) ( h n (r ))2 dr dsd φ , n Q +h n Q √ s0 () ∂ Q ∗ −4π 2 2 (s, ) ∂s L 0 n 2

−1/2 r2 −1 a (s, ) − ψ (r ) − h (r ) (r )dr h φ Q , n n φ Q +h n r1n × dsd,

−1/2 r2n −1 a (s, ) − ψ (r ) − h (r ) dr n φ , n Q φ Q +h n r

(4.48)

1

where we have denoted, for i = 1, 2,

(s, ), . rin = ri φ Q + h n , aφ−1 Q +h n By (4.46) and standard radial Sobolev embeddings, the sequence of radially symmetric functions h n is compact in L ∞ ([a, b]) for all 0 < a < b. By diagonal extraction, we deduce the pointwise convergence of h n (up to a subsequence) to a function h: h(r ) as n → +∞. h n (r ) →

∀r ∈ R∗+

(4.49)

Moreover, r

1/2

| h n (r )| ≤

+∞

s ( h n (s))2 ds

1/2

2

r

≤ |∇ h n | L 2 = 1,

(4.50)

thus, in particular, r 1/2 h belongs to L ∞ (R+ ). Let us analyze the convergence of (4.48). In a first step, recalling (4.26) and (4.27), we fix ∈ L and s ∈ ]0, s0 ()] and set (s, ), en = aφ−1 Q +h n

e∞ = aφ−1 (s, ) < 0. Q

From (4.46), the uniform bound of f n in Erad and Lemma 2.4, we have: en → e∞ as n → +∞. For k = 0, 1 or 2, we introduce the functions −1/2 gk (n, s, , r ) = en − ψφ Q , (r ) − h n (r ) ( h n (r ))k 1ψ (r )+h n (r )<en and

(4.51)

(4.52)

−1/2 ( h(r ))k 1ψ (r )<e∞ . gk (∞, s, , r ) = e∞ − ψφ Q , (r )

We claim: ∀ ∈ L, ∀s ∈]0, s0 ()], +∞ gk (n, s, , r ) dr → 0

0

+∞

gk (∞, s, , r ) dr for k = 0, 1 or 2.

(4.53)

214

M. Lemou, F. Florian, P. Raphaël

Indeed, from (4.49), (4.51), (4.46) and the bound of f n which imply |h n | L ∞ → 0, we first deduce that, for all s > 0, > 0, the function gk (n, s, , r ) converges pointwise in r ∈ R∗+ to the function gk (∞, s, , r ), for k = 0, 1 or 2, as n → +∞. Moreover, by applying (2.9) to the function φ Q + h n , we get ( r | h n (r )|k r1n r2n 1r1n
(4.54)

where we used (4.50). Now we observe from the uniform bound of f n in E, from (4.46) and from (2.29) that we have sup |Q + λn ( f n − Q)| L 1 < ∞ and 0 < inf inf (r + 1)|φ Q (r ) + h n (r )| < +∞. n r ≥0

n

Injecting the bound (4.37) on r2n into (4.54) gives: 0 ≤ gk (n, s, , r ) ≤ K (

1r1n
.

(4.55)

Denote ri∞ := ri (φ Q , e∞ , ) for i = 1, 2 and qn (r ) = (

1r1n
− r1n )(r2n

1r1∞
By Lemma 2.4, we have rin → ri∞ as n → +∞, for i = 1, 2. Therefore, the function qn converges pointwise to q∞ . Since +∞ +∞ π qn (r ) dr = q∞ (r ) dr = √ , (4.56) 0 0 we deduce from the Brézis-Lieb Lemma that qn → q∞ in L 1 (R+ ) as n → +∞. Finally, by applying the generalized dominated convergence theorem, we obtain (4.53). Now, we remark that, with the notation (4.52), (4.48) reads √ s0 () 2 2 D J0 (φ Q + h n )(h n , h n ) = −4π 2 G(n, s, )dsd L

0

and √ D 2 J0 (φ Q )( h, h) = −4π 2 2 L

with ∂ Q∗ (s, ) G(n, s, ) = − ∂s

& 0

∞

s0 ()

G(∞, s, )dsd,

0

∞

2 '

0

g1 (n, s, , r )dr

0

g0 (n, s, , r )dr

g2 (n, s, , r )dr − ∞

New Variational Approach to the Stability of Gravitational Systems

215

and ∂ Q∗ (s, ) G(∞, s, ) = − ∂s

&

∞ 0

∞

2 '

0

g1 (∞, s, , r )dr

0

g0 (∞, s, , r )dr

g2 (∞, s, , r )dr − ∞

.

From (4.53), we get that, for all ∈ L and s ∈]0, s0 ()[, G(n, s, ) → G(∞, s, ) as n → +∞.

(4.57)

Moreover, by the Cauchy-Schwarz inequality, we have

∞

0≤

2 g1 (n, s, , r )dr

0

≤

∞

∞

g0 (n, s, , r )dr

0

g2 (n, s, , r )dr .

0

(4.58) Therefore, (4.55), (4.56) and (4.58) yield the estimate π K ∂ Q∗ 0 ≤ G(n, s, ) ≤ − √ (s, ). ∂s

(4.59)

Remark that the function in the right-hand side of (4.59) belongs to L 1 since it is nonnegative and L

s0 () 0

−

d ∂ Q∗ (s, ) ds √ = ∂s

L

d Q ∗ (0, ) √ < +∞,

where we used (4.28) and |Q ∗ (0, )| ≤ |Q| L ∞ . Finally, we deduce from (4.57), (4.59) and from dominated convergence that D 2 J0 (φ Q + h n )( hn , h n ) → D 2 J0 (φ Q )( h, h)

(4.60)

as n → +∞. This contradicts (4.47) and concludes the proof of (4.45). This concludes the proof of Proposition 4.3. Remark 4.4. Note that a similar argument like for the proof of (4.60) gives that for all bounded sequences h n ∈ H˙ r1 , after extraction of a subsequence, D 2 J0 (φ Q )( hn , h n ) → D 2 J0 (φ Q )( h, h),

(4.61)

as n → +∞. Indeed, we never used for the proof of (4.60) the fact that h˜ n is a Poisson field. The important consequence is then that the quadratic form D 2 J0 (φ Q ) is compact on H˙ r1 .

216

M. Lemou, F. Florian, P. Raphaël

4.3. Proof of Proposition 3.2. We are now in position to conclude the proof of Proposition 3.2. The coercivity property (3.9) will appear as a consequence of the fact that the Hessian (4.20) can be connected to the Antonov functional (4.4) via the projection operator (4.16), and the key step is then Antonov’s coercivity property (4.5). Proof. Step 1. Strict positivity of the Hessian. We claim that: ∀h ∈ H˙r1 \{0},

D 2 J (φ Q )(h, h) > 0.

(4.62)

Indeed, let h ∈ H˙ r1 \{0} and consider the projection Ph given by (4.16). From (4.17), 2,r the functions h Fe and (Ph)Fe belong to L 2,r |Fe | and hence g = (h − Ph)Fe ∈ L |Fe | . By the orthogonality property (4.18), we have (h − Ph)2 Fe d xdv = − (h − Ph)2 Fe d xdv + 2 h (h − Ph) Fe d xdv g2 d xdv − 2 ∇h · ∇φg d x, = |Fe | where we used the Poisson equation. We may thus rewrite the Hessian (4.21): 2 2 D J (φ Q )(h, h) = |∇h| d x + (h − Ph)2 Fe d xdv =

R6

g2 |Fe |

d xdv − |∇φg |2L 2 + |∇h − ∇φg |2L 2

= A (g, g) + |∇h − ∇φg |2L 2 . Now, from (4.18) and Proposition 4.1 (iii), we deduce that A (g, g) ≥ 0. Therefore, D 2 J (φ Q )(h, h) is nonnegative. Moreover, if D 2 J (φ Q )(h, h) = 0, then A (g, g) = |∇h − ∇φg | L 2 = 0 and using again Proposition 4.1 (iii) enables to conclude that g = φg = h = 0. This ends the proof of (4.62). Step 2. Coercivity of the Hessian and conclusion. In Remark 4.4, we have seen that the quadratic form D 2 J0 (φ Q ) is compact on H˙ r1 . Hence from (4.22), the Fredholm alternative can be applied to the quadratic form D 2 J (φ Q ). Together with the strict positivity property (4.62), this implies the coercivity of this quadratic form: ∀h ∈ H˙ r1

D 2 J (φ Q )(h, h) ≥ c|∇h|2L 2 ,

(4.63)

for some universal constant c > 0. We now may conclude the proof of (3.9). Let R > 0 be fixed. From Proposition 4.3 (ii), there exists δ0 (R) – chosen in ]0, 21 |∇φ Q | L 2 ] – such that, for all f ∈ Erad satisfying | f − Q|E ≤ R,

|∇φ f − ∇φ Q | L 2 ≤ δ0 (R),

we have c , 4 where c is the constant in (4.63) and ε R is defined in (4.20). Hence, for such f , we deduce from (4.63) and (4.20) that c J (φ Q + h) − J (φ Q ) ≥ |∇h|2L 2 . 4 The proof of Proposition 3.2 is complete. ε R (φ f ) ≤

New Variational Approach to the Stability of Gravitational Systems

217

Acknowledgements. The authors would like to thank P.-E. Jabin for stimulating discussions about this work, and are endebted to J.-J. Aly for having kindly guided them through the physics reference on the subject and in particular the pioneering important works [2,41,54]. M. Lemou was supported by the Agence Nationale de la Recherche, ANR Jeunes Chercheurs MNEC. F. Méhats was supported by the Agence Nationale de la Recherche, ANR project QUATRAIN. P. Raphaël was supported by the Agence Nationale de la Recherche, ANR Projet Blanc OndeNonLin and ANR Jeune Chercheur SWAP.

Appendix A. Dominated Convergence Lemma Lemma A.1. Let I be an interval of R and let g(λ, r ) be a real-valued function in C 0 (I, L 1 (R+ )). Let g(λ, r )dr, G(λ) = R+

∂g the weak partial derivative of g with respect to λ. Assume that g and denote by ∂λ satisfies the following assumptions: ∂g ∂g ∈ L 1 (I × R + ) and for all λ0 ∈ I , limλ→λ0 ∂λ (λ, r ) = ∂λ (λ0 , r ) for a.e. r ; ∂g (ii) for all λ ∈ I , ∂λ (λ, r ) ≤ qλ (r ) a.e., where qλ ∈ L 1 (R+ ), and for all λ0 ∈ I ,

(i)

∂g ∂λ

qλ → qλ0 in L 1 (R+ ) as λ → λ0 . Then, G is C 1 on I and G (λ) =

∂g (λ, r )dr. R+ ∂λ

∂g Proof. Let λ0 , λ ∈ I . Since g ∈ C 0 (I, L 1 (R+ )) and ∂λ ∈ L 1 (I × R+ ), we have λ ∂g g(λ, r ) − g(λ0 , r ) = (μ, r )dμ. λ0 ∂λ

Hence, by Fubini,

G(λ) − G(λ0 ) g(λ, r ) − g(λ0 , r ) = dr λ − λ0 λ − λ0 R+ λ 1 ∂g = (μ, r )dr dμ. λ − λ0 λ0 R+ ∂λ

(A.1)

Now, we use a generalized version of the dominated convergence as stated in [33] (see the Remark after Theorem 1.8) and deduce from Assumptions (i) and (ii) that ∂g ∂g (μ, r )dr = (λ0 , r )dr. (A.2) lim μ→λ0 R+ ∂λ ∂λ R+ Hence, by using (A.2), we pass to the limit in (A.1) and obtain G(λ) − G(λ0 ) ∂g (λ0 , r )dr, lim = λ→0 λ − λ0 R+ ∂λ which proves the differentiability of G. In fact, we observe that the same Assumptions (i) and (ii) associated to the same generalized version of the dominated convergence theorem provide the continuity of G . This ends the proof of the lemma.

218

M. Lemou, F. Florian, P. Raphaël

Appendix B. Proof of the Antonov Inequality (4.6) Let be defined by (4.2) and let ξ ∈ Cc∞ () ∩ L 2,odd |Fe | . We recall that the linear transport operator T is defined by T ξ = v · ∇x ξ − ∇x φ Q · ∇v ξ. Our aim is to prove the coercivity property (4.6). Let g = T ξ ; we have from the Poisson equation r 1 r 2 φg (r ) = s 2 ρg (s)ds = ρg (x)d x 4π |x|≤r 0 1 v · ∇x ξ − ∇x φ Q · ∇v ξ dvd x = 4π |x|≤r R3 1 = ∇x · vξ dv d x 4π |x|≤r R3 x 1 = · vξ dv dσ (x). 4π |x|=r R3 r Now we observe from the spherical symmetry of ξ that the quantity R3 rx · vξ dv only depends on r = |x|. Therefore (x · v)ξ dv. (B.1) r φg (r ) = R3

We then use the Cauchy-Schwarz inequality and Supp(ξ ) ⊂ to estimate: ξ2 r 2 φg (r )2 ≤ (x · v)2 |Fe |dv dv , R3 R3 |Fe | where we recall that Fe is given by (4.1). Now we claim that (x · v)2 |Fe |dv = r 2 ρ Q (r ). R3

(B.2)

(B.3)

Indeed, we first pass to spherical coordinates in v, u = |v|,

|x × v|2 = r 2 u 2 sin2 θ, θ ∈ [0, π [, with r = |x|,

(B.4)

and get (recall that Fe ≤ 0) (x · v)2 |Fe |dv R3

= −4π 0

π/2 +∞ r =0

∂F r u cos θ sin θ ∂e 2 4

2

u2 2 2 2 + φ Q (r ), r u sin θ dudθ. 2

Now we perform the change of variable u2 (u, θ ) → e = + φ Q (r ), = r 2 u 2 sin2 θ , 2

(B.5)

New Variational Approach to the Stability of Gravitational Systems

and obtain R3

(x · v)

2

|Fe |dv

√ = −2π 2 ×

+∞ +∞ e=φ Q (r )+

=0

∂F (e, )ded. ∂e

2r 2

219

1/2 e − φ Q (r ) − 2 2r

We then integrate by parts with respect to the variable e: R3

√ (x · v)2 |Fe |dv = π 2

+∞ +∞

=0

e=φ Q (r )+

2r 2

−1/2 e − φ Q (r ) − 2 F(e, )ded. 2r

Using the same changes of variables (B.4) and (B.5), we get 2 |v| r 2 ρ Q (x) = r 2 F + φ Q (x), |x × v|2 dv 2 R3 √ +∞ +∞ −1/2 e − φ Q (r ) − 2 =π 2 F(e, )ded, 2r =0 e=φ Q (r )+ 2 2r

and (B.3) follows. Now, we integrate the inequality (B.2) with respect to r and use (B.3) to get +∞ |∇x φg |2 d x = 4π r 2 φg (r )2 dr R3

≤

0

R3 0

+∞

4πr 2 ρ Q (r )

ξ2 ξ2 dr dv = ρ Q (x) d xdv. |Fe | |Fe | R 3 ×R 3

From the definition (4.4) of the Antonov functional, we then deduce

1 A (T ξ, T ξ ) ≥ d xdv. (T ξ )2 − ρ Q (x)ξ 2 |Fe |

(B.6)

Let now ξ = (x · v)q(x, v) and write from the definition of T , (T ξ )2 = (qT (x · v) + (x · v)T q)2 = (x · v)2 (T q)2 + (x · v)T (x · v)T (q 2 ) + q 2 (T (x · v))2

= (x · v)2 (T q)2 + T (x · v)q 2 T (x · v) − (x · v)q 2 T (T (x · v)) . We observe, from the Poisson equation φ Q = ρ Q , that T (T (x · v)) = −(x · v)φ Q − v · ∇x φ Q = −(x · v) ρ Q (r ) +

φ Q (r ) r

.

Thus

φ Q (r ) . (T ξ )2 − ρ Q (r )ξ 2 = (x · v)2 (T q)2 + T (x · v)q 2 T (x · v) + ξ 2 r

(B.7)

220

M. Lemou, F. Florian, P. Raphaël

We now insert this expression into (B.6) and directly get the desired Antonov’s inequality (4.6), provided the following claim is proved:

T (x · v)q 2 T (x · v) d xdv → 0 as ε → 0. (B.8) |x · v| > ε 3 |x| > ε

Proof of (B.8). We shall in fact deal with the singularity at x · v = 0 in the integral (B.8), recalling that q(x, v) = ξ(x, v)/(x · v). We observe that the function |x|q(x, v) is bounded. To see this, let x = 0 and Rx be the orthogonal transformation of R3 such x that Rx |x| = e1 , where e1 = (1, 0, 0)T . Then, due to the spherical symmetry, ξ(|x|e1 , Rx v) = ω(|x|, Rx v), with e1 · Rx v ξ(r e1 , v) ξ˜ (r, |v|, e1 · v) ω(r, v) = = . e1 · v e1 · v

|x|q(x, v) =

Now, we recall that ξ is odd in v and hence ξ˜ is odd with respect to the last coordinate, thus ω is bounded and so is rq. Note also that q is smooth on |x · v| > δ, for all δ > 0. Let ε > 0. We have T (x · v)q 2 T (x · v) d xdv |x · v| > ε 3 |x| > ε

= −2ε 3 x · v = ε3

dσ1 (x, v) 2 q 2 (T (x · v))2 ( − ε |x|2 + |v|2

|x| > ε

(x · v)2 q 2 T (x · v)dσ2 (x)dv, x · v > ε3

|x| = ε

where dσ1 (x, v) is the measure on the set {(x, v) s.t. x · v = ε3 and |x| > ε} induced by the Lebesgue measure of R6 , and dσ2 (x) is the usual measure on the sphere {x ∈ R3 ; |x| = ε}. Now let R > 0 such that Supp(ξ ) ⊂ {(x, v), |x|2 + |v|2 ≤ R 2 }, then 2 T (x · v)q T (x · v) d xdv |x · v| > ε3 |x| > ε 2 2 2 2 dσ1 (x, v) 2 2 ≤ 2ε r q (T (x · v)) ( + ξ |T (x · v)| dv dσ2 (x), |x|2 + |v|2 ε |x|=ε R3 x · v = ε3 |x| > ε

≤ 2ε|rq|2L ∞ I (ε, R) +

C 2 ε , ε

where we have set

I (ε, R) = x · v = ε3 |x|2 + |v|2 < R 2

dσ1 (x, v) , (T (x · v))2 ( |x|2 + |v|2

and where we have used in the last estimate that rq is bounded and that ξ is compactly supported. We claim that I (ε, R) ≤ C R ,

(B.9)

New Variational Approach to the Stability of Gravitational Systems

221

where C R is independent of ε, which concludes the proof of (B.8). Indeed, we integrate by parts to get: T (T (x · v)) d xdv I (ε, R) = −

x · v > ε3 |x|2 + |v|2 < R 2

dσ3 (x, v) T (x · v) x · v − v · ∇x φ Q ( |x|2 + |v|2

+ x · v > ε3 |x|2 + |v|2 = R 2

≤

|x|2 +|v|2

+

|T (T (x · v))| d xdv

T (x · v) x · v − v · ∇x φ Q (dσ3 (x, v) ≤ C R , |x|2 + |v|2 |x|2 +|v|2 =R 2

where dσ3 (x, v) denotes the usual measure on the sphere |x|2 + |v|2 = R 2 . This concludes the proof of (B.9) and (4.6). Appendix C. Proof of Lemma 4.2 Let us first prove that for h ∈ rad , the projection Ph given by (4.16) is well-defined. From Lemma 2.1, the denominator in the definition (4.16) is finite and non zero for (x, v) such that (e(x, v), (x, v)) ∈ D and we have |(Ph)(x, v)| ≤ sup |h(r )|. r1 ≤r ≤r2

Let us now prove that h Fe belongs to L 2,r |F | . By a change of variable, we have

e

h 2 |Fe | d xdv r2 ∂ Q∗ −1/2 −1 aφ Q (s, ) − φ Q (r ) − 2 (s, ) (h(r ))2 dr dsd ∂s 2r L 0 r1 √ r2 s0 () r |h(r )|2 r1r2 ∂ Q∗ (s, ) ≤ −C dr dsd, √ ∂s (r − r1 )(r2 − r ) L 0 r1

√ = −4π 2 2

+∞

where we used (4.26), (4.27) for the support of Q ∗ (recall the definition (4.25) of L) and (2.9) for φ Q . From (4.28), (4.37) and the radial Sobolev bound: r

1/2

|h(r )| ≤

+∞

1/2

s (h (s)) ds 2

≤ |∇h| L 2 ,

2

r

we deduce that 2 2 h |Fe | d xdv ≤ −C|∇h| L 2 ≤ C|∇h|2L 2

0

L 0

s0 () 0

∂ Q∗ (s, ) ∂s

r2

r1

√

1 dr dsd (r − r1 )(r2 − r )

d Q ∗ (0, ) √ ≤ C|∇h|2L 2 .

(C.1)

222

M. Lemou, F. Florian, P. Raphaël

Now we prove that (Ph)Fe belongs to L 2,r |F | . The same change of variable as above gives e

(Ph)2 |Fe | d xdv √ s0 () ∂ Q ∗ (s, ) = −4π 2 2 ∂s L 0

2

−1/2 r2 −1 h(r )dr r1 aφ (s, ) − φ Q (r ) − 2 2r

Q

r2 −1 −1/2 dr r1 aφ (s, ) − φ Q (r ) − 2 Q

dsd.

2r

Hence, using the Cauchy-Schwarz inequality, we get 2 (Ph) |Fe | d xdv ≤ h 2 |Fe | d xdv, which concludes the proof of (4.17). Observe now from (4.16), (4.17) that (Ph)Fe is a L 2,r |Fe | function of e(x, v) and (x, v) and hence belongs to N (T ) from (4.3). It remains to prove that (h −Ph)Fe is orthogonal to N (T ). Indeed, let θ = θ (e(x, v), (x, v)) ∈ N (T ), then passing from the variables x, v to the variable r, e, , we get: (Ph)(x, v)θ (e(x, v), (x, v))d xdv √ = 2π 2 2 =

0

+∞ 0 eφ Q ,

r2

r1

−1/2 e − φ Q (r ) − 2 h(r )θ (e, )dr ded 2r

h(x, v)θ (e(x, v), (x, v))d xdv.

Hence (h − Ph)Fe and θ are orthogonal for the L 2,r |Fe | scalar product and (4.18) follows. This ends the proof of Lemma 4.2. References 1. Alvino, A., Trombetti, G., Lions, P.-L.: On optimization problems with prescribed rearrangements. Nonlinear Anal. 13(2), 185–220 (1989) 2. Aly, J.-J.: On the lowest energy state of a collisionless self-gravitating system under phase volume constraints. MNRAS 241, 15 (1989) 3. Antonov, A.V.: Remarks on the problem of stability in stellar dynamics. Soviet Astr., AJ. 4, 859–867 (1961) 4. Antonov, A.V.: Solution of the problem of stability of a stellar system with the Emden density law and spherical velocity distribution. J. Leningrad Univ. Se. Mekh. Astro. 7, 135–146 (1962) 5. Arnold, V.I.: Conditions for nonlinear stability of stationary plane curvilinear flows of an ideal fluid. Sov. Math. Dokl. 6, 773–776 (1965) 6. Arnold, V.I.: Sur un principe variationel pour les ecoulements stationaires des liquides parfaits et ses applications aux problèmes de stabilité nonlinénaire. J. Mécanique 5, 29–43 (1966) 7. Arnold, V.I.: Mathematical models of classical mechanics. New York: Springer Verlag, 1980 8. Batt, J., Faltenbacher, W., Horst, E.: Stationary spherically symmetric models in stellar dynamics, Arch. Rat. Mech. Anal. 93, 159–183 (1986)

New Variational Approach to the Stability of Gravitational Systems

223

9. Brézis, H.: Analyse fonctionnelle. Théorie et applications. Collection Mathématiques Appliquées pour la Maîtrise. Paris: Masson, 1983 10. Binney, J.; Tremaine, S.: Galactic Dynamics. Princeton, NJ: Princeton University Press, 1987 11. Cazenave, T., Lions, P.-L.: Orbital stability of standing waves for some nonlinear Schrödinger equations. Commun. Math. Phys. 85(4), 549–561 (1982) 12. Chavanis, P.-H.: Dynamical stability of collisionless stellar systems and barotropic stars: the nonlinear Antonov first law. Astronomy and Astrophysics 451(1), 109–123 (2006) 13. Dolbeault, J., Sánchez, Ó., Soler, J.: Asymptotic behaviour for the Vlasov-Poisson system in the stellardynamics case, Arch. Rat. Mech. Anal. 171(3), 301–327 (2004) 14. Doremus, J.P., Baumann, G., Feix, M.R.: Stability of a Self Gravitating System with Phase Space Density Function of Energy and Angular Momentum. Astronomy and Astrophysics 29, 401 (1973) 15. Fridmann, A.M., Polyachenko, V.L.: Physics of gravitating systems. Berlin-Heidelberg-New York: Springer-Verlag, 1984 16. Gardner, C.S.: Bound on the energy available from a plasma. Phys. Fluids 6, 839–840 (1963) 17. Gillon, D., Cantus, M., Doremus, J.P., Baumann, G.: Stability of self-gravitating spherical systems in which phase space density is a function of energy and angular momentum, for spherical perturbations. Astronomy and Astrophysics 50(3), 467–470 (1976) 18. Guo, Y.: Variational method for stable polytropic galaxies. Arch. Rat. Mech. Anal. 130, 163–182 (1999) 19. Guo, Y., Lin, Z.: Unstable and stable galaxy models. Comm. Math. Phys. 279(3), 789–813 (2008) 20. Guo, Y., Rein, G.: Stable steady states in stellar dynamics. Arch. Rat. Mech. Anal. 147, 225–243 (1999) 21. Guo, Y., Rein, G.: Isotropic steady states in galactic dynamics. Commun. Math. Phys. 219, 607–629 (2001) 22. Guo, Y.: On the generalized Antonov’s stability criterion. Contemp. Math. 263, 85–107 (2000) 23. Guo, Y., Rein, G.: A non-variational approach to nonlinear Stability in stellar dynamics applied to the King model. Commun. Math. Phys. 271, 489–509 (2007) 24. Had´zi´c, M., Rein, G.: Global existence and nonlinear stability for the relativistic Vlasov-Poisson system in the gravitational case. Indiana Univ. Math. J. 56(5), 2453–2488 (2007) 25. Kandrup, H.E., Sygnet, J.F.: A simple proof of dynamical stability for a class of spherical clusters. Astrophys. J. 298(1, part 1), 27–33 (1985) 26. Kandrup, H.E.: A stability criterion for any collisionless stellar equilibrium and some concrete applications thereof. Astrophys. J. 370(1), 312–317 (1991) 27. Kavian, O., Introduction à la théorie des points critiques et applications aux problèmes elliptiques. Mathématiques & Applications (Berlin), 13. Paris: Springer-Verlag, 1993 28. Lemou, M., Méhats, F., Raphaël, P.: Orbital stability and singularity formation for Vlasov-Poisson systems. C. R. Math. Acad. Sci. Paris 341(4), 269–274 (2005) 29. Lemou, M., Méhats, F., Raphaël, P.: On the orbital stability of the ground states and the singularity formation for the gravitational Vlasov-Poisson system, Arch. Rat. Mech. Anal. 189(3), 425–468 (2008) 30. Lemou, M., Méhats, F, Raphaël, P.: Stable ground states for the relativistic gravitational Vlasov-Poisson system. Comm. Partial Diff. Eq. 34(7), 703–721 (2009) 31. Lemou, M., Méhats, F., Raphaël, P.: Ensemble inequivalence for the gravitational Vlasov-Poisson system, In preparation 32. Lemou, M., Méhats, F., Raphaël, P.: Stable self-similar blow up dynamics for the three dimensional relativistic gravitational Vlasov-Poisson system. J. Amer. Math. Soc. 21(4), 1019–1063 (2008) 33. Lieb, E.H., Loss, M.: Analysis. Second edition. Graduate Studies in Mathematics, 14. Providence, RI: Amer. Math. Soc., 2001 34. Lieb, E.H., Yau, H.T.: The Chandrasekhar theory of stellar collapse as the limit of quantum mechanics. Comm. Math. Phys. 112(1), 147–174 (1987) 35. Lin, Z.: Nonlinear instability of periodic BGK waves for Vlasov-Poisson system. Comm. Pure Appl. Math. 58(4), 505–528 (2005) 36. Lin, Z., Strauss, W.A.: Linear stability and instability of relativistic Vlasov-Maxwell systems. Comm. Pure Appl. Math. 60(5), 724–787 (2007) 37. Lin, Z., Strauss, W.A.: A sharp stability criterion for the Vlasov-Maxwell system. Invent. Math. 173(3), 497–546 (2008) 38. Lions, P.-L.: The concentration-compactness principle in the calculus of variations. The locally compact case. I. . Ann. Inst. H. Poincaré Anal. Non Linéaire 1(2), 109–145 (1984) 39. Lions, P.-L.: The concentration-compactness principle in the calculus of variations. The locally compact case. II. Ann. Inst. H. Poincaré Anal. Non Linéaire 1(4), 223–283 (1984) 40. Lions, P.-L., Perthame, B.: Propagation of moments and regularity for the 3-dimensional Vlasov-Poisson system. Invent Math. 105(2), 415–430 (1991) 41. Lynden-Bell, D.: The Hartree-Fock exchange operator and the stability of galaxies. Mon. Not. R. Astr. Soc. 144, 189–217 (1969)

224

M. Lemou, F. Florian, P. Raphaël

42. Lynden-Bell, D.: Lectures on stellar dynamics. Galactic dynamics and N-body simulations. Lecture Notes in Phys. 433, Berlin, Springer, 1994, pp. 3–31 43. Marchioro, C., Pulvirenti, M.: Mathematical theory of incompressible nonviscous fluids. Applied Mathematical Sciences, 96. New York: Springer-Verlag, 1994 44. Mossino, J.: Inégalités isopérimétriques et applications en physique. (French) [Isoperimetric inequalities and applications to physics] Travaux en Cours. [Works in Progress]. Paris: Hermann, 1984 45. Marchioro, C., Pulvirenti, M.: Some considerations on the nonlinear stability of stationary planar Euler flows. Commun. Math. Phys. 100(3), 343–354 (1985) 46. Perez, J., Aly, J.-J.: Stability of spherical stellar systems -I. Analytical Results. Monthly. Not. Royal. Astronomical Soc. 280, 689–699 (1996) 47. Pfaffelmoser, K.: Global classical solutions of the Vlasov-Poisson system in three dimensions for general initial data. J. Diff. Eq. 95, 281–303 (1992) 48. Sánchez, Ó., Soler, J.: Orbital stability for polytropic galaxies. Ann. Inst. H. Poincaré Anal. Non Linéaire 23(6), 781–802 (2006) 49. Schaeffer, J.: Global existence of smooth solutions to the Vlasov-Poisson system in three dimensions. Comm. Part. Diff. Eq. 16, 1313–1335 (1991) 50. Schaeffer, J.: Steady States in Galactic Dynamics. Arch. Rat. Mech. Anal. 172, 1–19 (2004) 51. Serre, D.: Sur le principe variationnel des équations de la mécanique des fluides parfaits. [On the variational principle for the equations of perfect fluid dynamics] RAIRO Modél. Math. Anal. Numér. 27(6), 739–758 (1993) 52. Sygnet, J.-F., Des Forets, G., Lachieze-Rey, M., Pellat, R.: Stability of gravitational systems and gravothermal catastrophe in astrophysics. Astrophys. J. 276(2), 737–745 (1984) 53. Wan, Y.-H.: On nonlinear stability of isotropic models in stellar dynamics. Arch. Ration. Mech. Anal. 147(3), 245–268 (1999) 54. Wiechen, H., Ziegler, H.J., Schindler, K.: Relaxation of collisionless self gravitating matter: the lowest energy state. Mon. Mot. R. Ast. Soc. 223, 623–646 (1988) 55. Wolansky, G.: On nonlinear stability of polytropic galaxies. Ann. Inst. Henri Poincaré 16, 15–48 (1999) 56. Wolansky, G., Ghil, M.: Nonlinear Stability for Saddle Solutions of Ideal Flows and Symmetry Breaking. Commun. Math. Phys. 193, 713–736 (1998) Communicated by P. Constantin

Commun. Math. Phys. 302, 225–252 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1184-7

Communications in

Mathematical Physics

On Asymptotic Stability of Moving Kink for Relativistic Ginzburg-Landau Equation E. A. Kopylova1, , A. I. Komech1,2,, 1 Institute for Information Transmission Problems RAS, B.Karetnyi 19, GSP-4, Moscow 101447, Russia 2 Fakultät für Mathematik, Universität Wien, Norbergslavasse 15, 1090 Wien, Austria.

E-mail: [email protected]; [email protected] Received: 22 October 2009 / Accepted: 28 September 2010 Published online: 13 January 2011 – © Springer-Verlag 2011

Abstract: We prove the asymptotic stability of the moving kinks for the nonlinear relativistic wave equations in one space dimension with a Ginzburg-Landau potential: starting in a small neighborhood of the kink, the solution, asymptotically in time, is the sum of a uniformly moving kink and dispersive part described by the free KleinGordon equation. The remainder decays in a global energy norm. Our recent results on the weighted energy decay for the Klein-Gordon equations play a crucial role in the proofs. 1. Introduction There has been widespread interest in the dynamics of topological excitations of classical relativistic field theories [2,3]. These excitations are finite energy solutions which do not decay to one of the true ground states because of topological constraints, said differently, these excitations are separated by an infinitely high potential barrier from the ground state. In our contribution we will study in mathematical detail one of the simplest examples. The field, ψ, is real valued and defined on the line, ψ : R → R. The Hamiltonian function reads 1 1 2 2 H(ψ, π ) = |π(x)| + |ψ (x)| + U (ψ(x)) d x (1.1) 2 R 2 with π the momentum canonically conjugate to ψ and a smooth potential U . This leads to the equation of motion ¨ ψ(x, t) = ψ (x, t) + F(ψ(x, t)), x ∈ R, Supported partly by the grants of DFG, FWF and RFBR.

Supported partly by the Alexander von Humboldt Research Award.

(1.2)

226

E. A. Kopylova, A. I. Komech

where F(ψ) = −U (ψ). For an introduction, let us consider the Ginzburg-Landau quartic double well potential of the form U (ψ) = (ψ 2 − a 2 )2 /(4a 2 ). Then the topological excitations are defined through H < ∞ and the boundary conditions lim ψ(x, t) → ±a

(1.3)

x→±∞

with a fixed a > 0. Amongst them there are soliton-like solutions which travel with constant velocity, ψ(x, t) = a tanh γ

x − vt − q , √ 2

√ where γ = 1/ 1 − v 2 is the Lorentz contraction. The solitons are related by a Lorentz boost, since Eq. (1.2) is relativistically invariant. We will consider more general double well potentials for which U (±a) = U (±a) = 0,

U (±a) > 0,

(1.4)

and U (ψ) > 0 for ψ ∈ (−a, a),

(1.5)

similarly to the quartic potential. In this case the soliton-like solutions also exist, ψ(x, t) = s(γ (x − vt − q)),

v, q ∈ R,

|v| < 1,

(1.6)

where s(·) is a “kink” solution to the corresponding stationary equation s (x) − U (s(x)) = 0,

s(±∞) = ±a.

(1.7)

In general our goal is to clarify the special role of the soliton-like solutions (1.6) as long time asymptotics for any finite energy topological excitations satisfying (1.3). Namely, if one chooses some arbitrary finite energy initial state satisfying (1.3), one would expect that for t → ∞ the solution separates into two pieces: one piece is a finite collection of travelling solitons of the form (1.6) and their negatives with some velocities v j ∈ (−1, 1) and the shifts q j depending in a complicated way on the initial data, and the second radiative piece which is a dispersive solution to the free Klein-Gordon equation which propagates to infinity with the velocity 1. Our aim here is to elucidate this general picture by mathematical arguments for initial data sufficiently close to a soliton (1.6). Let us discuss our choice of the smooth potentials U . The condition (1.5) is necessary and sufficient for the existence of a finite energy static solution s(x) to (1.7) when (1.4) holds. Indeed, the condition is obviously sufficient. On the other hand, the “energy conservation” (s (x))2 /2 − U (s(x)) = E

(1.8)

and s(±∞) = ±a imply that E = 0. Therefore, U (ψ) > 0 for ψ ∈ (−a, a) since otherwise the boundary conditions s(±∞) = ±a would fail. As a byproduct, our kink solution is monotone increasing, and s (x) > 0,

x ∈ R.

(1.9)

Asymptotic Stability of Moving Kink for Relativistic Ginzburg-Landau Equation

227

Let us note that only the behavior of U near the interval [−a, a] is of importance since the solution is expected to be close to a soliton. However, we will assume additionally the potential to be bounded from below inf U (ψ) > −∞

(1.10)

ψ∈R

to have a well posed Cauchy problem for all finite energy initial states. Summarising, we formulate our first basic condition on the potential, for technical reasons adding a flatness condition. Condition U1. The potential U is a real smooth function which satisfies (1.4), (1.5), (1.10), and the following condition holds with some m > 0, U (ψ) =

m2 (ψ ∓ a)2 + O(|ψ ∓ a|14 ), 2

ψ → ±a.

(1.11)

Let us comment on the condition (1.11) (see also Remark 4.10). First, the condition means that U (−a) = U (a), though we do not need the potential to be reflection symmetric. We consider the solutions close to the kink, ψ(x, t) = s(γ (x −vt −q))+φ(x, t), with small perturbations φ(x, t). For such solution the condition (1.11) and the asymptotics (1.3) mean that Eq. (1.2) is almost linear Klein-Gordon equation for large |x| which is helpful for application of the dispersive properties. Finally, we expect that the degree 14 in (1.11) is technical, and a smaller degree should be sufficent. Let us note that a similar condition has been introduced in [4,5] in the context of the Schrödinger equation. Further we need some assumptions on the spectrum of the linearised equation. Let us rewrite Eq. (1.2) in the vector form, ˙ ψ(x, t) = π(x, t) x ∈ R. (1.12) π˙ (x, t) = ψ (x, t) + F(ψ(x, t)) Now the soliton-like solutions (1.6) become Yq,v (t) = (ψv (x − vt − q), πv (x − vt − q))

(1.13)

for q, v ∈ R with |v| < 1, where ψv (x) = s(γ x), πv (x) = −vψv (x).

(1.14)

The states Sq,v := Yq,v (0) form the solitary manifold S := {Sq,v : q, v ∈ R, |v| < 1}.

(1.15)

The linearized operator near the soliton solution Yq,v (t) is (see Sect. 4, formula (4.20)) Av =

v∇ 1 − m 2 − Vv (y) v∇

, ∇=

d2 d , = , dx dx2

where Vv (x) = −F (ψv (x)) − m 2 = U (ψv (x)) − m 2 .

(1.16)

228

E. A. Kopylova, A. I. Komech

By (1.7) and condition U1, we have Vv (x) ∼ C(s(γ x) ∓ a)12 ∼ Ce−12mγ |x| , x → ±∞,

(1.17)

s(x) ∓ a ∼ Ce−m|x| , x → ±∞.

(1.18)

since

In Sect. 4 we show that the spectral properties of the operator Av are determined by the corresponding properties of its determinant, which is the Schrödinger operator Hv = −(1 − v 2 ) + m 2 + Vv .

(1.19)

The spectral properties of Hv are identical for all v ∈ (−1, 1) since the relation Vv (x) = V0 (γ x) implies Hv = Tv−1 H0 Tv , where Tv : ψ(x) → ψ(x/γ ).

(1.20)

This equivalence manifests the relativistic invariance of Eq. (1.12). The continuous spectrum of the operator Hv coincides with [m 2 , ∞). The point 0 belongs to the discrete spectrum with corresponding eigenfunction ψv . By (1.14) and (1.9) we have ψv (x) = γ s (γ x) > 0 for x ∈ R. Hence, ψv is the groundstate, and all remaining discrete spectrum is contained in (0, m 2 ]. l, p For α ∈ R, p ≥ 1, and l = 0, 1, 2, ... let us denote by Wα , the weighted Sobolev space of the functions with the finite norm ψ W l, p = α

l

(1 + |x|)α ψ (k) L p < ∞.

k=0

Denote Hαl := Wαl,2 , so L 2α := Hα0 are the Agmon’s weighted spaces. Definition 1.1 (cf. [9,16]). A nonzero solution ψ ∈ L 2−1/2−0 (R)\L 2 (R) to Hv ψ = m 2 ψ is called a resonance. Now we can formulate our second basic condition on the potential. Condition U2. For any v ∈ (−1, 1), i) 0 is only eigenvalue of Hv . ii) m 2 is not a resonance of Hv . We show that Condition U2 implies the boundedness of the resolvent of the operator Av in the corresponding weighted Agmon spaces at the edge points ±im/γ of its continuous spectrum. Both conditions U1, U2 can be satisfied though it is non-obvious. Let us note that the quartic Ginzburg-Landau potential does not satisfy (1.11) and condition U2. We will prove elsewhere that the corresponding examples of potentials satisfying both U1 and U2 can be constructed as smoothened piece-wise quadratic potentials. We now can formulate the main result of our paper. Namely, we will prove the following asymptotics: (ψ(x, t), π(x, t)) ∼ (ψv± (x − v± t − q± ), πv± (x −v± t −q± ))+W0 (t) ± , t → ±∞ (1.21)

Asymptotic Stability of Moving Kink for Relativistic Ginzburg-Landau Equation

229

for solutions to (1.12) with initial states close to a soliton-like solution (1.13). Here W0 (t) is the dynamical group of the free Klein-Gordon equation, ± are the corresponding asymptotic states, and the remainder converges to zero ∼ t −1/2 in the global energy norm of the Sobolev space H 1 (R) ⊕ L 2 (R). Let us comment on previous results in this field. • Orbital stability of the kinks. For 1D relativistic nonlinear Ginzburg-Landau equations (1.2) the orbital stability of the kinks has been proved in [10]. • The Schrödinger equation. The asymptotics of type (1.21) were established for the first time by Soffer and Weinstein [23,24] (see also [19]) for nonlinear U (1)-invariant Schrödinger equation with a potential for small initial states and sufficiently small nonlinear coupling constant. The results have been extended by Buslaev and Perelman [4] to the translation invariant 1D nonlinear U (1)-invariant Schrödinger equation. The novel techniques [4] are based on the “separation of variables” along the solitary manifold and in the transversal directions. The symplectic projection allows to exclude the unstable directions corresponding to the zero discrete spectrum of the linearized dynamics. Similar techniques were developed by Miller, Pego and Weinstein for the 1D modified KdV and RLW equations, [17,18].The extensions to higher dimensions were obtained in [6,12,22,27]. • Nonrelativistic Klein-Gordon equations. The asymptotics of type (1.21) were extended to the nonlinear 3D Klein-Gordon equations with a potential [25], and for translation invariant system of the 3D Klein-Gordon equation coupled to a particle [11]. • Wave front of 3D Ginzburg-Landau equation. The asymptotic stability of wave front was proved for 3D relativistic Ginzburg-Landau equation with initial data which differ from the wave front on a compact set [7]. The wave front is the solution which depends on one space variable only, so it is not a soliton. The equation differs from the 1D equation (1.2) by the additional 2D Laplacian which improves the dispersive decay for the corresponding linearized Klein-Gordon equation in the continuous spectral space. The proving of the asymptotic stability of the solitons and kinks for relativistic equations remained an open problem till now. The investigation crucially depends on the spectral properties for the linearized equation which are completely unknown for higher dimensions. For the 1D case the main obstacle was the slow decay ∼ t −1/2 for the free 1D Klein-Gordon equation (see the discussion in [7, Introduction]). Let us comment on our approach. We follow general strategy of [4–7,11,25]: symplectic projection onto the solitary manifold, modulation equations, linearization of the transversal equations and further Taylor expansion of the nonlinearity, etc. We develop for relativistic equations a general scheme which is common in almost all papers in this area: dispersive estimates for the solutions to the linearized equation, virial and L 1 − L ∞ estimates and the method of majorants. However, the corresponding statements and their proofs in the context of relativistic equations are completely new. Let us comment on our novel techniques. I. The decay ∼ t −3/2 from Theorem 4.7 for the linearized transversal dynamics relies on our novel approach [13,14] to the 1D Klein-Gordon equation. II. The novel “virial type” estimate (4.42) is the relativistic version of the bound [5, (1.2.5)] used in [5] in the context of the nonlinear Schrödinger equation (see Remark 4.10).

230

E. A. Kopylova, A. I. Komech

III. We establish an appropriate relativistic version (4.31) of L 1 → L ∞ estimates. Both estimates (4.42) and (4.31) play a crucial role in obtaining the bounds for the majorants. IV. Finally, we give the complete proof of the soliton asymptotics (1.21). In the context of the Schrödinger equation, the proof of the corresponding asymptotics were sketched in [5]. Our paper is organized as follows. In Sect. 2 we formulate the main theorem. In Sect. 3 we introduce the symplectic projection onto the solitary manifold. The linearized equation is defined in Sect. 4. In Sect. 5 we split the dynamics in two components: along the solitary manifold and in the transversal directions. In Sect. 6 the modulation equations for the parameters of the soliton are displayed. The time decay of the transversal component is established in Sects. 7-11. Finally, in Sect. 12 we obtain the soliton asymptotics (1.21). 2. Main Results 2.1. Existence of dynamics. We consider the Cauchy problem for the Hamilton system (1.12) which we write as Y˙ (t) = F(Y (t)), t ∈ R :

Y (0) = Y0 .

(2.1)

Here Y (t) = (ψ(t), π(t)), Y0 = (ψ0 , π0 ), and all derivatives are understood in the sense of distributions. To formulate our results precisely, let us introduce a suitable phase space for the Cauchy problem (2.1). Definition 2.1. norm

i) E α := Hα1 ⊕ L 2α is the space of the states Y = (ψ, π ) with finite Y E α = ψ Hα1 + π L 2α < ∞.

(2.2)

ii) The phase space E := S + E, where E = E 0 and S is defined in (1.15). The metric in E is defined as ρE (Y1 , Y2 ) = Y1 − Y2 E , Y1 , Y2 ∈ E.

(2.3)

iii) W := W02,1 ⊕ W01,1 is the space of the states Y = (ψ, π ) with the finite norm Y W = ψ W 2,1 + π W 1,1 < ∞. 0

(2.4)

0

Obviously, the Hamilton functional (1.1) is continuous on the phase space E. The existence and uniqueness of the solutions to the Cauchy problem (2.1) follows by methods [15,20,26]: Proposition 2.2. (i) For any initial data Y0 ∈ E there exists the unique solution Y (t) ∈ C(R, E) to the problem (2.1). (ii) For every t ∈ R, the map U (t) : Y0 → Y (t) is continuous in E. (iii) The energy is conserved, i.e. H(Y (t)) = H(Y0 ), t ∈ R.

(2.5)

Asymptotic Stability of Moving Kink for Relativistic Ginzburg-Landau Equation

231

2.2. Solitary manifold and main result. Let us consider the solitons (1.14). The substitution to (1.12) gives the following stationary equations: −vψv (y) = πv (y), (2.6) −vπv (y) = ψv (y) + F(ψv (y)). Definition 2.3. A soliton state is S(σ ) := (ψv (x − b), πv (x − b)), where σ := (b, v) with b ∈ R and v ∈ (−1, 1). Obviously, the soliton solution (1.13) admits the representation S(σ (t)), where σ (t) = (b(t), v(t)) = (vt + q, v).

(2.7)

Definition 2.4. A solitary manifold is the set S := {S(σ ) : σ ∈ := R × (−1, 1)}. The main result of our paper is the following theorem Theorem 2.5. Let the conditions U1 and U2 hold, and Y (t) be the solution to the Cauchy problem (2.1) with an initial state Y0 ∈ E which is close to a kink S(σ0 ) = Sq0 ,v0 : Y0 = S(σ0 ) + X 0 , d0 := X 0 E β ∩W 1,

(2.8)

where β > 5/2. Then for d0 sufficiently small the solution admits the asymptotics: Y (x, t) = (ψv± (x − v± t − q± ), πv± (x − v± t −q± )) + W0 (t) ±r± (x, t), t → ± ∞, (2.9) where v± and q± are constants, ± ∈ E, and W0 (t) is the dynamical group of the free Klein-Gordon equation, while r± (t) E = O(|t|−1/2 ).

(2.10)

It suffices to prove the asymptotics (2.9) for t → +∞ since the system (1.12) is time reversible. 3. Symplectic Projection 3.1. Symplectic structure and hamiltonian form. The system (2.1) reads as the Hamilton system 0 1 ˙ Y = J DH(Y ), J := , Y = (ψ, π ) ∈ E, (3.1) −1 0 where DH is the Fréchet derivative of the Hamilton functional (1.1). Let us identify the tangent space of E, at every point, with the space E. Consider the symplectic form on E defined by (Y1 , Y2 ) = Y1 , J Y2 , Y1 , Y2 ∈ E,

(3.2)

where and ψ1 , ψ2 =

Y1 , Y2 := ψ1 , ψ2 + π1 , π2 ψ1 (x)ψ2 (x)d x, etc. It is clear that the form is non-degenerate, i.e. (Y1 , Y2 ) = 0 for every Y2 ∈ E ⇒ Y1 = 0.

Definition 3.1. i) The symbol Y1 Y2 means that Y1 ∈ E, Y2 ∈ E, and Y1 is symplectic orthogonal to Y2 , i.e. (Y1 , Y2 ) = 0. ii) A projection operator P : E → E is said to be symplectic orthogonal if Y1 Y2 for Y1 ∈ Ker P and Y2 ∈ Range P.

232

E. A. Kopylova, A. I. Komech

3.2. Symplectic projection onto solitary manifold. Let us consider the tangent space T S(σ ) S of the manifold S at a point S(σ ). The vectors τ1 = τ1 (v) := ∂b S(σ ) = (−ψv (y), −πv (y)), τ2 = τ2 (v) := ∂v S(σ ) = (∂v ψv (y), ∂v πv (y))

(3.3)

form a basis in T S(σ ) S. Here y := x − b is the “moving frame coordinate”. Let us stress that the functions τ j are always regarded as functions of y rather than those of x. Formula (1.14) implies that τ j (v) ∈ E α ,

v ∈ (−1, 1),

j = 1, 2,

∀α ∈ R.

(3.4)

Lemma 3.2. The symplectic form is nondegenerate on the tangent space T S(σ ) S, i.e. T S(σ ) S is a symplectic subspace. Proof. Let us compute the vectors τ1√and τ2 . Recall that ψv (y) = s(γ y) and πv = −vψv (y) = −vγ s (γ y) with γ = 1/ 1 − v 2 . Then τ1 = (τ11 , τ12 ) = −γ s (γ y), vγ 2 s (γ y) , τ2 = (τ21 , τ22 ) = vyγ 3 s (γ y), − γ 3 s (γ y) − v 2 yγ 4 s (γ y) . Therefore (τ1 , τ2 ) = τ11 , τ22 − τ12 , τ21 = γ 4 s (γ y), s (γ y) > 0.

(3.5)

Now we show that in a small neighborhood of the soliton manifold S a “symplectic orthogonal projection” onto S is well-defined. Let us introduce the translations Tq : (ψ(x), π(x)) → (ψ(x − q), π(x − q)), q ∈ R. Note that the manifold S is invariant with respect to the translations. Definition 3.3. For any v < 1 denote by (v) = {σ = (b, v) : b ∈ R, |v| ≤ v}. Let us note that S ⊂ E α with α < −1/2. Lemma 3.4. Let α < −1/2 and v < 1. Then i) there exists a neighborhood Oα (S) of S in E α and a mapping : Oα (S) → S such that is uniformly continuous on Oα (S) in the metric of E α , Y = Y for Y ∈ S,

and

Y − S T S S, where S = Y.

(3.6)

ii) Oα (S) is invariant with respect to the translations Tq , and Tq Y = Tq Y,

for Y ∈ Oα (S) and q ∈ R.

(3.7)

iii) For any v < 1 there exists an rα (v) > 0 s.t. S(σ ) + X ∈ Oα (S) if σ ∈ (v) and X E α < rα (v).

Asymptotic Stability of Moving Kink for Relativistic Ginzburg-Landau Equation

233

Proof. We have to find σ = σ (Y ) such that S(σ ) = Y and (Y − S(σ ), ∂σ j S(σ )) = 0,

j = 1, 2.

(3.8)

Let us fix an arbitrary σ 0 ∈ and note that the system (3.8) involves two smooth scalar functions of Y . Then for Y close to S(σ 0 ), the existence of σ follows by the standard finite dimensional implicit function theorem if we show that the 2 × 2 Jacobian matrix with elements Ml j (Y ) = ∂σl (Y − S(σ 0 ), ∂σ j S(σ 0 )) is non-degenerate at Y = S(σ 0 ). First note that all the derivatives exist by (3.4). The non-degeneracy holds by Lemma 3.2 and the definition (3.3) since Ml j (S(σ 0 )) = −(∂σl S(σ 0 ), ∂σ j S(σ 0 )). Thus, there exists some neighborhood Oα (S(σ 0 )) of S(σ 0 ), where is well defined and satisfies (3.6), and the same is true in the union Oα (S) = ∪σ 0 ∈ Oα (S(σ 0 )). The identity (3.7) holds for Y, Tq Y ∈ Oα (S), since the form and the manifold S are invariant with respect to the translations. It remains to modify Oα (S) by the translations: we set Oα (S) = ∪b∈R Tb Oα (S). Then the second statement obviously holds. The last two statements and the uniform continuity in the first statement follow by translation invariance and the compactness arguments. We refer to as the symplectic orthogonal projection onto S. 4. Linearization on the Solitary Manifold Let us consider a solution to the system (1.12), and split it as the sum Y (t) = S(σ (t)) + X (t),

(4.1)

where σ (t) = (b(t), v(t)) ∈ is an arbitrary smooth function of t ∈ R. In detail, denote Y = (ψ, π ) and X = (, ). Then (4.1) means that ψ(x, t) = ψv(t) (x − b(t)) + (x − b(t), t), (4.2) π(x, t) = πv(t) (x − b(t)) + (x − b(t), t). Let us substitute (4.2) to (1.12), and linearize the equations in X . Setting y = x − b(t) which is the “moving frame coordinate”, we obtain that ˙ v (y) + (y, ˙ (y, t) = πv (y) + (y, t), ˙ t) − b ψ˙ = v∂ ˙ v ψv (y) − bψ ˙ v (y)+ (y, ˙ (y, t) = ψv (y)+ (y, t)+ F(ψv (y) + (y, t)). ˙ t)− b π˙ = v∂ ˙ v πv (y)− bπ (4.3) Using Eq. (2.6), we obtain from (4.3) the following equations for the components of the vector X (t): ˙ (y, t) + (b˙ − v)ψv (y) − v∂ ˙ (y, t) = (y, t) + b ˙ v ψv (y), ˙ (y, t)+(b−v)π ˙ ˙ (y, t) = (y, t)+ b (y)− v∂ ˙ π (y)+ F(ψ (y)+(y, t))− F(ψ (y)). v v v v v

(4.4) We can write Eq. (4.4) as X˙ (t) = A(t)X (t) + T (t) + N (t), t ∈ R,

(4.5)

234

E. A. Kopylova, A. I. Komech

where T (t) is the sum of terms which do not depend on X , and N (t) is at least quadratic in X . The linear operator A(t) = Av,w depends on two parameters, v = v(t), and ˙ and can be written in the form w = b(t) w∇ 1 w∇ 1 = := , Av,w + F (ψv ) w∇ − m 2 − Vv (y) w∇ (4.6) where Vv (y) = −F (ψv ) − m 2 .

(4.7)

Furthermore, T (t) and N (t) = N (σ, X ) are given by (w − v)ψv − v∂ ˙ v ψv 0 T = , N (σ, X ) = , N (v, ) (w − v)πv − v∂ ˙ v πv

(4.8)

where v = v(t), w = w(t), σ = σ (t) = (b(t), v(t)), X = X (t), and N (v, ) = F(ψv + ) − F(ψv ) − F (ψv ),

(4.9)

Remark 4.1. Formulas (3.3) and (4.8) imply: T (t) = −(w − v)τ1 − vτ ˙ 2,

(4.10)

and hence T (t) ∈ T S(σ (t)) S, t ∈ R. This fact suggests an unstable character of the nonlinear dynamics along the solitary manifold. 4.1. Linearized equation. Here we collect some Hamiltonian and spectral properties of the operator Av,w . First, let us consider the linear equation X˙ (t) = Av,w X (t),

t ∈R

(4.11)

with arbitrary fixed v ∈ (−1, 1) and w ∈ R. Let us define the space E + := H 2 (R) ⊕ H 1 (R). Lemma 4.2. i) For any v ∈ (−1, 1) and w ∈ R, Eq. (4.11) can be represented as the Hamiltonian system, X˙ (t) = J DHv,w (X (t)),

t ∈ R,

where DHv,w is the Fréchet derivative of the Hamiltonian functional 1 2 Hv,w (X ) = || + | |2 + (m 2 + Vv )||2 dy + w dy. 2

(4.12)

(4.13)

ii) The energy conservation law holds for the solutions X (t) ∈ C 1 (R, E + ), Hv,w (X (t)) = const,

t ∈ R.

(4.14)

iii) The skew-symmetry relation holds: (Av,w X 1 , X 2 ) = −(X 1 , Av,w X 2 ),

X 1 , X 2 ∈ E.

(4.15)

Asymptotic Stability of Moving Kink for Relativistic Ginzburg-Landau Equation

Proof.

i) The equation (4.11) reads as follows: d + w . = − (m 2 + Vv ) + w dt

235

(4.16)

The equations correspond to the Hamilton form since + w = D Hv,w , − (m 2 + Vv ) + w = −D Hv,w . ii) The energy conservation law follows by (4.12) and the chain rule for the Fréchet derivatives: d Hv,w (X (t)) = DHv,w (X (t)), X˙ (t) = DHv,w (X (t)), J DHv,w (X (t)) = 0, dt t ∈ R, (4.17) since the operator J is skew-symmetric by (3.1), and DHv,w (X (t)) ∈ E for X (t) ∈ E + . iii) The skew-symmetry holds since Av,w X = J DHv,w (X ), and the linear operator X → DHv,w (X ) is symmetric as the Fréchet derivative of a real quadratic form. Lemma 4.3. The operator Av,w acts on the tangent vectors τ = τ j (v) to the solitary manifold as follows: Av,w [τ1 ] = (v − w)τ1 , Av,w [τ2 ] = (w − v)τ2 + τ1 .

(4.18)

Proof. In detail, we have to show that −ψv (v − w)ψv ∂v ψv (w − v)∂v ψv −ψv = , A = + . Av,w v,w −πv (v − w)πv ∂v πv (w − v)∂v πv −πv Indeed, differentiate Eqs. (2.6) in b and v, and obtain that the derivatives of the soliton state in parameters satisfy the following equations: −vψv = πv , −vπv = ψv + F (ψv )ψv ,

−ψv − v∂v ψv = ∂v πv , −πv − v∂v πv = ∂v ψv + F (ψv )∂v ψv . Then (4.18) follows from (4.19) by definition of Av,w in (4.6)

(4.19)

Now we consider the operator Av = Av,v corresponding to w = v: v∇ 1 . Av := − m 2 − Vv v∇

(4.20)

In that case the linearized equation has the following additional specific features. The continuous spectrum of the operator Av coincides with := (−i∞, −im/γ ] ∪ [im/γ , i∞).

(4.21)

From (4.18) it follows that the tangent vector τ1 (v) is the zero eigenvector, and τ2 (v) is the corresponding root vector of the operator Av , i.e. Av [τ1 (v)] = 0, Av [τ2 (v)] = τ1 (v).

(4.22)

236

E. A. Kopylova, A. I. Komech

Lemma 4.4. Zero root space of operator Av is two-dimensional for any v ∈ (−1, 1). Proof. It suffices to check that the equation Av u = τ2 (v) has no solution in L 2 ⊕ L 2 . Indeed, the equation reads v∇ 1 vγ 2 yψv u1 = . (4.23) u2 − m 2 − Vv v∇ −γ 2 ψv − v 2 γ 2 yψv From the first equation we get u 2 = vγ 2 yψv − vu 1 . Then the second equation implies that Hv u 1 = γ 2 (1 + v 2 )ψv + 2v 2 γ 2 yψv ,

(4.24)

where Hv is the Schrödinger operator defined in (1.19). Setting u 1 = − 21 v 2 γ 4 y 2 ψv + u˜ 1 , we reduce the equation to Hv u˜ 1 = −γ 2 ψv ,

(4.25)

since ψv = γ 2 (m 2 + Vv )ψv by the first line of (4.19). Hence, u˜ 1 is the root function of the operator Hv since ψv is an eigenfunction. However, this is impossible since Hv is a selfadjoint operator. Lemma 4.5. The operator Av has only eigenvalue λ = 0. Proof. Let us consider the eigenvalues problem for operator Av : v∇ 1 u1 u1 = λ . u2 u2 − m 2 − Vv v∇ From the first equation we have u 2 = −(v∇ − λ)u 1 . Then the second equation implies that (Hv + λ2 − 2vλ∇)u 1 = 0.

(4.26)

Hence, for v = 0 the operator A0 has only eigenvalue λ = 0 by Condition U2 i). Further, let us consider the case v = 0. Taking the scalar product with u 1 , we obtain Hv u 1 , u 1 + λ2 u 1 , u 1 = 0. Hence, λ2 is real since the operator Hv is selfadjoint. The nonzero eigenvalues can bifurcate either from the point λ = 0 or from the edge points ±im/γ of the continuous spectrum of the operator Av . Let us consider each case separately. i) The point λ = 0 cannot bifurcate since it is isolated, and the zero root space is two dimensional by Lemma 4.4. ii) The bifurcation from the edge points also is impossible. Indeed, the bifurcated eigenvalue λ ∈ (−im/γ , im/γ ) is pure imaginary because λ2 is real. Hence, (4.26) is equivalent to Hv + γ 2 λ2 p = 0, (4.27) where p(x) = eγ vλx u 1 (x) ∈ L 2 that is forbidden by Condition U2 i) since −γ 2 λ2 ∈ (0, m 2 ). 2

Asymptotic Stability of Moving Kink for Relativistic Ginzburg-Landau Equation

237

4.2. Decay for the linearized dynamics. Let us consider the linearized equation X˙ (t) = Av X (t), t ∈ R,

(4.28)

where Av = Av,v is given in (4.20) with Vv is defined in (4.7). Definition 4.6. For |v| < 1, denote by Pvd the symplectic orthogonal projection of E onto the tangent space T S(σ ) S, and Pvc = I − Pvd . Note that by the linearity, Pvd X =

p jl (v)τ j (v)(τl (v), X ),

X∈E

(4.29)

with some smooth coefficients p jl (v). Hence, the projector Pvd , in the variable y = x −b, does not depend on b. Next decay estimates will play the key role in our proofs. The first estimate follows from our assumption U2 by Theorem 3.15 of [14] since the condition of type [14, (1.3)] holds in our case (see also [13]). Theorem 4.7. Let the condition U2 hold, and β > 5/2. Then for any X ∈ E β , the weighted energy decay holds: e Av t Pvc X E −β ≤ C(v)(1 + t)−3/2 X E β , t ∈ R,

(4.30)

Corollary 4.8. For β > 5/2 and for X ∈ E β ∩ W , (e Av t Pvc X )1 L ∞ ≤ C(v)(1 + t)−1/2 ( X W + X E β ), t ∈ R.

(4.31)

Here (·)1 stands for the first component of the vector function. Proof. Let us apply the projector Pvc to both sides of (4.28): Pvc X˙ = Av Pvc X = A0v Pvc X + Vv Pvc X,

(4.32)

where A0v =

v∇ 1 − m 2 v∇

, V=

0 0 . −Vv 0

Hence, the Duhamel representation gives, e

Av t

Y =e

A0v t

Y+ 0

t

e Av (t−τ ) Ve Av τ Y dτ, Y = Pvc X, t > 0. 0

(4.33)

0

0

Let us note that e Av t Z = e A0 t Tvt Z , where Tvt Z (x, t) = Z (x + vt, t). Then (4.33) reads 0

e Av t Y = e A0 t Tvt Y +

0

t

e A0 (t−τ ) Tvt [Ve Av τ Y ]dτ, t > 0. 0

(4.34)

238

E. A. Kopylova, A. I. Komech

Applying estimate (265) from [21], the Hölder inequality and Theorem 4.7 we obtain t (e Av t Y )1 L ∞ ≤ C(1 + t)−1/2 Tvt Y W + C (1 + t − τ )−1/2 Tvt [V (e Av τ Y )1 ] W 1,1 dτ 0 0 t (1 + t − τ )−1/2 V (e Av τ Y )1 W 1,1 dτ = C(1 + t)−1/2 Y W + C 0 0 t (1 + t − τ )−1/2 e Av τ Pvc X E −β dτ ≤ C(1 + t)−1/2 X W + C 0 t (1 + t − τ )−1/2 (1 + τ )−3/2 X E β dτ ≤ C(1 + t)−1/2 X W + C 0

−1/2

≤ C(1 + t)

( X W + X E β ).

4.3. Taylor expansion for nonlinear term. Now let us expand N (v, ) from (4.9) in the Taylor series N (v, ) = N2 (v, ) + N3 (v, ) + · · · + N12 (v, ) + N R (v, ) = N I (v, ) + N R (v, ),

(4.35)

where N j (v, ) =

F ( j) (ψv ) j , j!

j = 2, . . . , 12

(4.36)

and N R is the remainder. By condition U1 we have F(ψ) = −m 2 (ψ ∓ a) + O(|ψ ∓ a|13 ), ψ → ±a. Hence, the functions F ( j) (ψv (y)), 2 ≤ j ≤ 12 decrease exponentially as |y| → ∞ by (1.18) and (1.14). Therefore, N I L 2 ∩W 1,1 = R( L ∞ ) L ∞ H 1 = R( L ∞ ) L ∞ X E −β . β

−β

0

(4.37)

For the remainder N R we have |N R | = R( L ∞ )||13 ,

(4.38)

where R(A) is a general notation for a positive function which remains bounded as A is sufficiently small. Lemma 4.9. The bounds hold: N R W 1,1 = R( L ∞ ) 11 L∞ ,

(4.39)

0

N R L 2

5/2+ν

= R( L ∞ )(1 + t)

4+ν

12 L∞ ,

0 < ν < 1/2.

(4.40)

Asymptotic Stability of Moving Kink for Relativistic Ginzburg-Landau Equation

239

Proof. Step i) By the Cauchy formula, 13 (x, t) N R (x, t) = (13)!

0

1

(1 − ρ)12 F (13) (ψv + ρ(x, t))dρ,

(4.41)

Therefore, N R L 1 = R( L ∞ )

2 ||13 d x = R( L ∞ ) 11 L ∞ 2

= R( L ∞ ) ) 11 L∞ , since L 2 ≤ C(d0 ) by the results of [10]. Differentiating (4.41) in x, we obtain N R =

1 13 (1 − ρ)12 (ψv + ρ )F (14) (ψv + ρ)dρ (13)! 0 12 1 + (1 − ρ)12 F (13) (ψv + ρ)dρ, (12)! 0

Hence, 11 N R L 1 = R( L ∞ ) 13 + (x)|d x |(x) ∞ ∞ L L ≤ R( L ∞ ) 11 L∞ , since

|(x) (x)|d x ≤ L 2 L 2 ≤ C(d0 ). Then (4.39) follows.

Step ii) The bound (4.38) implies N R L 2

5/2+ν

= R( L ∞ ) 12 L ∞ L 2

5/2+ν

.

We will prove in Appendix B that (t) L 2

5/2+ν

Then (4.40) follows.

≤ C(d0 )(1 + t)4+ν .

(4.42)

Remark 4.10. Our choice of the degree 14 in the condition (1.11) is due to the competition between the factors in the estimate (4.40) for the remainder. Namely, the factor (1 + t)4+ν with ν < 1/2 comes from the virial type estimate (4.42) describing the expan−6 by sion of the support for the perturbation of the kink. On the other hand, 12 L∞ ∼ t the crucial decay estimate (7.1). Hence, the right-hand side (4.40) decays like ∼ t −2+ν , where −2 + ν < −3/2 which is sufficient for the method of majorants (in integral inequalities (9.2) and (9.3)).

240

E. A. Kopylova, A. I. Komech

5. Symplectic Decomposition of the Dynamics Here we decompose the dynamics in two components: along the manifold S and in transversal directions. Equation (4.5) is obtained without any assumption on σ (t) in (4.1). We are going to choose S(σ (t)) := Y (t), but then we need to know that Y (t) ∈ Oα (S),

t ∈R

(5.1)

with some Oα (S) defined in Lemma 3.4. It is true for t = 0 by our main assumption (2.8) with sufficiently small d0 > 0. Then S(σ (0)) = Y (0) and X (0) = Y (0) − S(σ (0)) are well defined. We will prove below that (5.1) holds with α = −β if d0 is sufficiently small. First, we choose v < 1 such that |v(0)| ≤ v.

(5.2)

Denote by r−β (v) the positive number from Lemma 3.4 iii) which corresponds to α = −β. Then S(σ ) + X ∈ O−β (S) if σ = (b, v) with |v| < v and X E −β < r−β (v). Therefore, S(σ (t)) = Y (t) and X (t) = Y (t) − S(σ (t)) are well defined for t ≥ 0 so small that X (t) E −β < r−β (v). This is formalized by the standard definition of the “exit time”. First, we introduce the “majorants” m 1 (t) := sup (1 + s)3/2 X (s) E −β ,

m 2 (t) := sup (1 + s)1/2 (s) L ∞ .

s∈[0,t]

(5.3)

s∈[0,t]

Here X = (X 1 , X 2 ) = (, ). Let us denote by ε ∈ (0, r−β (v)) a fixed number which we will specify below. Definition 5.1. t∗ is the exit time t∗ = sup{t ≥ 0 : m j (s) < ε,

j = 1, 2, 0 ≤ s ≤ t}.

(5.4)

Let us note that m j (0) < ε for sufficiently small d0 . One of our main goals is to prove that t∗ = ∞ if d0 is sufficiently small. This would follow if we show that m j (t) < ε/2,

0 ≤ t < t∗ .

(5.5)

6. Modulation Equations In this section we present the modulation equations which allow to construct the solutions Y (t) of Eq. (2.1) close at each time t to a kink, i.e. to one of the functions described in Definition 2.3 with time varying (“modulating”) parameters (b, v) = (b(t), v(t)). We look for a solution to (2.1) in the form Y (t) = S(σ (t))+ X (t) by setting S(σ (t)) = Y (t) which is equivalent to the symplectic orthogonality condition of type (3.7), X (t) T S(σ (t)) S, t < t∗ ,

(6.1)

The projection Y (t) is well defined for t < t∗ by Lemma 3.4 iii). Now we derive the “modulation equations” for the parameters σ (t) = (b(t), v(t)). For this purpose, let us write (6.1) in the form (X (t), τ j (t)) = 0,

j = 1, 2,

(6.2)

where the vectors τ j (t) = τ j (σ (t)) span the tangent space T S(σ (t)) S. It would be convenient for us to use some other parameters (c, v) instead of σ = (b, v), where c(t) = t b(t) − 0 v(τ )dτ and ˙ − v(t) = w(t) − v(t) c(t) ˙ = b(t)

(6.3)

Asymptotic Stability of Moving Kink for Relativistic Ginzburg-Landau Equation

241

Lemma 6.1. Let Y (t) be a solution to the Cauchy problem (2.1), and (6.2) hold. Then the parameters c(t) and v(t) satisfy the equations (τ1 , τ2 )(N , τ2 ) + (X, ∂v τ1 )(N , τ2 ) − (X, ∂v τ2 )(N , τ1 ) (6.4) D −(τ1 , τ2 )(N , τ1 ) − (X, τ2 )(N , τ1 ) − (X, τ1 )(N , τ2 ) , (6.5) v˙ = D c˙ =

where D = 2 (τ1 , τ2 ) + O( X E −β ). Proof. Differentiating the orthogonality conditions (6.2) in t we obtain 0 = ( X˙ , τ j ) + (X, τ˙ j ) = (Av,w X + T + N , τ j ) + (X, τ˙ j ),

j = 1, 2.

(6.6)

First, let us compute the principal (i.e. non-vanishing at X = 0) term (T, τ j ). By (4.10), (T, τ1 ) = −v(τ ˙ ˙ ˙ 2 , τ1 ) = v(τ 1 , τ2 ); (T, τ2 ) = −c(τ 1 , τ2 ).

(6.7)

Second, let us compute (Av,w X, τ j ). The skew-symmetry (4.15) implies that (Av,w X, τ j ) = −(X, Av,w τ j ). Then by (4.18) we have (Av,w X, τ1 ) = (X, cτ ˙ 1 ), (Av,w X, τ2 ) = −(X, cτ ˙ 2 + τ1 ) = −(X, cτ ˙ 2 ),

(6.8) (6.9)

since (X, τ1 ) = 0. Finally, let us compute the last term (X, τ˙ j ) in (6.6). For j = 1, 2 one has τ˙ j = ˙ b τ j + v∂ ˙ v τ j = v∂ ˙ v τ j since the vectors τ j do not depend on b according to (3.3). b∂ Hence, (X, τ˙ j ) = (X, v∂ ˙ v τ j ).

(6.10)

As the result, by (6.7)–(6.10), Eq. (6.6) becomes 0 = c(X, ˙ τ1 ) + v˙ ((τ1 , τ2 ) + (X, ∂v τ1 )) + (N , τ1 ), 0 = −c˙ (X, τ2 ) + ((τ1 , τ2 ) + v(X, ˙ ∂v τ2 ) + (N , τ2 ). Since (τ1 , τ2 ) = 0 by (3.5) then the determinant D of the system does not vanish for small X E −β and we obtain (6.4)–(6.5). Corollary 6.2. Formulas (6.4)–(6.5) imply |c(t)|, ˙ |v(t)| ˙ ≤ C(v) (t) 2L 2 ≤ C(v) X (t) 2E −β , −β

0 ≤ t < t∗ .

(6.11)

242

E. A. Kopylova, A. I. Komech

7. Decay for the Transversal Dynamics In Sect. 12 we will show that our main Theorem 2.5 can be derived from the following time decay of the transversal component X (t): Proposition 7.1. Let all conditions of Theorem 2.5 hold. Then t∗ = ∞, and X (t) E −β ≤

C(v, d0 ) C(v, d0 ) , (t) L ∞ ≤ , 3/2 (1 + |t|) (1 + |t|)1/2

t ≥ 0.

(7.1)

We will derive (7.1) in Sects. 11 from our Eq. (4.5) for the transversal component X (t). This equation can be specified using Corollary 6.2. Indeed, (4.10) implies that T (t) E β ∩W ≤ C(v) X 2E −β ,

0 ≤ t < t∗

(7.2)

by (6.11) since w − v = c. ˙ Thus (4.5) becomes the equation X˙ (t) = A(t)X (t) + T (t) + N I (t) + N R (t),

0 ≤ t < t∗ ,

(7.3)

where A(t) = Av(t),w(t) , T (t) satisfies (7.2), and

12 4+ν N R E 5/2+ν ≤ C(v)(1 + t) L ∞ , 0 < ν < 1/2, 0 ≤ t < t∗ , N R W ≤ C(v) 11 ∞ L

N I (t) E β ∩W ≤ C(v) L ∞ X E −β ,

(7.4)

by (4.37), (4.39–(4.40)). In remaining part of our paper we will analyze mainly Eq. (7.3) to establish the decay (7.1). We are going to derive the decay using the bounds (7.2) and (7.4), and the orthogonality condition (6.1). Let us comment on two main difficulties in proving (7.1). The difficulties are common for the problems studied in [4]. First, the linear part of the equation is nonautonomous, hence we cannot apply directly the methods of scattering theory. Similarly to the approach of [4], we reduce the problem to the analysis of the frozen linear equation, X˙ (t) = A1 X (t), t ∈ R,

(7.5)

where A1 is the operator Av1 defined by (4.6) with v1 = v(t1 ) for a fixed t1 ∈ [0, t∗ ). Then we estimate the error by the method of majorants. Second, even for the frozen equation (7.5), the decay of type (7.1) for all solutions does not hold without the orthogonality condition of type (6.1). Namely, by (4.22) Eq. (7.5) admits the secular solutions X (t) = C1 τ1 (v) + C2 [τ1 (v)t + τ2 (v)]

(7.6)

which arise also by differentiation of the soliton (1.13) in the parameters q and v in the moving coordinate y = x − v1 t. Hence, we have to take into account the orthogonality condition (6.1) in order to avoid the secular solutions. For this purpose we will apply the corresponding symplectic orthogonal projection which kills the “runaway solutions” (7.6). Remark 7.2. The solution (7.6) lies in the tangent space T S(σ1 ) S with σ1 = (b1 , v1 ) (for an arbitrary b1 ∈ R) that suggests an unstable character of the nonlinear dynamics along the solitary manifold (cf. Remark 4.1 ii)).

Asymptotic Stability of Moving Kink for Relativistic Ginzburg-Landau Equation

243

Definition 7.3. Denote by Xv = Pvc E the space symplectic orthogonal to T S(σ ) S with σ = (b, v) (for an arbitrary b ∈ R). Now we have the symplectic orthogonal decomposition E = T S(σ ) S + Xv ,

σ = (b, v)

(7.7)

and the symplectic orthogonality (6.1) can be written in the following equivalent forms: d Pv(t) X (t) = 0,

c Pv(t) X (t) = X (t),

0 ≤ t < t∗ .

(7.8)

Remark 7.4. The tangent space T S(σ ) S is invariant under the operator Av by (4.22), hence the space Xv is also invariant by (4.15): Av X ∈ Xv on a dense domain of X ∈ Xv . 8. Frozen Form of Transversal Dynamics Now let us fix an arbitrary t1 ∈ [0, t∗ ), and rewrite Eq. (7.3) in a “frozen form” X˙ (t) = A1 X (t) + (A(t) − A1 )X (t) + T (t) + N I (t) + N R (t), 0 ≤ t < t∗ , (8.1) where A1 = Av(t1 ),v(t1 ) and A(t) − A1 =

(w(t) − v(t1 ))∇ 0 . 0 (w(t) − v(t1 ))∇

The next trick is important since it allows us to kill the “bad terms” (w(t) − v(t1 ))∇ in the operator A(t) − A1 . Let us change the variables (y, t) → (y1 , t) = (y + d1 (t), t), where t (w(s) − v(t1 ))ds, 0 ≤ t ≤ t1 . (8.2) d1 (t) := t1

Next define X˜ (t) = ((y1 − d1 (t), t), (y1 − d1 (t), t)).

(8.3)

Then we obtain the final form of the “frozen equation” for the transversal dynamics X˙˜ (t) = A1 X˜ (t) + T˜ (t) + N˜ I (t) + N˜ R (t), 0 ≤ t ≤ t1 ,

(8.4)

where T˜ (t), N˜ I (t) and N˜ R (t) are T (t), N I (t) and N R (t) expressed in terms of y1 = y + d1 (t). Now we derive appropriate bounds for the “remainder terms” in (8.4). Let us recall the following well-known inequality: for any α ∈ R, (1 + |y + x|)α ≤ (1 + |y|)α (1 + |x|)|α| ,

x, y ∈ R.

(8.5)

Lemma 8.1. For f ∈ L 2α with any α ∈ R the following bound holds: f (y1 − d1 ) L 2α ≤ f L 2α (1 + |d1 |)|α| , d1 ∈ R.

(8.6)

244

E. A. Kopylova, A. I. Komech

Proof. The bound (8.6) follows from (8.5) since 2 2 2α f (y1 − d1 ) L 2 = | f (y1 − d1 )| (1 + |y1 |) dy1 = | f (y)|2 (1 + |y + d1 |)2α dy α ≤ | f (y)|2 (1 + |y|)2α (1 + |d1 |)2|α| dy ≤ (1 + |d1 |)2|α| f 2L 2 . α

Corollary 8.2. The following bounds hold for 0 ≤ t ≤ t1 by (7.2) and (7.4): T˜ (t) E β N˜ I (t) E β N˜ R E 5/2+ν N˜ R W

, T˜ (t) W ≤ C(v) X 2E , −β β ˜ ≤ C(v)(1 + |d1 (t)|) L ∞ X E −β , N I (t) W ≤ C(v) L ∞ X E −β , 12 5/2+ν 4+ν ≤ C(v)(1 + |d1 (t)|) (1 + t) L ∞ , 0 < ν < 1/2, ≤ C(v) 11 . L∞ ≤ C(v)(1 + |d1 (t)|)β X 2E

−β

(8.7) 9. Integral Inequality Equation (8.4) can be written in the integral form: t A1 t ˜ ˜ X (t) = e X (0) + e A1 (t−s) [T˜ (s) + N˜ I (s) + N˜ R (s)]ds, 0 ≤ t ≤ t1 .

(9.1)

0 c We apply the symplectic orthogonal projection P1c := Pv(t to both sides, and get 1) t e A1 (t−s) P1c [T˜ (s) + N˜ I (s) + N˜ R (s)] ds. P1c X˜ (t) = e A1 t P1c X˜ (0) + 0

We have used here that P1c commutes with the group e A1 t since the space X1 := P1c E is invariant with respect to e A1 t by Remark 7.4. Applying (4.30) we obtain that t ˜ C X˜ (0) E β T (s) + N˜ I (s) + N˜ R (s) E β + C ds. P1c X˜ (t) E −β ≤ (1 + t)3/2 (1 + |t − s|)3/2 0 Then for 5/2 < β < 3 and 0 ≤ t ≤ t1 the bounds (8.7) imply P1c X˜ (t) E −β ≤ + C(d 1 (t)) 0

t

C(d 1 (0)) X (0) E β (1 + t)3/2 X (s) 2E −β + (s) L ∞ X (s) E −β + (1 + s)3/2+β (s) 12 L∞ (1 + |t − s|)3/2

ds, (9.2)

where d 1 (t) := sup0≤s≤t |d1 (s)|. Similarly, (4.31) and (8.7) imply t ˜ C X˜ (0) E β ∩W T (s) + N˜ I (s) + N˜ R (s) E β ∩W c ˜ + C ds (P1 X (t))1 L ∞ ≤ (1 + t)1/2 (1 + |t − s|)1/2 0 C(d 1 (0)) ≤ X (0) E β ∩W + C(d 1 (t)) (1 + t)1/2 t X (s) 2 + (s) ∞ X (s) 3/2+β (s) 12 + (s) 11 L E −β +(1 + s) E −β L∞ L∞ ds. × 1/2 (1 + |t − s|) 0 (9.3)

Asymptotic Stability of Moving Kink for Relativistic Ginzburg-Landau Equation

245

Lemma 9.1. For t1 < t∗ we have |d1 (t)| ≤ Cε2 , 0 ≤ t ≤ t1 .

(9.4)

Proof. To estimate d1 (t), we note that

t1

w(s) − v(t1 ) = w(s) − v(s) + v(s) − v(t1 ) = c(s) ˙ +

v(τ ˙ )dτ

(9.5)

s

by (6.3). Hence, the definitions (8.2), (5.3), and Corollary 6.2 imply that t t1 t1 |d1 (t)| = | (w(s) − v(t1 ))ds| ≤ |v(τ ˙ )|dτ ds |c(s)| ˙ + t1

≤ Cm 21 (t1 )

t1 t

1 + (1 + s)3

t t1 s

dτ (1 + τ )3

s

ds ≤ Cm 21 (t1 ) ≤ Cε2 ,

0 ≤ t ≤ t1 . (9.6)

Now (9.2) and (9.3) imply that for t1 < t∗ and 0 ≤ t ≤ t1 , P1c X˜ (t) E −β ≤

C X (0) E β (1 + t)3/2 + (s) L ∞ X (s) E −β + (1 + s)3/2+β (s) 12 L∞

t X (s) 2 E −β

+C

(1 + |t − s|)3/2

0

(P1c X˜ (t))1 L ∞ ≤ +C 0

ds,

(9.7)

C X (0) E β ∩W (1 + t)1/2

3/2+β (s) 12 + (s) 11 t X (s) 2 ∞ E −β + (s) L X (s) E −β +(1 + s) L∞ L∞ ds. (1 + |t − s|)1/2

(9.8) 10. Symplectic Orthogonality Finally, we are going to change P1c X˜ (t) by X (t) in the left-hand side of (9.7) and (9.8). We will prove that it is possible since d0 1 in (2.8). Lemma 10.1. For sufficiently small ε > 0, we have for t1 < t∗ , X (t) E −β ≤ C P1c X˜ (t) E −β , 0 ≤ t ≤ t1 , (t) L ∞ ≤ 2 (P1c X˜ (t))1 L ∞ , 0 ≤ t ≤ t1 , where the constant C does not depend on t1 . Proof. The proof is based on the symplectic orthogonality (7.8), i.e. d Pv(t) X (t) = 0,

t ∈ [0, t1 ]

c E are almost parallel for all t. and on the fact that all the spaces X (t) := Pv(t)

(10.1)

246

E. A. Kopylova, A. I. Komech

˜ (t) E −β ˜ Namely, we first note that (t) L ∞ = (t) L ∞ , and X (t) E −β ≤ C X by Lemma 8.1, since |d1 (t)| ≤ const for t ≤ t1 < t∗ by (9.4). Therefore, it suffices to prove that c ˜ ˜ (t) (t))1 L ∞ , X˜ (t) E −β ≤ 2 P1c X˜ (t) E −β , 0 ≤ t ≤ t1 . L ∞ ≤ 2 (P1 X (10.2)

This estimate will follow from 1 1 d ˜ ˜ (P1d X˜ (t))1 L ∞ ≤ (t) (t) E −β ≤ X˜ (t) E −β , 0 ≤ t ≤ t1 . L ∞ , P1 X 2 2 (10.3) since Pc X˜ (t) = X˜ (t) − Pd X˜ (t). To prove (10.3), we write (10.1) as, 1

1

d P˜ v(t) X˜ (t) = 0,

t ∈ [0, t1 ]

(10.4)

d X ˜ (t) is Pd X (t) expressed in terms of the variable y1 = y + d1 (t). Hence, where P˜ v(t) v(t) (10.3) follows from (10.4) if the difference Pd − P˜ d is small uniformly in t, i.e. 1

P1d

d − P˜ v(t)

< 1/2,

v(t)

0 ≤ t ≤ t1 .

(10.5)

It remains to justify (10.5) for small enough ε > 0. In order to prove the bound (10.5), we will need the formula (4.29) and the following relation which follows from (4.29): d P˜ v(t) (10.6) X˜ (t) = p jl (v(t))τ˜ j (v(t))(τ˜l (v(t)), X˜ (t)), where τ˜ j (v(t)) are the vectors τ j (v(t)) expressed in the variables y1 . In detail (cf. (3.3)), τ˜1 (v) := (−ψv (y1 − d1 (t)), −πv (y1 − d1 (t))), τ˜2 (v) := (∂v ψv (y1 − d1 (t)), ∂v πv (y1 − d1 (t))),

(10.7)

where v = v(t). Since τ j are smooth and rapidly decaying at infinity functions, then Lemma 9.1 implies τ˜ j (v(t)) − τ j (v(t)) E β ≤ Cε2 , Furthermore,

τ j (v(t)) − τ j (v(t1 )) = t

t1

0 ≤ t ≤ t1 ,

j = 1, 2.

(10.8)

v(s)∂ ˙ v τ j (v(s))ds,

and therefore

t1 |v(s)|ds, ˙ τ j (v(t)) − τ j (v(t1 )) E β ≤ C t t1 v(s)∂ ˙ p (v(s))ds| ≤ C | p jl (v(t)) − p jl (v(t1 ))| = | v jl t

0 ≤ t ≤ t1 , t1

|v(s)|ds, ˙

(10.9) 0 ≤ t ≤ t1 ,

t

(10.10) since |∂v p jl (v(s))| is uniformly bounded by (5.2). Further, t1 t1 ds |v(s)|ds ˙ ≤ Cm 21 (t1 ) ≤ Cε2 , 3 (1 + s) t t

0 ≤ t ≤ t1 .

(10.11)

Hence, the bounds (10.5) will follow from (4.29), (10.6) and (10.8)–(10.10) if we choose ε > 0 small enough. The proof is completed.

Asymptotic Stability of Moving Kink for Relativistic Ginzburg-Landau Equation

247

11. Decay of Transversal Component Here we prove Proposition 7.1. Step i) We fix ε > 0 and t∗ = t∗ (ε) for which Lemma 10.1 holds. Then the bounds of type (9.7) and (9.8) holds with P1c X˜ (t) E −β and (P1c X˜ (t))1 L ∞ in the left-hand sides replaced by X (t) E −β and (t) L ∞ : X (t) −β ≤

t

+C

(1 + t)3/2 X (s) 2E −β + (s) L ∞ X (s) E −β + (1 + s)3/2+β (s) 12 L∞

0

(t) L ∞ ≤

t

+C

C X (0) E β

C X (0) E β ∩W

(1 + |t − s|)3/2

ds,

(11.1)

(1 + t)1/2 11 X (s) 2E −β + (s) L ∞ X (s) E −β +(1 + s)3/2+β (s) 12 L ∞ + (s) L ∞ (1 + |t − s|)1/2

0

ds

(11.2) for 0 ≤ t ≤ t1 and t1 < t∗ . This implies an integral inequality for the majorants m 1 and m 2 . Namely, multiplying both sides of (11.1) by (1 + t)3/2 , and taking the supremum in t ∈ [0, t1 ], we obtain t (1 + t)3/2 ds m 1 (t1 ) ≤ C X (0) E β + C sup 3/2 t∈[0,t1 ] 0 (1 + |t − s|) 3/2+β m 21 (s) m 1 (s)m 2 (s) m 12 2 (s)(1 + s) × + + (1 + s)3 (1 + s)2 (1 + s)6 for t1 < t∗ . Taking into account that m(t) is a monotone increasing function, we get m 1 (t1 ) ≤ C X (0) E β + C[m 21 (t1 ) + m 1 (t1 )m 2 (t1 ) + m 12 2 (t1 )]I1 (t1 ), t1 < t∗ , (11.3) where

I1 (t1 ) = sup

t

t∈[0,t1 ] 0

(1 + t)3/2 ds ≤ I 1 < ∞, (1 + |t − s|)3/2 (1 + s)9/2−β

t1 ≥ 0, 5/2 < β < 3.

Therefore, (11.3) becomes m 1 (t1 ) ≤ C X (0) E β + C I 1 [m 21 (t) + m 1 (t1 )m 2 (t1 ) + m 12 2 (t1 )],

t1 < t∗ . (11.4)

Similarly, multiplying both sides of (11.2) by (1 + t)1/2 , and taking the supremum in t ∈ [0, t1 ], we get m 2 (t1 ) ≤ C X (0) E β ∩W + C[m 21 (t1 ) + m 1 (t1 )m 2 (t1 ) 11 + m 12 2 (t1 ) + m 2 (t1 )]I2 (t1 ), t1 < t∗ ,

where

I2 (t1 ) = sup

t∈[0,t1 ] 0

t

(1 + t)1/2 ds ≤ I 2 < ∞, (1 + |t − s|)1/2 (1 + s)9/2−β

(11.5)

t1 ≥ 0, 5/2 < β < 3.

248

E. A. Kopylova, A. I. Komech

Therefore, (11.5) becomes m 2 (t1 ) ≤ C X (0) E β ∩W + C I 2 [m 21 (t1 ) 11 + m 1 (t1 )m 2 (t1 ) + m 12 2 (t1 ) + m 2 (t1 )] t1 < t∗ ,

(11.6)

Inequalities (11.4) and (11.6) imply that m 1 (t1 ) and m 2 (t1 ) are bounded for t1 < t∗ , and moreover, m 1 (t1 ), m 2 (t1 ) ≤ C X (0) E β ∩W , t1 < t∗

(11.7)

since m 1 (0) = X (0) E −β and m 2 (0) = (0) L ∞ are sufficiently small by (2.8). Step ii) The constant C in the estimate (11.7) does not depend on t∗ by Lemma 10.1. We choose d0 in (2.8) so small that X (0) E β ∩W < ε/(2C). It is possible due to (2.8). Finally, this implies that t∗ = ∞, and (11.7) holds for all t1 > 0 if d0 is small enough. 12. Soliton Asymptotics Here we prove our main Theorem 2.5 using the decay (7.1). The estimates (6.11) and (7.1) imply that |c(t)| ˙ + |v(t)| ˙ ≤

C1 (v, d0 ) , (1 + t)3

t ≥ 0.

Therefore, c(t) = c+ + O(t −2 ) and v(t) = v+ + O(t −2 ), t → ∞. Similarly, t v(s)ds = v+ t + q+ + α(t), α(t) = O(t −1 ). b(t) = c(t) +

(12.1)

(12.2)

0

We have obtained the solution Y (x, t) = (ψ(x, t), π(x, t)) to (1.12) in the form Y (x, t) = Yv(t) (x − b(t), t) + X (x − b(t), t),

(12.3)

˙ = v+ + α(t). where we define now v(t) = b(t) ˙ Since Yv(t) (x − b(t), t) − Yv+ (x − v+ t − q+ , t) E = O(t −1 ), it remains to extract the dispersive wave W0 (t) + from the term X (x − b(t), t). Substituting (12.3) into (1.12) we obtain by (2.6) the inhomogeneous Klein-Gordon equation for the X (x − b(t), t): X˙ (y, t) = A0v X (y, t) + R(y, t), 0 ≤ t ≤ ∞, where y = x − b(t), and v∇ 1 , A0v = − m 2 v∇

R(t) =

(12.4)

v∂ ˙ v ψv , v∂ ˙ v πv + F( + ψv ) − F(ψv ) + m 2

Now we change the variable y → y1 = y + α(t) + q+ . Then we obtain the “frozen” equation ˜ X˙˜ (t) = A+ X˜ (t) + R(t), 0 ≤ t ≤ ∞,

(12.5)

Asymptotic Stability of Moving Kink for Relativistic Ginzburg-Landau Equation

249

˜ are X (t) and R(t) of y = y1 − α(t) − q+ , and where X˜ (t) and R(t) 1 v+ ∇ , A+ = − m 2 v+ ∇ Equation (12.5) implies

X˜ (t) = W+ (t) X˜ (0) +

t

˜ W+ (t − s) R(s)ds,

(12.6)

(12.7)

0

where W+ (t) = e A+ t is the integral operator with integral kernel W+ (y1 − z, t) = W0 (y1 − z + v+ t, t) = W0 (x − z, t), since by (12.2) y1 + v+ t = y + α(t) + q+ + v+ t = x − b(t) + α(t) + q+ + v+ t = x. Hence, Eq. (12.7) implies X (x − b(t), t) = W0 (t) X˜ (0) +

t

˜ W0 (t − s) R(s)ds.

(12.8)

0

Let us rewrite (12.8) as

X (x − b(t), t) = W0 (t) X˜ (0) +

∞

˜ W0 (−s) R(s)ds −

0

∞

˜ W0 (t − s) R(s)ds

t

= W0 (t) + + r+ (t). To establish the asymptotics (2.9), it suffices to prove that ∞ ˜ ˜ + = X (0) + W0 (−s) R(s)ds ∈ E and r+ (t) E = O(t −1/2 ).

(12.9)

0

˜ Assumption (2.8) implies that X˜ (0) ∈ E. Let us split R(s) as the sum 0 v∂ ˙ v ψ˜ v ˜ ˜ ˜ + R(s) = ˜ = R (s) + R (s). ˜ + ψ˜ v ) − F(ψ˜ v ) + m 2 F( v∂ ˙ v π˜ v By (12.1), we obtain R˜ (s) E = O(s −3 ).

(12.10)

Let us consider R˜ = (0, R˜ 2 ). We have ˜ = (F (ψ˜ v ) + m 2 ) ˜ + N˜ (v, ), ˜ ˜ + ψ˜ v ) − F(ψ˜ v )+m 2 ˜ + N˜ (v, ) ˜ = −V˜v R˜ 2 = F( By (1.17) and (7.1), we obtain −3/2 ˜ ˜ V˜v (s) , L 2 ≤ C (s) L 2 ≤ C(v, d0 )(1 + |s|) −β

(12.11)

since |q+ + α(s)| ≤ C. Finally, (7.1), (7.4), and (8.6) imply −3/2 ˜ N˜ (v, (s)) . L 2 ≤ C(v, d0 )(1 + |s|)

(12.12)

Hence, (12.11)–(12.12) imply R˜ (s) E = O(s −3/2 ), and (12.9) follows by (12.10) and (12.13).

(12.13)

250

E. A. Kopylova, A. I. Komech

A. Virial Type Estimates Here we prove the weighted estimate (4.42). Let us recall that we split the solution Y (t) = (ψ(·, t), π(·, t)) = S(σ (t)) + X (t), and denote X (t) = ((t), (t)), (0 , 0 ) := ((0), (0)). Our basic condition (2.8) implies that for some ν > 0, X 0 E 5/2+ν ≤ d0 < ∞.

(A.1)

Proposition A.1. Let the potential U satisfy conditions U1, and X 0 satisfy (A.1). Then the bounds hold (t) L 2

5/2+ν

≤ C(v, d0 )(1 + t)4+ν ,

t > 0.

(A.2)

We will deduce the proposition from the following two lemmas. The first lemma is well known. Denote |π(x, t)|2 |ψ (x, t)|2 + + U (ψ(x, t)). 2 2 Lemma A.2. For the solution ψ(x, t) of Klein-Gordon equation (1.2) the local energy estimate holds a2 a2 +t e(x, t) d x ≤ e(x, 0) d x, a1 < a2 , t > 0. (A.3) e(x, t) =

a1 −t

a1

Proof. The estimate follows by standard arguments: multiplication of Eq. (1.2) by ˙ ψ(x, t) and integration over the trapezium ABC D, where A = (a1 − t, 0), B = (a1 , t), C = (a2 , t), D = (a2 + t, 0). Then (A.3) is obtained after partial integration using that U (ψ) ≥ 0. Lemma A.3. For any σ ≥ 0 and b ∈ R, σ σ +1 (1 + |x|σ )e(x, 0)d x. (A.4) (1 + |x − b| )e(x, t)d x ≤ C(σ )(1 + t + |b|) Proof. By (A.3) (1 + |y|σ )

e(x, t)d x dy ≤ (1 + |y|σ )

y+b y+b−1

Hence, e(x, t)

x−b+1

e(x, 0)d x dy.

y+b+t y+b−1−t

(1 + |y|σ )dy d x ≤ e(x, 0)

x−b

x−b+1+t

(1 + |y|σ )dy d x.

x−b−t

(A.5) Obviously,

x−b+1

(1 + |y|σ )dy ≥ c(σ )(1 + |x − b|σ )

(A.6)

x−b

with some c(σ ) > 0. On the other hand, x−b+1+t (1 + |y|σ )dy ≤ (2t + 1)(1 + t + |b| + |x|)σ ≤ C(1 + t + |b|)σ +1 (1 + |x|σ ), x−b−t

(A.7) since σ ≥ 0. Finally, (A.5)–(A.7) imply (A.4).

Asymptotic Stability of Moving Kink for Relativistic Ginzburg-Landau Equation

Proof of Proposition A.1. First, we verify that U0 = (1 + |x|5+2ν )U (ψ0 (x))d x < C(d0 ), ψ0 (x) = ψ(x, 0).

251

(A.8)

Indeed, ψ0 (x) = ψv0 (x − q0 ) + 0 (x) is bounded since 0 ∈ H 1 (R). Hence U1 implies that |U (ψ0 (x))| ≤ C(d0 )(ψ0 (x) ± a)2 ≤ C(d0 ) (ψv0 (x − q0 ) ± a)2 + 02 (x) and then (A.8) follows by (1.14), (1.18) and (A.1). Further, we have

2 ˙ (y, s)ds − 0 (y) dy = (1 + |y| ) 0 t ˙ 2 (y, s)ds. ≤ 2d02 + 2t (1 + |y|5+2ν )dy

(t) 2L 2 5/2+ν

5+2ν

t

(A.9)

0

Due to (4.2) and (12.1)–(12.2) we have 2 ˙ ˙ 2 (y, s) = b(s)ψ (y + b(s), s) + π(y + b(s), s) − v∂ ˙ v ψv (y) ≤ C(v, d0 ) (ψ (y + b(s), s))2 + π 2 (y + b(s), s) + (∂v ψv (y))2 ≤ C(v, d0 ) e(y + b(s), s) + (∂v ψv (y))2 . (A.10) Substituting (A.10) into (A.9) and changing variables we obtain by (A.4) and (A.8) that t 2 2 5+2ν (t) L 2 ≤ 2d0 + C(v, d0 )t )e(x, s)d x + C(v) ds (1 + |x − b(s)| 5/2+ν 0 ≤ 2d02 + C(v, d0 )t 2 + C(v, d0 )t (1 + |x|5+2ν )e(x, 0)d x t × (1 + s + |b(s)|)6+2ν ds 0

2 ≤ 2d0 + C(v, d0 )t 2 + C(v, d0 )(1 + t)8+2ν X 0 2E 5/2+ν + U0 ≤ C(v, d0 )(1 + t)8+2ν . References 1. Agmon, S.: Spectral properties of Schrödinger operator and scattering theory. Ann. Scuola Norm. Sup. Pisa, Ser. IV 2, 151–218 (1975) 2. Bais, F.A.: Topological excitations in gauge theories; An introduction from the physical point of view. Springer Lecture Notes in Mathematics, Vol. 926, Berlin-Heidelberg-New York: Springer, 1982 3. Bjørn, F.: Geometry, Particles, and Fields. New York: Springer, NY, 1998 4. Buslaev, V.S., Perelman, G.S.: Scattering for the nonlinear Schrödinger equations: states close to a soliton. St. Petersburg Math. J. 4(6), 1111–1142 (1993) 5. Buslaev, V.S., Sulem, C.: On asymptotic stability of solitary waves for nonlinear Schrödinger equations. Ann. Inst. Henri Poincaré. Anal. Non Linéaire 20(3), 419–475 (2003)

252

E. A. Kopylova, A. I. Komech

6. Cuccagna, S.: Stabilization of solutions to nonlinear Schrödinger equations. Comm. Pure Appl. Math. 54, 1110–1145 (2001) 7. Cuccagna, S.: On asymptotic stability in 3D of kinks for the φ 4 model. Transactions of AMS 360(5), 2581– 2614 (2008) 8. Jensen, A., Kato, T.: Spectral properties of Schrödinger operators and time-decay of the wave functions. Duke Math. J. 46, 583–611 (1979) 9. Jensen, A., Nenciu, G.: A unified approach to resolvent expansions at thresholds. Rev. Math. Phys. 13(6), 717–754 (2001) 10. Henry, D.B., Perez, J.F., Wreszinski, W.F.: Stability theory for solitary-wave solutions of scalar field equations. Commun. Math. Phys. 85, 351–361 (1982) 11. Imaikin, V., Komech, A.I., Vainberg, B.: On scattering of solitons for the Klein-Gordon equation coupled to a particle. Commun. Math. Phys. 268(2), 321–367 (2006) 12. Kirr, E., Zarnesku, A.: On the asymptotic stability of bound states in 2D cubic Schrödinger equation. Commun. Math. Phys. 272(2), 443–468 (2007) 13. Komech, A., Kopylova, E.: Weighted energy decay for 1D Klein-Gordon equation. Comm. PDE 35(2), 353–374 (2010) 14. Kopylova, E.: On long-time decay for Klein-Gordon equation. Comm. Math. Anal. Conference 03, 137– 152 (2011). http://arriv.org/abs/1009.2649vz [math-ph]; 2010 15. Lions, J.L.: Quelques Mèthodes de Rèsolution des Problémes aux Limites non Linéaires. Paris: Dunod, 1969 16. Murata, M.: Asymptotic expansions in time for solutions of Schrödinger-type equations. J. Funct. Anal. 49, 10–56 (1982) 17. Miller, J., Weinstein, M.: Asymptotic stability of solitary waves for the regularized long-wave equation Comm. Pure Appl. Math. 49(4), 399–441 (1996) 18. Pego, R.L., Weinstein, M.I.: Asymptotic stability of solitary waves, Commun. Math. Phys. 164, 305–349 (1994) 19. Pillet, C.A., Wayne, C.E.: Invariant manifolds for a class of dispersive, Hamiltonian, partial differential equations. J. Differ. Eq. 141(2), 310–326 (1997) 20. Reed, M.: Abstract Non-Linear Wave Equations. Lecture Notes in Mathematics 507, Berlin: Springer, 1976 21. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, III. New York: Academic Press, 1979 22. Rodnianski, I., Schlag, W., Soffer, A.: Dispersive analysis of charge transfer models. Commun. Pure Appl. Math. 58(2), 149–216 (2005) 23. Soffer, A., Weinstein, M.I.: Multichannel nonlinear scattering for nonintegrable equations. Commun. Math. Phys. 133, 119–146 (1990) 24. Soffer, A., Weinstein, M.I.: Multichannel nonlinear scattering for nonintegrable equations. II. The case of anisotropic potentials and data. J. Diff. Eq. 98(2), 376–390 (1992) 25. Soffer, A., Weinstein, M.I.: Resonances, radiation damping and instability in Hamiltonian nonlinear wave equations. Invent. Math. 136, 9–74 (1999) 26. Strauss, W.A.: Nonlinear invariant wave equations. Lecture Notes in Physics 73, Berlin: Springer, 1978, pp. 197–249 27. Tsai, T.-P., Yau, H.-T.: Asymptotic dynamics of nonlinear Schrödinger equations: resonance-dominated and dispersion-dominated solutions. Commun. Pure Appl. Math. 55(2), 153–216 (2002) Communicated by H. Spohn

Commun. Math. Phys. 302, 253–289 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1186-5

Communications in

Mathematical Physics

The Interaction of a Gap with a Free Boundary in a Two Dimensional Dimer System M. Ciucu1, , C. Krattenthaler2, 1 Department of Mathematics, Indiana University, Bloomington, IN 47405-5701, USA.

E-mail: [email protected]

2 Fakultät für Mathematik der Universität Wien, Nordbergstraße 15, A-1090 Wien, Austria

Received: 11 December 2009 / Accepted: 23 August 2010 Published online: 14 January 2011 – © Springer-Verlag 2011

Abstract: Let be a fixed vertical lattice line of the unit triangular lattice in the plane, and let H be the half plane to the left of . We consider lozenge tilings of H that have a triangular gap of side-length two and in which is a free boundary — i.e., tiles are allowed to protrude out half-way across . We prove that the correlation function of this 1 gap near the free boundary has asymptotics 4πr , r → ∞, where r is the distance from the gap to the free boundary. This parallels the electrostatic phenomenon by which the field of an electric charge near a conductor can be obtained by the method of images.

1. Introduction The study of the interaction of gaps in dimer coverings was introduced in the literature by Fisher and Stephenson [15]. This pioneering work contains three different types of gap interaction in dimer systems on the square lattice: (i) interaction of two dimer-gaps (equivalently, interaction of two fixed dimers required to be contained in the dimer coverings); (ii) interaction of two non-dimer-gaps (specifically, two monomers), and (iii) the interaction of a dimer-gap with a constrained boundary (edge or corner). The first of these types of interactions was later generalized by Kenyon [20] to an arbitrary number of dimer-gaps on the square and hexagonal lattices, and recently by Kenyon, Okounkov and Sheffield [22] to general planar bipartite lattices. Interactions of the second type were studied by the first author of the present paper in [5–9], where close analogies to two dimensional electrostatics were established. Two instances of interaction of non-dimer-gaps with constrained boundaries can be found in [21, Sect. 7.5] (interaction of a monomer with a constrained straight line Research partially supported by NSF grant DMS-0500616.

Research partially supported by the Austrian Science Foundation FWF, grants Z130-N13 and S9607-

N13, the latter in the framework of the National Research Network “Analytic Combinatorics and Probabilistic Number Theory.”

254

M. Ciucu, C. Krattenthaler

boundary on the square lattice), and respectively [6, Theorem 2.2] (interaction of a family of triangular gaps with a constrained straight line boundary on the hexagonal lattice). In this paper we determine the interaction of a triangular gap with a free straight line boundary (i.e., dimers are allowed to protrude out across it) on the hexagonal lattice. This type of interaction has not been treated before in the literature. (We are aware of one other paper, namely [12], addressing the asymptotic behavior of lozenge tilings under the presence of a free boundary, but the regions considered there contain no gaps.) We find that the gap is attracted to the free boundary in precise analogy to the (two dimensional) electrostatic phenomenon in which an electric charge is attracted by a straight line conductor when placed near it. This develops further the analogy between dimer systems with gaps and electrostatics that the first author has described in [6–9]. More generally, our result shows that in any physical system that can be modeled by dimer coverings, a gap will tend to be attracted to an interface corresponding to a free boundary. This effect, purely entropic in origin, is reminiscent of the Cheerios effect by which an air bubble at the surface of a liquid in a container is attracted to the walls [35] (note that the Cheerios effect is not entropic in origin). 2. Set-up and Results There seem to be no methods in the literature for finding the interaction of a gap “in a sea of dimers” with a free boundary. However, as V. I. Arnold said, “mathematics is a part of physics where experiments are cheap.” We now design such an experiment in order to determine the interaction of a gap in a dimer system on the hexagonal lattice with a free boundary. Consider the triangular lattice in the plane consisting of unit equilateral triangles, drawn so that one family of lattice lines is vertical. Note that the hexagonal lattice is the dual of the triangular lattice. (To be precise, the hexagonal graph arises as the graph whose vertices are the unit triangles, and whose edges connect precisely those unit triangles that share an edge.) Dimers on the hexagonal lattice then correspond to lozenges (i.e., unit rhombi) consisting of pairs of adjacent unit triangles. The free boundary we choose is a lattice line — say vertical — on the triangular lattice, to the left of which the plane is covered completely and without overlapping by lozenges, except for a gap 2 in the shape of a triangle of side-length 2, pointing to the left; the lozenges are allowed to protrude halfway across the free boundary, to its right. (Figure 1 pictures a portion of such a tiling.) We define the correlation function (or simply correlation) of the hole 2 with the free boundary as follows. Choose a rectangular system of coordinates in which is the y-axis, the origin is some lattice point on , and the unit is the lattice spacing. Let√2 (k) be the placement of 2 so that the center C of its right side has coordinates (−k 3, 0) (i.e., C and the origin are the endpoints of a string of k contiguous horizontal lozenges; Fig. 1 illustrates 2 (2), the origin being denoted by O there). Let Hn,x be the lattice hexagon of side-lengths 2n, 2n, 2x, 2n, 2n, 2x (in counter-clockwise order, starting with the southwestern side) centered at the origin (thus Hn,x is vertically symmetric about , and its horizontal symmetry axis cuts 2 (k) into two equal parts; for example, Fig. 2 shows the hexagon H4,4 with the triangular hole 2 (2)). Let Fn,x be the region obtained from the left half of Hn,x by regarding its boundary along as free (i.e., lozenges in a tiling of Fn,x are allowed to protrude outward across ). Figure 3 shows the region F3,3 together with such a lozenge tiling; the origin is labelled by O.

Interaction of a Gap with a Free Boundary in a Dimer System

255

Fig. 1. A partial lozenge tiling of the left half plane with a gap

Following [15] and [6], for any fixed integer k ≥ 0 we define the correlation of 2 (k) with the free boundary , denoted ω f (k), by ω f (k) := lim

n→∞

M(Fn,n \ 2 (k)) , M(Fn,n )

(2.1)

where M(R) stands for the number of lozenge tilings of the region R (if R has portions of the boundary that are free — as in our case — then it is understood that what we count is tilings in which lozenges are allowed to protrude out across the free portions). A tiling of F4,4 \ 2 (2) of this type is illustrated in Fig. 1. We note that, by [11], in a regular hexagon lozenges have maximum entropy statistics (in the scaling limit) at the center. According to this, (2.1) is a natural definition for the correlation function. An analogous definition was used in [6]. In fact, it is worth generalizing the definition of correlation (2.1) to the situation when the side-lengths 2n and 2x of the half hexagon Fn,x go to infinity at different rates. More precisely, for any real number ξ > 0, define ω f (k; ξ ) by ω f (k; ξ ) := lim

n→∞

M(Fn,ξn n \ 2 (k)) , M(Fn,ξn n )

(2.2)

256

M. Ciucu, C. Krattenthaler

Fig. 2. The hexagon Hn,x with n = x = 4

Fig. 3. A lozenge tiling of the region Fn,n with n = 3

where (ξn )n≥1 is a suitable sequence of rational numbers approaching ξ . (“Suitable” here means that we have to choose ξn in such a way that ξn n is integral.) The number ω f (k; ξ ) is the correlation of the triangular gap 2 (k) with the free boundary, obtained when the large regions used in the definition are the left halfs of hexagons that are not

Interaction of a Gap with a Free Boundary in a Dimer System

257

necessarily regular, but have their left vertical side ξ times as long as the two oblique sides. In Lemma 13 we obtain an exact expression for ω f (k; ξ ) in terms of an integral. What affords this is an exact formula for M(Fn,x \ 2 (k)), which we present in Theorem 4. We then deduce the asymptotics of ω f (k; ξ ) as k → ∞ using Laplace’s method (see Lemma 14 and the proof of Theorem 1 in sect. 7). The result is the following. Theorem 1. As k → ∞, the correlation ω f (k; ξ ) is asymptotically ω f (k; ξ ) ∼

1 1 · √ π(1 + ξ )2 ξ(2 + ξ ) k

2 1+ξ

4k .

(2.3)

Remark 1. Note that, by the results of [11], we should expect distorted dimer statistics around the gap for ξ = 1. As the theorem above shows, the distortion is quite radical. Indeed, for ξ = 1, Theorem 1 gives ω f (k) = ω f (k; 1) ∼

1 1 1 1 , √ · = 4π d(2 (k), ) 4π 3 k

(2.4)

where d is the Euclidean distance. However, for ξ = 1, ω f (k; ξ ) decays exponentially to 0 or blows up exponentially, according as ξ > 1 or ξ < 1. Remark 2. The exponential behavior of ω f (k; ξ ) for ξ = 1 is in fact closely mirrored also in the setting of [6], where the correlation of holes was defined by including them at the center of large hexagons. Indeed, using arguments of [6], it follows that the correlation of two 2 ’s on the symmetry axis of the hexagons is exponential for ξ = 1. This has the interesting consequence that the limits of entries of the inverse Kasteleyn matrices of hexagons on the one hand and of “corresponding” tori on the other do not agree, not even at the center of the hexagons. To be more precise, note that the center of large hexagons is in the liquid regime of [22]. By [32, Ch. 8 and 9] (see also [22, Theorem 2.1]), there exists an ergodic Gibbs measure coming from (weighted) lozenge tilings on a large torus whose slope is the same as the slope of the limit shape at the center of the large hexagon. One might be tempted to guess that, in this situation, the entries of the inverse Kasteleyn matrix (which determine correlation) behave similarly in the limit. However, using [22, Theorem 4.3, (7), with P(z, w) = a + bz + cw and Q(z, w) = 1] and appropriate asymptotic expansions in the special case that we are interested in, it turns out that the correlation of two 2 ’s under any torus measure in the liquid regime behaves polynomially in the distance of the 2 ’s, in contrast to the exponential behavior for the hexagon mentioned above. This provides a concrete example highlighting the subtlety of taking limits of entries of the inverse Kasteleyn matrix: they depend quite sensitively on how the infinite plane is achieved as a limit of graphs. In [7] the first author described how a distribution of fixed holes on the triangular lattice defines in a natural way two vector fields. The F-field is a discrete vector field defined at the center of each left-pointing unit triangle e, and equal to the expected orientation of the lozenge covering e (under the uniform measure on the set of tilings). To define the T-field, one introduces an extra “test-hole” t and measures the relative change in the correlation function under small displacements of it, as the other holes are kept fixed. One can prove (details will appear elsewhere) that in the scaling limit of the lattice spacing approaching zero, this relative change is given by the scalar product of

258

M. Ciucu, C. Krattenthaler

the displacement vector with a certain vector T(z), where z is the point to which the test hole t shrinks when the lattice spacing approaches zero. This defines the second field. When these fields are generated by lozenge tilings that cover the entire plane with the exception of a finite collection of fixed-size holes (the case treated in [7] and [9]), both the T-field and the scaling limit of the F-field turn out to be equal, up to a constant multiple, to the electrostatic field of the two dimensional physical system obtained by viewing the holes as electrical charges. But what if we do not tile the entire plane, but only the half-plane to the left of the free boundary , and we have no holes? The above definitions for the F-field and T-field would still work, provided (i) the scaling limit of the discrete field defining F exists, and (ii) the scaling limit of the relative changes in the correlation function under small displacements of a test hole exists and is given by taking scalar products of the displacement vector with the vectors of a certain field. Our exact determination of ω f (k; ξ ) (see Lemma 13) allows us to verify (ii) for displacements along the horizontal direction. 2 (k) plays now the role of a test charge. The ω (k+1;ξ ) expression that measures the relative change in correlation in this case is ωf f (k;ξ ) − 1. What we have to do is to determine the leading term in the asymptotics of this expression as k → ∞. We obtain the following result, whose proof is given in Sect. 7. Theorem 2. We have

⎧ ⎨

4

2 − 1 + O k1 if ξ = 1, ω f (k + 1; ξ ) 1+ξ −1= ⎩−1 + O 1 ω f (k; ξ ) if ξ = 1, k k2

(2.5)

as k → ∞. Remark 3. In order to indicate the dependence on ξ (the asymptotic ratio of the sides of our half hexagon), we write Tξ for the T-field. By symmetry, displacements of 2 (k) parallel to leave ω f unchanged, so the relative change in ω f corresponding to such displacements is zero. Suppose that ξ = 1. Then, provided the field Tξ exists, it follows from Theorem 2 that its value at z is 4 2 e1 Tξ (z) = √ −1 , (2.6) 1+ξ 2 3 √ where e1 is the unit vector in the positive direction of the x-axis (the 2 3 at the denominator comes from the fact that Tξ arises from the√expression on the left-hand side of (2.5) divided by the product of the displacement, 3 in this case, and the “charge” of the hole 2 (k), which is 2; see [7] for details). In particular, the field Tξ is constant. On the other hand, if ξ = 1, then the second assertion in Theorem 2 yields e1 e1 . (2.7) T1 (z) = − √ =− 2 d(z, ) 2 3k Note that by [7] we would obtain (up to a multiplicative constant of 2) the same T-field at z if we look at tilings of the entire plane, with the mirror image of our test-hole 2 (k) being a fixed hole. This is analogous to the phenomenon in electrostatics by which the field created by an electric charge placed near a conductor can be obtained by the method of images (see e.g. [13, Chap. 6]).

Interaction of a Gap with a Free Boundary in a Dimer System

259

The F-field could be determined by an “experiment” analogous to the one we described at the beginning of this section: simply replace 2 (k) by L(k), the horizontal lozenge contained in 2 (k). Recall that, by definition, the F-field at a left-pointing unit triangle is determined by the probabilities p1 , p2 , p3 that is occupied by a horizontal, northwest-pointing, or southwest-pointing lozenge, respectively. More precisely, (cf. [9]) F() = p1 e1 + p2 e2 + p3 e3 , where e1 , e2 , e3 are unit vectors parallel√ to the long diagonals of the above three loz√ enges, that is, e1 = (1, 0), e2 = (− 21 , 23 ), e3 = (− 21 , − 23 ). Hence, since p2 = p3 by symmetry and p1 + p2 + p3 = 1, it suffices to determine p1 , that is, the limit of the proportion of the number of lozenge tilings of Fn,x that contain L(k), as n and x go to infinity so that x/n approaches a fixed positive real number ξ . It turns out that, for fixed n, x, and k, the number of lozenge tilings of Fn,x \L(k) is given by a formula similar to (3.2), namely by the formula in Theorem 15. By lemmas that are analogous to Lemmas 12–14 (see Lemmas 16 and 17, and the text in between), one can then derive that the probability p1 is given by (see Corollary 18) p1 =

1 2 arctan √ . π ξ(2 + ξ )

(2.8)

Using standard formulas for trigonometric functions, it can be seen that this value agrees with the probability of finding a lozenge in the center of a random tiling of a hexagon with side-lengths 2n, 2n, 2x, 2n, 2n, 2x in the limit n, x → ∞ so that x/n approaches ξ , as given in [11, Conjecture 6.1] (with x = y = 0, α = γ = 1, β = ξ ), proved in [3, Theorem 3.12]. (In fact, in the special case that is relevant here, Conjecture 6.1 of [11] was proved earlier in [10, Cor. 4].) Thus, the free boundary has no disturbing effect at all on the lozenge statistics. Let Fξ denote the F-field for the above situation. Then the definition of the F-field and the above considerations imply the following result. Corollary 3. Let e(k) be the leftmost left-pointing unit triangle of 2 (k). Then Fξ (e(k)) =

3 p1 − 1 e1 , 2

(2.9)

where p1 is given by (2.8). Remark 4. In the case where ξ = 1, Eqs. (2.7) and (2.9) imply that, in sharp contrast to the case of lozenge tilings of the plane with a finite number of fixed size holes, where the T- and F-field are the same up to a constant multiple (cf. the second paragraph after Remark 2), for the half-plane with free boundary the fields T and F have radically different behavior: while in the scaling limit the former behaves as the electrostatic field near a conductor, the latter is zero. It is amusing that, aside from ξ = 1, there is precisely one other value of ξ where the field intensities in (2.6) and (2.9) agree. Figure 4 shows a plot of the two functions, with the intensity in (2.9) being the one approaching −1/2 for ξ → ∞. Numerically, this other value of ξ is 3.28262 . . .. Our approach to proving Theorems 1 and 2 consists of solving first the counting problem exactly, see Theorem 4. This result generalizes Andrews’ theorem [1] (which proved

260

M. Ciucu, C. Krattenthaler

0.6

0.4

0.2

1

2

3

4

5

0.2

0.4 Fig. 4. Plot of the field intensities Tξ and Fξ

MacMahon’s conjecture on symmetric plane partitions) in the case q = 1. Its proof is given in Sects. 4 and 5, with some auxiliary results proved separately in Sect. 6. It is based on the “exhaustion/identification of factors” method described in [25, Sect. 2.4]. In Sect. 7, we perform the asymptotic calculations needed to derive Theorems 1 and 2 from the exact counting results. The final section, Sect. 8, presents the results that are needed for the determination of the F-field Fξ reported in Corollary 3. 3. An Exact Tiling Enumeration Formula Tilings of the region Fn,x are clearly equivalent to tilings of the hexagon Hn,x that are invariant under reflection across its symmetry axis . Counting such tilings was a problem considered (in the equivalent form of symmetric plane partitions) by MacMahon in the early twentieth century (see [28, p. 270]). MacMahon conjectured that the number of vertically symmetric lozenge tilings of a hexagon with side-lengths 2n, 2n, 2x, 2n, 2n, 2x is equal to

n x + 21 2n (2x + 2s)4n−4s+1 , (3.1) 1

(2s)4n−4s+1 2 2n

s=1

Interaction of a Gap with a Free Boundary in a Dimer System

261

Fig. 5. A symmetric lozenge tiling of the hexagon Hn,x with two holes

where (α)m is the Pochhammer symbol, defined by (α)m := α(α + 1) · · · (α + m − 1) for m ≥ 1, and (α)0 := 1. This was first proved by Andrews [1]. Other proofs, and refinements, were later found by e.g. Gordon [17], Macdonald [27, pp. 83–85], Proctor [31, Prop. 7.3], Fischer [14], and the second author of the present paper [23, Theorem 13]. Our “experiment” — counting M(Fn,x \ 2 (k)) — is by the same token equivalent to counting vertically symmetric lozenge tilings of Hn,x with two missing triangles (compare Figs. 1 and 5). This is in fact a generalization of MacMahon’s symmetric plane partitions problem (see Remark 5). The key result that allows deducing Theorems 1 and 2 is the following. Theorem 4. For all positive integers n, x and nonnegative integers k ≤ n − 1, we have M(Fn,x \ 2 (k)) n

4k + 1 (2x + 2s)4n−4s+1 (n + k)! = 2k (x + n − k)2k+1 (2s)4n−4s+1 s=1

×

n−k−1

( 21 )i

i=0

i! (n − k − i − 1)!2 (n + k − i + 1)n−k (n + k − i + 1)i (2n − i + 21 )i

· ((x)i (x + i + 1)n−k−i−1 (x + n + k + 1)n−k −(x)n−k (x + n + k + 1)n−k−i−1 (x + 2n − i + 1)i ) .

(3.2)

262

M. Ciucu, C. Krattenthaler

Fig. 6. Forced lozenges when the hole touches the left border

Remark 5. Replacing x by x − 1, n by n + 1, and k by n, one can see that the above formula specializes to MacMahon’s formula (3.1). More precisely, because of forced lozenges (see Fig. 6), the enumeration problem in the statement of Theorem 4 reduces to the problem of enumerating vertically symmetric lozenge tilings of a hexagon with side-lengths 2n, 2n, 2x, 2n, 2n, 2x. The proof of Theorem 4 is given in the next two sections. In Sect. 4, we show that M(Fn,x \ 2 (k)) can be expressed in terms of a certain Pfaffian. This Pfaffian is then evaluated in Sect. 5. 4. Lozenge Tilings and Nonintersecting Lattice Paths The purpose of this section is to find a manageable expression for M(Fn,x \ 2 (k)) (see Lemma 6 at the end of this section). In this context, we will find it more convenient to think of the tilings of Fn,x \ 2 (k) directly as tilings of a half hexagon with an open boundary (cf. Fig. 7) as opposed to symmetric tilings of a hexagon with two holes

Interaction of a Gap with a Free Boundary in a Dimer System

263

Fig. 7. A lozenge tiling of the region Fn,x \ 2 (k); the right boundary is free. The dotted lines mark paths of lozenges. They determine the tiling uniquely

(cf. Fig. 5). There is a well known bijection between lozenge tilings of lattice regions and families of “paths of lozenges” (see Fig. 7), which in turn are equivalent to families of non-intersecting lattice paths (see Fig. 8). Its application to our situation is illustrated in Figs. 7 and 8. The origin of the system of coordinates indicated in Fig. 8 corresponds to the point O in Fig. 7 (note that the bottommost path of lozenges in Fig. 7 is empty for the illustrated tiling; the corresponding lattice path in Fig. 8 has no steps). By this bijection, lozenge tilings of Fn,x \ 2 (k) are seen to be equinumerous with families (P1 , P2 , . . . , P2n ) of non-intersecting lattice paths consisting of unit horizontal and vertical steps in the positive direction, where Pi runs from Ai = (−i, i) to some point from the set I ∪ {S1 , S2 }, i = 1, 2, . . . , 2n, with I = {(−1, s) : s = 1, 2, . . . , 2x + 2n}, S1 = (−2k − 1, x + n + k), S2 = (−2k − 2, x + n + k + 1),

(4.1)

and the additional condition that S1 and S2 must be ending points of some paths. At this point, we need a slight extension of Stembridge’s Theorem 3.2 in [34] (which is, in fact, derivable from the minor summation formula of Ishikawa and Wakayama [19, Theorem 2]). The reader should recall that the Pfaffian of a skew-symmetric 2n × 2n

264

M. Ciucu, C. Krattenthaler

Fig. 8. The paths of lozenges of Fig. 7 drawn as non-intersecting lattice paths on Z2

matrix A can be defined by (see e.g. [34, p. 102]) Pf A := sgn π π ∈M[1,...,2n]

Ai, j ,

(4.2)

i< j

i, j matched in π

where M[1, 2, . . . , 2n] denotes the set of all perfect matchings (1-factors) of (the complete graph on) {1, 2, . . . , 2n}, and where sgn π = (−1)cr(π ) , with cr(π ) denoting the number of “crossings” of π , that is, the number of quadruples i < j < k < l such that, under π , i is paired with k, and j is paired with l. It is a well-known fact (see e.g. [34, Prop. 2.2]) that (Pf A)2 = det A.

(4.3)

Theorem 5. Let {A1 , A2 , . . . , A p , S1 , S2 , . . . , Sq } and I = {I1 , I2 , . . . } be finite sets of lattice points in the integer lattice Z2 , with p + q even. Then q Q H = (−1) 2 Pf (sgn π ) · P nonint (Aπ → S ∪ I ), (4.4) −H t 0 π ∈Sp

where S p denotes the symmetric group on {1, 2, . . . , p}, Aπ = (Aπ(1) , Aπ(2) , . . . , Aπ( p) ), and P nonint (Aπ → S ∪ I ) is the number of families (P1 , P2 , . . . , Pp ) of nonintersecting lattice paths consisting of unit horizontal and vertical steps in the positive direction, with Pk running from Aπ(k) to Sk , for k = 1, 2, . . . , q, and to I jk , for k = q + 1, q + 2, . . . , p, the indices being required to satisfy jq+1 < jq+2 < · · · < j p . The matrix Q = (Q i, j )1≤i, j≤ p is defined by Q i, j =

P(Ai → Is ) · P(A j → It ) − P(A j → Is ) · P(Ai → It ) , (4.5) 1≤s
Interaction of a Gap with a Free Boundary in a Dimer System

265

where P(A → E) denotes the number of lattice paths from A to E, and the matrix H = (Hi, j )1≤i≤ p, 1≤ j≤q by Hi, j = P(Ai → S j ). In the special case when the starting and ending points satisfy a certain compatibility condition (called D-compatibility in [34]), the only permutation π which contributes to the right-hand side of (4.4) is the identity permutation, and (4.4) reduces to [34, Theorem 3.2]. In our context, the compatibility condition is not satisfied. However, the same arguments that prove [34, Theorem 3.2] can be used to obtain (4.4). (Alternatively, one could use the minor summation formula of Ishikawa and Wakayama [19, Theorem 2]. In it, choose m = p, r = q, and the skew-symmetric matrix B to be Bi, j = 1 for i < j — which makes all principal Pfaffian minors of B equal 1 — to expand the Pfaffian on the left-hand side of (4.4) into a sum of minors of a certain matrix. Each minor can then be seen to count certain families of nonintersecting lattice paths by the general form of the Lindström–Gessel–Viennot theorem [26, Lemma 1], [16, Theorem 1], and, altogether, these are the families that are described in the statement of Theorem 5.) Remark 6. In the case where p > q + |I |, there are more starting points than available ending points. However, Theorem 5 still holds: then the right-hand side of (4.4) is clearly zero, and the Pfaffian on the left-hand side follows to be zero by the above indicated arguments that prove Theorem 5. We now apply Theorem 5 to our situation, that is, p = 2n, q = 2, Ai = (−i, i), for i = 1, 2, . . . , 2n, and S1 , S2 , and I are given by (4.1). It is not difficult to convince oneself that, for this choice of starting and ending points, all families of nonintersecting lattice paths counted on the right-hand side of (4.4) give rise to even permutations π . Hence, the right-hand side of (4.4) indeed counts the families of nonintersecting lattice paths that we need to count. By Theorem 5, their number is equal to the negative value of the Pfaffian of Q H , (4.6) Mn (x) := −H t 0 where Q is a (2n) × (2n) skew-symmetric matrix with (i, j)-entry Q i, j given by (4.5), and where H is a (2n) × 2 matrix, in which the (i, j)-entry Hi, j is equal to the number of paths from Ai to S j , i = 1,2, . . . , 2n, j = 1, 2. (It is the negative value of the Pfaffian q

because of the sign (−1) 2 on the right-hand side of (4.4), as we have q = 2.) In particular, using the fact that the number of lattice paths on the integer lattice Z2 between two given lattice points is given by a binomial coefficient, we have x +n−k−1 , (4.7) Hi,1 = i − 2k − 1 x +n−k−1 Hi,2 = . (4.8) i − 2k − 2 On the other hand, substituting Ai = (−i, i) and Is = (−1, s) in (4.5), we have Q i, j = (P(Ai → Is ) · P(A j → It ) − P(A j → Is ) · P(Ai → It )) 1≤s
=

1≤s

s−1 i −1

t −1 j −1

−

s−1 j −1

t −1 i −1

266

M. Ciucu, C. Krattenthaler

=

1≤s≤t≤2x+2n

=

1≤t≤2x+2n

=

2x+2n t=1

s−1 i −1

t −1 j −1

t −1 t − j −1 i

j −i t

−

1≤s≤t≤2x+2n

1≤t≤2x+2n

s−1 j −1

t −1 i −1

t −1 t i −1 j

t t , i j

(4.9)

(4.10)

where we used the well-known identity X t=0

t i −1

=

X +1 i

(4.11)

to obtain (4.9). We may obtain an alternative expression for Q i, j by replacing 1t it =

j−1 1 t−1 1 i−1 t− j l=0 i i−1 in the last expression by i i−l−1 , this equality being true because l of the Chu–Vandermonde summation (cf. e.g. [18, Sect. 5.1, (5.27)]) L M+N N M , = L L −l l

(4.12)

l=0

where L is a non-negative integer. Thus, we arrive at

Q i, j =

i−1 2x+2n j − i t − j j − 1 t l i −l −1 j i l=0 t=1

=

i−1 2x+2n j − i j − 1 l + j t l+ j l i −l −1 i l=0 t=1

i−1 j −i j −1 l+ j 2x + 2n + 1 = i −l −1 l l + j +1 i

(4.13)

l=0

the last line again being due to (4.11). To summarize, we have obtained the following result. Lemma 6. For all positive integers n, x and nonnegative integers k, we have M(Fn,x \ 2 (k)) = − Pf Mn (x),

(4.14)

where Mn (x) is given by (4.6), with Q i, j defined in (4.10) or (4.13), and Hi, j defined in (4.7) and (4.8).

Interaction of a Gap with a Free Boundary in a Dimer System

267

5. Proof of Theorem 4 In the sequel, we shall interpret sums by ⎧ n−1 n>m ⎪ ⎨ k=m Expr(k) n=m Expr(k) = 0 ⎪ ⎩ m−1 k=m − k=n Expr(k) n < m. n−1

(5.1)

In particular, using this convention, the expression for Q i, j given in (4.10) makes sense for negative integers x also (in which case the upper bound in the sum can be negative) and is actually equal to the expression in (4.13). It is the latter fact that we shall frequently make use of. Our proof of Theorem 4 involves a sequence of five steps. By Lemma 6, we know that the number that we want to compute is the negative of a Pfaffian. We shall frequently use the fact (4.3) that the square of the Pfaffian of a skew-symmetric matrix is equal to its determinant. By definition, Pf Mn (x) is a polynomial in x. In Step 1 we prove that det Mn (x) = det Mn (−2n − x). With d denoting the degree of Pf Mn (x) as a polynomial in x, this implies that Pf Mn (x) = (−1)d Pf Mn (−2n − x).

(5.2)

Subsequently, in Step 2 we show that n

(x + s)22n−2s+1

(5.3)

s=1

s =n−k

divides det Mn (x) as a polynomial in x (here, (x + s)22n−2s+1 is the square of the Pochhammer symbol (x + s)2n−2s+1 ), while in Step 3 we show that n−1

x +s+

s=1

1 2

2 2n−2s

divides det Mn (x) (where, similarly, (x + s + 21 )22n−2s is the square of the Pochhammer symbol (x + s + 21 )2n−2s ). Both combined, this proves that n

(x + s)2n−2s+1

s=1

s =n−k

n−1

s=1

1 x +s+ 2

, 2n−2s

which is a polynomial of degree n (4n − 4s + 1) − (2k + 1) = n(2n − 1) − (2k + 1), s=1

268

M. Ciucu, C. Krattenthaler

divides Pf Mn (x) as a polynomial in x. The computation in Step 4 then shows that the degree of Pf Mn (x), as a polynomial in x, is at most 2n 2 + n − 4k − 3. Altogether, this implies that − Pf Mn (x) = Pn (x)

n

(x + s)2n−2s+1

s=1

s =n−k

n

s=1

1 x +s+ 2

,

(5.4)

2n−2s

where Pn (x) is a polynomial in x of degree at most 2n 2 + n − 4k − 3 − n(2n − 1) + (2k + 1) = 2n − 2k − 2. In Step 5, we determine the value of Pn (x) at x = 0, −1, . . . , −n + k + 1 (see (5.30)). The corresponding calculations make use of an auxiliary lemma due to Mehta and Wang [29], see Theorem 7 and Corollary 10 in Sect. 6. By (5.2), this gives us at the same time the value of Pn (x) at x = −2n, −2n + 1, . . . , −n − k − 1. In total, these are 2n − 2k explicit evaluations of Pn (x) at special values of x. Given the fact that the degree of Pn (x) is at most 2n − 2k − 2, they determine Pn (x) uniquely, and an explicit expression for Pn (x) can be written down using Lagrange interpolation. If this is substituted into (5.4), then the evaluation of − Pf Mn (x) is complete. After some manipulations, one arrives at the expression in (3.2). Step 1. det Mn (x) = det Mn (−2n − x). We prove this claim by transforming, up to sign, Mn (x) into Mn (−2n − x) by a sequence of elementary row and column operations (which, of course, leave the value of the determinant invariant). To be precise, for i = 2n, 2n − 1, . . . , 2 (in this order), we add i−1 i −1 · (row a) a−1 a=1

to row i, and then for j = 2n, 2n − 1, . . . , 2, we add j−1 j −1 · (column b) b−1 b=1

(1)

to column j. Let Mn (x) denote the matrix which arises after these row and column operations. According to (4.9), the (i, j)-entry in Mn(1) (x) is 2x+2n j i t t −1 j −1 t −1 i −1 t − b a−1 b−1 b−1 a−1 a a=1 b=1

(5.5)

t=1

for 1 ≤ i, j ≤ 2n. By (4.7) and (4.8), for 1 ≤ i ≤ 2n and j = 2n + ε, ε = 1, 2, the (i, j)-entry of Mn(1) (x) is i x +n−k−1 i −1 , a − 2k − ε a−1 a=1

(5.6)

Interaction of a Gap with a Free Boundary in a Dimer System

269

and, for 1 ≤ j ≤ 2n and i = 2n + ε, ε = 1, 2, it is j x +n−k−1 j −1 . − b − 2k − ε b−1

(5.7)

b=1

By Chu–Vandermonde summation (4.12), we have i i i −1 t +γ i −1 t +γ t +i +γ −1 = = , a−1 a+η i −a a+η i +η a=1

a=1

whence the expression (5.5) simplifies to 2x+2n t=1

=

t +i −1 i

−1 t=−2x−2n

t + j −2 j −1

−t + i − 1 i −1

= (−1)i+ j−1

t=−2x−2n

t=−2x−2n

= (−1)i+ j = (−1)i+ j

−2x−2n−1

t=1

t +i −2 i −1

−t + j − 2 j −1

t +1 i

t +1 i

t=0 −2x−2n

t i

−

−

t + j −1 j

−t + i − 2 i −1

−t + j − 1 j

t t t t − i j −1 i −1 j

−1

= (−1)i+ j−1

t −1 j −1

t j −1

t j −1

−

−

−

t −1 i −1

t i −1

t i −1

t +1 j

t +1 j

(5.8)

t . j

(5.9) (5.10)

Here, we used the identity t t t t − j i −1 j −1 i t t t t t t = + − + i i −1 j −1 i −1 j j −1 t +1 t t t +1 = − i j −1 i −1 j to obtain (5.8), and our convention (5.1) for sums to obtain (5.9). Comparison with (4.9) shows that this last expression is, up to the sign (−1)i+ j , exactly Q i, j with x replaced by −2n − x. In a similar vein, the expression (5.6) simplifies to −x − n − k + 1 − ε x +n+i −k −2 , (5.11) = (−1)i+ε i − 2k − ε i − 2k − ε while expression (5.7) simplifies to the same expression with i replaced by j. Upon (1) setting ε = 2, this shows that the (i, 2n + 2)-entry in Mn (x) is, up to the sign (−1)i ,

270

M. Ciucu, C. Krattenthaler

identical with the (i, 2n + 2)-entry in Mn (−2n − x), with an analogous statement being true for the (2n + 2, j)-entry of Mn(1) (x) and the (2n + 2, j)-entry of Mn (−2n − x). (1) We do one last row and one last column operation: in Mn (x), we add the last row to (2) the next-to-last row, and we add the last column to the next-to-last column. Let Mn (x) denote the resulting matrix. By (5.11), for i = 1, 2, . . . , 2n, the (i, 2n + 1)-entry of Mn(2) (x) is equal to i+1 −x − n − k i −x − n − k − 1 + (−1) (−1) i − 2k − 1 i − 2k − 2 −x − n − k − 1 = (−1)i+1 , (5.12) i − 2k − 1 which is, up to the sign (−1)i+1 , exactly the (i, 2n + 1)-entry in Mn (−2n − x). An analogous statement is true for the (2n + 1, j)-entries of Mn(2) (x) and Mn (−2n − x). In summary, as the two by two block in the lower right corner of Mn (x) consists of (2) zeros, the computations (5.10)–(5.12) show that the (i, j)-entry of Mn (x) is (−1)i+ j times the (i, j)-entry of Mn (−2n − x). This implies det Mn (x) = det Mn(2) (x) = det Mn (−2n − x), as claimed. Step 2. ns=1 (x + s)22n−2s+1 divides det Mn (x). (The reader should recall the remarks s =n−k

on notation after (5.3).) We begin by observing that the product in the claim can be also rewritten as n

(x + s)22n−2s+1 =

s=1

n

(x + s)2s+2χ (s
s=1

s =n−k

×

2n−1

(x + s)4n−2s+2χ (s>n+k)−2 ,

(5.13)

s=n+1

where χ (A) = 1 if A is true and χ (A) = 0 otherwise. In view of Step 1, it suffices to establish that n

(x + s)2s+2χ (s
s=1

divides det Mn (x). Now let s, a and b be integers with 1 ≤ s ≤ n and 1 ≤ a ≤ b ≤ 2n. We claim that b b−a (row i of Mn (−s)) = 0, i −a i=a

as long as (A) a − b ≤ 2n − 2s < a, and (B) either b ≤ 2k, or b ≥ 2k + 3 and a − b + k + 1 ≤ n − s < a − k − 1.

(5.14)

Interaction of a Gap with a Free Boundary in a Dimer System

271

Indeed, if we specialize (5.14) to the j th column, where j ≤ 2n, we obtain, using the expression (4.10) for Q i, j , 2n−2s b b b−a b−a j −i t t Q i, j = i −a i −a i j t i=a

t=1

i=a

= =

b 2n−2s t=1 i=a 2n−2s t=1

b−a b−i

b−a + t b

t t −1 t −1 t − j i −1 j −1 i

t −1 b−a+t −1 t − . j −1 b−1 j (5.15)

Here we used the Chu–Vandermonde summation (4.12) in the last line. Since, for a −b ≤ 2n −2s < a (which is Condition (A)), the binomial coefficients containing the parameter b in expression (5.15) are identically zero throughout the summation range, it is clear that the corresponding sum vanishes. On the other hand, if we specialize (5.14) to the (2n + 1)st column, we obtain, again using the Chu–Vandermonde summation (4.12), b b−a Hi,1 i −a i=a

x=−s

b b−a n−s−k−1 = b−i i − 2k − 1 i=a b−a+n−s−k−1 , = b − 2k − 1

which vanishes for b ≤ 2k, and for b ≥ 2k +2 and 0 ≤ b −a +n −s −k −1 < b −2k −1, the last inequality being equivalent to a − b + k + 1 ≤ n − s < a − k, and if we specialize (5.14) to the (2n + 2)nd column, we obtain b b−a Hi,2 i −a i=a

b n−s−k−1 b−a i − 2k − 2 b−i i=a b−a+n−s−k−1 = , b − 2k − 2

= x=−s

which vanishes for b ≤ 2k−1, and for b ≥ 2k+3 and 0 ≤ b−a+n−s−k−1 < b−2k−2, the last inequality being equivalent to a − b + k + 1 ≤ n − s < a − k − 1. This establishes our claim. In order to prove that (x + s)2s divides det Mn (x) for 1 ≤ s < n − k, we use (5.14) with a = 2n − 2s + 1 and 2n − 2s + 1 ≤ b ≤ 2n. It is not difficult to see that for these choices of s, a and b Conditions (A) and (B) are satisfied, so that we obtain 2s linear combinations of the rows that are linearly independent (as, for our choices of a and b, the coefficients in (5.14) form a triangular array) and vanish when x = −s. This implies divisibility by (x + s)2s (cf. e.g. [24, Lemma in Sect. 2]). To prove that (x + s)2s−2 divides det Mn (x) for n − k ≤ s ≤ n, we use (5.14) with a = 2n − 2s + 1 and 2n − 2s + 1 ≤ b ≤ 2k on the one hand, and with a = n + k − s + 2 and 2k + 3 ≤ b ≤ 2n on the other hand. Again, it is not difficult to see that for both types of choices of s, a and b Conditions (A) and (B) are satisfied, so that we obtain

272

M. Ciucu, C. Krattenthaler

(2k + 2s − 2n) + (2n − 2k − 2) = 2s − 2 linear combinations of the rows that are linearly independent and vanish at x = −s. In the same way as before, this implies divisibility by (x + s)2s−2 . 1 2 Step 3. n−1 s=1 (x + s + 2 )2n−2s divides det Mn (x). (The reader should recall the remarks on notation after (5.3).) We begin by observing that the product in the claim can be also rewritten as n−1

s=1

1 x +s+ 2

2 = 2n−2s

n−1

s=1

1 x +s+ 2

2s 2n−2

s=n

1 x +s+ 2

4n−2s−2 .

In view of Step 1, it suffices to establish that n−1

s=1

1 x +s+ 2

2s

divides det Mn (x). In order to prove the claim for s < n − k, we shall show that 1 (n − s − k) · row (2n − 2s + 1) of Mn (−s − ) 2 1 1 + (n − s − k) · row (2n − 2s) of Mn (−s − ) 2 2 2n−2s−1 i (−1) 1 row i of Mn (−s − ) = 0, + 2n−2s−i+2 2 2

(5.16)

i=1

and that 2i + 2s − 2n − 2k − 3 1 1 row i of Mn (−s − ) + row (i − 1) of Mn (−s − ) 2 i − 2k − 1 2 (2i +2s − 2n − 2k − 3)(2i +2s − 2n − 2k − 5) 1 + row (i − 2) of Mn (−s − ) = 0 4(i − 2k − 1)(i − 2k − 2) 2 (5.17) for i = 2n − 2s + 2, 2n − 2s + 3, . . . , 2n. As these are linearly independent row combinations, the claim will follow. In order to prove the claim for s ≥ n − k, we shall show that 2n−2s−1 i=1

(−1)i 22n−2s−i−1

1 row i of Mn (−s − ) = 0, 2

(5.18)

that

1 row i of Mn (−s − ) = 0 2

(5.19)

Interaction of a Gap with a Free Boundary in a Dimer System

273

for i = 2n − 2s, 2n − 2s + 1, . . . , 2k, and that 1 1 2i + 2s − 2n − 2k − 3 row i of Mn (−s − ) + row (i − 1) of Mn (−s − ) 2 i − 2k − 1 2 1 (2i +2s −2n−2k −3)(2i +2s −2n−2k −5) row (i −2) of Mn (−s − ) = 0 + 4(i −2k −1)(i −2k −2) 2 (5.20) for i = 2k + 3, 2k + 4, . . . , 2n. Again, as these are linearly independent row combinations, the claim will follow. Let us first suppose s ≥ n − k. We start with the proof of (5.18). Specializing (5.18) to the j th column, j = 1, 2, . . . , 2n, by (4.9) we see that we must prove the identity 2n−2s−1

(−1)i 22n−2s−i−1

i=1

2n−2s−1 2n−2s−1 t − 1t t −1 t − = 0. j −1 j i i −1 t=1

t=1

(5.21) In order to see that this is indeed true, we first extend the sum over i to the range i = 0, 1, . . . , 2n − 2s − 1, thereby obtaining 2n−2s−1 i=0

(−1)i 22n−2s−i−1

2n−2s−1 2n−2s−1 t − 1t t −1 t − j −1 j i i −1 t=1

2n−2s−1

t=1

t −1 j −1 t=1 2n−2s−1 2n−2s−1 t t − 1 2n−2s−1 t − 1t (−1)i − = j −1 j i i −1 22n−2s−i−1 t=1 t=1 i=0 1 2n − 2s − 1 − 2n−2s−1 j 2 −

1 22n−2s−1

for the left-hand side of (5.21). Next we interchange the sum over i with the sums over t, and subsequently we evaluate the (now inner) sums over i by means of the binomial theorem. In this manner, the left-hand side of (5.21) becomes 2n−2s−1 1 t −1 t−1 t + 2n−2s−2 (1 − 2) (1 − 2) j −1 j 22n−2s−1 2 t=1 t=1 1 2n − 2s − 1 − 2n−2s−1 j 2 2n−2s−1 1 t t −1 2n − 2s − 1 t = − 2n−2s−1 + + (−1) = 0, j j j 2 1

2n−2s−1

t

t=1

as desired.

274

M. Ciucu, C. Krattenthaler

On the other hand, specializing (5.18) to the j th column, j = 2n + 1, 2n + 2, by (4.7) and (4.8) we obtain 2n−2s−1 (−1)i n − k − s − 23 , i − 2k − ε 22n−2s−i−1 i=1

where ε = 1, 2, which is indeed zero since the binomial coefficient always vanishes because of i ≤ 2n − 2s − 1 < 2k + ε, the last inequality being due to our assumption s ≥ n − k. That (5.19) holds can be easily checked by inspection. For the proof of (5.20), we observe that we have Q i, j x=−s− 1 = 0 for all i ≥ 2n−2s, 2

because in this case the appearance of the binomial coefficient it in the sum in formula (4.10) implies that all summands of this sum vanish. In its turn, this entails that the left-hand side of (5.20) specialized to the j th column, where 1 ≤ j ≤ 2n, is trivially zero since i > i − 1 > i − 2 ≥ 2k + 1 ≥ 2n − 2s + 1 > 2n − 2s, by our assumptions. To see that the left-hand side of (5.20) is as well zero when it is specialized to the (2n + 1)st or (2n + 2)nd column amounts to a routine verification using the expressions (4.7) and (4.8) for the corresponding matrix entries. We now assume that s < n − k and turn our attention to (5.16). The reader should notice that the relations (5.16) and (5.18) are relatively similar, the essential difference being the two extra terms in (5.16) corresponding to the (2n −2s)th and the (2n −2s +1)st row, respectively. If 1 ≤ j ≤ 2n, the proof of relation (5.16) specialized to column j is therefore identical with the proof of relation (5.18) specialized to column j, because the entries in the first 2n columns of the (2n − 2s)th and the (2n − 2s + 1)st row evaluated at x = −s − 21 are all zero. (The reader should recall formula (4.10).) To show the relation (5.16) specialized to the (2n +1)st respectively to the (2n +2)nd column requires however more work. We have to prove 1 n − s − k − 23 n − s − k − 23 + (n − s − k) 2n − 2s − 2k − ε + 1 2 2n − 2s − 2k − ε 2n−2s−1 (−1)i n − s − k − 23 = 0, + 2n−2s−i+2 i − 2k − ε 2 i=1

where ε = 1, 2, respectively, after simplification, (n − s − k)(ε − 2) (−n + s + k + ε − 21 )2n−2s−2k−ε · 2 (2n − 2s − 2k − ε + 1)! 2n−2s−1 (−1)i n − s − k − 23 = 0. + i − 2k − ε 22n−2s−i+2

(5.22)

i=1

We reverse the order of summation in the sum over i (that is, we replace i by 2n − 2s − i − 1), and subsequently we write the (new) sum over i in standard hypergeometric notation ∞ (a1 )m · · · (a p )m m a1 , . . . , a p ;z = z . (5.23) p Fq b1 , . . . , bq m! (b1 )m · · · (bq )m m=0

Interaction of a Gap with a Free Boundary in a Dimer System

275

Thereby we obtain (n − s − k)(ε − 2) (−n + s + k + ε − 21 )2n−2s−2k−ε · 2 (2n − 2s − 2k − ε + 1)! (−n + k + s + ε + 21 )2n−2s−2k−ε−1 1, −2n + 2k + 2s + ε + 1 1 (5.24) − F ; 2 1 −n + k + s + ε + 21 8 (2n − 2s − 2k − ε − 1)! 2 for the left-hand side of (5.22). If ε = 2, then the 2 F1 -series in (5.24) can be evaluated using Gauß’ second 2 F1 -summation (cf. [33, (1.7.1.9); App. (III.6)]) ⎧ if N is an odd nonnegative integer, ⎪ ⎨0 1 1 a, −N = 2 N/2 (5.25) 2 F1 1 a N ; ifN is an even nonnegative integer. ⎪ 2 + 2 − 2 2 ⎩ 1−a 2

2 N /2

As a result, in this case, the expression (5.24) vanishes, whence (5.22) with ε = 2 is satisfied, and thus relation (5.16) specialized to the (2n + 2)nd column. If ε = 1, the 2 F1 -series in (5.24) cannot be directly evaluated by means of Gauß’ formula. However, we may in a first stage apply the contiguous relation az a, b a, b − 1 a + 1, b ; z = 2 F1 ;z + ;z 2 F1 2 F1 c c c+1 c to transform (5.24) into (n − s − k) (−n + s + k + 21 )2n−2s−2k−1 · 2 (2n − 2s − 2k)! 3 (−n + k + s + 2 )2n−2s−2k−2 1, −2n + 2k + 2s + 1 1 − ; 2 F1 −n + k + s + 23 8 (2n − 2s − 2k − 2)! 2 1 2, −2n + 2k + 2s + 2 1 . − ; 2 F1 −n + k + s + 25 2n − 2k − 2s − 3 2

−

Both 2 F1 -series in the last expression can now be evaluated by means of Gauß’ formula (5.25). The first series simply vanishes, while the second series evaluates to a non-zero expression. If this is substituted, after some simplifications we obtain −

(−n + k + s + 23 )2n−2s−2k−2 (−n + s + k + 21 )2n−2s−2k−1 − = 0. 4 (2n − 2s − 2k − 1)! 8 (2n − 2s − 2k − 2)!

This shows that for ε = 1 the expression (5.24) vanishes as well, whence (5.22) with ε = 1 is satisfied, and thus also relation (5.16) specialized to the (2n + 1)st column. The verification of (5.17) is completely analogous to that of (5.20) and is left to the reader. Step 4. Pf Mn (x) is a polynomial in x of degree at most 2n 2 + n − 4k − 3. By (4.13), Q i, j is a polynomial in x of degree i + j. On the other hand, by recalling the definitions (4.7) and (4.8) of Hi,1 and Hi,2 , respectively, one sees that the degree of Hi,1 in x is i − 2k − 1,

276

M. Ciucu, C. Krattenthaler

while the degree of Hi,2 is i − 2k − 2. It follows that, in the defining expansion of the determinant det Mn (x), each nonzero term has degree 2n

i+

2n

i=1

j − 2(2k + 1) − 2(2k + 2) = 4n 2 + 2n − 8k − 6.

j=1

The Pfaffian being the square root of the determinant (cf. (4.3)), the claim follows. Step 5. Evaluation of Pn (x) at x = 0, −1, . . . , −n + k + 1. The polynomial Pn (x) is defined by means of (5.4). So, what we would like to do is to set x = −σ in (5.4), σ being one of 0, 1, . . . , n − k − 1, evaluate Pf Mn (−σ ), divide both sides of (5.4) by the products on the right-hand side of (5.4), and get the evaluation of Pn (x) at x = −σ . However, the first product on the right-hand side of (5.4) unfortunately is zero for x = −σ , 1 ≤ σ ≤ n − k − 1. (It is not zero for σ = 0.) Therefore we have to find a way around this difficulty. Fix a σ with 1 ≤ σ ≤ n − k − 1. Before setting x = −σ in (5.4), we have to cancel (x + σ )σ (see (5.13)) on both sides of (5.4). That is, we should write (5.4) in the form Pn (x) = − ×

1 Pf Mn (x) (x + σ )σ n−k−1

n

s=1

s=n−k+1

(x + s)−s

s =σ

×

n

s=1

1 x +s+ 2

−1

2n−1

(x + s)−s+1

(x + s)−2n+s−χ (s>n+k)+1

s=n+1

,

(5.26)

2n−2s

and subsequently specialize x = −σ . However, in order to be able to perform this step, we need to evaluate 1 − Pf Mn (x) . x=−σ (x + σ )σ In order to accomplish this, we apply Lemma 11 with N = 2n +2, a = 2n −2σ , b = 2n, and A = Mn (x). Indeed, (x + σ ) is a factor of each entry in the i th row in matrix Mn (x), for i = 2n − 2σ + 1, 2n − 2σ + 2, . . . , 2n. We obtain 1 Pf(S), Pf M (x) = − Pf( Q) (5.27) − n (x + σ )σ x=−σ where = Q with Q being given by

and H by

Q t H

H 0

,

Q = Q i, j |x=−σ 1≤i, j≤2n−2σ

H = Hi, j |x=−σ 1≤i≤2n−2σ, 1≤ j≤2 ,

(5.28)

Interaction of a Gap with a Free Boundary in a Dimer System

and where

S=

277

1 . Q i+2n−2σ, j+2n−2σ x +σ x=−σ 1≤i, j≤2σ

We point out that (5.27) also holds for σ = 0 once we interpret the Pfaffian of an empty matrix (namely the Pfaffian of S) as 1. In particular, under that convention, the arguments below can be used for 0 ≤ σ ≤ n − k − 1, that is, including σ = 0. and Pf(S). We start with the evaluation of Pf(S). It We must now compute Pf( Q) follows from (4.13) that the (i, j)-entry of S is given by Si, j =

i+2n−2σ −1

(−1)l+ j+1

l=0

j −i i + 2n − 2σ

j + 2n − 2σ − 1 i + 2n − 2σ − l − 1

l + j + 2n − 2σ l

(2n − 2σ + 1)! (l + j − 1)! . · (l + j + 2n − 2σ + 1)! If we write this using hypergeometric notation, we obtain the alternative expression Si, j =

(−1) j+1 ( j − i)i+2n−2σ 1 − i − 2n + 2σ, 1 + j + 2n − 2σ, j ; 1 . F 3 2 1 − i + j, 2 + j + 2n − 2σ (2n − 2σ + j + 1)! ( j)i− j+2n−2σ +1

Rewrite this expression as the limit (−1) j+1 ( j − i)i+2n−2σ 1 − i − 2n + 2σ, 1 + j + 2n − 2σ, j ; 1 . F 3 2 1 − i + j, 2 + ε + j + 2n − 2σ ε→0 (2n − 2σ + j + 1)! ( j)i− j+2n−2σ +1

Si, j = lim

Now we apply one of Thomae’s 3 F2 -transformation formulas (cf. [4, Ex. 7, p. 98]) (e) (d + e − a − b − c) a, b, c a, −b + d, −c + d ;1 = ;1 . 3 F2 3 F2 d, e d, −b − c + d + e (e − a) (d + e − b − c) Thus, we obtain (−1) j+1 (2n − 2σ + ε + 1) (2n − 2σ + j + ε + 2) ( j − i)i+2n−2σ ε→0 (ε − i + 2) (4n − 4σ + i + j + ε + 1) (2n − 2σ + j + 1)! ( j)i− j+2n−2σ +1 1 − i − 2n + 2σ, −i − 2n + 2σ, 1 − i ×3 F2 ;1 , 1 − i + j, 2 + ε − i

Si, j = lim

or, in usual sum notation, i−1 (−1) j+1 ( j − i) (2n − 2σ + ε + 1) (2n − 2σ + j + ε + 2) ε→0 (l − i + ε + 2) (4n − 4σ + i + j + ε + 1)

Si, j = lim

l=0

·

(1 − i)l (l − i + j + 1)2n−2σ +i−l−1 (2n − 2σ + i − l)l . l! (2n − 2σ + j + 1)! ( j)2n−2σ +i− j−l+1

Because of the term (l − i + ε + 2) in the denominator, in the limit only the summand for l = i − 1 does not vanish. After simplification, this leads to Si, j =

(−1)i+ j ( j − i) (2n − 2σ + i − 1)! (2n − 2σ + j − 1)! . (4n − 4σ + i + j)! (2n − 2σ + 1)!

278

M. Ciucu, C. Krattenthaler

a

b

Fig. 9. a A lozenge tiling for the degenerate region. b Forced lozenges in case x = 0

We must evaluate the Pfaffian Pf

1≤i, j≤2σ

(Si, j ).

By factoring some terms out of rows and columns, we see that Pf

1≤i, j≤2σ

(Si, j ) = (−1)σ (2n − 2σ + 1)!−σ ×

2σ

(2n − 2σ + i − 1)!

i=1

Pf

1≤i, j≤2σ

j −i . (4n − 4σ + i + j)!

This Pfaffian can be evaluated in closed form by Corollary 10 in the next section. The result is that 2σ

σ −σ Pf(S) = (−1) (2n − 2σ + 1)! (2n − 2σ + i − 1)! ×

σ −1

i=0

i=1

(2i + 1)! . (4n − 2σ + 2i + 1)!

(5.29)

If we compare (5.28) with (4.6), then We finally turn to the evaluation of det( Q). we see that Q = Mn−σ (0). Hence, using Lemma 6 with n replaced by n − σ and with is equal to M(Fn−σ,0 \ 2 (k)). (The reader should recall the x = 0, we see that − Pf( Q) definitions of the region Fn,x and of the triangular hole 2 (k) given in the Introduction, see again Fig. 7.) Fig. 9a shows a typical example where n − σ = 5 and k = 2. Since this region is degenerate, there are many forced lozenges, see Fig. 9b. The enumeration problem therefore reduces to the problem of determining the number of symmetric lozenge tilings of a hexagon with side-lengths 2k, 2k, 2, 2k, 2k, 2. This number is given by formula (3.1) with n = k and x = 1. If we substitute this in (5.27), together with the evaluation (5.29), then, after some manipulation, we obtain

Interaction of a Gap with a Free Boundary in a Dimer System

279

(2σ )! 1 4k + 1 σ Pf M (x) = (−1) n σ 2k (x + σ ) (2n − 2σ + 1)!σ 2σ σ ! x=−σ σ −1 2σ

(2i)! (2n − 2σ + i − 1)! × . (4n − 2σ + 2i + 1)!

−

i=1

i=0

Hence, by inserting this in (5.26), we have (2σ )! 4k + 1 σ Pn (−σ ) = (−1) 2k (2n − 2σ + 1)!σ 2σ σ ! 2σ σ −1

(2i)! × (2n − 2σ + i − 1)! (4n − 2σ + 2i + 1)! i=1

×

i=0

n−k−1

(−σ + s)−s

s=1 n

s=1

(−σ + s)−s+1

2n−1

(−σ + s)−2n+s−χ (s>n+k)+1

s=n+1

s=n−k+1

s =σ

×

n

1 (−σ + s + )−1 . 2 2n−2s

(5.30)

This completes the proof of Theorem 4. 6. An Auxiliary Determinant Evaluation, and an Auxiliary Pfaffian Factorization Mehta and Wang proved the following determinant evaluation in [29]. (There is a typo in the formula stated in [29, Eq. (7)] in that the binomial coefficient nk is missing there.) Theorem 7 ([29, Eq. (7)]). For all real numbers a, b and positive integers n, we have det

0≤i, j≤n−1

=

n−1

i=0

((a + j − i) (b + i + j)) i! (b + i)

n k=0

(−1)k

n ((b − a)/2)k ((b + a)/2)n−k , k

(6.1)

as long as the arguments occurring in the gamma functions avoid their singularities. The sum on the right-hand side of (6.1) can be alternatively expressed as the coefficient of z n /n! in (1 + z)(a−b)/2 (1 − z)(−a−b)/2 . Therefore, in the case a = 0 we obtain the following simpler determinant evaluation. Corollary 8. For all real numbers b and positive integers n, we have n−1

n! (b/2)n/2 , i! (b + i) det (( j − i) (b + i + j)) = χ (n is even) 0≤i, j≤n−1 (n/2)! i=0

as long as the arguments occurring in the gamma functions avoid their singularities. Here, as before, χ (A) = 1 if A is true and χ (A) = 0 otherwise.

280

M. Ciucu, C. Krattenthaler

One can obtain the following slightly (but, for our purposes, essentially) stronger statement. It is stated as Eq. (4) in [29], with the argument how to obtain it hinted at at the bottom of p. 231 of [29]. Since, from there, it is not completely obvious how to actually complete the argument, we provide a proof. Proposition 9. For all real numbers b and positive even integers n, we have n 2 −1

Pf

0≤i, j≤n−1

(( j − i) (b + i + j)) =

(2i + 1)! (b + 2i + 1),

(6.2)

i=0

as long as the arguments occurring in the gamma functions avoid their singularities. Proof. Since the Pfaffian of a skew-symmetric matrix equals the square root of its determinant (cf. (4.3)), the formula given by Theorem 8 yields, after a little manipulation, that n 2 −1

Pf

0≤i, j≤n−1

(( j − i) (b + i + j)) = ε

(2i + 1)! (b + 2i + 1),

(6.3)

i=0

where ε = +1 or ε = −1. In order to determine the sign ε, we argue by induction on (even) n. Let us suppose that we have already proved (6.2) up to n − 2. We now multiply both sides of (6.3) by b + 1 and then let b tend to −1. Thus, on the right-hand side we obtain the expression ⎛n ⎞ ⎛n ⎞ −1 −1 2 2 ε⎝ (2i + 1)!⎠ ⎝ (2i)⎠ . (6.4) i=0

i=1

On the other hand, by the definition of the Pfaffian, on the left-hand side we obtain ⎛ ⎞ π ∈M[0,...,n−1]

⎜ (b + 1) sgn π lim ⎜ b→−1 ⎝

i< j

⎟ ( j − i) (b + i + j)⎟ ⎠

(6.5)

i, j matched in π

(with the obvious meaning of M[0, . . . , n −1]; cf. the sentence containing (4.2)). In this sum, matchings π for which all matched pairs i, j satisfy i + j > 1 do not contribute, because the corresponding summands vanish. However, there is only one possible pair i, j with 0 ≤ i < j for which i + j ≤ 1, namely (i, j) = (0, 1). Therefore, the sum in (6.5) reduces to

sgn π

π ∈M[2,...,n−1]

= =

lim (b + 1)(1 − 0) (b + 1)

b→−1

Pf

(( j − i) (i + j − 1))

Pf

(( j − i) (i + j + 3)) ,

2≤i, j≤n−1 0≤i, j≤n−3

( j −i) (i + j − 1)

i< j

i, j matched in π

where the next-to-last equality holds by the definition (4.2) of the Pfaffian. Now we can use the induction hypothesis to evaluate the last Pfaffian. Comparison with (6.4) yields that ε = +1.

Interaction of a Gap with a Free Boundary in a Dimer System

281

By using the reflection formula (cf. [2, Theorem 1.2.1]) (x) (1 − x) =

π sin π x

for the gamma function, and the substitutions i → n − i − 1 and j → n − j − 1, it is not difficult to see that Proposition 9 is equivalent to the following. Corollary 10. For all positive even integers n, we have Pf

0≤i, j≤n−1

j −i (b + i + j)

n 2 −1

=

i=0

(2i + 1)! . (b + n + 2i − 1)

We close this section by proving a factorization of a certain specialization of a Pfaffian that we need in Step 5 in Sect. 5. Lemma 11. Let N , a, b be positive integers with a < b ≤ N , where N and b − a are even. Let A = (Ai, j )1≤i, j≤N be a skew-symmetric matrix with the following properties: (1) The entries of A are polynomials in x. (2) The entries in rows a + 1, a + 2, . . . , b (and, hence, also in the corresponding columns) are divisible by x + σ . Then

1 · Pf S, Pf A = Pf A (b−a)/2 (x + σ ) x=−σ

(6.6)

is the matrix which arises from A by deleting rows and columns a + 1, a + where A 2, . . . , b and subsequently specializing x = −σ , and S=

1 Ai, j . x +σ x=−σ a+1≤i, j≤b

Proof. By the definition (4.2) of the Pfaffian, we have

1 Pf A x=−σ (x + σ )(b−a)/2 ⎛ ⎜ =⎜ ⎝

1 (x + σ )(b−a)/2

π ∈M[1,...,N ]

sgn π

i< j

i, j matched in π

⎞ ⎟ ⎟ Ai, j ⎠

. x=−σ

Let M1 denote the subset of M[1, . . . , N ] consisting of those matchings that pair all the elements from {a + 1, a + 2, . . . , b} among themselves (and, hence, all the elements of the complement {1, 2, . . . , a, b + 1, b + 2, . . . , N } among themselves). Let M2 be the complement M[1, . . . , N ]\M1 . Then

282

M. Ciucu, C. Krattenthaler

1 Pf A (b−a)/2 (x + σ ) x=−σ ⎛

⎞

⎟ ⎜ 1 ⎟ =⎜ sgn π A i, j ⎠ ⎝ (x + σ )(b−a)/2 i< j π ∈M1 i, j matched in π x=−σ ⎞ ⎛

⎟ ⎜ 1 ⎟ ⎜ +⎝ sgn π Ai, j ⎠ . (x + σ )(b−a)/2 i< j π ∈M2 i, j matched in π

(6.7)

x=−σ

Each term in the third line of (6.7) vanishes, since the product contains more than (b − a)/2 factors Ai, j that are divisible by x + σ . On the other hand, every matching π in M1 is the disjoint union of a matching π ∈ M[1, 2, . . . , a, b + 1, b + 2, . . . , N ] and a matching π

∈ M[a + 1, a + 2, . . . , b]. If we also use the simple fact that sgn π = sgn π · sgn π

(as there are no crossings between paired elements of π and paired elements of π

), then we obtain 1 Pf A (b−a)/2 (x + σ ) x=−σ ⎛ ⎜ 1 =⎜ ⎝ (x + σ )(b−a)/2

π ∈M[1,...,a,b+1,...,N ] π

∈M[a+1,...,b]

i< j

i, j matched in π

=

⎞⎞

⎟⎜ ⎟⎟ ⎜ ⎟ ⎟ Ai, j ⎟ A i, j ⎠⎠ ⎠⎝ i< j i, j matched in π

x=−σ

sgn π

Ai, j

π ∈M[1,...,a,b+1,...,N ]

·

sgn π · sgn π

⎞⎛

⎛ ⎜ ·⎜ ⎝

π

∈M[a+1,...,b]

sgn π

x=−σ

i< j

i, j matched inπ

i< j

1 Ai, j . x +σ x=−σ

i, j matched in π

By the definition (4.2) of the Pfaffian, the last expression is exactly the right-hand side of (6.6).

7. Proofs of Theorems 1 and 2 In our proofs we make use of the following lemmas. Lemma 12. Let β be a real number with either β > 0 or β < −1. Then, for fixed positive k and all sequences (βn )n≥1 with βn → β as n → ∞, we have

Interaction of a Gap with a Free Boundary in a Dimer System

283

1 −2n, 21 , −n + k + 1, −n + k + 1, βn n ; 1 lim √ 5 F4 n→∞ n −2n + 21 , −n − k, −n − k, βn n + 1 √ 1 2 (1 − α)4k+2 dα, = α √ 0 (1 + β ) π α(2 − α)

(7.1)

where, on the left-hand side, we used again the standard notation (5.23) for hypergeometric series. Proof. We write the 5 F4 -series in (7.1) explicitly as a sum over l: n−k−1 l=0

(2n + 1) (l + 21 ) (2n − l + 21 ) (n − k)2 (n + k − l + 1)2

βn n

(2n − l + 1) ( 21 ) (l + 1) (2n + 21 ) (n − k − l)2 (n + k + 1)2 (βn n + l)

.

(7.2) Let us denote the summand in this sum by F(n, l). We have ∂ 1 1 F(n, l) = F(n, l) ψ(l + ) − ψ(l + 1) + ψ(2n − l + 1) − ψ(2n − l + ) ∂l 2 2 1 , + 2ψ(n − k − l) − 2ψ(n + k − l + 1) − βn n + l where ψ(x) := ( ddx (x))/ (x) is the digamma function. Because of the functional equation ψ(x + 1) = ψ(x) + x1 (cf. [2, Eq. (1.2.15) with n = 1]), we have 1 1 ψ(l + 1) − ψ(l + ) ≥ ψ(2n − l + 1) − ψ(2n − l + ) 2 2 for 0 ≤ l ≤ n. Moreover, since either β > 0 or β < −1, for large enough n we have ψ(n + k − l + 1) ≥ ψ(n − k − l) +

1 1 > ψ(n − k − l) − . n +k −l βn n + l

Altogether, this implies that ∂l∂ F(n, l) < 0 for 0 ≤ l ≤ n − k − 1, that is, for fixed large enough n, the summand F(n, l) is monotone decreasing as a function in l. In particular, for 0 ≤ l ≤ n − k − 1 we have 0 < F(n, l) ≤ F(n, 0) = 1.

(7.3)

The sum (7.2) may therefore be approximated by an integral: n−k−1 l=0

log n−1

F(n, l) =

n−k−log n−1

F(n, l) +

l=0

= O(log n) +

F(n, l) +

l=log n n−k−log n−1

log n−1

F(n, l) d l,

n−k−1

F(n, l)

l=n−k−log n

as n → ∞.

The next step is to apply Stirling’s approximation 1 1 1 log(z) − z + log(2π ) + O log (z) = z − 2 2 z

(7.4)

(7.5)

284

M. Ciucu, C. Krattenthaler

for the gamma function, in the form l c 1 log a +b +log(n)+log 1+ log (an+bl +c) = an+bl +c − 2 n an+bl 1 1 −(an + bl + c) + log(2π ) + O 2 an + bl l 1 (log(a + b ) + log(n)) = an + bl + c − 2 n 1 1 , −(an + bl) + log(2π ) + O 2 an + bl where a, b, c are real numbers with a ≥ 0. If this is used in the defining expression for F(n, l), then after cancellations we obtain 1 l 1 l log(2) + (4k + 2) log 1 − − log 2 − 2 n 2 n l 1 1 1 l − log − log(n) − log(π ) − log 1 + 2 n 2 2 βn n 1 1 1 1 +O +O +O +O l n −l 2n − l n ⎛ ⎞ √ 2 (1 − nl )4k+2 1 ⎝ ⎠ = log √ , +O log n n(1 + βnl n ) π nl (2 − nl )

log F(n, l) =

as long as log n ≤ l ≤ n − k − log n. Substitution of this approximation in (7.4) yields n−k−1 l=0

F(n, l)

⎛

=⎝

n−k−log n−1

log n−1

⎞ 2 (1 − nl )4k+2 1 ⎠ 1+O + O (log n) , dl √ log n n(1 + βnl n ) π nl (2 − nl ) √

or, after the substitution l = αn, n−k−1

F(n, l)

l=0

√

= n

(n−k−log n−1)/n (log n−1)/n

√ 1 2 (1 − α)4k+2 1+ O + O (log n). dα √ log n (1 + β1n α) π α(2 − α)

The assertion of the lemma follows now immediately. We can now get an exact formula for the correlation ω f (k; ξ ) defined in (2.2).

Interaction of a Gap with a Free Boundary in a Dimer System

285

Lemma 13. For any ξ > 0 and 0 ≤ k ∈ Z, we have 1 1 4k + 1 ω f (k; ξ ) = √ 4k+2 2k π (1 + ξ ) ξ(2 + ξ ) 1 1 4k+2 (1 − α) (1 − α)4k+2 dα − ξ dα × (2 + ξ ) α √ α √ 0 (1 + ξ ) α(2 − α) 0 (1 − 2+ξ ) α(2 − α) 1 1 1 4k + 1 2 (1 − α)4k+3 = dα. (7.6) √ α α √ 2k π (1 + ξ )4k+2 ξ(2 + ξ ) 0 (1 + ξ ) (1 − 2+ξ ) α(2 − α) Proof. By Theorem 4 and formula (3.1), the ratio between M(Fn,x \ 2(k)) and M(Fn,x ) is, when written in hypergeometric notation, (n + k)! ( 21 )2n (x + 1)n−k−1 (x + n + k + 1)n−k−1 4k + 1 1 2k (x + n − k)2k+1 (x + 2 )2n (n − k − 1)!2 (n + k + 1)n−k −2n, 21 , −n + k + 1, −n + k + 1, x ; 1 × (x + 2n) 5 F4 −2n + 21 , −n − k, −n − k, x + 1 −2n, 21 , −n + k + 1, −n + k + 1, −2n − x ;1 . × x 5 F4 −2n + 21 , −n − k, −n − k, −2n − x + 1 We now substitute x = ξn n in this expression. Use of Lemma 12 (which applies, as ξ > 0), together with Stirling’s formula (7.5), yields the assertion. Lemma 14. For any β = 0 we have 1 (1 − α)4k+2 π , k → ∞. dα ∼ α √ 8k 0 (1 + β ) α(2 − α)

(7.7)

Proof. Let Iβ (k) be the integral on the left-hand side of (7.7). The asymptotics of Iβ (k) as k → ∞ can be readily found using Laplace’s method as presented for instance in [30]. Conditions (i)–(v) of [30, pp. 121–122] are readily checked. By [30, Theorem 6.1, b p. 125], the large z asymptotics of a e−zp(t) q(t) dt is determined by the quantities λ, μ, p0 and q0 in the series expansions p(t) − p(a) = p0 (t − a)μ + p1 (t − a)μ+1 + · · · and q(t) = q0 (t − a)λ + q1 (t − a)λ+1 + · · · . Namely, under the above assumptions one has e

zp(a)

b

e a

−zp(t)

λ/μ λ q0 /(μp0 ) 1 q(t) dt = + O λ/μ+1 . μ z λ/μ z

(7.8)

In the case of Iβ (k) we have p(t) = − ln(1 − t), q(t) = (1−t/β)1√t (2−t) , a = 0, and b = 1. √ These yield parameters λ = 1/2, μ = 1, p0 = 1, and q0 = 1/ 2. In addition, p(a) = 0. As in our case z = 4k + 2, under these specializations (7.8) becomes (7.7).

286

M. Ciucu, C. Krattenthaler

Proof of Theorem 1. Combine Lemmas 13 (first expression) and 14 with Stirling’s

(4k+2) approximation (7.5) for the binomial coefficient 4k+1 =

2k (2k+1) (2k+2) in (7.6). Proof of Theorem 2. The case ξ = 1 follows directly from Theorem 1. From now on, let ξ = 1. Set Dk := 3I1 (k) − I−3 (k), where Iβ (k) denotes the integral on the left-hand side of (7.7) with ξ = 1. Using the earlier notation ω f (k) = ω f (k; 1), we have by Lemma 13 that ω f (k + 1) − ω f (k) ! (4k + 3)(4k + 5) 1 1 4k + 1 = − 1 Dk+1 + (Dk+1 − Dk ) , √ 2k π 24k+2 3 4(2k + 2)(2k + 3) and thus ω f (k + 1) − ω f (k) (4k + 3)(4k + 5) Dk+1 Dk+1 − Dk = −1 + . ω f (k) 4(2k + 2)(2k + 3) Dk Dk

(7.9)

By two applications of Lemma 14 it follows that √ π Dk ∼ √ , k → ∞. 2k

(7.10)

Thus Dk+1 /Dk → 1 as k → ∞, and elementary arithmetics implies that the first term on the right-hand side of (7.9) is asymptotically −1/(2k) as k → ∞. To determine the asymptotics of the second term, write by Lemma 13 " # Dk+1 − Dk = 3 [I1 (k + 1) − I1 (k)] − I−3 (k + 1) − I−3 (k) . (7.11) As Iβ (k) is the integral on the left-hand side of (7.7), we have Iβ (k + 1) − Iβ (k) =

1 0

$ % (1 − α)4k+2 √ (1 − α)4 − 1 dα. 1 + βα α(2 − α)

(7.12)

The asymptotics of the integral in (7.12) follows by Laplace’s method, in the same manner as the proof of Lemma 14. In this case λ = 3/2, μ = 1, and Eqs. (7.8) and (7.12) impliy that √ π Iβ (k + 1) − Iβ (k) ∼ √ , k → ∞. 4 2k 3/2

(7.13)

Equations (7.11) and (7.13) determine the asymptotics of Dk+1 − Dk , and combining this with the asymptotics of Dk given by (7.10) we obtain that the second term on the right-hand side of (7.9) has asymptotics −1/(2k) as k → ∞. The two terms on the righthand side of (7.9) thus have a sum that is asymptotically −1/(2k) − 1/(2k) = −1/k, and Theorem 2 is proved.

Interaction of a Gap with a Free Boundary in a Dimer System

287

8. Lozenge Occupation Probability for a Free Boundary This section contains the results that are relevant for the calculation of the F-field Fξ reported in Sect. 2. Since the proofs are very similar to those of Theorem 4 and Lemmas 12–14, here we only give a brief outline of how to derive these results. By a method completely analogous to the one used in the proof of Theorem 4 given in Sect. 5, one can derive the following theorem. Recall that Fn,x is the half hexagon with side-lengths√2n, 2x, 2n and that L(k) is the horizontal lozenge on its symmetry axis at distance k 3 from the free boundary. Theorem 15. For all positive integers n, x and nonnegative integers k ≤ n − 1, we have M(Fn,x \L(k)) =

n

(2x + 2s)4n−4s+1 (2s)4n−4s+1

s=1

×

n−k−1

( 21 )i

i=0

i! (2n − i)! (2n − i + 21 )i

((x)i (x + i + 1)2n−i − (x)2n−i (x + 2n − i + 1)i ) . (8.1)

The sum in (8.1) can be written as a difference of two hypergeometric series, which turn out to be 3 F2 -series. For the asymptotic analysis of these 3 F2 -series, we need the following counterpart of Lemma 12. Lemma 16. Let β be a real number with either β > 0 or β < −1. Then, for fixed positive k and all sequences (βn )n≥1 with βn → β as n → ∞, we have √ 1 1 2 −2n, 21 , βn n lim √ 3 F2 ;1 = dα, (8.2) √ 1 α n→∞ n −2n + 2 , βn n + 1 0 (1 + β ) π α(2 − α) where, on the left-hand side, we used again the standard notation (5.23) for hypergeometric series. Using Stirling’s formula (7.5) and the above lemma, it is straightforward to determine the asymptotics of (8.1) as n and x tend to infinity so that x/n approaches ξ . We obtain the following counterpart of Lemma 13. Lemma 17. For any ξ > 0, any sequence (ξn )n≥1 with limn→∞ ξn = ξ and ξn n ∈ Z, and 0 ≤ k ∈ Z, we have M(Fn,ξn n \L(k)) M(Fn,ξn n ) 1 1 1 dα = √ −ξ (2 + ξ ) α √ π ξ(2 + ξ ) 0 (1 + ξ ) α(2 − α) 0 (1 − 1 2(1 − α) 1 dα. = √ α α √ π ξ(2 + ξ ) 0 (1 + ξ ) (1 − 2+ξ ) α(2 − α)

lim

n→∞

dα α √ 2+ξ ) α(2 − α)

It is a routine matter to check that & & dα ξ α(2 + ξ ) arctan . =2 α √ π(2 + ξ ) (2 − α)ξ (1 + ξ ) α(2 − α)

(8.3)

(8.4)

288

M. Ciucu, C. Krattenthaler

Corollary 18. For any ξ > 0, any sequence (ξn )n≥1 with limn→∞ ξn = ξ and ξn n ∈ Z, and 0 ≤ k ∈ Z, we have lim

n→∞

M(Fn,ξn n \L(k)) 2 1 = arctan √ . M(Fn,ξn n ) π ξ(2 + ξ )

(8.5)

Proof. From Lemma 17 (first expression) and (8.4), we obtain & & M(Fn,ξn n \L(k)) 2 2+ξ ξ = − arctan arctan . lim n→∞ M(Fn,ξn n ) π ξ 2+ξ The expession in (8.5) then follows by standard formulas for the arctangent function.

Acknowledgements. We are grateful to the referee for a very careful reading of the original manuscript and many helpful suggestions on the presentation of the material.

References 1. Andrews, G.E.: Plane partitions I: The MacMahon conjecture. In: Studies in foundations and combinatorics, G.-C. Rota, ed., Adv. in Math. Suppl. Studies, Vol. 1, New York London: Academic Press, pp. 131–150, 1978 2. Andrews, G.E., Askey, R.A., Roy, R.: Special functions. In: Encyclopedia of Math. And Its Applications 71, Cambridge: Cambridge University Press, 1999 3. Baik, J., Kriecherbauer, T., McLaughlin, K.T.-R., Miller, P.D.: Discrete orthogonal polynomials. In: Asymptotics and applications Ann. Math. Studies, Princeton, NJ: Princeton University Press, 2007 4. Bailey, W.N.: Generalized hypergeometric series. Cambridge: Cambridge University Press, 1935 5. Ciucu, M.: Rotational invariance of quadromer correlations on the hexagonal lattice. Adv. in Math. 191, 46–77 (2005) 6. Ciucu, M.: A random tiling model for two dimensional electrostatics. Mem. Amer. Math. Soc. 178(839), 1–106 (2005) 7. Ciucu, M.: Dimer packings with gaps and electrostatics. Proc. Natl. Acad. Sci. USA 105, 2766–2772 (2008) 8. Ciucu, M.: The scaling limit of the correlation of holes on the triangular lattice with periodic boundary conditions. Mem. Amer. Math. Soc. 199(935), 1–100 (2009) 9. Ciucu, M.: The emergence of the electrostatic field as a Feynman sum in random tilings with holes. Trans. Amer. Math. Soc. 362, 4921–4954 (2010) 10. Ciucu, M., Krattenthaler, C.: The number of centered lozenge tilings of a symmetric hexagon. J. Combin. Theory Ser. A 86, 103–126 (1999) 11. Cohn, H., Larsen, M., Propp, J.: The shape of a typical boxed plane partition. New York J. of Math. 4, 137– 165 (1998) 12. Di Francesco, P., Reshetikhin, N.: Asymptotic shapes with free boundaries. preprint; http://arxiv.org/abs/ 0908.1630v1 [mathph], 2009 13. Feynman, R.P.: The Feynman Lectures on Physics, vol. II, Reading, MA: Addison-Wesley, 1963 14. Fischer, I.: Another refinement of the Bender–Knuth (ex-)conjecture. Eur. J. Combin. 27, 290–321 (2006) 15. Fisher, M.E., Stephenson, J.: Statistical mechanics of dimers on a plane lattice. II. Dimer correlations and monomers. Phys. Rev. 132(2 ), 1411–1431 (1963) 16. Gessel, I.M., Viennot, X.: Determinants, paths, and plane partitions. Preprint, 1989, available at: http:// people.brandeis.edu/~gessel/homepage/papers/pp.pdf (1989) 17. Gordon, B.: A proof of the Bender–Knuth conjecture. Pac. J. Math. 108, 99–113 (1983) 18. Graham, R.L. Knuth, D.E., Patashnik, O.: Concrete Mathematics. Reading, MA: Addison-Wesley, 1989 19. Ishikawa, M., Wakayama, M.: Minor summation formula for pfaffians. Linear and Multilinear Algebra 39, 285–305 (1995) 20. Kenyon, R.: Local statistics of lattice dimers. Ann. Inst. H. Poincaré Probab. Statist. 33, 591–618 (1997) 21. Kenyon, R.: The asymptotic determinant of the discrete Laplacian. Acta Math. 185, 239–286 (2000) 22. Kenyon, R., Okounkov, A., Sheffield, S.: Dimers and amoebae. Ann. of Math. 163, 1019–1056 (2006)

Interaction of a Gap with a Free Boundary in a Dimer System

289

23. Krattenthaler, C.: The major counting of nonintersecting lattice paths and generating functions for tableaux. Mem. Amer. Math. Soc. 115(552), (1995) 24. Krattenthaler, C.: An alternative evaluation of the Andrews–Burge determinant. In: Mathematical Essays in Honor of Gian-Carlo Rota, B. E. Sagan, R. P. Stanley eds., Progress in Math., Vol. 161, Boston: Birkhäuser, 1998, pp. 263–270 25. Krattenthaler, C.: Advanced determinant calculus. Séminaire Lotharingien Combin. 42 (“The Andrews Festschrift”) (1999), Article B42q, 67 pp 26. Lindström, B.: On the vector representations of induced matroids. Bull. London Math. Soc. 5, 85–90 (1973) 27. Macdonald, I.G.: Symmetric Functions and Hall Polynomials. Second edition, New York-London: Oxford University Press, 1995 28. MacMahon, P.A.: Combinatory Analysis. Vol. 2, Cambridge: Cambridge University Press, 1916; reprinted. New York: Chelsea, 1960 29. Mehta, M.L., Wang, R.: Calculation of a certain determinant. Commun. Math. Phys. 214, 227–232 (2000) 30. Olver, F.W.J.: Asymptotics and special functions. Reprint of the 1974 original [New York: Academic Press] Wellesley, MA: A K Peters, Ltd., 1997 31. Proctor, R.A.: Bruhat lattices, plane partitions generating functions, and minuscule representations. Europ. J. Combin. 5, 331–350 (1984) 32. Sheffield, S.: Random surfaces. Astérisque, Vol. 304, Paris: Soc. Math. France, 2005 33. Slater, L.J.: Generalized hypergeometric functions. Cambridge: University Press Cambridge, 1966 34. Stembridge, J.R.: Nonintersecting paths, pfaffians and plane partitions. Adv. in Math. 83, 96–131 (1990) 35. Vella, D., Mahadevan, L.: The “Cheerios effect”. Amer. J. Phys. 73, 817–825 (2005) Communicated by H. Spohn

Commun. Math. Phys. 302, 291–344 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1185-6

Communications in

Mathematical Physics

Spectral Simplicity and Asymptotic Separation of Variables Luc Hillairet1 , Chris Judge2 1 Laboratoire de Mathématiques Jean Leray, UMR 6629, Université de Nantes, 2, rue de la Houssinière,

44322 Nantes Cedex 3, France. E-mail: [email protected]

2 Department of Mathematics, Indiana University, Bloomington, IN 47405, USA.

E-mail: [email protected] Received: 18 January 2010 / Accepted: 4 July 2010 Published online: 15 January 2011 – © Springer-Verlag 2011

Abstract: We describe a method for comparing the spectra of two real-analytic families, (at ) and (qt ), of quadratic forms that both degenerate as a positive parameter t tends to zero. We suppose that the family (at ) is amenable to ‘separation of variables’ and that each eigenspace of at is 1-dimensional for some t. We show that if (qt ) is asymptotic to (at ) at first order as t → 0, then the eigenspaces of (qt ) are also 1-dimensional for all but countably many t. As an application, we prove that for the generic triangle (simplex) in Euclidean space (constant curvature space form) each eigenspace of the Laplacian acting on Dirichlet functions is 1-dimensional. 1. Introduction In this paper we continue a study of generic spectral simplicity that began with [HlrJdg09] and [HlrJdg10]. In particular, we develop a method that allows us to prove the following. Theorem 1.1. For almost every Euclidean triangle T ⊂ R2 , each eigenspace of the Dirichlet Laplacian associated to T is one-dimensional. Although we establish the existence of triangles with simple Laplace spectrum, we do not know the exact geometry of a single triangle that has simple spectrum. Up to homothety and isometry, there are only two Euclidean triangles whose Laplace spectrum has been explicitly computed, the equilateral triangle and the right isoceles triangle, and in both of these cases the Laplace spectrum has multiplicities [Lame,Pinsky80, Berard79,Harmer08]. Numerical results indicate that other triangles might have spectra with multiplicities [BryWlk84]. Non-isometric triangles have different spectra [Durso88, Hillairet05]. More generally, we prove that almost every simplex in Euclidean space has simple Laplace spectrum. Our method applies to other settings as well. For example, we have the following.

292

L. Hillairet, C. Judge

Theorem 1.2. For all but countably many α, each eigenspace of the Dirichlet Laplacian associated to the geodesic triangle Tα in the hyperbolic plane with angles 0, α, and α, is one-dimensional. If α = π/3, then Tα is isometric to a fundamental domain for the group S L 2 (Z) acting on the upper half-plane as linear fractional transformations. P. Cartier [Cartier71] conjectured that Tπ/3 has simple spectrum. This conjecture remains open (see [Sarnak03]). Until now, the only extant methods for proving that a domain has simple Laplace spectrum consisted of either explicit computation of the spectrum, a perturbation of a sufficiently well-understood domain, or a perturbation within an infinite dimensional space of domains. As an example of the first approach, using separation of variables one can compute the Laplace spectrum of each rectangle exactly and find that this spectrum is simple iff the ratio of the squares of the sidelengths is not a rational number. In [HlrJdg09] we used this fact and an analytic perturbation to show that almost every polygon with at least four sides has simple spectrum. The method for proving spectral simplicity by making perturbations in an infinite dimensional space originates with J. Albert [Albert78] and K. Uhlenbeck [Uhlenbeck72]. In particular, it is shown in [Uhlenbeck72] that the generic compact domain with smooth boundary has simple spectrum. In the case of Euclidean triangles, the last method does not apply since the space of triangles is finite dimensional. We also do not know how to compute the Laplace spectrum of a triangle other than the right-isoceles and equilateral ones. One does know the eigenfunctions of these two triangles sufficiently well to apply the perturbation method, but unfortunately the eigenvalues do not split at first order and it is not clear to us what happens at second order. As a first step towards describing our approach, we consider the following example. Let Tt be the family of Euclidean right triangles with vertices (0,0), (1,0), and (1, t) and let qt denote the associated Dirichlet energy form ∇u2 d x d y. qt (u) = Tt

For each u, v ∈ C0∞ (Tt ), we have qt (u, v) = t u, v , where t is the Laplacian, and hence the spectrum of t equals the spectrum of qt on the domain H01 (Tt ) with respect to the L 2 -inner product on Tt . As t tends to zero, the triangle Tt degenerates to the segment that joins (0,0) and (1,0). The spectrum of an interval is simple and hence one can hope to use this to show that Tt has simple spectrum for some small t > 0 (Fig. 1). Indeed, the spectral study of domains that degenerate to a one-dimensional object is quite well developed. In particular, the asymptotic behaviour of the spectrum of ordered

Fig. 1. The triangle Tt and the sector St .

Spectral Simplicity

293

eigenvalues involves a limiting one-dimensional Schrödinger operator (see, for example, [ExnPst05,FrdSlm09 and Grieser]). Using these kinds of results it can be proved that for each n ∈ N, there exists tn > 0 so that the first n eigenvalues of Ttn are simple (as in [LuRowl]). Unfortunately, this does not imply the existence of a triangle all of whose eigenvalues are simple. This subtle point is perhaps best illustrated by a different example whose spectrum can be explicitly calculated: Let Ct be the cylinder [0, 1] × R/tZ. The spectrum of the Dirichlet Laplacian on Ct is π 2 · k 2 + 4 · 2 /t 2 | (k, ) ∈ N × (N ∪ {0}) . Moreover, for each t > 0 and (k, ) ∈ N × N, each eigenspace is 2-dimensional. On the 1 other hand, the first n eigenvalues of the cylinder Ct are simple iff t < 2(n 2 − 1)− 2 . The example indicates that the degeneration approach to proving spectral simplicity does not work at the ‘zeroth order’ approximation. The method that we describe here is at the next order. In the case of the degenerating triangles Tt , there is a second quadratic form at to which qt is asymptotic in the sense that at − qt is controlled by t · at . Geometrically, the quadratic form at corresponds to the Dirichlet energy form on the sector, St , of the unit disc with angle arctan(t) and it is quite a standard idea to analyse the spectra of thin right triangles using thin sectors (see for example [BryWlk84]). The spectrum of the sectorial form at can be analyzed using polar coordinates and separation of variables. In particular, we obtain the Dirichlet quadratic form b associated to the interval of angles [0, arctan(t)], and, asssociated to each eigenvalue (·π/ arctan(t))2 of b, we have a quadratic √ form at on the radial interval [0, 1]. Each eigenfunction of at is of the form r → Jν ( λ · r ), where Jν is a Bessel function of order ν = π/ arctan(t) and where the eigenvalue, λ, is determined by the condition that this function vanish at r = 1. The spectrum of at is the union of the spectra of at over ∈ N. Figure 2 presents the main qualitative features of the spectrum of at after renormalization by multiplying by t 2 . For each ∈ N, the (renormalized) real-analytic eigenvalue

Fig. 2. The spectrum of the family at

294

L. Hillairet, C. Judge

branches of a coming from at converge to the threshold ( · π )2 . The eigenvalues of at are simple for all t, and for all but countably many t, the spectrum of at is simple. From the asymptotics of the zeroes of the Bessel function, one can show that the distance between any two (renormalized) real-analytic eigenbranches of at is of order 2 at least t 3 . This ‘super-separation’ of eigenvalues is central to our method. Indeed, simplicity would then follow if one were to prove that each real-analytic eigenvalue branch of qt lies in an O(t) neighborhood of a real-analytic eigenvalue branch of at and that at most one eigenfunction branch of qt has its eigenvalue branch lying in this neighborhood. In fact, as sets, the distance between the spectrum of at and the spectrum of qt is O(t), and, with some work, one can prove that each (renormalized) real-analytic eigenvalue branch of qt converges to a threshold in {( · π )2 | ∈ N} (Theorem 13.1). Nonetheless, infinitely many real-analytic eigenbranches of at converge to each threshold and the crossing pattern of these branches and the branches of qt can be quite complicated.

Semiclassical analysis predicts that the eigenvalues of at become separated at order t away from the threshold ( · π )2 (see Remark 10.5). On the other hand, two real-analytic 2 eigenbranches that converge to the same threshold stay separated at order t 3 . In order to use the super-separation of eigenvalues, we will need to show that each eigenvector branch of qt whose eigenvalue branch converges to a particular threshold does not interact with eigenvector branches of at that converge to another threshold (see Lemmas 12.3 and 12.4). In this sense, we will asymptotically separate variables. One somewhat novel feature of this work is the melding of techniques from semiclassical analysis and techniques from analytic perturbation theory. We apply quasimode and concentration estimates to make comparative estimates of the eigenvalues and eigenfunctions of at and qt . We then feed these estimates into the variational formulae of analytic perturbation theory in order to track the real-analytic branches. So far, our description of the method has been limited to the special case of degenerating right triangles. In §15 we make a change of variables that places the problem for right triangles into the following more general context. We suppose that there exists a positive abstract quadratic from b with simple discrete spectrum, and define ∞ ∞ at (u ⊗ ϕ) = t 2 · (ϕ, ϕ) |u (x)|2 d x + b(ϕ) |u(x)|2 d x. (1) 0

0

We consider this family of quadratic forms relative to the weighted L 2 -inner product defined by ∞ u ⊗ ϕ, v ⊗ ψ = (ϕ, ψ) u · v σ d x, 0

σ

where σ is a smooth positive function with < 0 and lim x→∞ σ (x) = 0. See §11. The spectrum of at decomposes into the joint spectra of ∞ ∞ μ at (u) = t 2 |u (x)|2 d x + μ |u(x)|2 d x, (2) 0

0

where μ is an eigenvalue of b (and hence is positive). Because σ is a decreasing function, μ an eigenfunction of at with eigenvalue E oscillates for x << x E = (E − μ · σ )−1 (0) and decays rapidly for x >> x E . Since σ < 0, one can approximate the eigenfunction (or a quasimode at energy E) with Airy functions in a neighborhood of x E . A good

Spectral Simplicity

295

deal of the present work is based on this approximation by Airy functions. For example, the asymptotics of the zeroes of Airy functions underlies the super-separation of eigenvalues. The following is the general result. Theorem 1.3 (Theorem 14.1). If qt is a real-analytic family of positive quadratic forms that is asymptotic to at at first order (see Definition 3.1), then for all but countably many t, the spectrum of qt is simple. Using an induction argument that begins with the triangle, we obtain the following: Corollary 1.4. For almost every simplex in Euclidean space, each eigenspace of the associated Dirichlet Laplacian is one-dimensional. Dirichlet boundary conditions can be replaced by any boundary condition that corresponds to a positive quadratic form b. In particular, one can choose any mixed Dirichlet-Neumann condition on the faces of the simplex except for all Neumann. Using the ‘pulling a vertex’ technique of [HlrJdg09], we can extend generic simplicity to certain classes of polyhedra. For example, a d-dimensional polytope P is called k-stacked if P can be triangulated by introducing only faces of dimension d − 1 [Grünbaum]. Corollary 1.5. Almost every d − 1-stacked convex polytope P ⊂ Rd with n vertices has simple Dirichlet spectrum. Finally, we note that by perturbing the curvature of Euclidean space as in §4 of [HlrJdg09] we obtain the following: Corollary 1.6. Almost every simplex in a constant curvature space form has simple Dirichlet Laplace spectrum.

Organization of the paper. In §2 we use standard resolvent estimates to quantify the assertion that if two quadratic forms are close, then their spectra are close. In particular, we consider the projection, PaI (u), of an eigenfunction q with eigenvalue E onto the eigenspaces of a whose eigenvalues lie in an interval I E. We show that this projection is essentially a quasimode at energy E for a. In §3 we specialize these estimates to the case of two real analytic families of quadratic forms at and qt . We define what it means for qt to be asymptotic to at at first order. We show that if the first order variation, a˙ t , of at is nonnegative, then each real-analytic eigenbranch of at converges as t tends to zero, and if qt is asymptotic to at at first order, then the eigenbranches of qt also converge. In Sect. 4, we use the variational formula along a real-analytic eigenfunction branch u t to derive an estimate on the projection PaIt (u t ). This results in the assertion that the function t →

a˙ t (PaIt (u t )) PaIt (u t )2

is integrable (Theorem 4.2). The integrability will be used several times in the sequel to control the projection PaIt (u t ), and in particular, it will be used to prove that the

296

L. Hillairet, C. Judge

eigenspaces essentially become one-dimensional in the limit. This result depends on both analytic perturbation theory and resolvent estimates. Sections 5 through 10 are devoted to the study of the one dimensional quadratic forms μ at in (2). Most of the material in these sections is based on asymptotics of solutions to second order ordinary differential equations (see, for example, [Olver]). In §6 we provide uniform estimates on the L 2 -norm of quasimodes and on the exponential decay of eigenfunctions for large x. In §7 we make a well-known change of variables to transform μ the second order ordinary differential equation associated to at into the inhomogeneous Airy equation. In §8 we use elementary estimates of the Airy kernel to estimate both quasimodes and eigenfunctions near the turning point x E . In §9 we use the preceding estimates to prove Proposition 9.1 which essentially says μ that the L 2 -mass of both eigenfunctions and quasimodes of at does not concentrate at x E as t tends to zero. This proposition is an essential ingredient in proving the projection estimates of §12. But first we use it in §10 to prove that each real-analytic eigenvalue μ branch of at converges to a threshold μ/σ (0). μ In §10 we also establish the ‘super-separation’ of eigenvalue branches for at . In the case of degenerating right-triangles, we may use the uniform asymptotics of the Bessel function (see [Olver]) to obtain the ‘super-separation’ near the threshold. We prove it directly in Proposition 10.4 for general σ . In §11 we establish some basic properties of the quadratic form defined in (1). In §12 we combine results of §2, §4, and §9 to derive estimates on PaIt (u t ), where u t is a real-analytic eigenfunction branch of qt with eigenvalue branch E t converging to a point E 0 belonging to the interior of an interval I . In §13 we show that each eigenvalue branch of qt converges to some threshold μ/σ (0) (Theorem 13.1). This leads to the following natural question: which thresholds μ/σ (0) are limits of some real-analytic eigenvalue branch of qt ? Strangely enough, we do not answer it here. In §14 we prove the generic simplicity of qt . In §15, we show how simplices and other domains in Euclidean space fit into the general framework presented here. Finally, in §16 we prove a generalization of Theorem 1.2. 2. Quasimode Estimates for Quadratic Forms Let H be a real Hilbert space with inner product ·, · . Let a be a real-valued, densely defined, closed quadratic form on H. Let dom(a) ⊂ H denote the domain of a. In the sequel, we will assume that the spectrum spec(a) of a with respect to ·, · is discrete. Moreover, we will assume that for each λ ∈ spec(a), the associated eigenspace Vλ is finite dimensional, and we will assume that there exists an orthonormal collection, {ψ }∈N , of eigenfunctions such that the span of {ψ } is dense in H. The following estimate is standard: Lemma 2.1 (Resolvent estimate). Suppose that the distance, δ, from E to the spectrum of a is positive. If |a(w, v) − E · w, v | ≤ · v then w ≤

. δ

Spectral Simplicity

297

Given a closed interval I ⊂ [0, ∞), define PaI to be the orthogonal projection onto ⊕λ∈I Vλ . Definition 2.2. Let q be a real-valued, closed quadratic form defined on dom(a). We will say that q is ε-close to a if and only if for each v, w ∈ dom(a), we have 1

1

|q(v, w) − a(v, w)| ≤ ε · a(v) 2 · a(w) 2.

(3)

For each quadratic form q defined on dom(a), define 1 2 n q (u) = u2 + q(u) . If q is ε-close to a, then the norms n q and n a are equivalent on dom(a). Thus, the form domains of q and a with respect to · coincide. We will denote this common form domain by D. Lemma 2.3. Let q and a be quadratic forms such that q is ε-close to a. If u is an eigenfunction of q with eigenvalue E contained in the open interval I ⊂ R, then E 2 I 2 a u − Pa (u) ≤ ε · a(u) · 1 + , (4) δ where δ is the distance from E to the complement R\I . Proof. Let v ∈ D. Since q(u, v) = E · u, v , from (3) we have 1

1

|E · u, v − a(u, v)| ≤ ε · a(u) 2 · a(v) 2 .

(5)

There exists a linear functional f such that for all v ∈ D we have Write f =

E · u, v − a(u, v) = f, v . f · ψ and define vtest = λ−1 f · ψ . Observe that

E · u, vtest − a(u, vtest ) = f, vtest =

(6)

| f l |2 = a(vtest ). λl

By substituting v = vtest into (5), we find that

| f l |2 ≤ ε2 · a(u). λl

(7)

−1 · f for each ∈ N. Let u = u · ψ . From (6) we find that u = (E − λl ) Therefore, a(u − PaI (u)) =

λl ∈I /

λl ·

λ2 | f l |2 2 ≤ ε · a(u) · sup , 2 |E − λl |2 λl ∈I / |E − λ |

where the inequality follows from (7). We have λ2 x2 ≤ sup = sup 2 2 λl ∈I / |E − λ | |1−x|>δ/E |1 − x| The desired bound follows.

2 E +1 . δ

298

L. Hillairet, C. Judge

The preceding lemma provides control of the norm of PaI (u). In particular, we have the following: Corollary 2.4. Let q and a be quadratic forms such that q is ε-close to a. If u is an eigenfunction of q with eigenvalue E contained in the open interval I ⊂ R, then E 2 a(u) I 2 2 , (8) Pa (u) ≥ 1 − ε · 1 + · δ sup(I ) where δ is the distance from E to the complement R\I . Proof. Since a(u − PaI (u), PaI (u)) = 0, we have a(PaI (u)) = a(u) − a u − PaI (u) . Thus, it follows from Lemma 2.3 that

a(PaI (u))

E 2 ≥ 1−ε 1+ · a(u). δ 2

Since, on the other hand, a(PaI (u)) ≤ sup(I ) · PaI (u)2 , the claim follows.

We use the preceding to prove the following. Lemma 2.5. Let I be an interval, let E ∈ I , and let δ denote the distance from E to the complement R\I . Let u be an eigenfunction of q with eigenvalue E. If ε < (1 + E/δ)−1 and q is ε-close to a, then for each v ∈ D, we have ε · sup(I ) · PaI (u) · v. (9) a PaI (u), v − E · PaI (u), v ≤ 1 E 2 2 2 1−ε 1+ δ Proof. Let u˜ = PaI (u), and v˜ = PaI (v). Since PaI is an orthogonal projection that commutes with a, we have a(u, ˜ v) = a(u, ˜ v) ˜ = a(u, v) ˜ and u, ˜ v = u, ˜ v ˜ = u, v . ˜ Therefore, by replacing v with v˜ in (5) we obtain 1

1

|a(u, ˜ v) − E · u, ˜ v | ≤ ε · a(u) 2 · a(v) ˜ 2. Since v˜ ∈ PaI (H), we have a(v) ˜ ≤ sup(I ) · v ˜ 2 ≤ sup(I ) · v2. By the hypothesis and Corollary 2.4, we have −1 E 2 2 a(u) ≤ 1 − ε 1 + · sup(I ) · PaI (u)2 . δ By combining these estimates, we obtain the claim.

Spectral Simplicity

299

Let {an }n∈N and {qn }n∈N be sequences of quadratic forms defined on D. For each n, let E n be an eigenvalue of qn . Proposition 2.6. Suppose that limn→∞ E n exists and is finite. If the quadratic form qn is 1/n-close to an for each n, then there exist N > 0 and C > 0 such that for each n > N and each eigenfunction u of qn with eigenvalue E n , we have 1 (10) an PaIn (u), v − E n · PaIn (u), v ≤ C · · PaIn (u) · v. n Proof. Let E 0 = limn→∞ E n and let I be an open interval that contains E 0 . Let δn be the distance from E n to R\I . Since E n converges to E 0 and I is open, there exists δ0 > 0 and N0 so that if n > N0 , then δn > δ0 . Choose N ≥ max{N0 , 1 + 2E 0 /δ0 } so that if n > N , then E n < 2E 0 . Then for each n > N we have n −1 (1 + E n /δn ) ≤ 1, and we can apply Lemma 2.5 to obtain the claim. 3. Asymptotic Families and Eigenvalue Convergence Given a mapping of the form t → f t , we will use f˙t to denote its first derivative. More precisely, we define d f˙t := fs . ds s=t Let at and qt be real-analytic families of closed quadratic forms densely defined on D ⊂ H for t > 0.1 In this section, we show that the nonnegativity of both at and a˙ t implies that each real-analytic eigenvalue branch of at converges as t tends to zero. We then show that if q is asymptotic to a in the following sense then the eigenvalue branches of qt also converge (Proposition 3.4). Definition 3.1. We will say that qt is asymptotic to a t at first order iff there exists C > 0 such that for each t > 0 and u, v ∈ D, 1

1

|qt (u, v) − at (u, v)| ≤ C · t · at (u) 2 · at (v) 2 ,

(11)

|q˙t (v) − a˙ t (v)| ≤ C · at (v).

(12)

and

Remark 3.2. By reparameterizing the family—replacing t by say t/C—one may assume, without loss of generality, that C = 1. We will do so in what follows. In what follows, we will assume that the eigenvalues and eigenfunctions of at and qt vary real-analytically. To be precise, we will suppose for each t > 0, there exists an orthonormal collection {ψ (t)}∈N of eigenvectors whose span is dense in H such that t → ψ (t) is real-analytic for each ∈ N. This assumption is satisfied if the operators that represent at and qt with respect to ·, · have a compact resolvent for each t > 0. See, for example, Remark 4.22 in §VII.4 of [Kato]. The following proposition is well-known: 1 For notational simplicity, we will often drop the index t, but note that each object related to a or q will, in general, depend on t.

300

L. Hillairet, C. Judge

Proposition 3.3. If at ≥ 0 and a˙ t ≥ 0 for all small t, then each real-analytic eigenvalue branch of at converges to a finite limit as t tends to zero. Proof. Let λt be a real-analytic eigenvalue branch of at . By standard perturbation theory (see [Kato]) λ˙ t · u t 2 = a(u ˙ t ).

(13)

Thus, since a˙ t ≥ 0, the function t → λt is increasing in t. Since λt is bounded below, the limit limt→0 λt exists. If qt is asymptotic to at , then the eigenvalues of qt also converge. Proposition 3.4. Suppose that for each t > 0, the quadratic forms at and a˙ t are nonnegative. If qt is asymptotic to at at first order, then each real-analytic eigenvalue branch of qt converges to a finite limit. Proof. Let (E t , u t ) be a real-analytic eigenbranch of qt with respect to ·, · . We have q˙t (u t ) = E˙ t · u t 2.

(14)

Using (11), we have qt (v) ≥

1 · at (v) 2

(15)

for all t sufficiently small. Since at ≥ 0, we have qt ≥ 0 and hence E t ≥ 0 for small t. From (12) and Remark 3.2 we have q˙t (u t ) ≥ a˙ t (u t ) − at (u t ) and hence, since a˙ t ≥ 0, we have q˙t (u t ) ≥ −at (u t ). By combining this fact with (14) and (15), we find that E˙ t + 2 · E t ≥ 0

(16)

for sufficiently small t. To finish the proof, define the function f by f (t) = E t · exp(2t). By (16) we have f (t) ≥ 0 for t < t0 and, since qt is non-negative, f is obviously bounded from below. Therefore limt→0 f (t) exists and is finite and so does limt→0 E t . 4. An Integrability Condition Let qt be a real-analytic family that is asymptotic to at at first order. In this section, we use the estimates of §2 to derive an integrability condition (Theorem 4.2) that will be used in §14 to prove that the spectrum of qt is simple for most t under certain additional conditions. Let E t be a real-analytic eigenvalue branch of qt that converges to E 0 as t tends to zero. Let Vt be the associated real-analytic family of eigenspaces. Let I be a compact interval whose interior contains E 0 . Remark 4.1. The definition of Vt implies that, for each t > 0, the vector space Vt is a subspace of ker(At − E t · I ). If a distinct real-analytic eigenvalue branch crosses the branch E t at t = t0 , then Vt0 is a proper subspace of ker(At0 − E t0 · I ).

Spectral Simplicity

301

Theorem 4.2. Let qt be asymptotic to at at first order, and suppose that for each t > 0, we have 0 ≤ a˙ t (v) ≤ t −1 · at (v).

(17)

If t → u t ∈ Vt is continuous on the complement of a countable set, then the function a˙ t PaIt (u t ) (18) t → P I (u t )2 at is integrable on each interval of the form (0, t ∗ ]. Proof. Let χt = PaIt (u t ). Since the spectrum of at is discrete and E t is real-analytic, the operator family t → PaIt is real-analytic on the complement of a countable set. By combining this with the hypothesis, we find that the function a(P ˙ aIt (u t ))/Pat (u t )2 is ∗ ∗ locally integrable on (0, t ] for each t > 0. By Lemma 4.3 below, there exists a constant C > 0 such that a˙ t (χt ) − C. E˙ t ≥ C · χt 2 Integration then gives

t∗

Et ∗ − Et ≥ C t

a˙ s (χs ) ds − C(t ∗ − t). χs 2

Since E t ≥ 0 and the integrand is nonnegative, the integral on the right-hand side converges as t tends to zero. Lemma 4.3. Suppose that for each t > 0, we have 0 ≤ a˙ t (v) ≤ t −1 · at (v).

(19)

If qt is asymptotic to at at first order, then there exists t > 0 and a constant C > 0 such that for each t ≤ t and each eigenvector u ∈ Vt we have ˙ (20) E t · u2 − a˙ t PaIt (u) ≤ C · u2 and PaI (u) ≥

1 · u. C

(21)

Proof. Since Vt is the real-analytic family of eigenspaces associated to E t , for each t > 0 and u ∈ Vt we have q(u) ˙ = E˙ · u2 (see Remark 4.1). Since E t converges to E 0 , we find using (11) that there exists t0 so that for t < t0 , at (u) ≤ 2qt (u) = 2E t · u2 ≤ 2(E 0 + 1) · u2. Thus, from (12) we find that ˙ E · u2 − a˙ t (u) ≤ 2(E 0 + 1) · u2 for t < t0 .

(22)

(23)

302

L. Hillairet, C. Judge

Let χt = PaIt (u). Since a˙ t is a nonnegative quadratic form, we have 1

1

a˙ t (u) ≤ a˙ t (χt ) + 2a˙ t (χt ) 2 · a˙ t (u − χt ) 2 + a˙ t (u − χt ) and 1

1

a˙ t (χt ) ≤ a˙ t (u) + 2a˙ t (u) 2 · a˙ t (χt − u) 2 + a˙ t (χt − u). The former estimate provides a bound on a˙ t (u) − a˙ t (χt ) and the latter one gives a bound on its negation. In particular, we find that 1 1 1 |a˙ t (u) − a˙ t (χt )| ≤ 2 · max a˙ t (u) 2 , a˙ t (χt ) 2 · a˙ t (u − χt ) 2 + a˙ t (u − χt ). Thus, by (19), we have 1 1 1 2 at (u − χt ) · max at (χt ) 2 , at (u) 2 at (u − χt ) 2 + . (24) t t Let δt be the distance from E t to the complement R\I . Since E 0 belongs to the interior of I and E t → E 0 , there exists δ > 0 and 0 < t1 ≤ t0 so that if t < t1 , then δt ≥ δ. Hence we may apply Lemma 2.3 to find that 2E 0 2 2 at (u − χt ) ≤ t · at (u) · 1 + δ |a˙ t (u) − a˙ t (χt )| ≤

for t < t1 . Since at is non-negative, from (22) we have at (χt ) ≤ at (u) ≤ 2(E 0 + 1) · u2 for t ≤ t0 . By combining these estimates with (24) we find that for t ≤ t1 , 2E 0 2E 0 2 · 2+t · 1+ . |a˙ t (u) − a˙ t (χt )| ≤ 2(E 0 + 1) · u · 1 + δ δ

(25)

Estimate (20) then follows from (23), (25) and the triangle inequality. If E 0 > 0, then there exists 0 < t2 ≤ t1 such that if t < t2 , then 1 1 1 · qt (u) = · E t · u2 ≥ · E 0 · u2. 2 2 4 Thus, if E 0 > 0, then (21) follows from Corollary 2.4. On the other hand, if E 0 = 0, then let t1 and δ be as above. Since PaI is a spectral projection and the eigenspaces are orthogonal, we have 2 a u − PaI (u) ≥ δ · u − PaI (u) . at (u) ≥

Thus, by Lemma 2.3 and (22) we have 2 E0 + 1 · u2 ≥ δ · u − PaI (u) . 2t 2 · (E 0 + 1) · 1 + δ In particular, if t 2 < (δ/8) · (E 0 + 1)−1 · (1 + (E 0 + 1)/δ)−1 , then 2 1 u2 ≥ · u − PaI (u) . 4 Estimate (21) then follows from the triangle inequality.

Spectral Simplicity

303

5. Definition and Basic Properties In the sequel σ : [0, ∞) → R+ will be a smooth positive function such that • lim x→∞ σ (x) = 0, • σ (x) < 0 for all x ≥ 0, • |σ

| has at most polynomial growth on [0, ∞). For u, v ∈ C0∞ ((0, ∞)), define u, v σ =

∞

u(x) · v(x) · σ (x) d x.

0

Let Hσ denote the Hilbert space obtained by completing C0∞ ((0, ∞)) with respect to √ the norm uσ := u, u σ . Let H 1 (0, ∞) and H01 (0, ∞) denote, respectively, the classical Sobolev spaces with respect to Lebesgue measure on (0, ∞). For each t > 0 and u in H 1 (0, ∞), we define ∞ μ t 2 · |u (x)|2 + μ · |u(x)|2 d x. at (u) = 0

Remark 5.1. If μ > 0, then since σ is decreasing, we have ∞ σ (0) μ 2 uσ ≤ σ (0) |u(x)|2 d x ≤ a (u). μ t 0 Let μ

dom D (at ) = H01 (0, ∞) ∩ Hσ and let μ

dom N (at ) = H 1 (0, ∞) ∩ Hσ . μ

μ

Both dom D (at ) and dom N (at ) are closed form domains for a that are dense in Hσ . μ

μ

Definition 5.2. The spectrum of the quadratic form at restricted to dom D (at ) (resp. μ dom N (at )) with respect to ·, · σ will be called the Dirichlet (resp. Neumann) specμ trum of at . μ

In the sequel, we will drop the subscript ‘D’ from dom D (at ) and the subscript ‘N ’ μ from dom N (at ). In particular, unless stated otherwise, all of the results below hold for both the Neumann and Dirichlet boundary conditions. When we refer to the ‘spectrum’ μ of at , we will mean either the Dirichlet or the Neumann spectrum. μ

Proposition 5.3. If μ > 0 and t > 0, then the quadratic form at has discrete spectrum with respect to ·, · σ . Proof. By a standard result in spectral theory—see, for example, Theorem XIII.64 [Reed-Simon]—it suffices to prove that for each r > 0 the set μ μ Ar = u ∈ dom(at )|at (u) ≤ r, uσ ≤ 1 is compact with respect to · σ . To verify this, one uses Rellich’s Lemma on compact sets. The decay of σ prevents the escape of mass at infinity.

304

L. Hillairet, C. Judge

6. Estimates of Quasimodes and Eigenfunctions In the sequel, unless otherwise stated, we assume that μ > 0. Let r ∈ Hσ and let E ≥ 0. In this section, we begin our analysis of functions w in μ dom(at ) that satisfy μ

at (w, v) − E · w, v σ = r, v σ

(26)

μ

for all v ∈ dom(at ). In applications, the function r in (26) will be negligible. For example, if r = 0, then w is an eigenfunction with eigenvalue E. More generally, if μ

atn (wn , v) − E n · wn , v σ = rn , v σ ,

(27)

μ

ρ

where tn → 0, wn ∈ dom(atn ), lim E n = E 0 and rn = O(tn ) · wtn , then the sequence wn is called a quasimode of order ρ at energy E 0 . (See also Proposition 2.6 and Remark 9.2.) Our goal is to understand the behavior of both eigenfunctions and quasimodes. Of course, in most situations, either the eigenfunction estimate will be stronger than the quasimode estimate and/or the proof will be simpler. In the following, we will first provide a general estimate—valid for any quasimode—and then, as needed, we will state and prove the stronger result for eigenfunctions. By unwinding the definitions, Eq. (26) may be rewritten as

t 2 · w (x) · v (x) + f E (x) · w(x) · v(x) d x =

∞ 0

∞

r (x) · v(x) · σ (x) d x,

0

(28) where f E (x) = μ − E · σ (x). By integrating (28) by parts, we find that w satisfies (28) for all v ∈ C0∞ ((0, ∞)) if and only if for each x ∈ (0, ∞), − t 2 · w

(x) + f E (x) · w(x) = r (x) · σ (x).

(29) μ

The function w is a Dirichlet (resp. Neumann) eigenfunction of at if and only if w is in Hσ ∩ H 1 , satisfies Eq. (29) and w(0) = 0 (resp. w (0) = 0). μ Let E ≥ μ/σ (0). For instance, we may choose E to be an eigenvalue of at . Since σ is strictly decreasing, there exists a unique point x E ∈ [0, ∞) such that f E (x E ) = 0. In particular, if x > x E , then f E (x) > 0 and if x < x E , then f E (x) < 0. If w is an eigenfunction (r = 0), then one expects w to behave like an exponential function when x >> x E and to oscillate for x << x E . Moreover, as t tends to zero, one expects that both types of behavior will become more and more extreme. On the other hand, since lim x→∞ σ (x) = 0, we do not know, for example, that r is bounded as x tends to infinity. In particular, for a non-zero r , we have no direct argument that shows that a solution w to (29) has exponential decay or is, in fact, bounded.

Spectral Simplicity

305

6.1. A general L 2 estimate. For each E ≥ μ/σ (0) and s ∈ [0, μ), let x Es ≥ x E be the unique solution to f E (x Es ) = s.

(30)

Remark 6.1. Since the derivative of σ does not vanish, the mappings E → x E and E → x Es are smooth from [μ/σ (0), ∞) to (0, ∞). Note also that lims→0 x Es = x E and lims→μ x Es = ∞. The following estimate shows that if wt satisfies (26) and lim

t→0

r σ = 0, wt σ

then, for any fixed s, the L 2 mass of wt concentrates in the region {x | x ≤ x Es } as t goes to 0. Additional work is required to prove that wt actually concentrates in the classically allowed region {x | x ≤ x E }. See Proposition 9.1. Lemma 6.2. Let K ⊂ [μ/σ (0), ∞) be compact and let s ∈ (0, μ). There exists a constant C such that for each E ∈ K , r ∈ Hσ , and solution w to (29) we have ∞ ∞ r σ 2 2 · |w(x)| d x ≤ C · t + |w(x)|2 d x. wσ x Es 0 The constant C depends only upon K , μ, σ , and s. Proof. Let χ : R → [0, 1] be a smooth function such that χ (x) = 0 for all x ≤ 0 and χ (x) = 1 for all x ≥ 1. For each M ∈ R, define x − xE · χ (M + 1 − x). ρ M (x) = χ x Es − x E Substitute ρ M · w for v in (28). Since ρ M vanishes for x ≤ x E , we obtain ∞ ∞ 2

2 2

2 t · (w ) · ρ M + t · w · w · ρ M + f E (x) · w · ρ M d x = r · w · ρ M · σ d x. xE

xE

(31) By integrating by parts, one finds that ∞ 1 ∞ 2

w · w · ρ M dx = − w · ρ M d x, 2 xE xE and hence (31) is equivalent to ∞ t2 (w )2 · ρ M d x + xE

t2 = 2

∞ xE

∞ xE

ρM · w2 d x +

f E · w2 · ρ M d x

∞ xE

r · w · ρ M · σ d x.

306

L. Hillairet, C. Judge

The first integral on the left-hand side is positive. Moreover, since f E · w 2 · ρ M is non-negative on [x E , x Es ], and 0 ≤ ρ M ≤ 1, we have

∞ x Es

f E · w2 · ρ M d x ≤

t2 2

∞ xE

|ρ M | · w2 d x +

∞

|r · w| · σ d x.

(32)

xE

| is bounded by a constant multiplied by |x s − x |−2 . Therefore, since The function |ρ M E E x E < x Es , and x E , x Es are smooth over the compact K (see Remark 6.1), there exists

(x)| ≤ C . C > 0 such that for each E ∈ K , x ∈ [0, ∞), and M ∈ R, we have |ρ M By applying this estimate to (32) and applying the Cauchy-Schwarz inequality, we obtain ∞ C · t2 ∞ 2 f E · w2 · ρ M d x ≤ w d x + r σ · wσ . 2 x Es xE

For x ≥ x Es , we have ρ M (x) = χ (M + 1 − x). Thus, since f E · w 2 is integrable, by the Lebesgue dominated convergence theorem, we may let M tend to ∞ and obtain ∞ C · t2 ∞ 2 f E · w2 d x ≤ w d x + r σ · wσ . 2 x Es xE Since σ is decreasing, the function f E (x) is increasing and ∞ w2σ ≤ σ (0) w 2 (x) d x. 0

Therefore, we find that ∞ 2 ∞ C ·t r σ f E (x Es ) · w2 d x ≤ w 2 d x. + σ (0) · 2 wσ x E+ 0

Since f E (x Es ) = s, the lemma follows by choosing C to be s −1 max( C2 , σ (0)).

6.2. An estimate of the L 2 mass of an eigenfunction. If w is an eigenfunction, then the bound given in Lemma 6.2 can be greatly improved. In particular, an eigenfunction is exponentially small in the classically forbidden region, and hence one can make L 2 estimates with polynomial weights. See Lemma 6.4. First, we quantify the exponential decay of each eigenfunction. μ

Lemma 6.3. Let w be an eigenfunction of at with eigenvalue λ ≤ E. If x ≥ y ≥ x Es , then

√ 2s 2 2 w (x) ≤ w (y) · exp − · (x − y) . (33) t Proof. The proof is a straightforward convexity estimate using the maximum principle. This estimate allows us to prove the following.

Spectral Simplicity

307

Lemma 6.4. For each ν > 0, there exists a function βν : (μ/σ (0), ∞) × (0, μ) → R μ such that if w is an eigenfunction of at with eigenvalue λ ≤ E, and t ≤ 1, then ∞ ∞ 2 ν w (x) · (1 + x ) d x ≤ βν (E, s) · t · w 2 (x) d x. 3x Es

Proof. Let α = x ∈ [x Es , y],

x Es

√ 2s/t. By exchanging the roles of x and y in (33), we find that for all w 2 (x) ≥ w 2 (y) · exp (α · (y − x)).

(34)

Integrating with respect to x, we obtain y 1 w 2 (x) d x ≥ · w 2 (y) · exp(α · (y − x Es ) − 1) , α x Es and thus α · w (y) ≤ exp(α · (y − x Es )) − 1 2

y x Es

w 2 (x) d x.

(35)

If u ≥ 0, then u ν ≤ cν · eu , where cν = sup{x ν e−u | u > 0}. Hence, we have ν 2 ν · eα·x/2 . x ≤ cν · α By combining this with (33), we find that for x ≥ y, ν x 2 −y . w 2 (x) · x ν ≤ cν · · w 2 (y) · exp −α · α 2 By integrating, we find that

∞ y

ν+1 2 w (x) · x d x ≤ cν · · w 2 (y) · exp(α · y/2). α 2

ν

Putting this together with (35) gives y ν ∞ 2 exp(α · y/2) w 2 (x) · x ν d x ≤ 2 · cν · · w 2 (x) d x. s )) − 1 s α exp(α · y − α · x y xE E (36) If we let

exp(3x/2) x > 0 cν = sup x · exp(2x) − 1

and set y = 3 · x Es , then we have cν exp(α · y/2) ≤ . exp(α · y − α · x Es )) − 1 α · x Es

308

L. Hillairet, C. Judge

By substituting this into (36) we obtain ∞ 2cν · cν ν+1 − ν+1 ∞ 2 2 w 2 (x) · x ν d x ≤ · t · s w (x) d x. x Es 3x Es x Es

(37)

The claim then follows by specializing (37) to the case ν = 0 and adding the resulting estimate to (37). In particular, we may define βν (E, s) = 2 ·

c0 · c0 + cν · cν x Es · s

ν+1 2

.

6.3. Comparing weighted L 2 inner products on eigenfunctions. Let p : [0, ∞) → R be a positive continuous function of (at most) polynomial growth. That is, there exist constants C p and ν p such that if x ≥ 0, then 0 < p(x) ≤ C p · 1 + x ν p . We will regard p as a weight for an L 2 -inner product. Proposition 6.5. Let p be as above. There exists a function α : [μ/σ (0), ∞)×(0, μ) → R such that if s ∈ (0, μ), then lim

E→μ/σ (0)

α(E, s) = 0

(38)

and a function β : (μ/σ (0), ∞) × (0, μ) → R such that if w± is an eigenfunction of μ at with eigenvalue λ± ≤ E, then ∞ ∞ ∞ ≤ s) + β(E, s) · t) w · w · p d x − p(0) w · w d x w 2 d x. (α(E, + − + − 0

0

0

The functions α and β depend only on p, E, σ , and μ. Proof. Set α(E, s) = sup | p(x) − p(0)| | 0 ≤ x ≤ 3x Es . Since p is continuous and lim E→μ/σ (0) x Es = 0 we have (38). Using the Cauchy-Schwarz inequality we find that s 3x s 3x E E w+ · w− · p d x − p(0) w+ · w− d x ≤ α(E, s) · w+ · w− . 0 0 We also have

∞

3x Es

w+ · w− · p d x

2

≤

∞ 3x Es

|w+ | · p d x · 2

∞

3x Es

|w− | · p d x . 2

Spectral Simplicity

309

By Lemma 6.4 we have ∞ |w± |2 · p d x ≤ C p · βν p (E, s) · t 3x Es

∞ 0

|w± |2 d x

and also p(0)

∞ 3x Es

|w± | d x ≤ p(0) · β0 (E, s) · t

∞

2

0

|w± |2 d x.

The claim then follows from combining these estimates and using the triangle inequality. 7. The Langer-Cherry Transform We wish to analyse the behavior of the solutions to (29) for x near x E and for t small. To do this, we will use a transform to put the solution into a normal form. The transform that we will use was first considered by Langer [Langer31] and Cherry [Cherry50] and is a variant of the Liouville-Green transformation. See Chapter 11 in [Olver]. As above, let f E = μ− E · σ, where σ is smooth with σ < 0 and lim x→∞ σ (x) = 0. For E ≥ μ/σ (0), there exists a unique x E ∈ [0, ∞) such that f E (x E ) = 0. In the present context, the Langer-Cherry transform is based on the function φ E : [0, ∞) → R defined by x 2 3 3 1 2 φ E (x) = sign(x − x E ) · | f E (u)| du . 2 xE

(39)

Before defining the Langer-Cherry transform, we collect some facts concerning φ E . Lemma 7.1. Let U = σ μ(0) , ∞ × [0, ∞). (1) (2) (3) (4)

The map (E, x) → φ E (x) is smooth on U. φ E (x) > 0 for each (E, x) ∈ U. (φ E )2 · φ E = f E . The map (E, x) → f E (x)/φ E (x) defined for x = x E extends to a smooth map from U to R+ . (5) The limit 2

2

1

lim x − 3 · φ E (x) = (3/2) 3 · μ 3

x→∞

holds uniformly for E in each compact subset of

μ σ (0) , ∞

.

Proof. These properties follow directly from the definition (39) or from the alternative expression (41) below that we now prove. Since σ (x) < 0 for all x ∈ [μ/σ (0), ∞), the map I : U → R, I (E, u) = 0

1

−E · σ (E, s · u + (1 − s) · x E ) ds,

310

L. Hillairet, C. Judge

is smooth and positive on U. The map π : U → R defined by 1 1 1 π(E, x) = s 2 · I 2 (E, s · x + (1 − s) · x E ) ds

(40)

0

is also smooth and positive. Since f E (x E ) = 0 and f E (x) = −Eσ (x), the fundamental theorem of calculus gives that μ − E · σ (u) = (u − x E ) · I (E, u). Direct computation shows that φ E (x) = (x − x E ) ·

3 · π(E, x) 2

2

3

.

(41)

Definition 7.2. Let w : [0, ∞) → R and let E ≥ μ/σ (0). Define the Langer-Cherry transform of w at energy E to be the function 1 (42) W E = (φ E ) 2 · w ◦ φ −1 E . It follows from Lemma 7.1 that the Langer-Cherry transform maps C k ([0, ∞)) to C k ([φ E (0), ∞)). The importance of this transform is due to its effect on solutions to the ordinary differential equation (29). In what follows we let 1

ρ E = (φ E )− 2 .

(43)

Proposition 7.3. Let r : [0, ∞) → R and let w ∈ C 2 ([0, ∞)). Let W E be the LangerCherry transform of w at energy E. Then w satisfies t 2 · w

− f E · w = −r · σ if and only if W E satisfies

3 t 2 · W E

− y · W E = −t 2 · (ρ E3 · ρ E

) ◦ φ −1 · W − ρ · r · σ ◦ φ −1 E E E E .

(44)

The proof is a straightforward but lengthy computation. See also, for example, §11.3 in [Olver], where the function fˆ is related to h E by fˆ = h 4 . In the analysis that follows, we will treat the right-hand side of (44) as an error term for t and r small. The following estimates will help justify this treatment. Lemma 7.4. Let K ⊂ [μ/σ (0), ∞) be compact. There exists C > 0 such that if x ≥ 0 and E ∈ K , then 1 −1 ρ (x) ≤ C · x 6 , E

and

1 |ρ E (x)| ≤ C · 1 + x 6 .

Moreover, there exists ν such that

ρ (x) ≤ C · 1 + x ν . E The exponent ν depends only on σ . The constant C depends only on μ, σ , and K .

Spectral Simplicity

311 1

Proof. By part (3) of Lemma 7.1, we have ρ = (φ/ f ) 4 . Hence since lim x→∞ f E (x) = μ, we find from part (5) that lim ρ E · x

− 16

x→∞

=

2 3μ

1 6

uniformly for E ∈ K . The first two estimates follow. To prove the last estimate, one computes using f = (φ )2 · φ that 3

1

3

5 f4 1 φ 4 · f

5 φ 4 · ( f )2 − + . ρ =− 9 16 φ 114 4 f 45 16 f4

By part (5) of Lemma 7.1, both φ and 1/φ have polynomial growth that is uniform for E ∈ K . By assumption, σ

has at most polynomial growth, and hence, by integration, the function σ also has at most polynomial growth. Therefore, f

and f both have polynomial growth that is uniform over K . Therefore, since lim x→∞ f E (x) = μ > 0, we find that ρ

has uniform polynomial growth. Lemma 7.5. Let I ⊂ [0, ∞) be a compact interval and let K ⊂ [μ/σ (0), ∞) be a compact set. There exists a constant C such that for each E ∈ K such that if w is a solution to (29) and W E is the Langer-Cherry transform of w at energy E, then we have ∞ 2 2

2 4 2 |w| d x . t · W E (y) − y · W E (y) dy ≤ C · r σ + t φ E (I )

0

The constant C depends only on μ, σ, I , and K . Proof. For each continuous function F : I → R, let |F|∞ = sup{|F(x)| | x ∈ I }. We perform the change of variables y = φ E (x). Since φ E = ρ E−2 , we have dy = ρ E−2 · d x, and thus by (42) and (43), |W |2 dy = ρ E−4 · |w|2 d x. Therefore φ E (I )

and

(45)

2 3 −1

2 2

2 · ρ ) ◦ φ · |W | dy ≤ |ρ | · |ρ | · |w|2 d x, (ρ E E E ∞ E ∞ E I

φ E (I )

2 3 2 2 (ρ E · r · σ ) ◦ φ −1 E dy ≤ |ρ E · σ |∞ · r σ .

The claim then follows from squaring and integrating (44) and applying the above estimates. μ

Suppose w is an eigenfunction of at , and denote by λ its eigenvalue. If we perform the Cherry-Langer transform at energy E = λ then r = 0 and hence the conclusion of Lemma 7.5 is stronger. Actually, we will need the following strengthening of Lemma 7.5 μ which treats the case when w is an eigenfunction of at but E is close to but not necessarily exactly the corresponding eigenvalue.

312

L. Hillairet, C. Judge

Lemma 7.6. Let K ⊂ [μ/σ (0), ∞) be compact. There exists a constant C K such that if μ t < 1, w is an eigenfunction of at with eigenvalue λ ∈ K , and W is the Cherry-Langer transform of w at energy E ∈ K , then ∞ 2 ∞ 2

2 4 w 2 d x. (46) t · W − y · W dy ≤ C K · |λ − E| + t φ E (0)

0

Proof. Since −t 2 · w

+ (μ − λ · σ ) · w = 0, the function w satisfies −t 2 · w

+ f E · w = r. with r = (E − λ) · σ · w. Therefore we may apply Proposition 7.3. In particular, it suffices to bound the integrals of the squares of the terms appearing on the right-hand side of (44). By Lemma 7.4 there exists ν1 and C1 (depending only on K ) such that |ρ E (x)|−4 · |ρ E3 · ρ E

(x)|2 ≤ C1 · (1 + x ν1 ). Hence by changing variables (recall that W 2 dy = ρ E−4 w 2 d x) we find that

2 3 −1

2 (ρ E · ρ E ) ◦ φ E · |W (y)| dy ≤ C1

∞

φ E (0)

∞

|w(x)|2 · (1 + x ν1 ) d x.

0

Since w is an eigenfunction, we can apply Lemma 6.4. By fixing s = μ/2, we obtain a constant C2 —depending only on K —such that ∞ ∞ 2 ν1 |w(x)| · (1 + x ) d x ≤ C2 · t |w(x)|2 d x. 3x Es

x Es

Let x ∗ = sup{x Es | E ∈ K , s = μ/2}. Then

3x Es

|w(x)|2 · (1 + (3x)ν1 ) d x ≤ 1 + (3x ∗ )ν1

0

∞

|w(x)|2 d x.

0

In sum, if t ≥ 1, then we have a constant C3 such that ∞ 3

−1 2 2 |ρ E · ρ E ◦ φ (y)| · |W (y)| dy ≤ C3 φ E (0)

∞

|w(x)|2 d x.

0

A similar argument shows that there exists C4 —depending only on K —such that ∞ ∞ 2 3 (y) · W (y) dy ≤ C |w(x)|2 d x. (ρ E · σ 2 ) ◦ φ −1 4 E φ E (0)

By putting these estimates together we obtain the claim.

0

The following lemma will allow us to control scalar products in w when they are expressed on the Cherry-Langer side, in the limit as E tends to μ/σ (0) and t tends to 0. It will be used in the proof of Theorem 10.4.

Spectral Simplicity

313

Lemma 7.7. Let q : [0, ∞) → R be a positive continuous function of at most polynomial growth. Given > 0, there exists δ > 0 such that if t < δ, E < μ/σ (0) + δ, and w± is an eigenfunction of aμt with eigenvalue λ± ≤ E, then ∞ ∞ 1 W+ · W− dy − 4 w+ · w− · q d x ≤ · w+ · w− , (47) φ E (0) ρ E (0) · q(0) 0 where W± is the Langer-Cherry transform of w± at energy E, and · is the standard (unweighted) L 2 norm. Proof. Changing variables gives ∞ W+ · W− dy = φ E (0)

0

∞

w+ · w− · ρ E−4 (x) d x.

ρ E−4

is bounded, and hence we can apply Proposition 6.5. By Lemma 7.4, the function In particular, choose δ1 > 0 so that if E < μ/σ (0) + δ1 , then α p (E, μ/2) < /4 and choose δ2 ≤ δ1 so that if t < δ2 , then β(δ1 , μ/2) · t < /4. Thus, if E < μ/σ (0) + δ2 and t < δ2 , then ∞ ∞ −4 ≤ · w+ · w− . W · W dy − ρ (0) w · w d x + − + − E 2 φ E (0)

0

In a similar fashion we can apply Lemma 6.5 to find δ ≤ δ2 so that if E < μ/σ (0) + δ and t < δ, then ∞ ∞ w+ · w− · q dy − q(0) w+ · w− d x ≤ · ρ E4 (0) · q(0) · w+ · w− . 2 0

The claim follows.

0

8. Airy Approximations In this section we analyse solutions to the inhomogeneous equation t 2 · W

(y) − y · W (y) = g(y).

(48)

To do this, we will use a solution operator, K˜ t , for the associated homogeneous equation t 2 W0

− y · W0 = 0.

(49) 2

The function W0 is a solution to (49) if and only if A(u) = W0 (t 3 · u) is a solution to the Airy equation A

− u · A = 0.

(50)

Using, for example, the method of variation of constants, one can construct an integral kernel K for an ‘inverse’ of the operator A(u) → A

(u) − u · A(u) in terms of Airy functions. We give the construction of K as well as its basic properties in Appendix A. By rescaling (or by direct construction) we obtain an integral kernel for the operator A(x) → t 2 · A

(x) − x · A(x). To be precise, define 2 4 2 K˜ t (y, z) = t − 3 · K t − 3 · y, t − 3 · z , where K is the integral kernel constructed in Appendix A.

314

L. Hillairet, C. Judge

Lemma 8.1. Let −∞ < a ≤ b ≤ ∞. For each locally integrable g : [a, b] → R of at most polynomial growth, the function b (51) K˜ t (y, z) · g(z) dz y → a

is a solution to (48). Proof. This follows from Lemma A.1 or directly from the variation of constants construction. The following estimate is crucial to the proof of Proposition 9.1. Lemma 8.2. Let g : R → R be continuous. For each −∞ < a < 0 < b, there exist constants C and t0 > 0 such that if t < t0 and W satisfies (48), then

a 0 b 2 2 2 − 53 W ≤C· W +t g2 , (52) a

a

and

b 2

W ≤C· t 2

0

1 3

a 2

a

b

2

W +

a

b 2

2

W +t

− 53

b

g

2

.

(53)

a

The constants C and t0 can be chosen to depend continuously upon a and b. Proof. Define W0 on [a, b] by W0 (y) = W (y) −

b

K˜ t (y, z) · g(z) dz.

a

Using Lemma 8.1 and linearity, W0 is a solution to (49). Using the Cauchy-Schwarz-Bunyakovsky inequality, we find that b 2 b ˜ 2 2 |g(z)| dz . |W (y) − W0 (y)| ≤ K t (y, z) dz a

(54)

a

A change of variables gives a

b

a

b

t − 23 b t − 23 b 2 4 ˜ |K (u, v)|2 du dv. K t (y, z) dy dz = t − 3 − 2 −2 t

3a

t

(55)

3a

√ 1 By Lemma A.3 in Appendix A, the latter integral is less than CAiry · δ · t − 3 , where CAiry is a universal constant and δ = max{|a|, b}. Therefore, by integrating (54) over an interval I ⊂ [a, b] and substituting (55), we find that 5

W − W0 2I ≤ C0 · t − 3 · g2[a,b] ,

(56)

where C0 = CAiry · δ and · J denotes the L 2 -norm over the interval J . In particular, by the triangle inequality we have 1

5

W I ≤ W0 I + C02 · t − 6 · g[a,b] ,

Spectral Simplicity

315

and hence 5

W 2I ≤ 2 · W0 2I + 2 · C0 · t − 3 · g2[a,b] .

(57)

Similarly, 5

W0 2I ≤ 2 · W 2I + 2 · C0 · t − 3 · g2[a,b] .

(58)

2

The function u → W0 (t 3 · u) satisfies the Airy equation (131). Hence, it follows 2 from Lemma A.4 (in which s is replaced by t − 3 ) that there exist constants M and t0 > 0—depending continuously on a and b—such that if t ≤ t0 , then 0 a 2 W02 dy ≤ M W02 dy (59) a

a

and

b 2

W02

0

dy ≤ M t

1 3

a

a 2

W02

dy +

b b 2

W02

dy .

(60)

By combining (60) with (57) and (58), we obtain (52). By combining (59) with (57) and (58), we obtain (53). 9. A Non-Concentration Estimate μ

Fix μ and σ and let at be the family of quadratic forms defined as in §5. The purpose of this section is to prove the following non-concentration estimate—see Remark 9.3—that is crucial to our proof of generic spectral simplicity. Proposition 9.1. Let K be a compact subset of (μ · σ (0)−1 , ∞), and C > 0. There exist μ constants t0 > 0 and κ > 0 such that if E ∈ K , if t < t0 , and if for each v ∈ dom(at ), the function w satisfies μ at (w, v) − E · w, v σ ≤ C · t · wσ · vσ , (61) then

∞ 0

(E · σ (x) − μ) · |w(x)|2 d x ≥ κ · w2σ .

(62)

The constants t0 and κ depend only upon K , C, μ, and σ . In contrast to previous estimates, Proposition 9.1 is concerned with so-called noncritical energies, those values of E that are strictly greater than the threshold μ/σ (0). Remark 9.2. Estimate (61) is a special case of an estimate of the following form: For all v ∈ dom(aμ,t ), μ at (w, v) − E t · w, v σ ≤ t ρ · wσ · vσ . (63) By the Riesz representation theorem, estimate (63) is equivalent to Eq. (26) with r such that r ≤ t ρ · w. In other words, a sequence wn satisfying (63) is what we have called a quasimode of order ρ at energy E 0 .

316

L. Hillairet, C. Judge μ

Remark 9.3. Suppose that wn is a sequence of eigenfunctions of atn with tn tending to zero as n tends to infinity. Then, by Lemma 6.3, each wn decays exponentially in the region {x | E · σ (x) − μ < 0} and the rate of decay increases as n increases. In particular, we can use Proposition 6.4 to prove that the measure |wn (x)|2 d x concentrates in the ‘classically allowed region’ {x | E · σ (x) − μ ≥ 0}. Proposition 9.1 is a twofold strengthening of this latter statement: We prove that if E is not critical then |wn (x)|2 d x does not concentrate solely on {x | E · σ (x) − μ = 0}, and we prove that this also holds true for a quasimode of order 1. Estimate (62) for eigenfunctions could be obtained using a contradiction argument which is standard in the study of semiclassical measures. (See [Hillairet10] for closely related topics.) However, we believe that this method fails for first order quasimodes. Proof of Proposition 9.1. Let E > μ/σ (0). Then f E (0) < 0 and since f E = μ − E · σ is strictly increasing with lim x→∞ f E (x) = μ, there exists a unique x E > 0 such that f E (x E ) = 0. Since f E changes sign at x E , we have

∞

xE

(− f E ) · w 2 d x =

| f E | · w2 d x −

0

0

∞

| f E | · w 2 d x.

(64)

xE

Thus, by Lemmas 9.5 and 9.4 below, there exist constants C + , c− > 0 and t ∗ > 0 such that if t < t ∗ , then

∞

(− f E ) · w 2 d x ≥ c−

0

∞

1

∞

w2 d x − C + · t 3

0

w 2 d x.

(65)

0

Thus, if t < t0 = (c− /2C + )3 , then we have (62) with κ = c− /(2 · σ (0)).

Lemma 9.4. There exist constants C + and t + > 0 so that if t < t + , then

∞

1

| f E | · w2 d x ≤ C + · t 3

xE

w 2 d x.

(66)

0

xE

Lemma 9.5. There exist constants c− > 0, and t − > 0 so that if t < t − , then

xE 0

| fE| · w dx ≥ c 2

−

∞

w 2 d x.

(67)

0

The proofs of Lemma 9.5 and 9.4 are based on estimates provided in Sects. 6, 7, and 8. In preparation for these proofs we provide the common context. First note that the Riesz representation theorem provides r ∈ Hσ so that for all v ∈ dom(at ), |at (w, v) − E · w, v σ | = r, v σ , where r σ ≤ C0 · t · wσ .

Spectral Simplicity

317

Let W denote the Langer-Cherry transform of w at energy E (see §7). In particular, 1 W = (φ E ) 2 · w ◦ φ −1 E , where φ E is defined by (39). By Proposition 7.3, the function W satisfies (48) with g equal to the right-hand side of Eq. (44). As a last preparation for the proofs, we define the endpoints of the intervals over which we will apply the estimates from the preceding sections. Let x E+ be defined by μ/2 f E (x E+ ) = μ/2. In other words, x E+ = x E , where x Es is defined in (30). Define y E+ = 2 · φ E (x E+ ) and y E− = φ E (0). Since σ is decreasing, we have 0 < x E < x E+ , and hence since φ E is strictly increasing, we have y E− < 0 < y E+ . It follows from Lemma 7.1 and Remark 6.1 that y E+ and y E− depend smoothly on E. Proof of Lemma 9.4. Since σ is decreasing, we have sup {| f E | | x ≥ x E } = μ, and thus ∞ ∞ | f E | · w2 d x ≤ μ w 2 d x. (68) xE

Since φ E [x E , x E+ ] = 0,

∞

+

yE 2

xE

and W 2 · dy = (φ E )2 · w 2 · d x, we have

y+ E 2

w d x ≤ C1 2

W 2 dy +

x E+

0

xE

∞

w 2 d x,

(69)

where C1 = max{(φ E (x))−2 | E ∈ K , x ∈ [x E , x E+ ]}. By Lemma 8.2 there exist constants C E and t E > 0 so that if t < t E , then ⎞ ⎛ − y+ yE y+ y +E E E 2 2 1 5 W 2 dy ≤ C E ⎝t 3 W 2 dy + y + W 2 dy + t − 3 g 2 d x ⎠. y E−

0

y E−

E 2

(70)

The constants C E and t E depend continuously on E and hence C2 = sup{C E | E ∈ K } is finite and t2 = inf{t E | E ∈ K } is positive. Since W 2 · dy = (φ E )2 · w 2 · d x, we have t

1 3

y− E 2

y E−

2

W dy +

y E+ y+ E 2

W dy ≤ C3 · t 2

1 3

∞

w dx + 2

0

∞ x E+

w dx , 2

(71)

where C3 := sup (φ E (x))2 | E ∈ K , x ∈ 0, φ −1 (y E+ ) . By Lemma 7.5, there exists a constant C ∗ so that

y E+ y E−

∗

g dy ≤ C · t 2

2 0

∞

w 2 d x.

(72)

318

L. Hillairet, C. Judge

By substituting (71) and (72) into (70) we find that if t < t2 , then

y+ E 2

1

W 2 dy ≤ C4 · t 3

0

∞

w2 d x + C5

0

∞ x E+

w 2 d x,

(73)

where C4 = C2 · (C3 + C ∗ ) and C5 = C2 · C3 . By Lemma 6.2, there exists a constant C6 so that if t < 1, then ∞ ∞ 1 w2 d x ≤ C6 · t 3 w 2 d x. (74) x E+

0

By combining (69), (73), and (74), we find that if t < t3 := min{1, t2 }, then ∞ ∞ 1 w2 d x ≤ C7 · t 3 w 2 d x, xE

(75)

0

where C7 = C1 · C4 + C1 · C5 · C6 + C6 . Finally, split the integral on the right-hand side of (75) into the integral over [0, x E ] and the integral over [x E , ∞). Then subtract the latter integral from both sides of (75). It follows that if t < min{t3 , (2C7 )−3 }, then xE 1 1 ∞ 2 3 w d x ≤ C7 · t w 2 d x. 2 xE 0

The claim then follows by combining this with (68). We have the following corollary of the proof.

Corollary 9.6. There exist constants C and t > 0 such that if t < t , ∞ xE 1 w2 d x ≤ C · t 3 w 2 d x. 0

xE

Proof of Lemma 9.5. Since W 2 · dy = (φ E )2 · w 2 · d x we have xE 0 2 2 | f E | · w d x ≥ c1 f E ◦ φ −1 E · W dy, y E−

0

where c1 = inf φ E (x)−2 | E ∈ K , x ∈ [0, x E ] . Since f E ◦ φ −1 E is negative and − − increasing on [y E , y E /2], we have

y− E 2

y E−

2 · W dy ≥ c f E ◦ φ −1 2 E

y− E 2

y E−

W 2 dy,

− E ∈ K . Putting these two estimates together y /2 where c2 = inf f E ◦ φ −1 E E we have

xE 0

| f E | · w d x ≥ c1 · c2 2

y− E 2

y E−

W 2 dy.

It follows from Lemma 7.1 that c1 and c2 are both positive.

(76)

Spectral Simplicity

319

By Lemma 8.2, there exist constants C E and t E > 0 so that if t < t E , then

⎛ 0 y E−

W ≤ CE · ⎝

y− E 2

2

y E−

2

W +t

− 53

y E+ y E−

⎞ g ⎠. 2

(77)

Moreover, C E and t E depend continuously on E, and hence the constants c3 = sup{1/C E | E ∈ K } and t1 = inf{t E | E ∈ K } are both positive. By manipulating (77) we find that

y− E 2

y E−

W 2 ≥ c3

0 y E−

5

W 2 − t− 3

y E+ y E−

g2

(78)

for each t < t1 . By combining (76), (78), and (72) we find that for t < t1 ,

xE

| f E | · w d x ≥ c4 2

0

0 y E−

W dy − C · t 2

1 3

∞

w 2 d x,

(79)

0

where c4 = c1 · c2 · c3 and C = c1 · c2 · C ∗ . Since W 2 · dy = (φ E )2 · w 2 · d x, we have

0 y E−

xE

W 2 dy ≥ c5

w 2 d x,

(80)

0

where c5 = inf{(φ E (x))2 | E ∈ K , x ∈ [0, x E ]} is positive by Lemma 7.1. By substituting (80) into (79) and applying Corollary 9.6, we find that if t < t2 = min{t1 , t }, then

xE 0

| f E | · w 2 d x ≥ c4 · c5

∞

1 w 2 d x − C + c4 · c5 · C · t 3

0

∞

w 2 d x. (81)

0

3 If t < t − = min t2 , c4 · c5 /2(C + c4 · c5 · C ) , then (67) holds with c− = c4 ·c5 /2.

10. Convergence, Estimation, and Separation of Eigenvalues μ

Let at be the family of quadratic forms defined as in §5. In this section, we will evaluate the limit to which each real-analytic eigenvalue branch converges (Proposition 10.1), estimate the asymptotic behavior of eigenvalues (Proposition 10.3), and show that if both t and E − μ/σ (0) are sufficiently small, then eigenvalues near energy E must be ‘super-separated’ at order t (Theorem 10.4).

320

L. Hillairet, C. Judge μ

10.1. Convergence. Let t → λt be a real-analytic eigenvalue branch of at with respect to ·, · σ . Since |w |2 ≥ 0 and σ is decreasing, we have ∞ μ 0 |wt |2 d x μ . (82) ≥ λt ≥ ∞ 2 · σ dx σ (0) |w | t 0 μ

The first derivative of at , μ a˙ t (u)

w (x)2 d x, t

∞

= 2t 0

is nonnegative, and hence by Proposition 3.3, the eigenbranch λt converges as t tends to zero. Proposition 10.1. We have lim λt =

t→0

μ . σ (0)

Proof. Let wt be an eigenfunction branch associated to E t . The variational formula (13) becomes ∞ 2 w (x) d x. (83) λ˙ · wt 2σ = 2t t 0

μ

Using the eigenvalue equation for at with respect to ·, · σ we find that t

2

w (x)2 d x = t

∞

0

∞

(λt · σ (x) − μ) · |wt (x)|2 d x.

0

By combining this with (83) and (82) we find that ∞ 2 ˙λ · wt 2σ ≥ · (λt · σ (x) − μ) · |wt (x)|2 d x. t 0

(84)

Suppose to the contrary that λ0 := limt→0 λt = μ/σ (0). Then by (82), we have λ0 > μ/σ (0). Let K be the compact interval [λ0 , λ0 + 1]. Then for all t sufficiently small, λt ∈ K . Hence we can apply Proposition 9.1, with E = λt , and obtain a constant κ > 0 such that ∞ (λt · σ (x) − μ) · |wt (x)|2 d x ≥ κ · wt (x)2σ . 0

By combining this with (84) we find that d 2·κ λt ≥ . dt t The left-hand side is integrable on an interval of the form [0, t0 ), but the right-hand side is not integrable on such an interval. The claim follows.

Spectral Simplicity

321

10.2. Airy eigenvalues. The remainder of this section concerns quantitative estimates μ on the eigenvalues of at for t small. In particular, we will use the Langer-Cherry transμ form to compare the eigenvalues of at to the eigenvalues of the operator associated to the Airy equation. We first define and study the eigenvalue problem for the model operator. For each z ∈ R and u ∈ C0∞ [z, ∞) define Az (u)(y) = −u

(y) + y · u(y). The operator Az is symmetric with respect to the L 2 ([z, ∞), dy) inner product, and we have Az (u), u ≥ z · u2 . Thus, by the method of Friedrichs, we may extend Az to a densely defined, self-adjoint operator on L 2 ([z, ∞), dy) with either Dirichlet or Neumann conditions at y = z. Let A± be the solutions to the Airy equation defined in Appendix A. Proposition 10.2. The real number ν is a Dirichlet (resp. Neumann) eigenvalue of Az with respect to the L 2 -norm if and only if z − ν is a zero of A− (resp. A − ). Moreover, each eigenspace of Az is 1-dimensional and each eigenvalue of Az is strictly greater than z. Proof. If ψ is an eigenfunction with eigenvalue ν, then x → ψ(x + ν) is solution to the Airy equation that decays as x tends to infinity. Sturm-Liouville theory ensures that the associated eigenspaces are one-dimensional. 10.3. Estimation. Proposition 10.3. There exists δ0 and C such that for any t ≤ δ0 , if λ ∈ [ σ μ(0) , σ μ(0) + δ0 ] μ is a Dirichlet (resp. Neumann) eigenvalue of at , then there exists a zero, z, of A− (resp.

A− ) such that 2 (85) φλ (0) − t 3 · z ≤ C · t 2 . Proof. We set = 21 min{ρ E−4 (0) | E ∈ [ σ μ(0) , σ μ(0) + 1]}, and we choose δ0 to be the minimum of 1 and the δ provided by Lemma 7.7 that is associated with this and q identically 1. Let K be the compact [ σ μ(0) , σ μ(0) + δ0 ]. Let w be an eigenfunction with eigenvalue λ ∈ K and t ≤ δ0 . Let W the Cherry-Langer transform of w at energy λ. According to Lemma 7.7 and to the choice we made of , we have ∞ ∞ ∞ 1 3 2 2 w d x ≤ W dy ≤ w 2 d x. 2ρλ (0)4 0 2ρλ (0)4 0 φλ (0) Combining with Lemma 7.6, (and using that ρ E (0) is uniformly bounded away from 0 over the compact K ), there exists a constant C such that ∞ ∞ |t 2 · W

− y · W |2 dy ≤ C · t 4 |W (y)|2 dy. (86) φλ (0)

φλ (0)

2 3

Setting U (x) = W (t · x), we have ∞ 8

2 |U − x · U | d x ≤ C · t 3 2 t − 3 ·φλ (0)

2

t − 3 ·φλ (0)

|U (x)|2 d x.

(87)

322

L. Hillairet, C. Judge 2

Let z t = t − 3 · φλ (0). Then U (z) = 0 (resp. U (z) = 0) if λ is a Dirichlet (resp. Neumann) eigenvalue. In particular, U belongs to the domain of Az t . Moreover, from (87) we have that Az (U )2 ≤ C · t 83 · U 2. (88) t Thus, since Az t is self-adjoint,

! 8 A2z t (U ), U ≤ C · t 3 · U 2.

(89) 8

Thus, by the minimax principle, A2z t has an eigenvalue in the interval [0, Ct 3 ]. Hence 1

4

1

4

Az t has an eigenvalue in the interval [−C 2 t 3 , C 2 t 3 ], and the claim follows from Proposition 10.2. μ

10.4. Separation. We next show that, as t tends to zero, the eigenvalues of at with respect ·, · σ are separated at order greater than t. More precisely, we have the following. Theorem 10.4. Let t1 , t2 , t3 , . . . be a sequence of positive real numbers such that limn→∞ tn = 0. For each n ∈ Z+ , let λ+n and λ− n be distinct eigenvalues of the quadratic μ form atn . If limn→∞ λ± n = μ/σ (0), then lim

n→∞

1 + · λn − λ− n = ∞. tn

This fact may be understood by using the following semiclassical heuristics: The threshold σ μ(0) is the bottom of the potential, and the eigenvalues near it are driven by the shape of this minimum. Since σ (0) = 0, the asymptotics are given by the eigenvalues of the model problem Pt u = −t 2 · u

+ x · u = 0 on (0, ∞). Denote by en (t) the n th 2 eigenvalue of the model operator. Using homogeneity, en (t) behaves like en (1) · t 3 (and en (1) actually is some zero of the Airy function see Proposition 10.2). For fixed n, the 2 separation between two eigenvalues is thus of order t 3 . It would be relatively straightforward to make the preceding reasoning rigorous in the case of a finite number of real-analytic eigenvalue branches. (For instance we could use [FrdSlm09]). Unfortunately, this is not enough for our purposes. In Sect. 14 we will need the result for a sequence of eigenvalues that may belong to an infinite number of distinct branches. Remark 10.5. The same semiclassical heuristics show that this super-separation does not hold near an energy strictly greater than σ μ(0) . Indeed, near a non-critical energy, the spectrum is separated at order t. Proof of Theorem 10.4. Suppose to the contrary that there exists a subsequence—that + ± we will abusively call tn —such that |λ− n − λn |/tn is bounded. Let wn denote a sequence ± ± + of eigenfunctions associated to λn with wn σ = 1. Since λ− n = λn , we have − + wn , wn σ = 0. + Let Wn± denote the Langer-Cherry transform of wn± at the energy E n = sup{λ− n , λn }. By hypothesis limn→∞ = μ/σ (0). By Lemma 7.6 and Lemma 7.7, we find that there exist N1 and C such that if n > N1 , then 2 2 (90) −tn2 · ∂ y2 − y Wn± ≤ C · tn2 · Wn± .

Spectral Simplicity

323

Since wn− , wn+ σ = 0 and wn± σ = 1, it follows from Lemma 7.7 that there exists N2 > N1 such that if n > N2 , then − W , W + ≤ 1 · W − · W + . n n n n 2 This implies that for any linear combination of Wn+ and Wn− we have |α+ |2 Wn+ 2 + |α− |2 Wn− 2 ≤ 2α+ Wn+ + α− Wn− 2. Therefore, it follows from (90) that if W belongs to the span, Wn , of {Wn− , Wn+ }, then 2 −tn2 · ∂ y2 − y W ≤ 4 · C · tn2 · W 2 . 2

Let U (x) = W (t 3 · x) and let Un denote the vector space corresponding to Wn . If U ∈ Un , then 2 2 2 (91) ∂x − x U ≤ 4 · C · tn3 · U 2 . Since wn± satisfies the boundary condition at 0, the Langer-Cherry transform Wn± at energy E n satisfies the boundary condition at φ E n (0). It follows that Un ⊂ dom(Az n ), − 23

where z n = tn

· φ E n (0). By (91) we have 2

A2z (U ), U ≤ 4 · C · tn3 · U 2 for each U ∈ Un . Hence, by the minimax principle, A2z n has at least two independent 2

eigenvectors with eigenvalues in the interval [0, 4C · tn3 ]. Thus, Az n has at least two 1 1 √ √ independent eigenvectors with eigenvalues in the interval [−2 C · tn3 , 2 C · tn3 ]. By Proposition 10.2, the eigenvalues of Az n are simple, and hence Az n has at least two 1 1 √ √ distinct eigenvalues, νn+ < νn− lying in [−2 C · tn3 , 2 C · tn3 ]. By Proposition 10.2, the number an± = z n − νn± is a zero of the funtion A− . Note that 1 √ |an+ − an− | ≤ 4 C · tn3 .

(92)

Since A− is real-analytic and A− (x) = 0 for x nonnegative, the zeroes Z of A− are a countable discrete subset of (−∞, 0). In particular, there is a unique bijection : Z → Z+ such that a < a implies (a) > (a ) and limk→∞ −1 (k) = −∞. From the asymptotics of A− —see Appendix A—one finds that there exists a constant c > 0 so that 2

lim k − 3 · −1 (k) = −c, 2 1 lim k 3 · −1 (k) − −1 (k + 1) = · c. k→∞ 3 k→∞

(93) (94)

Since limn→∞ tn = 0, estimate (92) implies that limn→∞ an± = −∞, and hence limn→∞ (an± ) = ∞. Therefore, since an+ = an− for all n, we have from (94) that there exists N such that if n > N then c 1 lim an+ − an− ≥ · (an+ )− 3 . k→∞ 2

324

L. Hillairet, C. Judge

By combining this with (92) we find that 1 c ((an ) · tn ) 3 ≥ √ . 4 C

(95)

But since limn→∞ E n = μ/σ (0), we have limn→∞ φ E n (0) = 0. Therefore, by Propo2 sition 10.3 we have limn→∞ t 3 · an± = 0. By (93) we have lim

n→∞ 2

an 2

(an ) 3

= −c.

2

Thus, limn→∞ tn3 · (an ) 3 = 0. This contradicts (95).

11. Separation of Variables in the Abstract Recall that the first step in our method for proving generic simplicity consists of finding a family at such that qt is asymptotic to at and such that at decomposes as a direct sum μ of ‘1-dimensional’ quadratic forms at of the type considered in the previous sections. μ In the present section we discuss the decomposition of at into forms at . Although the content is very well-known, we include it here for the purpose of establishing notation and context. Let ·, · σ be the inner product on Hσ defined in §5. Let H be Hilbert space "a real with inner product (·, ·). Consider the tensor product H := Hσ H completed with respect to the inner product ·, · determined by u 1 ⊗ ϕ1 , u 2 ⊗ ϕ2 := u 1 , u 2 σ · (ϕ1 , ϕ2 ).

(96)

Let b be a positive, closed, densely defined quadratic form on H . We will assume that the spectrum of b with respect to (·, ·) is discrete " and the eigenspaces are finite dimensional. For each t > 0 and u ⊗ ϕ ∈ C0∞ ([0, ∞)) dom(b), define ∞ ∞ 2

2 at (u ⊗ ϕ) = t · (ϕ, ϕ) |u (x)| d x + b(ϕ) |u(x)|2 d x. (97) 0

0

C0∞ ([0, ∞))

Let Y ⊂ be a subspace. The restriction of at to Y ⊗ H is a nonnegative real quadratic form. By Theorem 1.17 in Chap. VI of [Kato], this restriction has a unique minimal closed extension. In particular, let dom(at ) be the collection of u ∈ Hσ ⊗ H such that there exists a sequence u n ∈ Y ⊗ dom(b) such that limn→∞ u n − u = 0 and u n is Cauchy in the norm [u]t := at (u) + uH . For each u ∈ dom(at ) define at (u) := lim at (u n ), n→∞

where u n is a sequence as above. For t, t > 0 the norms [·]t and [·]t are equivalent, and hence dom(at ) does not depend on t. Remark 11.1. In applications, either Y = C0 ([0, ∞)) or Y consists of smooth functions whose support is compact and does not include zero. In the former case, eigenfunctions of at will satisfy a Neumann condition at x = 0 and in the latter case they will satisfy a Dirichlet condition at x = 0.

Spectral Simplicity

325

Proposition 11.2. The family t → at is a real-analytic family of type (a) in the sense of Kato.2 Proof. For each t the form at is closed with respect to ·, · , the domain dom(at ) is constant in t, and for each u ∈ dom(at ), the function t → at (u) is analytic in t. on a compact LipsExample 11.3. Let H be the space of square integrable functions " chitz domain U ⊂ Rn with the usual inner product. Then H H is isomorphic to the completion of C0∞ ((0, ∞) × U ) with respect to the inner product ∞ f, g = f (x, y) · g(x, y) · σ (x) d x d y. U

0

Let b˜ be the quadratic form defined on H 1 (U ) by ˜ b(φ) = |∇φ|2 d x d y.

(98)

U

We define b to be the restriction of b˜ to any closed subset of H 1 (U ) on which it defines a positive quadratic form. In this case the quadratic form at is equivalent to the form t 2 · |∂x u|2 + |∇ y u|2 d x d y. (99) a t (u) = R+ ×U

μ

μ

For each μ > 0 and t > 0, we define the quadratic form at as in §5. The form at is equivalent to the construction above with H = R with its standard inner product μ μ and b(s) = μ · s 2 . The norms [·]t,μ and [·]t ,μ that are used to extend at and at are μ equivalent. Hence dom(at ) is independent of t and μ. We will denote this common domain by D. Proposition 11.4. If φ is a μ-eigenvector for b with respect to ·, · H , and v is a μ λ-eigenvector of at with respect to ·, · σ , then v ⊗ φ is a λ-eigenvector of at with respect to ·, · H . Conversely, if u is a λ-eigenvector of at with respect to ·, · H , then u μ is a finite sum vμ ⊗ φμ , where vμ is a λ-eigenvector of at with respect to ·, · σ and φμ is a μ-eigenvector of b with respect to ·, · H . Proof. Straightforward.

Proposition 11.5. For each analytic eigenvalue branch λt of at , there exists a unique μ μ ∈ spec(b) such that λt is an analytic eigenvalue branch of at . In particular, λt μ decreases to σ (0) as t tends to 0. Proof. Let t0 > 0. For each μ ∈ spec(b), consider # the set Aμ of t ∈ (0, t0 ) such that μ λt ∈ spec(at ). By Proposition 11.4, the union μ Aμ equals (0, t0 ). Since spec(b) is countable, the Baire Category Theorem implies that there exists μ ∈ spec(b) such μ that Aμ has nonempty interior A0μ . For each real-analytic eigenvalue branch νt of at , let Bν ⊂ A0μ be the set of t such that νt = λt . Since there are only countably many eigenvalue branches, the Baire Category Theorem implies that there exists an eigenvalue μ branch νt of at such that Bμ has nonempty interior Bμ0 . Since λt and νt are real-analytic functions that coincide on a nonempty open set, they agree for all t. The latter statement then follows from Proposition 10.1. 2 See Chap. VII §4.2 in [Kato].

326

L. Hillairet, C. Judge

Corollary 11.6. If each eigenspace of b is 1-dimensional, then for each t belonging to the complement of a countable set, each eigenspace of at with respect to ·, · H is 1-dimensional. Proof. Use the assumption that b is simple and the fact that the eigenbranches are analytic. We end this section by establishing some notation that will be useful in the sections that follow. For each eigenvalue μ of b, let Vμ denote the associated eigenspace " and let Pμ : H → Vμ denote the associated orthogonal projection. Define μ : Hσ H by μ (v ⊗ w) = v ⊗ Pμ (w). If M is a collection of eigenvalues μ of b, then we define M to be the orthogonal projection onto the direct sum of μ-eigenspaces. That is,

M = μ . μ∈M

The subscript for may represent either an eigenvalue or a set of eigenvalues. Assumption 11.7. In what follows we assume that each eigenspace of b with respect to ·, · is 1-dimensional. One convenient consequence of this assumption is that for each w ∈ H, there exists w˜ μ ∈ Hσ and a unit norm eigenvector φμ of b such that μ (w) = w˜ μ ⊗ φμ .

(100)

Indeed, for each μ ∈ spec(b), let φμ ∈ Vμ . Since dim(Vμ ) = 1, each vector in Hσ ⊗ Vμ is of the form v ⊗ φμ . In particular, there exists w˜ μ so that (100) holds. Note that

w=

μ∈spec(b)

μ (w) =

w˜ μ ⊗ φμ .

μ∈spec(b)

12. Projection Estimates In this section, qt will denote a family of quadratic forms densely defined on H that is asymptotic at first order3 to the family at defined in the preceding section. Let PaIt be the orthogonal projection onto the direct sum of eigenspaces of at associated to the eigenvalues of at that belong to the interval I (see §2). We will provide some basic estimates on w := PaIt (u)

(101)

We begin with the following quasimode type estimate. In the sequel φμ will denote a unit norm eigenvector of b with eigenvalue μ. By Assumption 11.7, φμ is unique up to sign. 3 See Definition 3.1.

Spectral Simplicity

327

Lemma 12.1. Let J ⊂ I be a proper closed subinterval of a compact interval I . There exist constants C > 0 and t0 > 0 such that if μ ∈ spec(b), t < t0 , u is an eigenfunction of qt with eigenvalue E ∈ J, z ∈ D, then the projection w = PaIt (u) satisfies $ % at μ w, z ⊗ φμ − E · μ w, z ⊗ φμ ≤ C · t · zσ · w. (102) Proof. Since qt and at are asymptotic at first order, Lemma 2.5 applies. In particular, 1 −1 by letting δ = dist(J, " ∂ I ), t0 = 2 (1 + E/δ) , and C = (4/3) · sup(I ), we have for t < t0 and v ∈ D dom(b), |at (w, v) − E · w, v | ≤ C · t · v · w. For each μ ∈ spec(b), there exists w˜ μ ∈ D so that

w= w˜ μ ⊗ φμ

(103)

(104)

μ ∈spec(b)

and v = v˜μ ⊗ φμ . If μ = μ, then b(φμ , φμ ) = 0 and φμ , φμ = 0, and hence using (96) and (97) we find that % $ at (w˜ μ ⊗ φμ , z ⊗ φμ ) − E · w˜ μ ⊗ φμ , z ⊗ φμ = 0. Thus,

$ % at (w, v) − E · w, v = at μ w, v − E · μ w, v .

The claim then follows from substituting this into (103).

Lemma 12.2. Let J ⊂ I be a proper closed subinterval of a compact interval I . Let μ ∈ spec(b) with μ < σ (0) · inf(I ) and let > 0. There exist constants κ > 0 and t0 > 0 such that if t < t0 , u is an eigenfunction of qt with eigenvalue E ∈ J , and μ w ≥ · w, where w = PaIt (u), then we have 2 κ a˙ t μ (w) ≥ · μ (w)σ . t

(105)

Proof. We have μ w = w˜ μ ⊗ φμ for some w˜ μ ∈ D. Since, by assumption, φμ = 1, we have μ (w) = w˜ μ and hence the assumption becomes w˜ μ σ ≥ · w. Therefore, Lemma 12.1 gives μ at w˜ μ , z − E t · w˜ μ , z σ ≤ C · t · zσ · w˜ μ σ

(106)

for all sufficiently small t. Since μ/σ (0) < inf(I ), the compact set I is a subset of (μ/σ (0), ∞). Hence we may apply Proposition 9.1 to obtain κ > 0 and t1 > 0 so that if t < t1 , then ∞ (107) (E t · σ (x) − μ) · |w˜ μ |2 d x ≥ κ · w˜ μ 2σ . 0

328

L. Hillairet, C. Judge

Inspection of (97) gives that

a˙ t (w˜ μ ⊗ φμ , w˜ μ ⊗ φμ ) = 2t · φμ , φμ In particular a˙ w˜ μ ⊗ φμ = 2t

∞ 0

∂x w˜ μ · ∂x w˜ μ .

(108)

∞

0

∂x w˜ μ 2 d x.

μ at

Thus, by using the definition of and estimates (106) and (107) we find that ∞ 2 μ at (w˜ μ ) − μ |w˜ μ |2 d x a˙ μ (w) = t 0 ∞ 2C 2 ≥ · w˜ μ 2σ (E t · σ − μ) |w˜ μ |2 d x − t 0 C κ − · w˜ μ 2σ . ≥2 t By choosing t0 = min{t1 , C/( · κ)} we obtain the claim.

Remark 12.3. In the preceding lemma the constants t0 and κ a priori depend on the chosen μ. However, since there is only a finite number of eigenvalues of b that satisfy μ ≤ σ (0) inf I , we can choose t0 and κ depending only on I and not on the eigenvalue μ. It will be convenient to introduce the following notation. Given μ ∈ spec(b), define μ˜ =

μ , σ (0)

where σ is as in §11. For each compact interval I , define M I = {μ ∈ spec(b)| μ˜ ∈ I }, M− ˜ < inf I }, I = {μ ∈ spec(b)| μ M+I = {μ ∈ spec(b)| μ˜ > sup I }. + The spectrum spec(b) equals the disjoint union of M− I , M I , and M I , and in particular, each v ∈ H can be orthogonally decomposed as

v = M− (v) + M I (v) + M+I (v). I

The following lemma is crucial to our proof of generic simplicity. The proof uses both Theorem 4.2 and—by way of Lemma 12.2—Proposition 9.1. Lemma 12.4. Let E t be a real-analytic eigenvalue branch qt , and let Vt be the associated family of eigenspaces. Let t → u t be a map from (0, t0 ] to Vt that is continuous on the complement of a countable set. If wt = PaIt (u t ), then lim inf t→0

M− (wt ) I

wt

= 0.

Here if wt = 0, then we interpret the ratio to be equal to 1.

(109)

Spectral Simplicity

329

Proof. Suppose that (109) is false. We have the orthogonal decomposition M− (wt ) = I

μ (wt ),

μ∈M− I

and hence there exists > 0 and t0 > 0 such that for each t < t0 there exists μt ∈ M− I such that μt (wt ) ≥ · wt .

(110)

Using the orthogonal decomposition of w as in (104) we find that a˙ t (wt ) =

a˙ t μt (wt ) .

μ∈spec(b)

(See also (108).) In particular, since the quadratic form a˙ t is nonnegative, we have that a˙ t (wt ) ≥ a˙ t (μt (wt )). Thus we may apply Lemma 12.2 with J = E((0, t0 ]) as well as (110) to find that a˙ t (wt ) ≥

·κ · wt 2 t

for all t sufficiently small with some κ independent of t (according to Remark 12.3). Thus, it follows from Theorem 4.2 that the function 1/t is integrable on an interval whose left endpoint is zero. This is absurd. Lemma 12.5. Let I be a compact interval. If w belongs to the range of PaIt , then M+I (w) = 0. In particular, 2 2 w2 = M I (w) + M− (w) . I

Proof. By definition, w is a linear combination of eigenfunctions of at whose eigenvalues belong to I . Hence by Proposition 11.4, we have w=

vλ,μ ⊗ φμ ,

μ∈spec(b) λ∈I ∩spec(atμ ) μ

where vλ,μ belongs to the λ-eigenspace of at and φμ belongs to the μ-eigenspace of b. Hence

M+I (w) = vλ,μ ⊗ φμ . (111) μ∈M+I λ∈I ∩spec(atμ )

μ

˜ If μ ∈ M+I , According to Proposition 10.1, each eigenvalue λ of at satisfies λ ≥ μ. then μ˜ ≥ sup(I ). Hence each term in (111) vanishes.

330

L. Hillairet, C. Judge

13. The Limits of the Eigenvalue Branches of q t Proposition 3.4 implies that each real-analytic eigenvalue branch E t of qt converges as t tends to zero. In this section we use the results of the previous section to show that each limit belongs to the set & = {μ˜ | μ ∈ spec(b)}. spec(b) Theorem 13.1. For each real-analytic eigenvalue branch E t of qt , we have & lim E t ∈ spec(b).

t→0

& Since Proof. Suppose to the contrary that the limit, E 0 , does not belong to spec(b). & is discrete, there exists a nontrivial compact interval I such that E 0 ∈ J , such spec(b) that & = ∅. J ∩ spec(b)

(112)

Since J is nontrivial and E t is continuous, there exists t0 such that if t < t0 , then E t ∈ J . & . Let I be a compact interval such that J ⊂ I ⊂ R\spec(b) Let u t be a real-analytic eigenfunction branch associated to E t and let wt = PaIt (u t ). We have chosen I so that M I = ∅. Thus, by Lemma 12.5, 2 M− (wt ) = wt 2. I

This contradicts Lemma 12.4.

14. Generic Simplicity of q t In this section, we prove that the spectrum of qt is generically simple. We will make crucial use of the ‘super-separation’ of the eigenvalues of at for small t (see Theorem 10.4). Before providing the details of the proof, we first illustrate how super-separation can be useful in proving simplicity. Suppose that there exists an eigenvalue branch E t of qt such that E t → μ˜ and the associated real-analytic family of eigenspaces Vt is at least two dimensional. If for each u t ∈ Vt we knew that μ u t were uniformly bounded away from 0, then, arguing as in the beginning of the proof of Lemma 12.2, we would μ find that μ u t is a first order quasimode of at at energy μ. ˜ Then, since dim(Vt ) ≥ 2, μ we would have a sequence tn tending to zero and two distinct eigenvalues λ, λ of atn

such that λ − λ /tn is bounded. This would contradict super-separation. Theorem 14.1. Let E t be a real-analytic eigenvalue branch E t of qt , and let Vt be the associated real-analytic family of eigenspaces (see Remark 4.1). For each t ∈ (0, t0 ] we have dim(Vt ) = 1. Since each eigenvalue branch of qt is real-analytic and the spectrum of each qt is discrete with finite dimensional eigenspaces, we have the following corollary. Corollary 14.2. Let E t be a real-analytic eigenbranch, then E t is a simple eigenvalue of qt for all t in the complement of a discrete subset of (0, t0 ].

Spectral Simplicity

331

Proof of Theorem 14.1. Suppose that the conclusion does not hold. Since Vt is a realanalytic family of vector spaces, its dimension is constant and so for each t ∈ (0, t0 ], we have dim(Vt ) > 1. By Theorem 13.1 there exists μ ∈ spec(b) such that E t tends to μ˜ = μ/σ (0) as t & = {μ}. tends to zero. Let I be a compact interval so that I ∩ spec(b) ˜ By Lemma 14.3 below, there exists t3 ≤ t0 and a map t → u t from (0, t3 ] into Vt that is continuous on the complement of a discrete set so that if t ∈ (0, t3 ]\Z , then μ (wt ) <

1 · wt , 2

where wt = PaI (u t ). Thus, since {μ} = M I , Lemma 12.5 gives that M− (wt ) ≥ I

This contradicts Lemma 12.4.

1 · wt . 2

Lemma 14.3. Let E t be a real-analytic eigenvalue branch of qt such that for each t > 0 we have dim(Vt ) > 1. Let μ ∈ spec(b) be such that limt→0 E t = μ, ˜ and let I be a compact interval such that & = μ. I ∩ spec(b) ˜ There exists t0 > 0 and a function t → u t that maps (0, t0 ] to Vt , is continuous on the complement of a discrete set, and satisfies μ (wt ) ≤ 1 · wt (113) 2 where wt = PaIt (u t ). To prove Lemma 14.3, we will use the following well-known fact. Lemma 14.4. Let {gk : (a, b) → R | k ∈ N} be a collection of real-analytic functions. If for each k ∈ N and t ∈ (a, b) we have gk+1 (t) > gk (t) then the set {t ∈ (a, b) | gk (t) = 0, k ∈ N} is a discrete subset of (a, b). Proof. Suppose that gk (t) = 0 for some k ∈ N and t ∈ (a, b). Since gk is real-analytic there exists an open set U t such that if t ∈ U \{t}, then gk (t) = 0. Since k > k

implies gk (t) > gk

(t) we have ' −1 (0, ∞) = gk−1 t ∈ gk+1

(0, ∞) k >k

and −1 (−∞, 0) = t ∈ gk−1

' k
gk−1

(−∞, 0).

It follows that if −1 −1 (0, ∞) ∩ gk−1 (−∞, 0), t ∈ W := U ∩ gk+1

t = t, and k ∈ N, then gk (t) = 0. Since W is open, we have the claim.

332

L. Hillairet, C. Judge

Proof of Lemma 14.3. By Lemma 12.1, there exist C and t1 > 0 such that if t ≤ t1 , z ∈ D, and u is an eigenfunction with eigenvalue E t , then μ $ % at w˜ μ , z − E t · w˜ μ , z ≤ C · t · w · zσ , (114) σ where w = PaIt (u) and w˜ ⊗ ϕμ = μ w. μ Since at is a real-analytic family of type (a) in the sense of [Kato], for each k ∈ N, there exists a real-analytic function λk : (0, t1 ] → R so that for each t ∈ (0, t1 ], we μ μ have spec(at ) = {λk (t) | k ∈ N}. Since each eigenspace of at is 1-dimensional, we

may assume that k > k implies λk (t) > λk (t) for all t ∈ (0, t1 ]. By Theorem 10.4, there exists t0 ∈ (0, t1 ] such that if t < t0 , then k = k , then |λk (t) − λk (t)| > 4C · t.

(115)

For each k ∈ N and t ∈ (0, t0 ), define gk± (t) = λk (t) − E t ± 2C · t. Thus, by Lemma 14.4, the set Z=

' ' (gk+ )−1 {0} (gk− )−1 {0} k∈N

is discrete in (0, t0 ]. On each component J of the complement (0, t0 ]\Z , we have either μ • for all t ∈ J , we have dist E t , spec(at ) ≥ 2C · t, or μ • for all t ∈ J , we have dist E t , spec(at ) < 2C · t. It suffices to construct in each of these cases a continuous map t → u t from J to Vt that satisfies (113). Without loss of generality, each interval J is precompact in (0, t0 ], for otherwise we may, for example, add the discrete set {1/n | n ∈ N} to Z . We consider the first case. Let u t be a real-analytic eigenfunction branch of qt associated to E t . By estimate (114), we may apply Lemma 2.1 with = C · t · wt and find that w˜ t σ ≤

1 · wt . 2

(116)

Since μ w = w˜ μ σ , the desired (113) follows. We consider the second case. By (115) and since J ⊂ (0, t0 ) there exists a unique k such that if t ∈ J , then |E t − λk (t)| < 2C · t. μ

(117)

Let t → v˜t be the unique eigenfunction branch of at associated to the eigenvalue branch λk . Since dim(Vt ) > 1 and Vt is an analytic family of vector spaces, there exist analytic eigenfunction branches xt , xt ∈ Vt so that for each t, the eigenvectors xt and xt are independent. The function t → xt , v˜t ⊗ φμ is real-analytic, and thus it vanishes on at most a finite subset Z J ⊂ J . Away from Z J , set c(t) = −

xt , v˜t ⊗ φμ . xt , v˜t ⊗ φμ

Spectral Simplicity

333

Then u t = c(t) · xt + xt depends real-analytically on t and satisfies u t , v˜t ⊗ φμ = 0. μ

For each t ∈ J \Z J , let rt denote the " restriction of the quadratic form at to the orthogonal complement of v˜t ⊗ φμ in D dom(b). Let wt = PaIt (u t ) and let w˜ μ,t ∈ D such that μ wt = w˜ μ,t ⊗ φμ . From (114), we have $ % rt w˜ μ,t , z − E t · w˜ μ,t , z ≤ C · t · wt · zσ . σ It follows from (115) that dist(E t , spec(rt )) ≥ 2C · t. Hence Lemma 2.1 applies with = 2C · t · w to give (113). # Therefore, on the complement of Z ∪ J Z J , we have constructed a real-analytic function t → Vt so that (113) holds. 15. Stretching Along an Axis In this section, we consider a family of quadratic forms qt obtained by ‘stretching’ certain domains in Euclidean space Rn+1 that fiber over an interval. To be precise, let I = [0, c] be an interval, let Y ⊂ Rn be a compact domain with Lipschitz boundary, and let ρ : [0, c] → R be a smooth nonnegative function. For t > 0, define φt : I × Y → Rn+1 by φt (x, y) = (x/t, ρ(x) · y).

(118)

We will consider the Dirichlet Laplacian associated to the domain t = φt (I × Y ). Example 15.1 (Triangles and simplices). Let Y = [0, a] and ρ(x) = x. Then t is the right triangle with vertices (0, 0), (c/t, 0), (c/t, c). More generally, if ρ(x) = x and Y is a n-simplex, then t is a n + 1-simplex. Theorem 15.2. If ρ : [0, a] → R is smooth, ρ(0) = 0, ρ > 0, c dx = ∞, lim →0 ρ(x) and each eigenspace of the Dirichlet Laplacian acting on L 2 (Y ) is 1-dimensional, then for all but countably many t, each eigenspace of the Dirichlet Laplacian acting on L 2 (t ) is 1-dimensional. Proof. In order to apply Theorem 14.1, we make the following change of variables. Define ψ : (0, c] → [0, ∞) by c dx . ψ(x) = x ρ(x) By hypothesis, ψ is an orientation reversing homeomorphism. Define t : C ∞ ([0, ∞)× Y ) → C ∞ (t ) by n−1 t (u) = ρ 2 · u ◦ (ψ × Id) ◦ φt , where φt is defined by (118). We will use t to pull-back the L 2 inner product and the Dirichlet energy form.

334

L. Hillairet, C. Judge

First note that the Jacobian matrix of φt is 1/t 0 , Jφ = ∂x ρ · y ρ · Id

(119)

where Id is the n × n identity matrix, and hence the Jacobian determinant |J φt | equals t −1 · ρ n . The Jacobian determinant of ψ × Id is ρ −1 . It follows that 1 ∞ u · v σ (x) d x d y, (120) (t (u) · t (v)) d V = t 0 t Y where σ = ρ 2 ◦ ψ −1 and where dy denotes Lebesgue measure on Y ⊂ Rn . In order to have an inner product that does not depend on t, we rescale by t. Define ∞ u, v = u · v σ (x) d x d y. Y

0

Define a family of quadratic forms on C ∞ ([0, ∞) × Y ) by |∇ (t (u))|2 d x d y. qt (u) = t · t

The map t defines an isomorphism from each eigenspace of qt with respect to ·, · to the eigenspaces of the Dirichlet energy form on t with respect to the L 2 -inner product on t . In particular, it suffices to show that each eigenspace of qt with respect to ·, · is 1-dimensional. Define ∞ 2 at (u) = t 2 · |∂x u|2 + ∇ y u d x d y. 0

Y

By Theorem 14.1, it suffices to show that qt is asymptotic to at at first order. Let τ = ρ ◦ ψ −1 . A straightforward calculation of moderate length shows that qt (u, v) − at (u, v) = t · (I1 (u, v) + I2 (u, v) + I3 (u, v) + I4 (u, v) + I5 (u, v) + I3 (v, u) + I4 (v, u) + I5 (v, u)), where I1 (u, v) = I2 (u, v) = I3 (u, v) = I4 (u, v) = I5 (u, v) =

(n − 1)2 ∞ t· τ 2 · u · v d x d y, 4 Y 0 ∞ t τ 2 · y · ∇ y u · y · ∇ y v d x d y, Y 0 n−1 ∞ t· τ 2 · u · y · ∇ y v d x d y, 2 Y ∞ 0 t τ · ∂x u · y · ∇ y v d x d y, Y 0 ∞ n−1 t· τ · u · ∂x v d x d y. · 2 0 Y

To get (11), it suffices to show that for each k = 1, . . . , 5, there exists a constant Ck 1 1 such that |Ik (u, v)| ≤ Ck · at (u) 2 · at (v) 2 for t < 1.

Spectral Simplicity

335

First note that by assumption |ρ |—and hence |τ |—is bounded by a constant C. Second, note that if λ0 > 0 is the smallest eigenvalue of the Dirichlet Laplacian on L 2 (Y ), then for each u ∈ C ∞ ([0, ∞) × Y ) we have

∞ 0

u2 d x d y ≤ Y

1 λ0

∞

∇ y u 2 d x d y.

(121)

Y

0

If n = 1, then |I1 (u, v)| is trivial. Otherwise, apply the Cauchy-Schwarz inequality and estimate (121). More precisely 4 · |I1 (u, v)| ≤ t · C 2 (n − 1)2

∞ 0

u2 d x d y Y

∞

1 2 ·

1

∞

0

v2 d x d y Y

1 2 ∇ y u 2 d x d y ·

t ≤ · λ0 0 Y 1 1 t ≤ · at (u) 2 · at (v) 2 . λ0

0

∞

2

∇ y v 2 d x d y

1 2

Y

To bound |I2 (u, v)|, note that |y · ∇ y u|2 ≤ |y|2 · |∇ y u|2 and that |y|2 is bounded since Y is compact. The desired bound of |I2 (u, v)| then follows from an application of the Cauchy-Schwarz inequality. If n = 1, then |I3 (u, v)| is trivial. Otherwise, we apply the Cauchy-Schwarz inequality and estimate (121) as in the bound of |I1 (u, v)|. To bound |I4 (x, y)| we apply Cauchy-Schwarz as follows:

|t · ∂x u| · y · ∇ y v ≤

1 |t · ∂x u|2

2

y · ∇ y v 2

1 2

.

It then follows that 1

1

|I4 (u, v)| ≤ C · at (u) 2 · at (u) 2 . To bound |I5 (u, v)| apply Cauchy-Schwarz and argue in a fashion similar to the above. Condition (12) also follows using that (t Ik ) = 2Ik .

15.1. Changing the boundary condition. Theorem 15.2 extends to a more general boundary condition that we describe here. Inspecting the proof, the only thing we have used from the Laplace operator on Y is that it satisfies the Poincaré inequality (121). This fact is true for any mixed Dirichlet-Neumann boundary condition except Neumann on all faces. As a consequence we may take on the faces of t of the form I × ∂Y any kind of boundary condition except full Neumann. On the face {1} × Y we may take Dirichlet or Neumann as we want since we have μ allowed Dirichlet or Neumann at 0 for the one-dimensional model operators at .

336

L. Hillairet, C. Judge

16. Domains in the Hyperbolic Plane with a Cusp Recall that the hyperbolic metric on the upper half-plane R × R+ is defined by (d x 2 + dy 2 )/y 2 . The associated Riemannian measure is given by dμ = y −2 d x d y and the gradient is given by ∇ f = y 2 (∂x f · ∂x + ∂ y f · ∂ y ). Let h : (−η, η) → R be a positive real-analytic function such that h (0) = 0. For each t < η, define t by t = (x, y) ∈ R × R+ | − t ≤ x ≤ t and y ≥ h(x) . The domain t is unbounded but has finite hyperbolic area. It is known that the hyperbolic Dirichlet Laplacian acting on L 2 (t , dμ) is compactly resolved and hence has discrete spectrum (see e.g. [LaxPhl]).4 √ Example 16.1. Let h : (−1, 1) → R be defined by h(x) = 1 − x 2 . For each t < 1, the domain t is a hyperbolic triangle with one ideal vertex. In particular, 1/2 is a fundamental domain for the modular group S L(2, Z) acting on R × R+ ⊂ C as linear fractional transformations. Theorem 16.2. For all but countably many t, each eigenspace of the Dirichlet Laplacian acting on L 2 (t , dμ) is 1-dimensional. The remainder of this section is devoted to the proof of Theorem 16.2. The spectrum of the hyperbolic Laplacian on t coincides with the spectrum of the Dirichlet energy form |∂x u|2 + |∂ y u|2 d x d y, (122) E(u) = t

with respect to the inner product

u, v μ =

t

u·v

dx dy . y2

(123)

In order to study the variational behavior of the eigenvalues, we first adjust the domains by constructing a family of diffeomorphisms φt from the fixed set U = [−1, 1] × [h(0), ∞[ onto t . In particular, define t ·a a . = φt b + h(t · a) − h(0) b For each u ∈ C0∞ (U), we define u˜ = ψ · u ◦ φt−1 , where ψ(x, y) =

y . y − h(x) + h(0)

Since φt is a smooth diffeomorphism from U onto t and ψ is smooth on t , the mapping u → u˜ is a bijection from C0∞ (U) onto C0∞ (t ). 4 The Neumann Laplacian is not compactly resolved, and in fact, has essential spectrum.

Spectral Simplicity

337

Since the Jacobian of φt is J (φt )

t 0 a = t · h (t · a) 1 b

(124)

and ψ ◦ φt = (y ◦ φt )/b, we find that, for any smooth u and v compactly supported in U, d xd y da db u˜ · v˜ = u·v . (125) t −1 2 y b2 t U In particular, the mapping u → u˜ extends to an isometry of H := L 2 (U, da · db/b2 ) onto L 2 (t , t −1 dμ). We now pull-back the Dirichlet energy form from t to U. In particular, we define qt : C0∞ (U) → R by ˜ qt (u) = t · E(u). The form extends to a closed densely defined form on H. By construction, λ belongs to the spectrum of qt if and only if t −2 ·λ belongs to the Laplace spectrum of the hyperbolic triangle t . Because h is real-analytic, t → φt is a real-analytic family of bi-Lipschitz homeomorphisms. It follows that qt is a real-analytic family of quadratic forms of type (a) in the sense of Kato [Kato]. On C0∞ (U), we also define t 2 · |∂b u|2 + |∂a u|2 da db. at (u) = U

Theorem 16.2 follows from Theorem 14.1 and the following proposition. Proposition 16.3. qt is asymptotic to at at first order. Proof. Let u¯ = (ψ ◦ φt ) · u. One computes that ¯ ∂ y u˜ ◦ φt = ∂b u, 1 ˜ ◦ φt = · ∂a u¯ − h (ta) · ∂b u. ¯ (∂x u) t Thus, by making a change of variables in the integral that defines E, we find that |∂a u| qt (u) = ¯ 2 − 2t · h (ta) · ∂a u¯ · ∂b u¯ + t 2 · (1 + h (ta)2 ) |∂b u| ¯ 2 da db, (126) U

where u¯ = ψ¯ · u. To aid in computation we define a weighted gradient ¯ = [∂a w, t · ∂b w], ∇w and we define

At =

1 −h (t · a) . −h (t · a) 1 + h (t · a)2

Thus, (126) becomes

qt (u, v) =

U

∇¯ u¯ · At · ∇¯ v¯ da db

338

L. Hillairet, C. Judge

and

at (u, v) =

U

¯ · ∇v ¯ da db. ∇u

Letting ψ¯ = ψ ◦ φ, we have ¯ + w · ∇ψ, ¯ ∇¯ w¯ = ψ¯ · ∇w and hence qt (u, v) − at (u, v) is the sum of four terms: ¯ da db, ¯ · (ψ¯ 2 · At − I ) · ∇v ∇u U ¯ da db, ψ¯ · v · (∇¯ ψ¯ · At · ∇u) U ¯ da db, ψ¯ · u · (∇¯ ψ¯ · At · ∇v) U ¯ · u · v da db, (∇ ψ¯ · At · ∇¯ ψ) U

(127) (128) (129) (130)

where I denotes the 2×2 identity matrix. To finish the proof, it suffices to show that each 1 1 of these terms is bounded by O(t) · at (u) 2 · at (v) 2 , where O(t) represents a function that is bounded by a constant times t for t small. ¯ In order to estimate these terms, we use elementary estimates of h(t · a), h (t · a), ψ,

¯ ¯ and ∇ ψ. In particular, since h (0) = 0 we have that |h(t · a) − h(0)| = O(t) and |h (t · a)| = O(t) uniformly for a ∈ [−1, 1]. Thus, since ¯ ψ(a, b) = 1 −

h(t · a) − h(0) , b

¯ = O(t) uniformly for (a, b) ∈ U. we find that |ψ¯ 2 (a, b) − 1| = O(t) and |∇ ψ| To bound (127), note that tr(ψ¯ 2 · A − I ) = 2(ψ¯ 2 − 1) + ψ¯ 2 · h (t · a)2 and det(ψ¯ 2 · A − I ) = (ψ¯ 2 − 1)2 − h (t · a)2. Hence tr(ψ¯ 2 · A− I ) = O(t) and det(ψ¯ 2 · A− I ) = O(t 2 ). It follows that the eigenvalues of ψ¯ 2 · A − I are O(t). Therefore, 2 ¯ ¯ ¯ · ∇v ¯ da db. ¯ ∇u · (ψ · At − I ) · ∇v da db = O(t) · ∇u U

U

To estimate (128) we first note that the eigenvalues of At are O(1). Then we apply Cauchy-Schwarz ¯ ≤ |∇¯ ψ| ¯ ¯ · |∇u|, |∇¯ ψ¯ · ∇u| ¯ to find that and then the elementary estimate on |∇¯ ψ| ¯ da db ≤ O(t) ¯ da db. ¯ · |v| · |∇¯ ψ¯ · ∇u| |ψ| v · |∇u| U

U

Spectral Simplicity

339

Cauchy-Schwarz applied to the latter integral gives U

¯ da db ≤ |v| · |∇u|

U

|v|2 da db

1 1 2 2 ¯ 2 da db . · |∇u| U

From a Poincaré inquality—Lemma 16.4 below—we find that 2 2 ¯ 2 da db. |v| da db ≤ π |∇v| U

U

1

1

In sum we find that the expression in (128) is bounded by O(t) · at (u) 2 · at (v) 2 . Switching the rôles of u and v, we obtain the same bound for the expression in (129). To estimate (130) we use the fact that the norm of the eigenvalues of At are O(1) ¯ 2 = O(t 2 ) to find that and the fact that |∇¯ ψ| ¯ · |u| · |v| da db = O(t) · |∇ ψ¯ · At · ∇¯ ψ| |u| · |v| da db. U

U

By applying Cauchy-Schwarz and the Poincaré inequality of Lemma 16.4 below we obtain the claim. Condition (12) follows using the same kind of arguments. Lemma 16.4. Any u ∈ C0∞ (U) satisfies: 2 2 |u| da db ≤ π |∂a u|2 da db. U

U

Proof. We decompose u = k u k (b) sin(kπa). Then we have ∞

|∂a u|2 = k2π 2 u k (b)2 db U

k

≥ π2

k

h(0) ∞

u k (b)2 db = π 2

h(0)

U

|u|2 da db.

Acknowledgements. L.H. would like to thank Indiana University for its invitation and hospitality and the ANR programs ‘Teichmüller’ and ‘Résonances et chaos quantiques’ for their support. C.J. thanks the Université de Nantes, MATPYL program, L’Institut Fourier, and the Max Planck Institut für Mathematik-Bonn for hospitality and support.

Appendix A. Solutions to the Airy Equation Here we consider solutions to Airy’s differential equation A

(u) = u · A(u)

(131)

for u ∈ R. It is well-known that there exist unique solutions A+ and A− that satisfy5 1 3 3 u− 4 2 1 + O u− 2 (132) A± (u) = 1±1 · exp ± · u 2 3 2 2 1

5 The functions π − 2 · A are the classical Airy functions Ai and Bi. See, for example, [Olver] Chap. 11. ±

340

L. Hillairet, C. Judge

and A± (−u) = u 3

− 14

3 2 π cos ·u2 ∓ 3 4

3 + O u− 2 ,

(133)

3

where u 2 · O(u − 2 ) is bounded on [1, ∞). Let W denote the Wronskian of {A+ , A− }. Define K : R × R → R by

K (u, v) = W

−1

⎧ ⎪ ⎨

A+ (u) · A− (v) A− (u) · A+ (v) · ⎪ ⎩ A+ (u) · A− (v) − A− (u) · A+ (v) 0

if v ≥ u ≥ 0 or v ≥ 0 ≥ u if u ≥ v ≥ 0 if u ≤ v ≤ 0 otherwise.

Lemma A.1. Let −∞ < α ≤ 0 ≤ β ≤ ∞. For each locally integrable function g : [α, β] → R of at most polynomial growth, we have (∂u2 − u)

β α

K (u, v) · g(v) dv = g(u),

(134)

Proof. The Wronskian W is constant and hence by, for example, variation of parameters we find that the function P(u) = W −1 · A+ (u)

β u

A− (v) · g(v) dv + W −1 · A− (u)

u

A+ (v) · g(v) dv

0

is a solution to P

(u) − u · P(u) = g(u). Hence K satisfies (134).

Lemma A.2. There exists a constant CAiry so that ⎧ if u, v ≥ 0 ⎨ exp (− |v −1 u|) −4 |u · v| if u≤v≤0 |K (u, v)| ≤ CAiry · ⎩ −1 |u| 4 · exp (−v) if u ≤ 0 ≤ v

(135)

⎧ u|) if u, v ≥ 0 ⎨ exp (− |v − 1 4 |u · v| if u ≤ v ≤ 0 |∂u K (u, v)| ≤ CAiry · ⎩ 1 |u| 4 · exp (−v) if u ≤ 0 ≤ v.

(136)

and

Proof. Straightforward using definition of K and the asymptotic behavior of the Airy functions [Olver]. Lemma A.3. There exists a constant C so that α α √ |K (u, v)|2 du dv ≤ C · α. −α

−α

Proof. This follows directly from Lemma A.2.

(137)

Spectral Simplicity

341

Lemma A.4. Let b− < a − < 0 < b+ < a + . There exist constants C and s0 such that if s > s0 and A is a solution to (131), then

0 s·a −

and

A2 du ≤ C

s·b+

A2 du ≤ C s

0

− 21

s·a −

s·b−

s·b− s·a −

A2 du,

A2 du +

s·2b+

s·b+

(138)

A2 du .

(139)

The constants C and s0 may be chosen to depend continuously on a − , b− , a + , and b+ . Proof. Let 0 < α < β. By using (133) and the identity cos2 (ξ ) = 2−1 · (1 + cos(2ξ )), we have β −α 1 β −1 1 β −1 A2± du = u 2 du + u 2 · cos(2ξ ) du + O (1 + u)−2 du, 2 α 2 α −β α 3

where ξ = (2/3) · u 2 ∓ π/4. Integration by parts gives β β 1 1 1 β −2 u − 2 · cos(2ξ ) du = · u −1 · sin(2ξ ) + u · sin(2ξ ) du, 2 2 α α α and hence we have

−α −β

1 1 A2± du = β 2 − α 2 + O β −1 + α −1 .

Since A± is bounded on [−1, 0] we also have 0 1 A2± du = β 2 + O(1). −β

(140)

(141)

Using (133) and the fact that 2 cos(ξ + π4 ) cos(ξ − π4 ) = cos(2ξ ), we find that for 0 < α < β, we have −α (142) A+ · A− du = O β −1 + α −1 . −β

Since A± is bounded on [−1, 0], it follows that 0 A+ · A− du = O(1). −β

(143)

We now specialize to the case α = −s · a − and β = −s · b− . By (140) and (141), there exists s1 —depending continuously on b− < a − < 0—such that for s > s1 ,

s·a − s·b−

A2± du ≥ m

0

s·a −

A2± du,

(144)

342

L. Hillairet, C. Judge

where

⎛ m=

1 ⎝ · 1− 2

a− 2

1

b−

⎞ ⎠.

By (141) and (142), there exists a constant s2 —depending continuously on b− , a − < 0— such that if s > s2 , then − s·a m 0 · A+ · A− du ≤ A2± du. (145) s·b− 2 s·a − If A is a general solution to (131), then there exist c+ , c− ∈ R such that A = c+ · A+ + c− · A− . Using (145) we find that − 0 s·a m 0 2 · c+2 A+ · A− du ≤ A2+ du + c− A2− du . 2|c+ · c− | · s·b− 2 s·a − s·a − By combining this with (144) we find that if s > max{s1 , s2 }, then s·a − m 0 A2 du ≥ A2 du. − − 4 s·b s·a

(146)

This finishes the proof of the first estimate. 3 To prove the second estimate, first define f (u) = exp((2/3) · u 2 ) and let 0 < α < β. By using (132) and integrating by parts we find that, for β large, β 1 A2+ du = · β −1 · f (β) · 1 + O(β −1 ) . 4 0 It follows that there exists s3 so that for s > s3 , s·b+ s·2b+ 1 A2+ du ≥ · A2+ du. 2 0 s·b+

(147)

Equation (132) also implies that β 1 1 A+ · A− du = β 2 − α 2 + O β −1 + α −1 . α

In particular, there exists s4 > 0 so that if s > s4 , then s·2b+ A+ · A− du ≥ 0.

(148)

s·b+

By (132), the function A2− is integrable on [0, ∞). Let I be the value of this integral. Using (140) we find that there exists s5 such that if s > s5 , then s·a − s·b+ 2 − 21 A− du ≤ M · s A2− du, (149) 0

s·b−

Spectral Simplicity

343

1 1 where M = 2I / (b− ) 2 − (a − ) 2 . From (140) and (142) we find that there exists s6 such that if s > s6 , then − s·a 1 s·a − A+ · A− du ≤ A2 du. (150) s·b− 2 s·b− ± Let A = c+ A+ + c− A− be a general solution to the Airy equation. From (147) and (148) it follows that if s > max{s3 , s4 }, then 2s·b+ s·b+ c+2 A2+ du ≤ 2 A2 du. (151) s·b+

0

From (150) we have that if s > s6 , then −

− s·a − s·a 1 s·a 2 2 2 2 A+ · A− du ≤ · c+ A+ du + c− A− du . 2|c+ · c− | · s·b− 2 s·b− s·b− It follows that for s > s6 , 2 c−

s·a −

s·b−

A2−

du ≤ 2

s·a −

s·b−

A2 du.

Putting this together with (149) gives 2 c−

s·b+ 0

1

A2− du ≤ 2M · s − 2

s·a − s·b−

A2 du.

(152)

By combining (151) and (152) we find that 1 2

s·b+ 0

1

A2 du ≤ 2M · s − 2

s·a − s·b−

A2 du + 2

This completes the proof of the second estimate.

2s·b+

A2 du.

s·b+

References [Albert78] [Berard79] [BryWlk84] [Cartier71] [Courant-Hilbert] [Cherry50] [Durso88] [ExnPst05]

Albert, J.H.: Generic properties of eigenfunctions of elliptic partial differential operators. Trans. Amer. Math. Soc. 238, 341–354 (1978) Bérard, P.: Spectres et groupes cristallographiques. C.R. Acad. Sci. Paris Sér. A-B 288(23), A1059–A1060 (1979) Berry, M.V., Wilkinson, M.: Diabolical points in the spectra of triangles. Proc. Roy. Soc. London Ser. A 392(1802), 15–43 (1984) Cartier, P.: Some numerical computations relating to automorphic functions in Computers in number theory. Proceedings of the Science Research Council Atlas Symposium No. 2. Edited by A.O.L. Atkin, B.J. Birch. London-New York: Academic Press, 1971 Courant, R., Hilbert, D.: Methods of Mathematical Physics. Volume 1. New York: Wiley Classics, 1989 Cherry, T.M.: Uniform asymptotic formulae for functions with transition points. Trans. Amer. Math. Soc. 68, 224–257 (1950) Durso, C.: Inverse spectral problem for triangular domains. Thesis, MIT, 1988 Exner, P., Post, O.: Convergence of spectra of graph-like thin manifolds. J. Geom. Phys. 54(1), 77–115 (2005)

344

[FrdSlm09] [Grieser]

[Grünbaum] [Harmer08] [Hillairet05] [Hillairet10] [HlrJdg09] [HlrJdg10] [Kato] [Lame] [Langer31] [LaxPhl] [LuRowl] [Olver] [Pinsky80] [Reed-Simon] [Sarnak03] [Uhlenbeck72]

L. Hillairet, C. Judge

Friedlander, L., Solomyak, M.: On the spectrum of the Dirichlet laplacian in a narrow strip. Israel J. Math. 170, 337–354 (2009) Grieser, D.: Thin tubes in mathematical physics, global analysis and spectral geometry in Analysis on Graphs and its Applications. Proceedings of Symposia in Pure Mathematics, Edited by P. Exner, J. Keating, P. Kuchment, T. Sunada, A. Teplyaev, Providence, RI: Amer. Math. Soc, 2008 Grünbaum, B.: Convex polytopes. 2nd ed. Graduate Texts in Mathematics 221. New York, Springer-Verlag, 2003 Harmer, M.: The spectra of the spherical and euclidean triangle groups. J. Aust. Math. Soc. 84(2), 217–227 (2008) Hillairet, L.: Contribution of periodic diffractive geodesics. J. Funct. Anal. 226(1), 48– 89 (2005) Hillairet, L.: Eigenvalue variations and semiclassical concentration in Spectrum and Dynamics: Proceedings of the Workshop Held in Montral, QC, April 7–11, 2008, Edited by D. Jakobson, S. Nonnenmacher, I. Polterovich, Montral, QC: Amer. Math. Soc, 2010 Hillairet, L., Judge, C.: Generic spectral simplicity of polygons. Proc. Amer. Math. Soc. 137(6), 2139–2145 (2009) Hillairet, L., Judge, C.: The eigenvalues of the Laplacian on domains with small slits. Trans. Amer. Math. Soc. 362(12), 6231–6259, (2010) Kato, T.: Perturbation Theory for Linear Operators, Springer-Verlag Classics in Mathematics, Berlin, Heidelberg-New York: Springer Verlag, 1995 Lamé, G.: Leçons sur la théorie mathématique de l’élasticité des corps solides. Paris: Bachelier, 1852 Langer, R.E.: On the asymptotic solutions of ordinary differential equations with an application to the Bessel functions of large order. Trans. of the Amer. Math. Soc. 33(1), 23– 64 (1931) Lax, P., Phillips, R.: Scattering theory for automorphic forms. Princeton, NJ: Princeton U. Press, 1976 Lu, Z., Rowlett, J.: The fundamental gap, http://arxiv.org/abs/1003.0191v1 [math.sp], 2010 Olver, F.W.J.: Asymptotics and Special Functions. AKP Classics. Wellesley, MA: A K Peters, Ltd., 1997 Pinsky, M.A.: The eigenvalues of an equilateral triangle. SIAM J. Math. Anal. 11(5), 819– 827 (1980) Reed, M., Simon, B.: Methods of modern mathematical physics. IV. Analysis of operators. New York-London: Academic Press, 1978 Sarnak, P.: Spectra of hyperbolic surfaces. Bull. Amer. Math. Soc. (N.S.) 40(4), 441– 478 (2003) Uhlenbeck, K.: Eigenfunctions of Laplace operators. Bull. Amer. Math. Soc. 78, 1073– 1076 (1972)

Communicated by S. Zelditch

Commun. Math. Phys. 302, 345–357 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1160-2

Communications in

Mathematical Physics

Existence and Uniqueness of SRB Measure on C 1 Generic Hyperbolic Attractors Hao Qiu School of Mathematical Sciences, Peking University, Beijing 100871, China. E-mail: [email protected] Received: 26 February 2010 / Accepted: 25 June 2010 Published online: 17 December 2010 – © Springer-Verlag 2010

Abstract: Let M be a smooth Riemannian manifold. We show that for C 1 generic f ∈ Diff1 (M), if f has a hyperbolic attractor f , then there exists a unique SRB measure supported on f . Moreover, the SRB measure happens to be the unique equilibrium state of potential function ψ f ∈ C 0 ( f ) defined by ψ f (x) = − log | det(D f |E xu )|, x ∈ f , where E xu is the unstable space of Tx M. 1. Preliminary Let M be a smooth Riemannian manifold. Assume m is the volume measure of M induced by Riemann metric. Denote by δx the probability atomic measure supported on x ∈ M. For any C 1 diffeomorphism f and ergodic measure μ, the statistical basin of μ is defined as n−1 1 k B(μ) = {x ∈ M : lim ϕ( f x) = ϕdμ, ∀ϕ ∈ C 0 (M)} n→∞ n k=0

n−1 1 δ f k x = μ}, = {x ∈ M : lim n→∞ n k=0

and its elements are called generic points of μ. If m(B(μ)) > 0, we call μ an SRB measure. The theory of SRB measure has been extensively studied since it was introduced by Sinai, Ruelle and Bowen in the early 1970’s. The classical SRB theory says that, if dynamical systems admit sufficient differentiability and hyperbolicity, then they do have SRB measures. A particular example will be C k,α hyperbolic attractors, where k = 1, 2, 3, . . . and 0 < α ≤ 1. In this situation, we have both existence and uniqueness of the SRB measure that is supported on such an attractor (see, for instance, [1,8]).

346

H. Qiu

With abundance of results in the case of high differentiability, people are curious whether the theory maintains for “most” f ∈ Diff1 (M). Towards this question, Campbell and Quas obtained the following C 1 generic result for circle expanding maps (see [2]). Theorem (Campbell, Quas). Let E 1 denote the set of C 1 expanding maps of the unit circle S 1 onto itself. Assume m is the normalized Lebesgue measure over S 1 . Then for generic T ∈ E 1 , there is a unique SRB measure μT , with m(B(μT )) = 1. In this paper, we push forward the above result to the setting of C 1 hyperbolic attractors: Let f 0 be a C 1 diffeomorphism of M. Assume there exists a compact invariant transitive hyperbolic set f0 , and an open neighborhood ⊃ f0 , so that f 0 () ⊂ and i≥0 f 0i () = f0 . By stability theory of an isolated hyperbolic set (see [7]), there exists a C 1 neighborhood U of f 0 , so that for any f ∈ U, the f -maximal invariant set of , denoted by f , is also hyperbolic. Moreover, for each f ∈ U there is a unique homeomorphism r f : f0 → f that is C 0 close to id| f0 , with f | f ◦ r f = r f ◦ f 0 | f0 . The main result of the paper is Theorem A. There exists a generic set U in U with the following property: for any f ∈ U , there is a unique SRB measure μ f supported on f , with m(B(μ f ) ∩ ) = m(). Moreover, μ f depends continuously in weak*-topology on f ∈ U . The proof of Theorem A is formulated through Sects. 3, 4. It basically follows Bowen’s convention of equilibrium state thermodynamical formalism developed in [1]. Thus we give in Sect. 2 a partial review on related concepts and results of this topic. Notation Hypotheses:

u ⊕ Es = u s 1) For any f ∈ U, denote by E x∈ f E x ⊕ E x the hyperbolic splitting f f u for T f M, and u = dim E f . 2) For compact metric space X and continuous map T over it, denote by M(X ) the set of Borel probability measures on X , by M(X ; T ) the set of T -invariant Borel probability measures on X , and by E(X ; T ) the set of T -ergodic Borel probability measures on X . 3) For any compact C 1 submanifold ⊂ M, denote by Tx the tangent space of at x, by T the tangent bundle of , and by m the volume measure induced by submanifold immersion. 4) For any finite set A, denote by A the cardinality of A.

2. A Partial Review on Thermodynamical Formalism Most contents of this section can be found in [1,3 and 9]. Let X be a compact metric space, and T be a continuous map over it. We call such a pair as (X, T ) a topological dynamical system. For any φ ∈ C 0 (X ) (φ is usually called a potential function), the topological pressure of φ (w.r.t T ) is defined by P(T ; φ) = sup {h μ (T ) + φdμ}, μ∈M(X ;T )

X

Existence and Uniqueness of SRB Measure on C 1 Generic Hyperbolic Attractors

347

where h μ (T ) is the measure theoretical entropy of T with respect to μ. If the topological entropy h(T )

sup

μ∈M(X ;T )

h μ (T ) < ∞,

then |P(T ; φ)| < ∞ for any φ ∈ C 0 (X ). In this situation, P(T ; ·) : C 0 (X ) → R has the following elementary properties (see Theorem 9.7 of [9]): 1. (Continuity) For any φ , φ ∈ C 0 (X ), P(T ; φ ) − P(T ; φ )| ≤ φ − φ C 0 (X ) .

(2.1)

2. (Convexity) For any φ , φ ∈ C 0 (X ) and 0 ≤ t ≤ 1 P(T ; tφ + (1 − t)φ ) ≤ t P(T ; φ ) + (1 − t)P(T ; φ ). As a consequence of convexity, for any φ, ϕ ∈ C 0 (X ), and t1 < t2 < t3 , we have P(T ; φ + t3 ϕ) − P(T ; φ + t1 ϕ) P(T ; φ + t2 ϕ) − P(T ; φ + t1 ϕ) ≤ , t2 − t1 t3 − t1

(2.2)

P(T ; φ + t2 ϕ) − P(T ; φ + t1 ϕ) P(T ; φ + t3 ϕ) − P(T ; φ + t2 ϕ) ≤ . t2 − t1 t3 − t2

(2.3)

and

In particular, taking t1 = 0, (2.2) implies that (P(T ; φ + tϕ) − P(T ; φ))/t monotonically decreases as t → 0+ . Moreover, taking t2 = 0, (2.3) implies that (P(T ; φ + tϕ) − P(T ; φ))/t, t > 0 is bounded from below. Thus limt→0+ (P(T ; φ + tϕ) − P(T ; φ))/t exists, and equals inf t>0 (P(T ; φ + tϕ) − P(T ; φ)/t. We denote the limit by τ (T ; φ, ϕ), i.e., τ (T ; φ, ϕ) = inf

t>0

P(T ; φ + tϕ) − P(T ; φ) P(T ; φ + tϕ) − P(T ; φ) = lim+ . t→0 t t

(2.4)

Lemma 2.1. Assume h(T ) < ∞ and φ ∈ C 0 (X ). Then 1. For any ϕ ∈ C 0 (X ), τ (T ; φ, ϕ) ≥ −τ (T ; φ, −ϕ).

(2.5)

2. τ (T ; φ, ·) : C 0 (X ) → R is continuous. More precisely, for any ϕ , ϕ ∈ C 0 (X ), |τ (T ; φ, ϕ ) − τ (T ; φ, ϕ )| ≤ ϕ − ϕ C 0 (X ) . Proof. Let t2 = 0 and take the limit as t1 → 0− , respectively t3 → 0+ in (2.3). Then the first statement is clear by definition of τ (T ; φ, ϕ). The second statement is straightforward by (2.1) and direct computation. Lemma 2.2. Assume h(T ) < ∞ and ϕ ∈ C 0 (X ). Then 1. τ (T ; ·, ϕ) : C 0 (X ) → R is upper semicontinuous. 2. For any φ ∈ C 0 (X ), if τ (T ; φ, ϕ) = −τ (T ; φ, −ϕ), then τ (T ; ·, ϕ) is continuous at φ.

348

H. Qiu

Proof. Upper semicontinuity of τ (T ; ·, ϕ) is clear from the first “=” of (2.4). For the C0

second statement, let φk → φ, then upper semicontinuity of τ (T ; ·, ϕ) gives lim sup τ (T ; φk , ϕ) ≤ τ (T ; φ, ϕ), k→∞

and lim sup τ (T ; φk , −ϕ) ≤ τ (T ; φ, −ϕ). k→∞

Therefore, if τ (T ; φ, ϕ) = −τ (T ; φ, −ϕ), we have τ (T ; φ, ϕ) = −τ (T ; φ, −ϕ) ≤ − lim sup τ (T ; φk , −ϕ) = lim inf −τ (T ; φk , −ϕ) k→∞

k→∞ (2.5)

≤ lim inf τ (T ; φk , ϕ) ≤ lim sup τ (T ; φk , ϕ) ≤ τ (T ; φ, ϕ). k→∞

k→∞

Then the above “≤” must all be “=”. In particular, lim inf τ (T ; φk , ϕ) = lim sup τ (T ; φk , ϕ) = τ (T ; φ, ϕ), k→∞

k→∞

thus limk→∞ τ (T ; φk , ϕ) = τ (T ; φ, ϕ).

An equilibrium state of φ (w.r.t. T ) is a T -invariant probability measure ν satisfying P(T ; φ) = h ν (T ) +

φdν. X

A tangent functional to P(T ; ·) at φ is a finite signed measure μ on X such that P(T ; φ + ϕ) − P(T ; φ) ≥

ϕdμ, ∀ϕ ∈ C 0 (X ) X

Let Eq(T ; φ) be the collection of equilibrium states of φ w.r.t. T, t (T ; φ) be the collection of tangent functionals to P(T ; ·) at φ. Lemma 2.3. Assume h · (T ) : M(X ; T ) → R is upper semicontinuous. Then for any φ ∈ C 0 (X ), Eq(T ; φ) = t (T ; φ). Proof. See Theorem 9.15 of [9]. Lemma 2.4. Assume h · (T ) : M(X ; T ) → R is upper semicontinuous and φ ∈ C 0 (X ). Then the following statements are equivalent: 1) Eq(T ; φ) = 1. 2) τ (T ; φ, ϕ) = −τ (T ; φ, −ϕ), ∀ϕ ∈ C 0 (X ). 3) For any ν ∈ Eq(T, φ), we have ϕdν = τ (T ; φ, ϕ), ∀ϕ ∈ C 0 (X ).

Existence and Uniqueness of SRB Measure on C 1 Generic Hyperbolic Attractors

349

Proof. Consider “1) ⇒ 2)” first. Assume Eq(T ; φ) = 1. Suppose ∃ϕ ∈ C 0 (X ) so that τ (T ; φ, ϕ ) = −τ (T ; φ, −ϕ ). Then by (2.5) τ (T ; φ, ϕ ) > −τ (T ; φ, −ϕ ). We claim −ϕ ), τ (T ; φ, ϕ )], there exist ν ∈ Eq(T ; φ) so that that for any a ∈ [−τ (T ; φ, ϕ dν = a. In fact, consider ϕ the one-dimensional linear space generated by ϕ . We ˜ ) = at. Then the first “=” of (2.4) define the linear functional A˜ : ϕ → R by A(tϕ yields ˜ ), P(T ; φ + tϕ ) − P(T ; φ) ≥ tτ (T ; φ, ϕ ) ≥ at = A(tϕ (2.6) ), ˜ P(T ; φ − tϕ ) − P(T ; φ) ≥ tτ (T ; φ, −ϕ ) ≥ −at = A(−tϕ for t ≥ 0. This implies that the graph of A˜ is under the graph of P(T ; φ+·)− P(T ; φ)|ϕ . Applying the Hahn-Banach theorem and due to convexity of P(T ; φ + ·) − P(T ; φ), we can extend A˜ to A ∈ C 0 (X )∗ , so that A(tϕ ) = at, and the graph of A is under the graph of P(T ; φ + ·) − P(T ; φ), i.e. P(T ; φ + ϕ) − P(T ; φ) ≥ A(ϕ), ∀ϕ ∈ C 0 (X ). Let ν be the signed measure associated to A by the Rieszrepresentation theorem, then ν ∈ t (T ; φ), and by Lemma 2.3, ν ∈ Eq(T ; φ). Clearly, ϕ dν = a. Therefore, for arbitrary −τ (T ; φ, −ϕ ) ≤ a1 < a2 ≤ τ (T ; φ, ϕ ), there must be ν1 , ν2 ∈ Eq(T ; φ), so that ϕ dν1 = a1 and ϕ dν2 = a2 . This contradicts Eq(T ; φ) = 1. For “2) ⇒ 3)”, let ν be arbitrary equilibrium state of φ. Viewing it as tangential functional, we have P(T ; φ + tϕ) − P(T ; φ) ≥ tϕdν = t ϕdν, ∀ϕ ∈ C 0 (X ). (2.7) Dividing (2.7) by t > 0, respectively t < 0, and taking limit as t → 0+ , respectively 0− , we obtain τ (T ; φ, ϕ) ≥ ϕdν ≥ −τ (T ; φ, −ϕ), ∀ϕ ∈ C 0 (X ). (2.8) Then if τ (T ; φ, ϕ) = −τ (T ; φ, −ϕ) for any ϕ ∈ C 0 (X ), (2.8) yields ϕdν = τ (T ; φ, ϕ), ∀ϕ ∈ C 0 (X ). “3) ⇒ 1)” is trivial. Corollary 2.5. Assume h · (T ) : M(X ; T ) → R is upper semicontinuous. Denote by R ⊂ C 0 (X ) the set of potential functions that have unique equilibrium state. Then R is a G δ set in C 0 (X ). Proof. Let {ϕi }i be a countable and dense subset of C 0 (X ). By Lemma 2.4 and 2) of Lemma 2.1, R can be represented as {φ ∈ C 0 (X ) | τ (T ; φ, ϕi ) = −τ (T ; φ, −ϕi )}. R= i

350

H. Qiu

Since τ (T ; φ, ϕi ) ≥ −τ (T ; φ, −ϕi ), {φ ∈ C 0 (X ) | τ (T ; φ, ϕi ) = −τ (T ; φ, −ϕi )} = {φ ∈ C 0 (X ) | τ (T ; φ, ϕi ) + τ (T ; φ, −ϕi ) < ε} ε>0

=

{φ ∈ C 0 (X ) : inf

t>0

ε>0

=

P(T ; φ + tϕi ) + P(T ; φ − tϕi ) − 2P(T ; φ) < ε} t

{φ ∈ C 0 (X ) : P(T ; φ + tϕi ) + P(T ; φ − tϕi ) − 2P(T ; φ) < tε}.

ε>0 t>0

This implies that R is G δ .

Remark 2.6. In fact, one may go further to prove that R is a dense G δ set in C 0 (X ), see Corollary 9.15.1 of [9]. Under the condition of Corollary 2.5, for any φ ∈ R we denote by μφ the unique equilibrium state of φ. Corollary 2.7. μφ depends continuously in weak*-topology on φ ∈ R. Proof. By 3) of Lemma 2.4, we have ϕdμφ = τ (T ; φ, ϕ) for any ϕ ∈ C 0 (X ). Thus it is sufficient to prove for any ϕ ∈ C 0 (X ), τ (T ; ·, ϕ) is continuous at φ, and this is derived from 2) of Lemma 2.2. Denote by d : X × X → R the distance function of X . Call E ⊂ X , (n, ε) separated, if whenever x, y are two distinct points in E, one can find 0 ≤ i ≤ n − 1 with d(T i x, T i y) > ε. Lemma 2.8. Given ε > 0 and ψ ∈ C 0 (X ), for each n ∈ N, let E n ⊂ X be an (n, ε) separated set, and μn ∈ M(X ) be defined by: μn =

x∈E n

where Sn ψ = and

n−1 i=0

n−1 e Sn ψ(x) 1 · δT i x , Sn ψ(x) n x∈E n e i=0

ψ ◦ T i . Assume μn i → μ in weak*-topology, then μ ∈ M(X ; T )

h μ (T ) +

ψdμ ≥ lim sup X

i→∞

1 log e Sni ψ(x) . ni x∈E ni

Proof. See part (2) of proof of Theorem 9.10 in [9]. 3. Generic Properties of P( f | f ; ψ f ) and Eq( f | f ; ψ f ) for f ∈ U For any f ∈ U, we define ψ f ∈ C 0 ( f ) by ψ f (x) = − log | det(D f |E xu )|, x ∈ f . With preparations in the previous section, we are going to study P( f | f ; ψ f ) and Eq( f | f ; ψ f ) for generic f ∈ U. Indeed, since f | f is expansive, the entropy map h · ( f | f ) : M( f ; f | f ) → R

Existence and Uniqueness of SRB Measure on C 1 Generic Hyperbolic Attractors

351

is upper semicontinuous, thus h( f | f ) < ∞ (see Theorem 8.2 of [9]). Then all the results presented in the previous section hold in this situation. Recall that by classical SRB theory, if f ∈ U ∩ Diff2 (M), we have P( f | f ; ψ f ) = 0, Eq( f | f ; ψ f ) = 1.

(3.9)

Indeed, this is another presentation of the Ruelle-Pesin formula (see [4]). The next proposition says that this property holds for “most” f ∈ U. Proposition 3.1. 1. For any f ∈ U, P( f | f ; ψ f ) = 0. 2. There exists a generic subset U ⊂ U, so that for any f ∈ U , Eq( f | f ; ψ f ) = 1. Proof. We introduce a continuous map : U → C 0 ( f0 ) defined by ( f ) = ψ f ◦ r f . By invariance of topological pressure under conjugation, we have P( f | f ; ψ f ) = P( f 0 | f0 ; ( f )),

Eq( f | f ; ψ f ) = r f ∗ Eq( f 0 | f0 ; ( f )). (3.10)

For the first statement, let f be an arbitrary diffeomorphism in U, and { f k }k be C 2 C1

diffeomorphisms so that f k → f . Therefore P( f | f ; ψ f ) = P( f 0 | f0 ; ( f )) = lim P( f 0 | f0 ; ( f k )) k→∞ (3.9)

= lim P( f k | fk ; ψ fk ) = 0. k→∞

For the second statement, abusing the notations in Corollary 2.5, we denote by R the set of potentials in C 0 ( f0 ) that have unique equilibrium state w.r.t. f 0 | f0 . Let U = −1 (R). Clearly, for any f ∈ U , Eq( f | f , ψ f ) = 1. By Corollary 2.5, R is a G δ set, thus U is a G δ set in U. Moreover, by (3.9), U ∩ Diff2 (M) ⊂ U . This implies that U is dense in U. In the sequel, for any f ∈ U , we denote by μ f the unique equilibrium state for ψ f w.r.t. f | f . Derived directly from (3.10) and Corollary 2.7, we have: Corollary 3.2. μ f depends continuously in weak*-topology on f ∈ U . Corollary 3.3. μ f is ergodic (w.r.t f ). Proof. Let μ f = E ( f ; f | f ) μdη(μ) be the ergodic decomposition of μ f , where η ∈ M(M( f ; f | f )) with η(E( f ; f | f )) = 1. Therefore by Theorem 8.4 of [9], 0 = h μ f ( f | f ) +

f

ψ f dμ f =

E ( f ; f | f )

{h μ ( f | f ) +

f

ψ f dμ}dη(μ). (3.11)

By (3.9), h μ ( f | f ) + f ψ f dμ ≤ 0, and “=” holds if and only if μ = μ f . Then (3.11) implies that μ = μ f for η a.e. μ. Thus μ f is ergodic.

352

H. Qiu

4. Volume Estimate of B(μ f ) ∩ for f ∈ U Now we carry on to compute, for any fixed f ∈ U , the volume of B(μ f ) ∩ . Our aim is to derive estimate m(B(μ f ) ∩ ) = m()

(4.12)

through the thermodynamical properties P( f | f ; ψ f ) = 0,

Eq( f | f ; ψ f ) = {μ f }.

(4.13)

Recall that if we consider a local unstable manifold , by Bowen’s standard technique developed in [1], one can obtain the following estimate: m (B(μ f ) ∩ ) = m ()

(4.14)

C2

from (4.13). Then, when f is of class, by an absolutely continuous holonomy map derived by stable foliation of f , one can transfer (4.14) to every u-dimensional C 1 compact submanifold that is transverse to stable foliation (in the sequel, we call them u-transversal C 1 compact submanifold or u-TCSM in abbreviation). Observe that can be foliated by a smooth family of u-TCSM’s. Thus applying Fubini’s Theorem, one can integrate (4.14) over this family to obtain estimate (4.12). However, for f ∈ Diff1 (M), the above holonomy map is, in general, not absolutely continuous (see [6]). Our strategy in this situation is to generalize Bowen’s technique for every u-TCSM in to obtain (4.14). More specifically, we will prove: Lemma 4.1. Let ⊂ be a u-TCSM. Then m (B(μ f ) ∩ ) = m (). As an immediate consequence of Lemma 4.1 and Fubini’s Theorem, we have: Proposition 4.2. m(B(μ f ) ∩ ) = m(). Then Proposition 4.2, Proposition 3.1, Corollary 3.2 and Corollary 3.3 jointly accomplish the proof of Theorem A. Now we only need to prove Lemma 4.1. To illustrate the argument in a simple case, we first prove the lemma for those ’s so that: case *) for any i ∈ N, f i ∩ = ∅ and f i ∩ f = ∅. The proof of the general case is very similar. Before the formal argument, we need the following preparative lemma: Lemma 4.3. Let ⊂ be a u-TCSM. Then, 1. Given C1 > 0, there exist δ1 > 0, so that for any i, j ∈ N and any compact disk D ⊂ f i , diam(D) ≤ δ1 ⇒ m f i (D) ≤ C1 . 2. Given C2 > 1, there exist δ2 > 0, so that for any i ∈ N and any x ∈ f i , y ∈ f j , d(x, y) ≤ δ2 ⇒ C2−1 ≤ | det(D f |Tx f i )| · | det(D f |Ty f j )|−1 ≤ C2 , where d(·, ·) is the distance function of M. Proof. The detail of the proof is omitted. The key observation is that, due to λ-lemma (see p. 82 of [5]), f i “C 1 -converges” to f as i → ∞. Thus we can apply the argument of compactness over f ∪ i≥0 f i .

Existence and Uniqueness of SRB Measure on C 1 Generic Hyperbolic Attractors

353

4.1. Proof of Lemma 4.1 for Case *). For of case *), we consider the positive invariant set = f ∪ i≥0 f i and potential ψ ∈ C 0 () defined by if x ∈ f ; ψ f (x), ψ(x) = i − log | det(D f |Tx f )|, if x ∈ f i , i = 0, 1, 2, . . . . By definition of case *), one sees that ψ is well defined. By λ-lemma, is a compact set, thus (, f |) is a topological dynamical system. Furthermore, i≥0 f i = f , which implies that any invariant measure on must be supported on f . Then the thermodynamical properties for f | f with potential ψ f can be handed to f | with potential ψ. Therefore, by (4.13) and upper-semicontinuity of h · ( f | f ) : M( f , f | f ) → R, we have 1) P( f |; ψ) = 0, Eq( f |; ψ) = {μ f }, (4.15) 2) h · ( f |) : M(, f |) → R is upper semicontinuous. For any r > 0, let Kr ⊂ M(; f |) be defined by {ν ∈ M(; f |) : h ν ( f |) + ψdν ≥ −r }.

Then by 1) of (4.15), r >0 Kr = {μ f }. Furthermore, by 2) of (4.15), Kr is closed in M(; f |), thus closed in M(). This implies M()\Kr is open in M(). Therefore by local compactness and local convexity of M(), M()\Kr can be covered by a countable family of open sets {Vi }i in M(), so that each Vi is convex, and the closure of Vi is contained in M()\Kr . For any W ⊂ M(), let (W, n) and (W) be defined by (W, n) = {x ∈ :

n−1 1 δ f i x ∈ W}, n i=0

1 i→∞ n i

(W) = {x ∈ : lim It is easy to see that (W) ⊂

n≥0

n i −1

δ f ni x ∈ W, for some{n i }i }.

i=0

i≥n

(W, i) whenever W is open.

Claim. For any V ∈ {Vi }i , m ((V)) = 0. Proof. We choose arbitrary C1 > 0, C2 > 1, and determine δ1 = ε1 (C1 , ), δ2 = ε2 (C2 , ) by Lemma 4.3. Let δ = min{δ1 , δ2 }. Moreover, choose 0 < ε < δ so that for any x, y ∈ M, d( f x, f y) < δ whenever d(x, y) < ε. For each n ∈ N, select E n an (n, ε) separated set that is maximal in (V, n). For each x ∈ E n , let B n,ε (x) = {y ∈ : d( f i x, f i y) ≤ ε, 0 ≤ i ≤ n − 1}. Due to maximality, (V, n) ⊂ x∈E n Bn,ε (x). Then m ((V, n)) ≤ m (Bn,ε (x)) = dm (y) x∈E n

=

x∈E n

x∈E n n−1 f n (Bn,ε (x)) i=0

Bn,ε (x)

| det(D f |T f −n+i y f i )|−1 dm f n (y )

354

H. Qiu

≤ C2n

e Sn ψ(x) m f n ( f n (Bn,ε (x)))

x∈E n

≤ C1 C2n

e Sn ψ(x) ,

(4.16)

x∈E n

n−1 where Sn ψ = i=0 ψ ◦ f i. Now we apply Lemma 2.8 to (, f |), ψ and {E n }n . For each n ∈ N, let νn =

x∈E n

n e Sn ψ(x) 1 · δ f ix, Sn ψ(x) n x∈E n e i=0

and {νn i }i be a subsequence converging to some ν in weak*-sense. Then Lemma 2.8 gives 1 Sni ψ(x) lim sup log e ≤ h ν ( f |) + ψdν. (4.17) i→∞ n i x∈E ni

n−1

n−1 Observe that νn is a convex combination of { n1 i=0 δ f i x , x ∈ E n }, and n1 i=0 δfix ∈ V for any x ∈ E n . By convexity of V we have νn ∈ V, thus ν = limi→∞ νn i ∈ V ⊂ M()\Kr . Then by definition of Kr , h ν ( f |) + ψdν < −r . Therefore lim sup i→∞

1 log e Sni ψ(x) < −r. ni

(4.18)

x∈E ni

Clearly, (4.18) holds for any {n i }i such that νn i converges. Substituting n i in (4.18) by n, lim sup n→∞

1 log m ((V, n)) < −r. n

(4.19)

Combine (4.19) with (4.16), lim sup n→∞

1 1 log m ((V, n)) ≤ lim sup log e Sn ψ(x) + log C2 n n→∞ n x∈E n

< −r + log C2 .

(4.20)

1 log m ((V, n)) ≤ −r. n

(4.21)

Let C2 → 1, we have lim sup n→∞

Then, given 0 < σ < r , there exist C > 1, so that

m ((V, n)) ≤ Ce−(r −σ )n .

(4.22)

(V, i) because V is open, m ((V)) ≤ lim sup m ((V, i)) ≤ lim sup Ce−(r −σ )i = 0.

Note that (V) ⊂

n≥0

i≥n

n→∞

n→∞

i≥n

This ends the proof of the claim.

i≥n

(4.23)

Existence and Uniqueness of SRB Measure on C 1 Generic Hyperbolic Attractors

As a consequence of the claim, m ((M()\Kr )) ≤ then by

355

m ((Vi )) = 0,

i

r >0 Kr

= {μ f },

m ((M()\{μ f })) = lim m ((M()\Kr )) = 0. r →0

Clearly, we have = (M()\{μ f })∪(B(μ f )∩) and (M()\{μ f })∩(B(μ f )∩ ) = ∅. Thus m (B(μ f ) ∩ ) = m () − m ((M()\{μ f })) = m (). This completes the proof of Lemma 4.1 for case *).

(4.24)

4.2. Proof of Lemma 4.1. Now we are going to apply the above argument in the general case. Note that the crucial point in the previous proof is that we “naturally” extend ψ f to ψ, in the sense that ψ| is “compatible” with volume measure on . However, without the assumption in case *), such an extension may be unrealizable. For example, assume there exists x ∈ ∩ f so that Tx = E xu , ψ(x) should equal − log | det(D f |E xu )| if x is referred to a point in f , while ψ(x) must be − log | det(D f |Tx )| if x is considered contained in , and | det(D f |E xu )| = | det(D f |Tx )| in general. Similar problem happens when there exists y ∈ ∩ f i with Ty = Ty f i . To overcome this problem, we introduce the framework of the Grassmann bundle, in which the previously mentioned Tx and E xu (respectively, Ty and Ty f i ) are forced apart. In precise words, let π : G(M, u) → M be the u-dimensional Grassmann bundle over M. For any V ⊂ T M a u-dimensional linear subspace, we write [V ] to denote the corresponding element in G(M, u). The topology of G(M, u) is determined by the distance function ˆ d([V ], [V ]) = min{l(γ ) + ∠π([V ]) (V, Pγ V )|γ : [0, 1] → M is piecewise smooth with γ (0) = π([V ]), γ (1) = π([V ])}, where l(γ ) is the length of γ , Pγ the parallel translation along γ , and ∠π([V ]) (V, Pγ V ) sup{v − v | v ∈ V, v ∈ Pγ V , v = v = 1}. Under this topology π : G(M, u) → M is a continuous map. Let fˆ : G(M, u) → G(M, u) be a homeomorphism defined by fˆ[V ] = [D f (V )]. Then f ◦ π = π ◦ fˆ. Let potential ψˆ ∈ C 0 (G(M, u)) be defined by ˆ ψ([V ]) = − log | det(D f |V )|.

Proof of Lemma 4.1. Still as in case *), we consider = f ∪ i≥0 f i . Moreover, define the following sets of G(M, u) that are related to : ˆ ˆf = ˆ = ˆ = ˆf ∪ [E xu ], [Tx ], fˆi . x∈ f

x∈

i≥0

356

H. Qiu

ˆ f, ˆ and ˆ respectively onto f , and . In particular, π | ˆf : Clearly, π maps ˆ ˆ ˆ f) f → f is a homeomorphism. Then by upper semicontinuity of h · ( f | f ), h · ( f | ˆ is upper semicontinuous. Moreover, since (ψ ◦ π )| f = ψ f , by (4.13) and invariance of topological pressure, ˆ = 0, ˆ f ; ψ) P( fˆ|

ˆ f ; ψ) ˆ = {μˆ f }, Eq( fˆ|

(4.25)

ˆ f )−1 where μˆ f (π | ∗ μf. ˆ is a compact set, thus (, ˆ fˆ|) ˆ is a topological dynamical system. By λ-lemma, i ˆ = ˆ f . Then for a similar reason mentioned before (4.15), we Furthermore, i≥0 fˆ have ˆ ψ) ˆ = 0, Eq( fˆ|; ˆ ψ) ˆ = {μˆ f }, 1) P( fˆ|; (4.26) ˆ : M(; ˆ fˆ|) ˆ → R is upper semicontinuous. 2) h · ( fˆ|) ˆ fˆ|) ˆ be defined by For any r > 0, let Kˆ r ⊂ M(; ˆ νˆ ≥ −r }. ˆ + ˆ fˆ|) ˆ : h νˆ ( fˆ|) ψd {ˆν ∈ M(; ˆ

Then by 1) of (4.26), r >0 Kˆ r = {μˆ f }. Furthermore, by 2) of (4.26) Kˆ r is closed in ˆ fˆ|), ˆ thus closed in M(). ˆ This implies M()\ ˆ Kˆ r is open in M(). ˆ ThereM(; ˆ ˆ ˆ fore by local compactness and local convexity of M(), M()\Kr can be covered by a countable family of open sets {Vˆ i }i , so that each Vˆ i is convex, and the closure of Vˆ i is ˆ Kˆ r . contained in M()\ In the sequel, for any x ∈ , we write xˆ to represent [Tx ] for simplicity. For any ˆ ⊂ M(), ˆ n) and ( ˆ be defined by ˆ let ( ˆ W, ˆ W) W n−1 1 ˆ n) = {xˆ ∈ ˆ ˆ W, ˆ : ( δ fˆi xˆ ∈ W}, n i=0

1 i→∞ n i

ˆ = {xˆ ∈ ˆ W) ˆ : lim (

n i −1

ˆ for some {n i }i }. δ fˆni xˆ ∈ W,

i=0

ˆ ˆ V))) Claim. For any Vˆ ∈ {Vˆ i }i , m (π(( = 0. Proof. Again, we choose arbitrary C1 > 0, C2 > 1, determine δ1 = ε1 (C1 , ), δ2 = δ2 (C2 , ) and δ = min{δ1 , δ2 } by Lemma 4.3, and choose 0 < ε < δ as in case *). For ˆ n)). ˆ V, each n ∈ N, select E n an (n, ε) separated set (w.r.t. f ) that is maximal in π(( ˆ ˆ ˆ We write E n = {xˆ : x ∈ E n }, then E n is (n, ε) separated (w.r.t. f ). For each x ∈ E n , let Bn,ε (x) = {y ∈ : d( f i x, f i y) ≤ ε, 0 ≤ i ≤ n − 1}. Due to maximality, ˆ n)) ⊂ ˆ V, π(( x∈E n Bn,ε (x). Then similar to (4.16), ˆ n))) ≤ ˆ V, m (π((

x∈E n

where Sn ψˆ =

n−1 i=0

ψˆ ◦ fˆi .

m (Bn,ε (x)) ≤ C1 C2n

x∈ ˆ Eˆ n

ˆ

ˆ e Sn ψ(x) ,

(4.27)

Existence and Uniqueness of SRB Measure on C 1 Generic Hyperbolic Attractors

357

ˆ fˆ|), ˆ ψˆ and Eˆ n . With same argument as in between Now we apply Lemma 2.8 to (, (4.17) and (4.21), we have lim sup n→∞

1 ˆ n))) ≤ −r, ˆ V, log m (π(( n

(4.28)

which implies, as in (4.23), that ˆ ˆ V))) m (π(( = 0.

(4.29)

This ends the proof of claim. As a consequence of the claim, ˆ ˆ Kˆ r ))) ≤ )\ m (π((M(

ˆ Vˆ i ))) = 0, m (π((

i

then by

ˆ = {μˆ f },

r >0 Kr

ˆ ˆ μˆ f }))) = lim m (π((M( ˆ ˆ Kˆ r ))) = 0. m (π((M( )\{ )\ r →0

Moreover, it is easy to check that ˆ ˆ μˆ f })) = (M()\{μ f }). π((M( )\{ Then similar to (4.24), m (B(μ f ) ∩ ) = m () − m ((M()\{μ f }) ˆ ˆ μˆ f }))) = m (). )\{ = m () − m (π((M(

(4.30)

This completes the proof of Lemma 4.1. Acknowledgements. We sincerely thank Professor HU Huyi and Professor GAN Shaobo for posing to him the problem addressed in this paper, and helpful discussion with them. We also thank Professor WEN Lan, Professor SUN Wenxiang and Professor CAO Yongluo for their useful comments.

References 1. Bowen, R.: Equilibrium states and ergodic theory of Anosov diffeomorphisms. Lecture Note in Mathematics 470. New York: Springer Verlag, 1975 2. Campbell, J., Quas, A.: A Generic C 1 Expanding Map has a Singular SRB Measure. Commun. Math. Phys. 221, 335–349 (2001) 3. Keller, G.: Equilibrium states in ergodic theory. Cambridge: Cambridge University Press, 1998 4. Ledrappier, F., Young, L-S.: The metric entropy of diffeomorphisms, Part I: Characterization of measures satisfying Pesin’s entropy formula. Annals Math. 122, 509–539 (1985) 5. Palis, J., de Melo, W.: Geometric theory of dynamic systems: an introduction. New York: Springer Verlag, 1982 6. Robinson, C., Young, L-S.: Nonabsolutely continuous foliations for an Anosov diffeomorphism. Invent. Math. 61, 159–176 (1980) 7. Shub, M.: Global stability of dynamical systems. New York: Springer-Verlag, 1987 8. Viana, M.: Stochastic dynamics of deterministic systems. Lecture Notes 21st Braz. Math. Colloq. Rio de Janeiro: IMPA, 1997 9. Walters, P.: An introduction to ergodic theory. Graduate Texts in Mathematics 79, New York: Springer Verlag, 1982 Communicated by G. Gallavotti

Commun. Math. Phys. 302, 359–402 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1131-7

Communications in

Mathematical Physics

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series Livia Corsi1 , Guido Gentile1 , Michela Procesi2 1 Dipartimento di Matematica, Università di Roma Tre, Roma, I-00146, Italy.

E-mail: [email protected]; [email protected]

2 Dipartimento di Matematica, Università di Napoli “Federico II”, Napoli, I-80126, Italy.

E-mail: [email protected] Received: 8 March 2010 / Accepted: 18 May 2010 Published online: 29 September 2010 – © Springer-Verlag 2010

Abstract: The KAM theorem for analytic quasi-integrable anisochronous Hamiltonian systems yields that the perturbation expansion (Lindstedt series) for any quasi-periodic solution with Diophantine frequency vector converges. If one studies the Lindstedt series by following a perturbation theory approach, one finds that convergence is ultimately related to the presence of cancellations between contributions of the same perturbation order. In turn, this is due to symmetries in the problem. Such symmetries are easily visualised in action-angle coordinates, where the KAM theorem is usually formulated by exploiting the analogy between Lindstedt series and perturbation expansions in quantum field theory and, in particular, the possibility of expressing the solutions in terms of tree graphs, which are the analogue of Feynman diagrams. If the unperturbed system is isochronous, Moser’s modifying terms theorem ensures that an analytic quasi-periodic solution with the same Diophantine frequency vector as the unperturbed Hamiltonian exists for the system obtained by adding a suitable constant (counterterm) to the vector field. Also in this case, one can follow the alternative approach of studying the perturbation expansion for both the solution and the counterterm, and again convergence of the two series is obtained as a consequence of deep cancellations between contributions of the same order. In this paper, we revisit Moser’s theorem, by studying the perturbation expansion one obtains by working in Cartesian coordinates. We investigate the symmetries giving rise to the cancellations which makes possible the convergence of the series. We find that the cancellation mechanism works in a completely different way in Cartesian coordinates, and the interpretation of the underlying symmetries in terms of tree graphs is much more subtle than in the case of action-angle coordinates.

1. Introduction Consider an isochronous Hamiltonian system, described by the Hamiltonian H (α, A) = ω · A + ε f (α, A), with f real analytic in Td × A and A an open subset of Rd .

360

L. Corsi, G. Gentile, M. Procesi

The corresponding Hamilton equations are α˙ = ω + ε∂ A f (α, A),

˙ = −ε∂α f (α, A). A

(1.1)

Let (α 0 (t), A0 (t)) = (α 0 + ωt, A0 ) be a solution of (1.1) for ε = 0. For ε = 0, in general, there is no quasi-periodic solution to (1.1) with frequency vector ω which reduces to (α 0 (t), A0 (t)) as ε → 0. However, one can prove that, if ε is small enough and ω satisfies some Diophantine condition, then there is a ‘correction’ μ(ε, A0 ), analytic in both ε and A0 , such that the modified equations α˙ = ω + ε∂ A f (α, A) + μ(ε, A0 ),

˙ = −ε∂α f (α, A) A

(1.2)

admit a quasi-periodic solution with frequency vector ω which reduces to (α 0 (t), A0 (t)) as ε → 0. This is a well known result, called the modifying terms theorem, or translated torus theorem, first proved by Moser [20]. By writing the solution as a power series in ε (Lindstedt series), the existence of an analytic solution means that the series converges. This is ultimately related to some deep cancellations in the series; see [1] for a review. Equations like (1.1) naturally arise when studying the stability of an elliptic equilibrium point. For instance, one can think of a mechanical system near a minimum point for the potential energy, where the Hamiltonian describing the system looks like 1 2 H (x1 , . . . , xn , y1 , . . . , yn ) = y j + ω2j x 2j + εF(x1 , . . . , xn , ε), 2 d

(1.3)

j=1

where F is a real analytic function at least of third order in its arguments, the vector ω = (ω1 , . . . , ωd ) satisfies some Diophantine condition, and the factor ε can be assumed to be obtained after a rescaling of the original coordinates – such rescaling makes sense if one wants to study the behaviour of the system near the origin. Indeed, the corresponding Hamilton equations, written in action-angle variables, are of the form (1.1). Unfortunately, the action-angle variables are singular near the equilibrium, and hence there are problems in the region where one of the actions is much smaller than the others. Thus, it can be worthwhile to work directly in the original Cartesian coordinates. In fact, there has been a lot of interest for KAM theory in configuration space, that is, without action-angle variables; see for instance [6,19,22]. 1.1. Set up of the problem. In this paper we consider the ordinary differential equations x¨ j + ω2j x j + f j (x1 , . . . , xd , ε) + η j x j = 0,

j = 1, . . . , d,

(1.4)

where x = (x1 , . . . , xd ) ∈ Rd , ε is real parameter (perturbation parameter), the function f (x, ε) = ( f 1 (x, ε), . . . , f d (x, ε)) is real analytic in x and ε at (x, ε) = (0, 0) and at least quadratic in x, f j (x, ε) =

∞ p=1

εp

f j,s1 ,...,sd x1s1 . . . xdsd ,

(1.5)

s1 ,...,sd ≥0 s1 +···+sd = p+1

(by taking f j (x, ε) = −ε∂x j F(x, ε) one recovers the Hamilton equations corresponding to the Hamiltonian (1.3)), η = (η1 , . . . , ηd ) is a vector of parameters, and the frequency vector (or rotation vector) ω = (ω1 , . . . , ωd ) satisfies the Diophantine condition |ω · ν| > γ0 |ν|−τ

∀ν ∈ Zd∗ ,

(1.6)

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

361

with Zd∗ = Zd \{0}, τ > d − 1 and γ0 > 0. Here and henceforth · denotes the standard scalar product in Rd , and |ν| = |ν1 | + . . . + |νd |. In light of Moser’s theorem of the modifying terms, one expects that, by taking the (arbitrary) unperturbed solution x0, j (t) = C j cos ω j t + S j sin ω j t = c j eiω j t + c∗j e−iω j t , j = 1, . . . , d, there exists a function η(ε, c), analytic both in ε and c = (c1 , . . . , cd ), such that, by fixing η j = η j (ε, c), there exists a quasi-periodic solution to (1.4) with frequency vector ω, which reduces to the unperturbed one as ε → 0. In fact, this is what happens: the result is just a rephrasing of Moser’s modifying terms theorem, with the advantage that it extends to the regions of phase space where the action-angle variables cannot be defined, and hence is not surprising; see also [6]. What is less obvious is the cancellation mechanism which is behind the convergence of the perturbation series. The problem can be described as follows. One can try to write again – as in action-angle variables – the solution as a power series in ε, and study directly the convergence of the series. In general, when considering the Lindstedt series of some KAM problem, first of all one identifies the terms of the series which are an obstruction to convergence: such terms are usually called resonances (or self-energy clusters, by analogy to what happens in quantum field theory). Crudely speaking, the series is given by the sum of infinitely many terms (finitely many for each perturbation order), and each term looks like a product of ‘small divisors’ times some harmless factors: a resonance is a particular structure in the product which allows a dangerous accumulation of small divisors. This phenomenon is very easily visualised when each term of the series is graphically represented as a tree graph (tree tout court in the following), that is, a set of points and lines connecting them in such a way that no loop arises; we refer to [10,13,15] for an introduction to the tree formalism. Shortly, in any tree, each line carries a label j ∈ {1, . . . , d} and a label ν ∈ Zd (that one calls momentum, again inspired by the terminology of quantum field theory) and with each such line a small divisor δ j (ω · ν ) is associated; here u → δ j (u) is a smooth function, which depends on both the model under study and the coordinates one is working with, for instance δ j (u) = u for (1.2), while δ j (u) = u 2 − ω2j for (1.4). Then a resonance becomes a subgraph which is between two lines 1 and 2 with the same small divisors, i.e. δ j 1 (ω · ν 1 ) = δ j 2 (ω · ν 2 ). A tree with a chain of resonances represents a term of the series containing a factor δ j (ω · ν) to a very large power, and this produces a factorial k! to some positive power when bounding some terms contributing to the k th order in ε of the Lindstedt series, so preventing a proof of convergence. However, a careful analysis of the resonances shows that there are cancellations to all perturbation orders. This is what can be proved in the case of the standard anisochronous KAM theorem, as first pointed out by Eliasson [8]; see also [9,10], for a proof which more deeply exploits the similarity with the techniques of quantum field theory. More precisely the cancellation mechanism works in the following way. Given a tree θ and two lines 1 and 2 of θ with the same small divisor, consider all possible resonances which can be inserted between 1 and 2 . For each possible resonance one obtains a different tree, which represents a term of the perturbation series, and each term can be written as the product of a numerical value corresponding to the resonance times a numerical value associated to the points and lines of θ which are outside the resonance: this second numerical value is the same for all such trees, and hence factorises out. When summing together the numerical values corresponding to all resonances, there are compensations and the sum is in fact much smaller than each summand (for more details we refer to [10,13]).

362

L. Corsi, G. Gentile, M. Procesi

For the isochronous case, already in action-angle variables [1], there are some kinds of resonances which do not cancel each other. Nevertheless there are other kinds of resonances for which the gain factor due to the cancellation is more than what is needed (that is, one has a second order instead of a first order cancellation). Thus, the hope naturally arises that one can use the extra gain factors to compensate the lack of gain factors for the first kind of resonances, and in fact this happens. Indeed, the resonances for which there is no cancellation cannot accumulate too much without entailing the presence of as many resonances with the extra gain factors, in such a way that the overall number of gain factors is, in average, one per resonance (this is essentially the meaning of Lemma 5.4 in [1]). When working in Cartesian coordinates, one immediately meets a difficulty. If one writes down the lowest order resonances, there is no cancellation at all. This is slightly surprising because a cancellation is expected somewhere: if the resonances do not cancel each other, in principle one can construct trees containing chains of arbitrarily many resonances, and these trees represent terms of the formal power series expansion for which a bound proportional to some factorial seems unavoidable. However, we shall show that there are cancellations, as soon as one has at least two resonances. So, one has the curious phenomenon that resonances which do not cancel each other are allowed, but they cannot accumulate too much. Moreover, the cancellation mechanism is more involved than in other cases (including the same problem in action-angle variables). First of all, the resonances are no longer diagonal in the momenta, that is, the lines 1 and 2 considered above can have different momenta ν 1 and ν 2 . Second, the cancellation does not operate simply by collecting together all resonances to a given order and then summing the corresponding numerical values. As we mentioned, in this way no cancellation is produced: to obtain a cancellation one has to consider all possible ways to connect two resonances to each other. Thus, there is a cancellation only if there is a chain of at least two resonances. What emerges eventually is that working in Cartesian coordinates rather complicates the analysis. On the other hand, as remarked above, it can be worthwhile to investigate the problem in Cartesian coordinates. Moreover, the cancellations are due to remarkable symmetries in the problem, which can be of interest on their own; in this regard we mention the problem of the reducibility of the skew-product flows with Bryuno base [11], where the convergence of the corresponding Lindstedt series is also due to some cancellation mechanism and hence to some deep symmetry of the system. In this paper we shall assume the standard Diophantine condition on the frequency vector ω; see (1.6) below. Of course one could consider more general Diophantine conditions than the standard one (for instance a Bryuno condition [5]; see also [12] for a discussion using the Lindstedt series expansion). This would make the analysis slightly more complicated, without shedding further light on the problem. An important feature of the Lindstedt series method is that, from a conceptual point of view, the general strategy is exactly the same independent of the kind of coordinates one uses (and independent of the fact that the system is a discrete map or a continuous flow; see [2,10,15]). What is really important for the analysis is the form of the unperturbed solution: the simpler such a solution is the easier the analysis. Of course, an essential issue is that the system one wants to study is a perturbation of one which is exactly soluble. This is certainly true in the case of quasi-integrable Hamiltonian systems, but of course the range of applicability is much wider, and includes also non-Hamiltonian systems; see for instance [14,16]. Moreover an assumption of this kind is more or less always implicit in whatever method one can envisage to deal with small divisor problems of this kind; see also [6].

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

363

In the anisochronous case, the cancellations are due to symmetry properties of the model – essentially the symplectic character of the problem, as first pointed out by Eliasson [8]. The cancellation mechanism for the resonances is deeply related to that assuring the formal solubility of the equations of motions, which in turn is due to a symmetry property as already shown by Poincaré [21]. We refer to [17] for a detailed comparison between Eliasson’s method and the tree formalism that we are using here. Note that, despite what is sometimes claimed in the literature, Eliasson did not study how the resonances have to be regrouped in order to exhibit the cancellation; on the contrary, he proved that, because of aforementioned symmetry properties, the sum of (the leading parts of) all possible resonances must cancel out; a proof of the cancellation through a careful regrouping of the resonances was first given by Gallavotti [9]. Subsequently, stressing further the analogy with quantum field theory, Bricmont et al. showed that the cancellations can be interpreted as a consequence of suitable Ward identities of the corresponding field theory [4] (see also [7]): the symmetry property corresponds to the translation invariance of the field theoy. In the isochronous case, in terms of Cartesian coordinates the cancellation mechanism works in a completely different way with respect to action-angle coordinates. However, as we shall see, the cancellation is still related to underlying symmetry properties: it would be interesting to relate the symmetry properties that we find to invariance properties of the corresponding quantum field model, as done in [4] for the KAM theorem. 1.2. Statement of the results. Now, we give a formal statement of our results. As stressed above, the main point of the paper is not in the results themselves, but in the method used to prove them, in particular on the analysis of the perturbation series and of the cancellation mechanism which is at the base of the convergence of the series. We look for quasi-periodic solutions x(t) of (1.4) with frequency vector ω. Therefore we expand the function x(t) by writing x(t) = e i ν ·ω t x ν , (1.7) ν ∈Zd

ν th

and we denote by f ν (x, ε) the Fourier coefficient of the function that we obtain by Taylor-expanding f (x, ε) in powers of x and Fourier-expanding x according to (1.7). Thus, in Fourier space (1.4) becomes (ω · ν)2 − ω2j x j,ν = f j,ν (x, ε) + η j x j,ν . (1.8) For ε = 0, η = 0, the vector x (0) (t) with components (0)

x j (t) = c j eiω j t + c∗j e−iω j t ,

j = 1, . . . , d,

(1.9)

is a solution of (1.4) for any choice of the complex constant c = (c1 , . . . , cd ). Here and henceforth ∗ denotes complex conjugation. Define e j as the vector with components δi j (Kronecker delta). Then we can split (1.8) into two sets of equations, called respectively the bifurcation equation and the range equation,

f j,σ e j (x, ε) + η j x j,σ e j = 0, j = 1, . . . , d, σ = ±1, (1.10a) (ω · ν)2 − ω2j x j,ν = f j,ν (x, ε) + η j x j,ν , j = 1, . . . , d, ν = ±e j . (1.10b)

364

L. Corsi, G. Gentile, M. Procesi

We shall study both Eqs. (1.10) simultaneously, by showing that for all choices of the parameters c there exist suitable counterterms η, depending analytically on ε and c, such that (1.10) admits a quasi-periodic solution with frequency vector ω, which is analytic in ε, c, and t. Moreover, with the choice x j,e j = c j for all j = 1, . . . , d, the counterterms are uniquely determined. We formulate the following result. Theorem 1.1. Consider the system described by Eqs. (1.4) and let (1.9) be a solution at ε = 0, η = 0. Set (c) = max{|c1 |, . . . , |cd |, 1}. There exist a positive constant η0 , small enough and independent of ε, c, and a unique function η(ε, c), holomorphic in the domain |ε| 3 (c) ≤ η0 and real for real ε, such that the system x¨ j + ω2j x j + f j (x1 , . . . , xd , ε) + η j (ε, c) x j = 0,

j = 1, . . . , d,

admits a solution x(t) = x(t, ε, c) of the form (1.7), holomorphic in the domain |ε| 3 (c)e3|ω| |Im t| ≤ η0 and real for real ε, t, with Fourier coefficients x j,e j = c j and x j,ν = O(ε) if ν = ±e j for j = 1, . . . , d. The proof is organised as follows. After introducing the small divisors and proving some simple preliminary properties in Sect. 2, we develop in Sect. 3 a graphical representation for the power series of the counterterms and the solution (tree expansion). In particular we perform a multiscale analysis which allows us to single out the contributions (self-energy clusters) which give problems when trying to bound the coefficients of the series. In Sect. 4 we show that, as far as such contributions are neglected, there is no difficulty in obtaining power-like estimates on the coefficients: these estimates, which are generalisations of the Siegel-Bryuno bounds holding for anisochronous systems [9,10], would imply the convergence of the series and hence analyticity. In Sect. 5 we discuss how to deal with the self-energy clusters: in particular we single out the leading part of their contributions (localised values), which are proved in Sect. 6 to satisfy some deep symmetry properties. Finally, in Sect. 7 we show how the symmetry properties can be exploited in order to obtain cancellations involving the localised parts, in such a way that the remaining contributions can still be bounded in a summable way. This will yield the convergence of the full series and hence the analyticity of both the solution and the counterterms. Note that the system dealt with in Theorem 1.1 can be non-Hamiltonian. On the other hand the most general case for a Hamiltonian system near a stable equilibrium allows for Hamiltonians of the form 1 2 y j + ω2j x 2j + εF(x1 , . . . , xn , y1 , . . . , yn , ε), 2 d

H (x1 , . . . , xn , y1 , . . . , yn ) =

j=1

(1.11) which lead to the equations x˙ j = y j + ε∂ yi F(x, y, ε), y˙ j = −ω2j x j − ε∂xi F(x, y, ε). Also in this case one can consider the modified equations x˙ j = y j + ε∂ yi F(x, y, ε), y˙ j = −ω2j x j − ε∂xi F(x, y, ε) + η j x j ,

(1.12)

(1.13)

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

365

which are not of the form considered in Theorem 1.1. However, a result in the same spirit as Theorem 1.1 still holds. Theorem 1.2. Consider the system described by Eqs. (1.13) and let (x (0) (t), y(0) (t)) be a solution at ε = 0, η = 0, with x (0) (t) given by (1.9) and y(0) (t) = x˙ (0) (t). Set

(c) = max{|c1 |, . . . , |cd |, 1}. Then there exist a positive constant η0 , small enough and independent of ε, c, and a unique function η(ε, c), holomorphic in the domain |ε| 3 (c) ≤ η0 and real for real ε, such that the system x˙ j = y j + ε∂ yi F(x, y, ε), y˙ j = −ω2j x j − ε∂xi F(x, y, ε) + η j (ε, c) x j admits a solution (x(t, ε, c), y(t, ε, c)), holomorphic in the domain |ε| 3 (c)e3|ω| |Im t| ≤ η0 and real for real ε, t, with Fourier coefficients x j,e j = y j,e j /iω j = c j and x j,ν = y j,ν = O(ε) if ν = ±e j for j = 1, . . . , d. The proof follows the same lines as that of Theorem 1.1, and it is discussed in Appendices A and B. Finally in Appendix C we briefly sketch an alternative approach based on the resummation of the perturbation series. 2. Preliminary Results We shall denote by N the set of (strictly) positive integers, and set Z+ = N ∪ {0}. For any j = 1, . . . , d and ν ∈ Zd define the small divisors (2.1) δ j (ω · ν) := min{ω · ν − ω j , ω · ν + ω j } = |ω · (ν − σ (ν, j) e j )|, where σ (ν, j) is the minimizer. Note that the Diophantine condition (1.6) implies that δ j (ω · ν) ≥ γ |ν|−τ

∀ j = 1, . . . , d, ∀ν = 0, σ (ν, j) e j ,

(2.2a)

δ j (ω · ν) + δ j (ω · ν ) ≥ γ |ν − ν |−τ ∀ j, j =1, . . . , d, ∀ν = ν , ν−ν = σ (ν, j) e j −σ (ν , j ) e j ,

(2.2b)

for a suitable positive γ > 0. We can (and shall) assume that γ is sufficiently smaller than γ0 , and hence than δ(0) = min{|ω1 |, . . . , |ωd |} and ω := min{||ωi | − |ω j || : 1 ≤ i < j ≤ d}. Lemma 2.1. Given ν, ν ∈ Zd , with ν = ν , and δ j (ω · ν) = δ j (ω · ν ) for some j, j ∈ {1, . . . , d}, then either |ν − ν | ≥ |ν| + |ν | − 2 or |ν − ν | = 2. Proof. One has δ j (ω · ν) = |ω · ν − σ ω j | and δ j (ω · ν ) = |ω · ν − σ ω j |, with σ = σ (ν, j) and σ = σ (ν , j ). Set ν¯ = ν − σ e j and ν¯ = ν − σ e j . By the Diophantine condition (1.6) one can have δ j (ω · ν) = δ j (ω · ν ), and hence |ω · ν¯ | = |ω · ν¯ |, if and only if ν¯ = ±¯ν . If ν¯ = −¯ν then for σ = −σ one has |ν − ν | = |ν| + |ν |, while for σ = σ one obtains |ν − ν | ≥ |ν| + |ν | − 2. If ν¯ = ν¯ and j = j one has νi = νi for all i = j and ν j − σ = ν j − σ , and hence |ν j − ν j | = 2. If ν¯ = ν¯ and j = j then νi = νi for all i = j, j , while ν j − σ = ν j and ν j = ν j − σ , and hence |ν j − ν j | = |ν j − ν j | = 1.

366

L. Corsi, G. Gentile, M. Procesi

Lemma 2.2. Let ν, ν ∈ Zd be such that ν = ν and, for some n ∈ Z+ , j, j ∈ {1, . . . , d}, both δ j (ω · ν) ≤ 2−n γ and δ j (ω · ν ) ≤ 2−n γ hold. Then either |ν − ν | > 2(n−2)/τ or |ν − ν | = 2 and δ j (ω · ν) = δ j (ω · ν ). Proof. Write δ j (ω·ν) = |ω·ν −σ ω j | and δ j (ω·ν ) = |ω·ν −σ ω j |, with σ = σ (ν, j) and σ = σ (ν , j ), and set ν¯ = ν − σ e j and ν¯ = ν − σ e j as above. If ν¯ = ν¯ , by the Diophantine condition (2.2b), one has −τ γ ν¯ − ν¯ < ω · (¯ν − ν¯ ) ≤ |ω · ν¯ | + ω · ν¯ < 2−(n−1) γ , which implies |¯ν − ν¯ | > 2(n−1)/τ , and hence we have |ν − ν | > 2(n−2)/τ in such a case. If ν¯ = ν¯ then, as in Lemma 2.1, one has |ν − ν | = 2 and δ j (ω · ν) = δ j (ω · ν ).

Remark 2.3. Note that |ν − ν | ≤ 2 and δ j (ω · ν) = δ j (ω · ν ) if and only if ν − ν = σ (ν, j)e j − σ (ν , j )e j . Lemma 2.4. Let ν 1 , . . . , ν p ∈ Zd and j1 , . . . , j p ∈ {1, . . . , d}, with p ≥ 2, be such that |ν i − ν i−1 | ≤ 2 and δ ji (ω · ν i ) = δ j1 (ω · ν 1 ) ≤ γ for i = 2, . . . , p. Then |ν 1 − ν p | ≤ 2. Proof. Set σi = σ (ν i , ji ) and ν¯ i = ν i − σi e ji for i = 1, . . . , p. For all i = 2, . . . , p, the assumption δ ji (ω · ν i ) = δ ji−1 (ω · ν i−1 ) implies ν¯ i = ±¯ν i−1 , which in turn yields ν¯ i = ν¯ i−1 , since |ν i − ν i−1 | ≤ 2. In particular ν¯ 1 = ν¯ p , and hence |ν 1 − ν p | ≤ 2.

3. Multiscale Analysis and Diagrammatic Rules As we are looking for x(t, ε, c) and η(ε, c) analytic in ε, we formally write x j,ν =

∞

(k)

εk x j,ν ,

k=0

ηj =

∞

(k)

εk η j .

(3.1)

k=1

It is not difficult to see that using (3.1) in (1.10) one can recursively compute (at least (k) formally) the coefficients x (k) j,ν , η j to all orders. Here we introduce a graphical represen(k)

(k)

tation for each contribution to x j,ν , η j , which will allow us to study the convergence of the series. 3.1. Trees. A graph is a set of points and lines connecting them. A tree θ is a graph with no cycle, such that all the lines are oriented toward a unique point (root) which has only one incident line (root line). All the points in a tree except the root are called nodes. The orientation of the lines in a tree induces a partial ordering relation () between the nodes and the lines: we can imagine that each line carries an arrow pointing toward the root; see Fig. 1. Given two nodes v and w, we shall write w ≺ v every time v is along the path (of lines) which connects w to the root. We call E(θ ) the set of end nodes in θ , that is, the nodes which have no entering line, and V (θ ) the set of internal nodes in θ , that is, the set of nodes which have at least one entering line. Set N (θ ) = E(θ ) V (θ ). For all v ∈ N (θ ) denote by sv the number of lines entering the node v.

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

367

Fig. 1. An unlabelled tree: the arrows on the lines all point toward the root, according to the tree partial ordering

Remark 3.1. One has

v∈V (θ) sv

= |N (θ )| − 1.

We denote by L(θ ) the set of lines in θ . We call an internal line a line exiting an internal node and an end line a line exiting an end node. Since a line ∈ L(θ ) is uniquely identified with the node v which it leaves, we may write = v . We write w ≺ v if w ≺ v; we say that a node w precedes a line , and write w ≺ , if w . Notation 3.2. (1) If and are two comparable lines, i.e., ≺ , we denote by P( , ) the (unique) path of lines connecting to , the lines and being excluded. (2) Each internal line ∈ L(θ ) can be seen as the root line of the tree θ whose nodes and lines are those of θ which precede , that is, N (θ ) = {v ∈ N (θ ) : v ≺ } and L(θ ) = { ∈ L(θ ) : }. 3.2. Tree labels. With each end node v ∈ E(θ ) we associate a mode label ν v ∈ Zd , a component label jv ∈ {1, . . . , d}, and a sign label σv ∈ {±}; see Fig. 2. We call E σj (θ ) the set of end nodes v ∈ E(θ ) such that jv = j and σv = σ . With each internal node v ∈ V (θ ) we associate a component label jv ∈ {1, . . . , d}, and an order label kv ∈ Z+ . Set V0 (θ ) = {v ∈ V (θ ) : kv = 0} and N0 (θ ) = E(θ ) V0 (θ ). We also associate a sign label σv ∈ {±} with each v ∈ V0 (θ ). The internal nodes v with kv ≥ 1 will be drawn as black bullets, while the end nodes and the internal nodes with kv = 0 will be drawn as white bullets and white squares, respectively; see Fig. 2. With each line we associate a momentum label ν ∈ Zd , a component label j ∈ {1, . . . , d}, a sign label σ ∈ {±}, and scale label n ∈ Z+ ∪ {−1}; see Fig. 3. Denote by sv, j the number of lines with component label j = j entering the node v, and with rv, j,σ the number of end lines with component label j and sign label σ which enter the node v. Of course sv = sv,1 + · · · + sv,d and sv, j ≥ rv, j,+ + rv, j,− for all j = 1, . . . , d.

368

L. Corsi, G. Gentile, M. Procesi

(a)

(b)

(c)

Fig. 2. Nodes and labels associated with the nodes: (a) end node v with sv = 0, jv ∈ {1, . . . , d}, σv ∈ {±}, and ν v = σv e jv (cf. Sect. 3.3); (b) internal node v with sv ≥ 2, jv ∈ {1, . . . , d}, and kv = sv − 1 (cf. Sect. 3.3); (c) internal node v with sv = 2, jv ∈ {1, . . . , d} kv = 0, σv ∈ {±} (cf. Sect. 3.3)

Fig. 3. Labels associated with a line. One has σ = σ (ν , j ) (cf. Sect. 3.3) Moreover if = v then j = jv ; if v ∈ V0 (θ ) one has also σ = σv ; if ν = σ e j then n = −1, otherwise n ≥ 0 (cf. Sect. 3.3)

Finally call k(θ ) :=

kv

v∈V (θ)

the order of the tree θ . In the following we shall call trees tout court the trees with labels, and we shall use the term unlabelled trees for the trees without labels. 3.3. Constraints on the tree labels. Constraint 3.3. We have the following constraints on the labels of the nodes (see Fig. 2): (1) if v ∈ V (θ ) one has sv ≥ 2; (2) if v ∈ E(θ ) one has ν v = σv e jv ; (3) if v ∈ V (θ ) then kv = sv − 1, except for sv = 2, where both kv = 1 and kv = 0 are allowed. Constraint 3.4. The following constraints will be imposed on the labels of the lines: (1) j = jv , ν = ν v , and σ = σv if exits v ∈ E(θ ); (2) j = jv if exits v ∈ V (θ ); (3) if is an internal line then σ = σ (ν , j ), i.e., δ j (ω · ν ) = |ω · ν − σ ω j | (see (2.1) for notations); (4) if v ∈ V0 (θ ) then (see Fig. 4) 1. sv = 2; 2. both lines 1 and 2 entering v are internal and have σ 1 = σ 2 = σv and j 1 = j 2 = jv ; 3. either ν 1 = σv e jv and ν 2 = σv e jv or ν 1 = σv e jv and ν 2 = σv e jv ; 4. σ v = σv ; (5) if is an internal line and ν = σ e j , then enters a node v ∈ V0 (θ ); (6) n ≥ 0 if ν = σ e j and n = −1 otherwise.

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

369

Fig. 4. If there is an internal node v with kv = 0 then sv = 2 and the following constraints are imposed on the other labels: σ v = σ 1 = σ 2 = σv ; j v = j 1 = j 2 = jv ; either ν 1 = σv e jv and ν 2 = σv e jv (as in the figure) or ν 2 = σv e jv and ν 1 = σv e jv . (The scale labels are not shown)

(a)

(b)

Fig. 5. Conservation law: (a) v with kv = sv − 1 ≥ 1, so that ν = ν 1 + . . . + ν sv , (b) v with sv = 2 and kv = 0. (The scale labels are not shown)

Notation 3.5. Given a tree θ , call 0 its root line and consider the internal lines 1 , . . . , p ∈ L(θ ) on scale −1 (if any) such that one has n ≥ 0 for all ∈ P( 0 , i ), i = 1, . . . , p; we shall say that 1 , . . . , p are the lines on scale −1 which are closest to the root of θ . For each such line i , call θi = θ i . Then we call pruned tree θ˘ the subgraph with set of nodes and set of lines N (θ˘ ) = N (θ )\

p

N (θi ),

i=1

L(θ˘ ) = L(θ )\

p

L(θi ),

i=1

respectively. By construction, θ˘ is a tree, except that, with respect to the constraints listed above, one has sv = 1 whenever kv = 0; moreover one has ν = σ e j (and hence n ≥ 0) for ˘ except possibly the root line. all internal lines ∈ L(θ) Constraint 3.6. The modes of the end nodes and the momenta of the lines are related as follows: if = v one has the conservation law νw − σw e jw = νw. ν = w∈E(θ) wv

w∈V0 (θ) wv

˘ w∈E(θ) wv

Note that by Constraint 3.6 one has ν = ν v if v ∈ E(θ ), and ν = ν 1 + · · · + ν sv if v ∈ V (θ ), kv ≥ 1, and 1 , . . . , sv are the lines entering v; see Fig. 5. Moreover for any line ∈ L(θ ) one has |ν | ≤ |E(θ˘ )|.

370

L. Corsi, G. Gentile, M. Procesi

Remark 3.7. In the following we shall repeatedly consider the operation of changing the sign label of the nodes. Of course this change produces the change of other labels, consistently with the constraints mentioned above: for instance, if we change the label σv of an end node v into −σv , then also ν v is changed into −ν v ; if we change the sign labels of all the end nodes, then also the momenta of all the lines are changed, according to the conservation law (Constraint 3.6); and so on. Two unlabelled trees are called equivalent if they can be transformed into each other by continuously deforming the lines in such a way that they do not cross each other. We shall call equivalent two trees if the same happens in such a way that all labels match. Notation 3.8. We denote by Tkj,ν the set of inequivalent trees of order k with tree component j and tree momentum ν, that is, such that the component label and the momentum of the root line are j and ν, respectively. Finally for n ≥ −1 define Tkj,ν (n) the set of trees θ ∈ Tkj,ν such that n ≤ n for all ∈ L(θ ). − ˘ ˘ Remark 3.9. For θ ∈ Tkj,ν , by writing ν = (ν1 , . . . , νd ), one has νi = |E i+ (θ)|−|E i (θ )| ˘ = |E −σ (θ˘ )| + 1 ≥ 1, and for i = 1, . . . , d. In particular for ν = σ e j , one has |E σ (θ)| j

˘ = |E −σ ˘ |E σj (θ)| j (θ )| for all j = j.

j

Lemma 3.10. The number of unlabelled trees θ with N nodes is bounded by 4 N . If k(θ ) = k then |E(θ )| ≤ E 0 k and |V (θ )| ≤ V0 k, for suitable positive constants E 0 and V0 . Proof. The bound |V (θ )| ≤ |E(θ )| − 1 is easily proved by induction using that sv ≥ 2 for all v ∈ V (θ ). So it is enough to bound |E(θ )|. The definition of order and Remark 3.1 yield |E(θ )| = 1 + k(θ ) + |V0 (θ )|, and the bound |V0 (θ )| ≤ 2k(θ ) − 1 immediately follows by induction on the order of the tree, simply using that sv ≥ 2 for v ∈ V (θ ). Thus, the assertions are proved with E 0 = V0 = 3.

3.4. Tree expansion. Now we shall see how to associate with each tree θ ∈ Tkj,ν a (k) (k) contribution to the coefficients x j,ν and η j of the power series in (3.1). For all j = 1, . . . , d set c+j = c j and c−j = c∗j . We associate with each end node v ∈ E(θ ) a node factor Fv := cσjvv ,

(3.2)

and with each internal node v ∈ V (θ ) a node factor

Fv :=

⎧ sv,1 ! . . . sv,d ! ⎪ ⎪ f jv ,sv,1 ,...,sv,d , ⎪ ⎪ sv ! ⎨ ⎪ 1 ⎪ ⎪ ⎪ ⎩− σv , 2c jv

where the coefficients f j,s1 ,...,sd are defined in (1.5).

kv ≥ 1, (3.3) kv = 0,

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

371

Fig. 6. The functions ψ and n

Let ψ be a non-decreasing C ∞ function defined in R+ , such that (see Fig. 6) 1, for u ≥ 7γ /8, ψ(u) = (3.4) 0, for u ≤ 5γ /8, and set χ (u) := 1−ψ(u). For all n ∈ Z+ define χn (u) := χ (2n u) and ψn (u) := ψ(2n u), and set (see Fig. 6) n (u) = χn−1 (u) ψn (u),

(3.5)

where χ−1 (u) = 1. Note that χn−1 (u)χn (u) = χn (u), and hence {n (u)}n∈Z+ is a partition of unity. ] We associate with each line a propagator G := G [n j (ω · ν ), where ⎧ ⎪ ⎨ n (δ j (u)) , n ≥ 0, [n] u 2 − ω2j (3.6) G j (u) := ⎪ ⎩1, n = −1. Remark 3.11. The number of scale labels which can be associated with a line in such a way that G = 0 is at most 2. In particular, given a line with momentum ν = ν and scale n = n, such that n (δ j (ω · ν)) = 0, then (see Fig. 6) 2−(n+1) γ ≤

5 −n 7 2 γ ≤ δ j (ω · ν) ≤ 2−(n−1) γ ≤ 2−(n−1) γ , 8 8

(3.7)

and if n (δ j (ω · ν))n+1 (δ j (ω · ν)) = 0, then 5 −n 7 2 γ ≤ δ j (ω · ν) ≤ 2−n γ . 8 8 We define

⎛ V (θ ) := ⎝

∈L(θ)

⎞⎛ G ⎠ ⎝

(3.8)

⎞ Fv ⎠ ,

(3.9)

v∈N (θ)

and call V (θ ) the value of the tree θ . Remark 3.12. The number of trees θ ∈ Tkj,ν with V (θ ) = 0 is bounded proportionally to C k , for some positive constant C. This immediately follows from Lemma 3.10 and the observation that the number of trees obtained from a given unlabelled tree by assigning the labels to the nodes and the lines is also bounded by a constant to the power k (use Remark 3.11 to bound the number of allowed scale labels).

372

L. Corsi, G. Gentile, M. Procesi

Remark 3.13. In any tree θ there is at least one end node with node factor cσj for each internal node v with kv = 0, σv = σ and jv = j (this is easily proved by induction on the order of the pruned tree): the node factors −1/2cσj do not introduce any singularity at cσj = 0. Therefore for any tree θ the corresponding value V (θ ) is well defined because both propagators and node factors are finite quantities. Remark 3.12 implies that also

V (θ )

θ∈Tkj,ν

is well defined for all k ∈ N, all j ∈ {1, . . . , d}, and all ν ∈ Zd . Lemma 3.14. For all k ∈ N, all j = 1, . . . , d, and any θ ∈ Tkj,σ e j , there exists θ ∈ σ Tkj,−σ e j such that c−σ j V (θ ) = c j V (θ ). The tree θ is obtained from θ by changing the sign labels of all the nodes v ∈ N0 (θ ).

Proof. The proof is by induction on the order of the tree. For any tree θ ∈ Tkj,e j consider

the tree θ ∈ Tkj,−e j obtained from θ by replacing all the labels σv of all nodes v ∈ N0 (θ ) with −σv , so that the mode labels ν v are replaced with −ν v and the momenta ν with −ν (see Remark 3.7). Call 1 , . . . , p the lines on scale −1 (if any) closest to the root of θ , and for i = 1, . . . , p denote by vi the node i enters and θi = θ i (recall (2) in Notation 3.2). As an effect of the change of the sign labels, each tree θi is replaced with a tree σ θi such that c−σ jvi V (θi ) = c jvi V (θi ), by the inductive hypothesis. Thus, for each node vi the quantity Fvi V (θi ) is not changed. Moreover, neither the propagators of the lines ˘ nor the node factors corresponding to the internal nodes v ∈ V (θ) ˘ with kv = 0 ∈ L(θ) σv v ˘ change, while the node factors c jv of the nodes v ∈ E(θ ) are changed into c−σ jv . On the other hand one has |E + (θ˘ )| = |E − (θ˘ )| for all i = j, whereas |E + (θ˘ )| = |E − (θ˘ )| + 1 i

i

j

j

−σ σ ˘ and |E +j (θ˘ )| + 1 = |E − j (θ )|. Therefore one obtains c j V (θ ) = c j V (θ ), and the assertion follows.

For k ∈ N, j ∈ {1, . . . , d}, and σ ∈ {±}, define (k)

η j,σ = −

1 cσj

V (θ ).

θ∈Tkj,σ e j (k)

(k)

Lemma 3.15. For all k ∈ N and all j = 1, . . . , d one has η j,+ = η j,− . Proof. Lemma 3.14 implies c−j

θ∈Tkj,e j

V (θ ) = c+j

V (θ )

θ∈Tkj,−e j

for all k ∈ N and all j = 1, . . . , d, so that the assertion follows from the definition (k) of η j,σ .

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

373

Lemma 3.16. Equations (1.10) formally hold, i.e., they hold to all perturbation orders, provided that for all k ∈ N and j = 1, . . . , d we set formally x j,ν =

∞

(k)

εk x j,ν ,

(k)

x j,ν =

V (θ ) ∀ν ∈ Zd \{±e j } ,

(k)

x j,±e j = 0 ,

(3.10)

θ∈Tkj,ν

k=1

ηj =

∞

(k)

(k)

εk η j ,

ηj = −

k=1

Proof. The proof is a direct check.

1 cj

V (θ ).

(3.11)

θ∈Tkj,e j

(k) Remark 3.17. In η j , defined as (3.11), there is no singularity in c j = 0 because V (θ˘ ) contains at least one factor c+j = c j by Remark 3.9.

In the light of Lemma 3.16 one can wonder why the definition of the propagators for ν = σ e j is so involved; as a matter of fact one could define G =

1 . (ω · ν )2 − ω2j

However, since n≥0 n (u) ≡ 1, the two definitions are equivalent. We use the definition (3.6) so that we can immediately identify the factors O(2n ) which could prevent the convergence of the power series (3.1). In what follows we shall make this idea more precise. 3.5. Clusters. A cluster T on scale n is a maximal set of nodes and lines connecting them such that all the lines have scales n ≤ n and there is at least one line with scale n; see Fig. 7. The lines entering the cluster T and the line coming out from it (unique if existing at all) are called the external lines of the cluster T . We call V (T ), E(T ), and L(T ) the set of internal nodes, of end nodes, and of lines of T , respectively; note that the external lines of T do not belong to L(T ). Define also E σj (T ) as the set of end nodes v ∈ E(T ) such that σv = σ and jv = j. By setting kv , k(T ) := v∈V (T )

we say that the cluster T has order k if k(T ) = k. 3.6. Self-energy clusters. We call self-energy cluster any cluster T such that (see Fig. 8) (1) T has only one entering line and one exiting line, (2) one has n ≤ min{n T , n T } − 2 for any ∈ L(T ), (3) one has |ν T − ν T | ≤ 2 and δ j T (ω · ν T ) = δ j (ω · ν T ). T

Notation 3.18. For any self-energy cluster T we denote by T and T the exiting and the entering line of T respectively. We call PT the path of lines ∈ L(T ) connecting T to T , i.e., PT = P( T , T ) (recall (1) in Notation 3.2), and set n T = min{n T , n T }.

374

(a)

L. Corsi, G. Gentile, M. Procesi

(b)

Fig. 7. Example of tree and the corresponding clusters: once the scale labels have been assigned to the lines of the tree as in (a), one obtains the cluster structure depicted in (b)

Fig. 8. Example of self-energy cluster: consider the cluster T on scale 3 in Fig. 7, and suppose that the mode labels of the end nodes are such that |ν1 + ν2 + ν3 + ν4 + ν5 + ν6 | ≤ 2 and δ j (ω · ν T ) = δ j (ω · ν ). T

T

T

Then T is a self-energy cluster with external lines T (entering line) and T (exiting line). The path PT is such that PT = { }

Remark 3.19. Notice that, by Remark 2.3, for any self-energy cluster the label ν T is uniquely fixed by the labels j T , σ T , j T , σ T , ν T . In particular, for fixed ν and j such that δ j (ω · ν) ≤ γ , there are only 2d − 1 momenta ν = ν such that |ν − ν| ≤ 2 and δ j (ω · ν ) = δ j (ω · ν) for some j and σ , depending on ν . All the other ν with small divisor equal to δ j (ω · ν) are far away from ν, according to Lemma 2.1. We say that a line is a resonant line if it is both the exiting line of a self-energy cluster and the entering line of another self-energy cluster, that is, is resonant if there exist two self-energy clusters T1 and T2 such that = T1 = T2 ; see Fig. 9. Remark 3.20. The notion of self-energy cluster was first introduced by Eliasson, in the context of the KAM theorem, in [8], where it was called resonance. We prefer the term self-energy cluster to stress further the analogy with quantum field theory. The notion of equivalence given for trees can be extended in the obvious way to self-energy clusters.

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

375

Fig. 9. Example of resonant line: is resonant if both T1 and T2 are self-energy clusters

Fig. 10. A self-energy cluster in Ekj,σ, j,σ (ω · ν, n); T contains at least one line on scale ≤ n and n such that min{n , n } ≥ n + 2

Notation 3.21. We denote by Rkj,σ, j ,σ (ω · ν , n) the set of inequivalent self-energy clusters T on scale ≤ n of order k, such that ν T = ν , j T = j, σ T = σ , j T = j and σ T = σ . By definition of cluster for T ∈ Rkj,σ, j ,σ (ω · ν , n) one must have n ≤ n T − 2. For j = j and σ = σ define also Ekj,σ, j,σ (ω · ν , n) the set of selfenergy clusters T ∈ Rkj,σ, j,σ (ω · ν , n) such that (1) T enters the same node v which k

T exits and (2) kv = 0. We call vT such a special node and set R j,σ, j,σ (ω · ν , n) = Rkj,σ, j,σ (ω · ν , n)\Ekj,σ, j,σ (ω · ν , n); see Fig. 10. Notation 3.22. For any T ∈ Ekj,σ, j,σ (ω · ν , n) we call θT the tree which has as root line the line ∈ L(T ) entering vT (one can imagine to obtain θT from T by ‘removing’ the node vT ); see Fig. 11. Note that θT ∈ Tkj,σ e j (n). Notation 3.23. Consider a self-energy cluster T such that n = −1 for all lines ∈ PT . If T ∈ Ekj,σ, j,σ (ω · ν , n) for some k, j, σ, ν , n then we define the pruned self-energy cluster T˘ as the subgraph with N (T˘ ) = {vT } ∪ N (θ˘T ) and L(T˘ ) = L(θ˘T ). For all other self-energy clusters T , call 1 , . . . , p ∈ L(T ) the internal lines on scale −1 (if any) which are closest to the exiting line of T , that is, such that n ≥ 0 for all lines ∈ P( T , i ), i = 1, . . . , p. For each line i set θi = θ i . Then the pruned self-energy cluster T˘ is the subgraph with set of nodes and set of lines N (T˘ ) = N (T )\

p

i=1

respectively.

N (θi ),

L(T˘ ) = L(T )\

p

i=1

L(θi ),

376

L. Corsi, G. Gentile, M. Procesi

Fig. 11. An example of self-energy cluster T ∈ Ekj,σ, j,σ (ω · ν, n) and the corresponding tree θT . (Only the mode labels of the end nodes are shown in T and θT .)

Remark 3.24. For T ∈ Rkj,σ, j ,σ (ω · ν , n) such that n ≥ 0 for all ∈ PT , one has σ ˘ ˘ |E i+ (T˘ )| = |E i− (T˘ )| for all i = j, j . If j = j then |E −σ j (T )| = |E j (T )| + 1 k ˘ and |E σj (T˘ )| = |E −σ j (T )| + 1; if j = j , σ = σ and T ∈ R j,σ, j,σ (ω · ν , n) then −σ ˘ σ ˘ ˘ |E σj (T˘ )| = |E −σ j (T )|, while if j = j and σ = −σ then |E j (T )| = |E j (T )| + 2. −σ Finally, for any T ∈ Ek (ω · ν , n) one has |E σ (T˘ )| = |E (T˘ )| + 1 ≥ 1. j,σ, j,σ

j

j

We shall define

⎛ V (T, ω · ν T ) := ⎝

⎞⎛

G ⎠ ⎝

∈L(T )

⎞ Fv ⎠ ,

(3.12)

v∈N (T )

where V (T, ω · ν T ) will be called the value of the self-energy cluster T . The value V (T, ω · ν T ) depends on ω · ν T through the propagators of the lines ∈ PT . Remark 3.25. The value of a self-energy cluster T ∈ Ekj,σ, j,σ (u, n) does not depend on u so that we shall write 1 V (T, u) = V (T ) = − σ V (θT ). 2c j We define also for future convenience M (k) j,σ, j ,σ (ω · ν , n) :=

V (T, ω · ν ).

(3.13)

T ∈Rkj,σ, j ,σ (ω·ν ,n) (k) (k) (n) + M (k) (k) Note that M j,σ, j,σ (ω · ν , n) = M j,σ, j,σ (ω · ν , n), where M j,σ, j,σ (n) j,σ, j,σ (k)

and M j,σ, j,σ (ω · ν , n) are defined as in (3.13) but for the sum restricted to the set k

Ekj,σ, j,σ (ω · ν , n) and R j,σ, j,σ (ω · ν , n) respectively. (k)

(k)

(k)

Remark 3.26. Both the quantities M j,σ, j ,σ (ω · ν , n) and the coefficients x j,ν and η j are well defined to all orders because the number of terms which one sums over is finite (by the same argument in Remark 3.12). At least formally, we can define

M j,σ, j ,σ (ω · ν ) =

∞ k=1

εk

n≥−1

(k)

M j,σ, j ,σ (ω · ν , n).

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

377

We define the depth D(T ) of a self-energy cluster T recursively as follows: we set D(T ) = 1 if there is no self-energy cluster containing T , and set D(T ) = D(T ) + 1 if T is contained inside a self-energy cluster T and no other self-energy clusters inside T (if any) contain T . We denote by S D (θ ) the set of self-energy clusters of depth D in θ , and by S D (θ, T ) the set of self-energy clusters of depth D in θ contained inside T . Notation 3.27. Call θ˚ = θ \S1 (θ ) the subgraph of θ formed by the set of nodes and lines of θ which are outside the set S1 (θ ) (the external lines of the self-energy clusters T ∈ ˚ and, analogously, for T ∈ S D (θ ) call T˚ = T \S D+1 (θ, T ) S1 (θ ) being included in θ), the subgraph of T formed by the set of nodes and lines of T which are outside the set S D+1 (θ, T ). We denote by V (T˚ ), E(T˚ ), and L(T˚ ) the set of internal nodes, of end nodes, and of lines of T˚ , and by k(T˚ ) the order of T˚ , that is, the sum of the labels kv of all the internal nodes v ∈ V (T˚ ). Lemma 3.28. Given a line ∈ L(θ ), if T is the self-energy cluster with largest depth containing (if any), ∈ PT and there is no line ∈ PT preceding with n = −1, one can write ν = ν 0 + ν T . Then one has |ν 0 | ≤ E 1 k(T˚ ), for a suitable positive constant E 1 , if k(T˚ ) ≥ 1, and |ν 0 | ≤ 2 if k(T˚ ) = 0.

Proof. We first prove that for any tree θ , if we denote by 0 its root line, one has E 1 k(θ˚ ) − 2, if 0 does not exit a self-energy cluster, (3.14) |ν 0 | ≤ if 0 exits a self-energy cluster, E 1 k(θ˚ ), for a suitable constant E 1 ≥ 4. The proof is by induction on the order of the tree θ . If k(θ ) = 1 (and hence θ˚ = θ ) then the only internal line of θ is 0 and |ν 0 | ≤ 2, so that the assertion trivially holds provided E 1 ≥ 4. If k(θ ) > 1 let v0 be the node which 0 exits. If v0 is not contained inside a self-energy cluster let 1 , . . . , m , m ≥ 0, be the internal lines entering v0 and θi = θ i for all i = 1, . . . , m. Finally let m+1 , . . . , m+m be the end-lines entering v0 . By definition we have k(θ˚ ) = kv0 + k(θ˚1 ) + · · · + k(θ˚m ). If kv0 > 0, we have ν 0 = ν 1 + · · · + ν m+m . This implies in turn |ν 0 | ≤ |ν 1 | + · · · + |ν m | + m ≤ E 1 k(θ˚1 ) + · · · + k(θ˚m ) + m ≤ E 1 (k(θ˚ ) − m − m + 1) + m . The assertion follows for E 1 ≥ 4 by the inductive hypothesis (the worst possible case is m = 0, m = 2). If kv0 = 0 then sv = 2 and m = 0. Moreover one of the lines, say 1 , is on scale n = −1 while for the other line one has ν 0 = ν 2 . Once more the bound follows from the inductive hypothesis since |ν 2 | ≤ E 1 k(θ˚2 ) ≤ E 1 (k(θ˚ ) − 1). Finally, if v0 is contained inside a self-energy cluster, then 0 exits a self-energy cluster T1 . There will be p self-energy clusters T1 , . . . , T p , p ≥ 1, such that the exiting line of Ti is the entering line of Ti−1 , for i = 2, . . . , p, while the entering line of T p does not exit any self-energy cluster. By Lemma 2.4, one has |ν 0 − ν | ≤ 2 and k(θ˚ ) = k(θ˚ ). Then, by the inductive hypothesis, one finds |ν 0 | ≤ 2 + E 1 k(θ˚ ) − 2 = E 1 k(θ˚ ). Now for and T as in the statement we prove, by induction on the order of the self-energy cluster, the bound E 1 k(T˚ ) − 2, if k(T˚ ) ≥ 1, 0 |ν | ≤ (3.15) 2 if k(T˚ ) = 0,

378

L. Corsi, G. Gentile, M. Procesi

Fig. 12. The self-energy cluster T considered in the proof of Lemma 3.28, with m = 2, m = 3, and a chain of p self-energy clusters between and v (one has p ≥ 0, and = v if p = 0)

where T˚ is the set of nodes and lines of T˚ which precede . The bound is trivially satisfied when k(T˚ ) = 0. Otherwise let v be the node in V (T˚ ) between and T which is closest to . If kv = 0 the bound follows trivially by using the bound (3.14). If kv ≥ 1, call 1 , . . . , m , m ≥ 0, the internal lines entering v which are not along the path PT , and m+1 , . . . , m+m the end lines entering v; one has m + m ≥ 1. There is a further line 0 ∈ PT entering v such that ν 0 = ν 0 0 + ν T ; see Fig. 12. Using also Lemma 2.4 one has |ν 0 | ≤ 2 + |ν 0 0 | + |ν 1 | + · · · + |ν m | + m . As n 0 ≤ n T − 2 one has k(T˚ 0 ) ≥ 1 and hence, by (3.14) and the inductive hypothesis, one has |ν 0 | ≤ 2 + E 1 k(T˚ 0 ) − 2 + E 1 k(θ˚1 ) + · · · + k(θ˚m ) + m , where θi = θ i for all i = 1, . . . , m. Thus, since k(T˚ 0 ) + k(θ˚1 ) + · · · + k(θ˚m ) + (m + m ) = k(T˚ ) and m + m ≥ 1, one finds |ν 0 | ≤ E 1 k(T˚ ) − m − m + m ≤ E 2 k(T˚ ) − 2, provided E 1 ≥ 4. Therefore, the assertion follows with, say, E 1 = 4.

Notation 3.29. Given a tree θ and a line ∈ L(θ ), call = (θ ) the subgraph formed by the set of nodes and lines which do not precede ; see Fig. 13. Let us call

˚ the set of nodes and lines of which are outside any self-energy cluster contained inside . Lemma 3.30. Given a tree θ let 0 and be the root line and an arbitrary internal line preceding 0 . If k( ˚ ) ≥ 1 one has |ν 0 − ν | ≤ E 2 k( ˚ ), for a suitable positive constant E 2 .

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

379

Fig. 13. The set = (θ ) and the subtree θ determined by the line ∈ L(θ ). If is the root line then

= ∅

Proof. We prove by induction on the order of the bound E 2 k( ˚ ) − 2, if 0 does not exit a self-energy cluster, |ν 0 − ν | ≤ E 2 k( ˚ ), if 0 exits a self-energy cluster.

(3.16)

We mimic the proof of (3.14) in Lemma 3.28. The case k( ˚ ) = 1 is trivial provided E 2 ≥ 3, so let us consider k( ˚ ) > 1 and call v0 the node which 0 exits. If v0 is not contained inside a self-energy cluster and kv0 ≥ 1 then ν 0 = ν 1 + · · · + ν m+m , where 1 , . . . , m are the internal lines entering v0 , with (say) m ∈ P( 0 , ) ∪ { }, and m+1 , . . . , m+m are the end lines entering v0 . Hence k( ˚ ) = kv0 + k(θ˚1 ) + · · · + k(θ˚m−1 ) + k( ˚ m ), where θi = θ i and m = (θ m ) ( m = ∅ if m = ). Thus, the assertion follows by (3.14) and the inductive hypothesis. If v0 is not contained inside a self-energy cluster and kv0 = 0 then two lines 1 and 2 enter v0 , and one of them, say 1 , is such that |ν 1 | = 1. If = 2 the result is trivial. If 2 ∈ P( 0 , ) the bound follows once more from the inductive hypothesis. If = 1 one has |ν 0 − ν | ≤ |ν 0 | + 1 ≤ E 1 k(θ˚2 ) + 1 ≤ E 2 k( ˚ ) − 2, where θ2 = θ 2 , provided E 2 ≥ E 1 + 3, if E 1 is the constant defined in Lemma 3.28. If 1 ∈ P( 0 , ) denote by 1 the line on scale −1 along the path { 1 } ∪ P( 1 , ) which is closest to . Again call θ2 = θ 2 and J1 the subgraph formed by the set of nodes and lines preceding 1 (with 1 included) but not ; define also θ1 as the tree obtained from J1 by (1) reverting the arrows of all lines along { 1 , } ∪ P( 1 , ), (2) replacing 1 with an end line carrying the same sign and component labels as 1 , and (3) replacing all the labels σv , v ∈ N0 (J1 ) with −σv . One has, by using also (3.14), |ν 0 − ν | ≤ |ν 0 | + |ν | ≤ E 1 k(θ˚1 ) + E 1 k(θ˚2 ) ≤ E 2 k( ˚ ) − 2, provided E 2 ≥ E 1 + 2 so that the bound follows once more. Finally, if v0 is contained inside a self-energy cluster, then 0 exits a self-energy cluster T1 . There will be p selfenergy clusters T1 , . . . , T p , p ≥ 1, such that the exiting line of Ti is the entering line of Ti−1 , for i = 2, . . . , p, while the entering line of T p does not exit any self-energy cluster. By Lemma 2.4, one has |ν 0 −ν | ≤ 2 and k( ˚ ) = k( ˚ ), where = (θ ). Then, ˚ the inductive hypothesis yields |ν 0 − ν | ≤ 2 + |ν − ν | ≤ 2 + E 2 k( ˚ ) − 2 = E 2 k( ). Therefore the assertion follows with, say, E 2 = E 1 + 3 (and hence E 2 = 7 if E 1 = 4).

Remark 3.31. Lemma 3.28 will be used in Sect. 5 to control the change of the momenta as an effect of the regularisation procedure (to be defined). Furthermore, both Lemmas 3.28 and 3.30 will be used in Sect. 7 to show that the resonant lines which are not regularised cannot accumulate too much.

380

L. Corsi, G. Gentile, M. Procesi

4. Dimensional Bounds In this section we discuss how to prove that the series (3.10) and (3.11) converge if the resonant lines are excluded. We shall see in the following sections how to take into account the presence of the resonant lines. Call Nn (θ ) the number of non-resonant lines ∈ L(θ ) such that n ≥ n, and Nn (T ) the number of non-resonant lines ∈ L(T ) such that n ≥ n. The analyticity assumption on f yields that one has |Fv | ≤ sv +kv

∀v ∈ V (θ )\V0 (θ ),

(4.1)

for a suitable positive constant . Lemma 4.1. Assume that 2−(n +2) γ ≤ δ j (ω · ν ) ≤ 2−(n −2) γ for all trees θ and all lines ∈ L(θ ). Then there exists a positive constant c such that for any tree θ one has Nn (θ ) ≤ c 2−n/τ k(θ ). Proof. We prove that Nn (θ ) ≤ max{0, c 2−n/τ k(θ ) − 2} by induction on the order of θ . 1. First of all note that for a tree θ to have a line on scale n ≥ n one needs k(θ ) ≥ kn = E 0−1 2(n−2)/τ , as it follows from the Diophantine condition (2.2a) and Lemma 3.10. Hence the bound is trivially true for k < kn . 2. For k(θ ) ≥ kn , let 0 be the root line of θ and set ν = ν 0 and j = j 0 . If n 0 < n the assertion follows from the inductive hypothesis. If n 0 ≥ n, call 1 , . . . , m the lines with scale ≥ n − 1 which are closest to 0 (that is, such that n ≤ n − 2 for all p = 1, . . . , m and all lines ∈ P( 0 , p )). The case m = 0 is trivial. If m ≥ 2 the bound follows once more from the inductive hypothesis. 3. If m = 1, then 1 is the only entering line of a cluster T . Set ν = ν 1 , j = j 1 and n = n 1 . By hypothesis one has δ j (ω · ν) ≤ 2−(n−2) γ and δ j (ω · ν ) ≤ 2−(n−3) γ , so that, by Lemma 2.2, either |ν − ν | > 2(n−5)/τ or |ν − ν | ≤ 2 and δ j (ω · ν) = δ j (ω · ν ). In the first case, since νw − σw e jw = νw, ν − ν = w∈E(T )

w∈V (T ) kw =0

w∈E(T˘ )

the same argument used to prove Lemma 3.10 yields |ν − ν | ≤ |E(T )| ≤ E 0 k(T ), and hence k(T ) ≥ E 0−1 2(n−5)/τ . Thus, if θ1 = θ 1 , one has k(θ ) = k(T ) + k(θ1 ), so that Nn (θ ) = 1 + Nn (θ1 ) ≤ c 2−n/τ k(θ1 ) − 1 ≤ c 2−n/τ k(θ ) − c 2−n/τ k(T ) − 1 ≤ c 2−n/τ k(θ ) − 2, provided c ≥ E 0 25/τ . 4. If instead |ν − ν | ≤ 2 and δ j (ω · ν) = δ j (ω · ν ), then the only way for T not to be a self-energy cluster is that n 1 = n 0 − 1 = n − 1 and there is at least a line ∈ T with n = n − 2. But then δ j (ω · ν) = δ j (ω · ν ) so that |ν − ν | > 2(n−6)/τ and we can reason as in the previous case provided c ≥ E 0 26/τ . Otherwise T is a self-energy cluster and 1 can be either resonant or not-resonant. Call 1 , . . . , m the lines with scale ≥ n − 1 which are closest to 1 . Once more the cases m = 0 and m ≥ 2 are trivial.

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

381

5. If m = 1, then 1 is the only entering line of a cluster T . If θ1 = θ 1 , then Nn (θ ) = 1 + Nn (θ1 ) if 1 is resonant and Nn (θ ) ≤ 2 + Nn (θ1 ) if 1 is non-resonant. Consider first the case of 1 being non-resonant. Set ν = ν 1 , j = j 1 and n = n 1 . By reasoning as before we find that one has either |ν − ν | > 2(n−5)/τ or |ν − ν | ≤ 2 and δ j (ω · ν ) = δ j (ω · ν ). If |ν − ν | > 2(n−5)/τ then k(T ) ≥ E 0−1 2(n−5)/τ ; thus, by using that k(θ ) = k(T ) + k(T ) + k(θ1 ), we obtain Nn (θ ) ≤ 2 + Nn (θ1 ) ≤ c 2−n/τ k(θ ) − c 2−n/τ k(T ) − c 2−n/τ k(T ) ≤ c 2−n/τ k(θ ) − c 2−n/τ k(T ) ≤ c 2−n/τ k(θ ) − 2, provided c ≥ 2E 0 25/τ . 6. Otherwise one has |ν−ν | ≤ 2, |ν −ν | ≤ 2, and δ j (ω·ν) = δ j (ω·ν ) = δ j (ω·ν ). Since we are assuming 1 to be non-resonant then, T is not a self-energy cluster. But then there is at least a line ∈ T with n = n − 2 and we can reason as in item 4. 7. So we are left with the case in which 1 is resonant and hence T is a self-energy cluster. Let 1 be the entering line of T . Once more 1 is either resonant or nonresonant. If it is non-resonant we repeat the same argument as done before for 1 . If it is resonant, we iterate the construction, and so on. Therefore we proceed until either we find a non-resonant line on scale ≥ n, for which we can reason as before, or we reach a tree θ of order so small that it cannot contain any line on scale ≥ n (i.e., k(θ ) < kn ). 8. Therefore the assertion follows with, say, c = 2E 0 26/τ .

Remark 4.2. One can wonder why in Lemma 4.1 did we assume 2−(n +2) γ ≤ δ j (ω · ν ) ≤ 2−(n −2) γ when Remark 3.11 assures the stronger condition 2−(n +1) γ ≤ δ j (ω · ν ) ≤ 2−(n −1) γ . The reason is that later on we shall need to slightly change the momenta of the lines, in such a way that the scales in general no longer satisfy the condition (3.7) noted in Remark 3.11. However the condition assumed for proving Lemma 4.1 will still be satisfied. For any tree θ we call L R (θ ) and L NR (θ ) the sets of resonant lines and of non-resonant lines respectively, in L(θ ). Then we can write ⎛ ⎛ ⎞ ⎞⎛ ⎞ V (θ ) = ⎝ G ⎠ V NR (θ ), V NR (θ ) := ⎝ G ⎠ ⎝ Fv ⎠ , (4.2) ∈L R (θ)

∈L NR (θ)

v∈N (θ)

where each propagator G can be bounded as C0 2n , for some constant C0 . Lemma 4.3. For all trees θ with k(θ ) = k one has | V NR (θ )| ≤ C k 3k (c), where

(c) := max{|c1 |, . . . , |cd |, 1} and C is a suitable positive constant. Proof. One has

⎛

|V NR (θ )| ≤ C0k 3k (c)k ⎝

⎞ 2n ⎠ ≤ C0k 3k (c)k

∈L N R (θ)

≤ C0k 3k (c)k exp c log 2 k

∞

2−n/τ n .

n=1

The last sum converges: this is enough to prove the lemma.

∞ n=0

2n Nn (θ)

382

L. Corsi, G. Gentile, M. Procesi

Fig. 14. A chain of self-energy clusters

So far the only bound that we have on the propagators of the resonant lines is |G | ≤ 1/ω j δ j (ω · ν ) ≤ C0 2n . What we need is to obtain a gain factor proportional to 2−n for each resonant line with n ≥ 1. Lemma 4.4. Given θ such that V (θ ) = 0, let ∈ L(θ ) be a resonant line and let T be the self-energy cluster of largest depth containing (if any). Then there is at least one non-resonant line in T on scale ≥ n − 1. Proof. Set n = n . There are in general p ≥ 2 self-energy clusters T1 , . . . , T p , contained inside T , connected by resonant lines 1 , . . . , p−1 , and is one of such lines, while the entering line p of T p and the exiting line 0 of T1 are non-resonant. Moreover δ(ω · ν i ) = δ(ω · ν ) for all i = 0, . . . , p, so that all the lines 0 , . . . , p have scales either n, n − 1 or n, n + 1, by Remark 3.11. In any case the lines 0 , p must be in T by definition of the self-energy cluster.

5. Renormalisation Now we shall see how to deal with the resonant lines. In principle, one can have trees containing chains of arbitrarily many self-energy clusters (see Fig. 14), and this produces an accumulation of small divisors, and hence a bound proportional to k! to some positive power for the corresponding values. Let K 0 be such that E 1 K 0 = 2−8/τ . For T ∈ Rkj,σ, j ,σ (u, n), define the localisation operator L by setting ⎧ ⎪ k(T˚ ) ≤ K 0 2n T /τ , n ≥ 0 ∀ ∈ PT , ⎨V (T, σ ω j ), L V (T, u) := (5.1) ⎪ ⎩0, otherwise, which will be called the localised value of the self-energy cluster T . Define also R := 1 − L , by setting, for T ∈ Rkj,σ, j ,σ (u, n), R V (T, u) ⎧ 1 ⎪ ⎪u − σ ω dt ∂ V (T, σ ω + t (u −σ ω )), k(T˚ ) ≤ K 2n T /τ , n ≥ 0 ∀ ∈ P , ⎪ u 0 T ⎨ j j j 0 (5.2) = ⎪ ⎪ ⎪ ⎩ V (T, u), otherwise,

so that (k)

L M j,σ, j ,σ (u, n) =

L V (T, u),

(5.3a)

T ∈Rkj,σ, j ,σ (u,n)

R M (k) j,σ, j ,σ (u, n) =

R V (T, u).

(5.3b)

T ∈Rkj,σ, j ,σ (u,n)

We shall call R the regularisation operator and R V (T, u) the regularised value of T .

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

383

Remark 5.1. If T ∈ Ekj,σ, j,σ (u, n) the localisation operator acts as

L V (T ) =

⎧ n /τ ⎪ ⎨V (T ), k(T˚ ) ≤ K 0 2 T , ⎪ ⎩0,

k(T˚ ) > K 0 2n T /τ .

Remark 5.2. If in a self-energy cluster T there is a line ∈ PT such that ν = σ e j (and hence n = −1) then L V (T , u) = 0 for all self-energy clusters containing T such that ∈ PT . Recall the definition of the sets S D (θ ) and S D (θ, T ) after Remark 3.26. For any tree θ we can write its value as ⎞⎛ ⎛ ⎞⎛ ⎞ V (T, ω · ν T )⎠ ⎝ G ⎠ ⎝ Fv ⎠ , (5.4) V (θ ) = ⎝ T ∈S1 (θ)

∈L(θ\S1 (θ))

v∈N (θ\S1 (θ))

and, recursively, for any self-energy cluster T of depth D we have ⎛ V (T, ω · ν T ) = ⎝

⎞⎛

T ∈SD+1 (θ,T )

⎛

×⎝

V (T , ω · ν )⎠ ⎝

T

⎞ Fv ⎠ .

⎞ G ⎠

∈L(T \SD+1 (θ,T ))

(5.5)

v∈N (T \SD+1 (θ,T ))

Then we modify the diagrammatic rules given in Sect. 3 by assigning a further label OT ∈ {R, L }, which will be called the operator label, to each self-energy cluster T . Then, by writing V (θ ) according to (5.4) and (5.5), one replaces V (T, ω · ν T ) with L V (T, ω · ν T ) if OT = L and with R V (T, ω · ν T ) if OT = R. When considering the regularised value of a self-energy cluster T ∈ Rk (u, n) with k(T˚ ) ≤ K 0 2n T /τ j,σ, j ,σ

and n ≥ 0 for all ∈ PT , then we have also an interpolation parameter t to consider: we shall denote it by tT to keep trace of the self-energy cluster which it is associated with. We set tT = 1 for a regularised self-energy cluster T with either k(T˚ ) > K 0 2n T /τ or PT containing at least one line with n = −1. We call renormalised trees the trees θ carrying the further labels OT , associated with the self-energy clusters T of θ . As an effect of the localisation and regularisation operators the arguments of the propagators of some lines are changed. Remark 5.3. For any self-energy cluster T the localised value L V (T, u) does not depend on the operator labels of the self-energy clusters containing T .

Given a self-energy cluster T ∈ Rkj,σ, j ,σ (u, n) such that no line along PT is on scale −1, let be a line such that (1) ∈ PT , and (2) T is the self-energy cluster with largest depth containing . If one has OT = R, then the quantity ω · ν is changed according to the operator labels of all the self-energy clusters T such that (1) T contains T , (2) no line along PT has scale −1, and (3) ∈ PT . Call T p ⊂ T p−1 ⊂ · · · ⊂ T1 such

384

L. Corsi, G. Gentile, M. Procesi

self-energy clusters, with T p = T . If OTi = R for all i = 1, . . . , p, then ω · ν is replaced with ω · ν (t ) = ω · ν 0 + σ p ω j p + t p ω · ν 0 p + σ p−1 ω j p−1 − σ p ω j p +

p−1

t p . . . ti ω · ν 0 i + σ i−1 ω ji−1 − σ i ω ji

i=2

+ t p . . . t1 ω · ν 1 − σ 1 ω j1 ,

(5.6)

where we have set t = (t1 , . . . , t p ), Ti = i and tTi = ti for simplicity. Otherwise let Tq be the self-energy cluster of highest depth, among T1 , . . . , T p−1 , with OTq = L (so that OTi = R for i ≥ q + 1). In that case, instead of (5.6), one has ω · ν (t ) = ω · ν 0 + σ p ω j p + t p ω · ν 0 p + σ p−1 ω j p−1 − σ p ω j p +

p−1

t p . . . ti ω · ν 0 i + σ i−1 ω ji−1 − σ i ω ji ,

(5.7)

i=q+1

with the same notations used in (5.6). If OT p = L , since ω · ν is replaced with ω · ν 0 + σ T ω j for ∈ PT , we can write T

ω · ν 0 + σ T ω j as in (5.6) by setting t p = 0. More generally, if we set tT = 0 whenever T

OT = L , we see that we can always claim that, under the action of the localisation and regularisation operators, the momentum ν of any line ∈ PT is changed to ν (t ), in such a way that ω · ν (t ) is given by (5.6). Lemma 5.4. Given θ such that V (θ ) = 0, for all ∈ L(θ ) one has 4 δ j (ω · ν ) ≤ 5 δ j (ω · ν (t )) ≤ 6 δ j (ω · ν ). Proof. The proof is by induction on the depth of the self-energy cluster. 1. Consider first the case that ∈ PT , with OT = L . Set n = n T , ν = ν T , σ = σ T , and j = j T . Then ω · ν is replaced with σ ω j , and, as a consequence, ω · ν is replaced with ω · ν (t ) = ω · ν 0 + σ ω j . Define n˜ such that 2−(n˜ +1) γ ≤ δ j (ω · ν 0 + σ ω j ) ≤ 2−(n˜ −1) γ ,

(5.8)

where δ j (ω · ν 0 + σ ω j ) = |ω · ν 0 + σ ω j − σ ω j | ≥ γ |ν 0 |−τ by the Diophantine condition (2.2b). Therefore 2n˜ −1 ≤ |ν 0 |τ ≤ (E 1 k(T˚ ))τ ≤ (E 1 K 0 )τ 2n = 2n−8 , and hence n˜ ≤ n − 7. Since |ω · ν − σ ω j | ≤ 2−n+2 γ by the inductive hypothesis, one has δ j (ω · ν ) = ω · ν 0 + ω · ν − σ ω j 15 ≥ ω · ν 0 + σ ω j − σ ω j − ω · ν − σ ω j ≥ δ j (ω · ν 0 + σ ω j ), 16 because δ j (ω · ν 0 + σ ω j ) ≥ 2−(n˜ +1) γ ≥ 2−n+6 γ ≥ 24 |ω · ν − σ ω j |. In the same way one can bound δ j (ω · ν ) ≤ |ω · ν 0 + σ ω j − σ ω j | + |ω · ν − σ ω j |, so that we conclude that

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

15 17 δ j (ω · ν 0 + σ ω j ) ≤ δ j (ω · ν ) ≤ δ j (ω · ν 0 + σ ω j ). 16 16

385

(5.9)

This yields the assertion. 2. Consider now the case that OT = R. In that case ω · ν (t ) is given by (5.6). Define n˜ as in (5.8), with σ = σ p and j = j p . We want to prove that 7 9 δ j (ω · ν 0 + σ ω j ) ≤ δ j (ω · ν (t )) ≤ δ j (ω · ν 0 + σ ω j ) 8 8

(5.10)

for all t = (t1 , . . . , t p ), with ti ∈ [0, 1] for i = 1, . . . , p. This immediately implies the assertion because, by using also (5.9), we obtain 7 14 δ j (ω · ν ) ≤ δ j (ω · ν 0 + σ ω j ) ≤ δ j (ω · ν (t )) 17 8 9 18 ≤ δ j (ω · ν 0 + σ ω j ) ≤ δ j (ω · ν ), 8 15 and hence 4δ j (ω · ν ) ≤ 5δ j (ω · ν (t )) ≤ 6δ j (ω · ν ). By the inductive hypothesis and the discussion of the case 1, in (5.8) we have i = 1, . . . , p, ω · ν 0 i + σ i−1 ω ji−1 − σ i ω ji ≤ 2−n i +2 γ , where n i = n i . Moreover one has n i ≥ n i+1 for i = 1, . . . , p − 1, so that we obtain δ j (ω · ν (t )) ≥ δ j (ω · ν 0 + σ ω j ) −

p

2−n i +2 γ ≥ δ j (ω · ν 0 + σ ω j ) − 2−n+3 γ .

i=1

Since δ j (ω · ν 0 + σ ω j ) ≥ 2−(n˜ +1) γ and n˜ ≤ n − 7, one finds δ j (ω · ν (t )) ≥ (1 − 2−3 )δ j (ω · ν 0 + σ ω j ). In the same way one has δ j (ω · ν (t )) ≤ (1 + 2−3 )δ j (ω ·

ν 0 + σ ω j ), so that (5.10) follows. Remark 5.5. Given a renormalised tree θ , with V (θ ) = 0, if a line ∈ L(θ ) has scale n then n (δ j (ω · ν )(t )) = 0, and hence, by Lemma 5.4, one has 2−(n +2) γ ≤ δ j (ω · ν ) ≤ 2−(n −2) γ . Therefore, Lemma 4.1 still holds for the renormalised trees without any changes in the proof (see also Remark 4.2). Remark 5.6. Another important consequence of Lemma 5.4 (and of Inequality (3.8) in Remark 3.11) is that the number of scale labels which can be associated with each line of a renormalised tree is still at most 2. 6. Symmetries and Identities Now we shall prove some symmetry properties on the localized value of the self-energy clusters. Lemma 6.1. If T ∈ Ekj,σ, j,σ (u, n) is such that T˘ does not contain any end node v with k

Fv = c−σ j then there exists T ∈ R j,σ, j,σ (u, n) such that −2L V (T ) = L V (T , u).

386

L. Corsi, G. Gentile, M. Procesi

Fig. 15. The self-energy cluster T , the tree θT , and the self-energy cluster T in the proof of Lemma 6.1

Fig. 16. The sets F1 (T ) = {T1 , T2 } and F2 (T ) = {T3 } corresponding to the self-energy cluster T in Fig. 11

˘ Proof. If T ∈ Ekj,σ, j,σ (u, n) one has |E σj (T˘ )| = |E −σ j (T )|+1 (see Remark 3.24), so that −σ ˘ σ ˘ if |E (T )| = 0, then also |E (T )| = 1. This means that jv = j for all v ∈ E(T˘ )\{v0 }, j

j

k if E σj (T˘ ) = {v0 }. Consider the self-energy cluster T ∈ R j,σ, j,σ (u, n) obtained from θT by replacing the line exiting v0 with an entering line carrying a momentum ν such that ω · ν = u and n T = n T ; see Fig. 15. With the exception of v0 , the nodes of θT have the same node factors as T ; in particular they have the same combinatorial factors. If we compute the propagators G of ∈ L(T ), by setting u = σ ω j , then they are the same as the corresponding propagators of θT . Finally, as n T = n T , one has L V (T ) = 0 if and only if also L V (T , u) = 0. Thus, by recalling also Remark 3.25, one finds −2L V (T ) = L V (T , u).

For T ∈ Ekj,σ, j,σ (u, n) let us call F1 (T ) the set of all inequivalent self-energy clusters k

T ∈ R j,σ, j,σ (u, n) obtained from θT by replacing a line exiting an end node v ∈ E σj (θ˘T ) with an entering line carrying a momentum ν such that ω·ν = u and with n T = n T . Call also F2 (T ) the set of all inequivalent self-energy clusters T ∈ Rkj,σ, j,−σ (u , n), with u = u − 2σ ω j , obtained from θT by replacing a line exiting an end node v ∈ E −σ (θ˘T ) j

(if any) with an entering line carrying a momentum ν such that ω · ν = u and with n T = n T ; see Fig. 16.

Lemma 6.2. For all T ∈ Ekj,σ, j,σ (u, n) one has ⎛ ⎞ ⎝2cσj L V (T ) + cσj L V (T , u)⎠ = c−σ j T ∈F

1 (T )

T ∈F

L V (T , u ),

2 (T )

where u = u − 2σ ω j and the right hand side is meant as zero if F2 (T ) = ∅. Proof. The case k(T ) > K 0 2n T /τ is trivial so that we consider only the case k(T ) ≤ K 0 2n T /τ . By construction any T ∈ Ekj,σ, j,σ (u, n) is such that T˘ contains at least an end

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

387

˘ node v such that Fv = cσj , hence |E σj (T˘ )| ≥ 1. By Lemma 6.1 either |E −σ j (T )| ≥ 1 k

or there exists T ∈ R j,σ, j,σ (u, n) such that 2L V (T ) + L V (T , u) = 0. Hence the ˘ assertion is proved if E −σ j (T ) = ∅. ˘ So, let us consider the case |E −σ j (T )| ≥ 1. First of all note that there is a 1-to-1 correspondence between the lines of θT and the lines and external lines, respectively, of both T ∈ F1 (T ) and T ∈ F2 (T ); the same holds for the internal nodes. Moreover the propagators both of any T ∈ F1 (T ) and of any T ∈ F2 (T ) are equal to the corresponding propagators of T when setting u = σ ω j and u = −σ ω j , respectively. Also the node factors of the internal nodes of all self-energy clusters T ∈ F1 (T ) ∪ F2 (T ) are the same as those of T . For T ∈ F1 (T ) one has |E i+ (T˘ )| = |E i− (T˘ )| for all i = 1, . . . , d, whereas for T ∈ F2 (T ) one has |E i+ (T˘ )| = |E i− (T˘ )| for all i = j and ˘ |E σj (T˘ )| = |E −σ j (T )| + 2; thus, one has ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ σ σ σ ⎝ c v ⎠ = cσj ⎝ c v ⎠ = c−σ ⎝ c v⎠ v∈E(T˘ )

jv

v∈E(T˘ )

jv

j

v∈E(T˘ )

for all T ∈ F1 (T ) and all T ∈ F2 (T ). Therefore, if we write

⎛

⎞

−2cσj L V (T ) = V (θT ) = A (T ) ⎝

jv

v∈E(T˘ )

cσjvv ⎠ ,

(6.1)

where A (T ) depends only on T , then one finds ⎛ ⎞ 1 L V (T , u) = A (T ) σ ⎝ cσjvv ⎠ rv, j,σ , cj T ∈F1 (T )

v∈E(T˘ )

v∈V (T˘ )

with the same factor A (T ) as in (6.1). Analogously one has ⎛ ⎞ σ 1 L V (T , u ) = A (T ) −σ ⎝ c jvv ⎠ rv, j,−σ , cj T ∈F (T ) ˘ ˘ v∈E(T )

2

v∈V (T )

again with the same factor A (T ) as in (6.1), so one can write ⎞ ⎛ ⎝−2cσj V (T ) + cσj L V (T , u)⎠ − c−σ L V (T , u ) j ⎛

T ∈F1 (T )

= B(T ) ⎝−1 +

rv, j,σ

T ∈F2 (T )

⎞ − rv, j,−σ ⎠ ,

v∈V (T˘ )

where

⎛ B(T ) = A (T ) ⎝

v∈E(T˘ )

⎞ cσjvv ⎠ .

(6.2)

388

L. Corsi, G. Gentile, M. Procesi

Fig. 17. A self-energy cluster T and the corresponding sets G1 (T ) = {T, T1 }, G2 (T ) = {T2 , T3 }, and G3 (T ) = {T4 , T5 }

On the other hand one has

rv, j,σ = |E σj (T˘ )|,

v∈V (T˘ )

˘ so that the term in the last parentheses of (6.2) gives −1 + |E σj (T˘ )| − |E −σ j (T )| = 0. Therefore the assertion is proved.

For T ∈ Rkj,σ, j ,σ (u, n) with j = j and n ≥ 0 for all ∈ PT , call G1 (T ) the set of self-energy clusters T ∈ Rkj,σ, j ,σ (u, n) obtained from T by exchanging the entering line with a line exiting an end node v ∈ E σ (T˘ ) (if any). Call also G2 (T ) the set of T

j

self-energy clusters T ∈ Rkj,σ, j ,−σ (u , n), with u = u −2σ ω j , obtained from T by (1) replacing the momentum of T with a momentum ν such that ω · ν = u , (2) changing ˘ the sign label of an end node v ∈ E −σ j (T ) into σ , and (3) exchanging the lines T and v . Finally call G3 (T ) the set of self-energy clusters T ∈ Rkj,−σ, j ,σ (u, n), obtained from T by (1) replacing the entering line T with a line exiting a new end node v0 with σv0 = σ and ν v0 = σ e j , (2) replacing all the labels σv of the nodes v ∈ N0 (T ) ∪ {v0 } with −σv and (3) replacing a line exiting an end node v ∈ E σj (T˘ ), with the entering line T ; see Fig. 17. Again we force n T = n T for all T ∈ G1 (T ) ∪ G2 (T ) ∪ G3 (T ). Lemma 6.3. For all T ∈ Rkj,σ, j ,σ (u, n), with j = j and n ≥ 0 for all ∈ PT , one has cσj L V (T , u) = c−σ L V (T , u ), j T ∈G1 (T )

σ L c−σ j c j T ∈G1 (T )

T ∈G2 (T )

V (T , u) =

cσj c−σ L j T ∈G3 (T )

V (T , u).

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

389

Proof. Again we consider only the case k(T˚ ) ≤ K 0 2n T /τ . For fixed T ∈ Rkj,σ, j ,σ (u, n), with j = j , let θ ∈ Tkj,σ e j (n) be the tree obtained from T by replacing the entering line T with a line exiting a new end node v0 with σv0 = σ and ν v0 = σ e j . Note that ˘ in particular one has |E σj (θ˘ )| = |E −σ j (θ )|. Any T ∈ G1 (T ) can be obtained from θ by replacing a line exiting an end node v ∈ E σ (θ˘ ) with an entering line , with the same j

labels as T , so that

cσj

T

˘ V (θ ). L V (T , u) = |E σj (θ)|

T ∈G1 (T )

On the other hand, any T ∈ G2 (T ) can be obtained from θ by replacing a line exiting an ˘ end node v ∈ E −σ j (θ) with an entering line T , with labels ν − 2σ e j , j , −σ , hence c−σ j

˘ L V (T , u) = |E −σ j (θ )| V (θ ),

T ∈G2 (T )

so that the first equality is proved. Now, let θ ∈ Tkj,−σ e j (n) be the tree obtained from θ by replacing all the labels σv of the nodes v ∈ N0 (θ ) with −σv . Any T ∈ G3 (T ) can be obtained from θ by replacing a line exiting an end node v ∈ E σj (θ˘ ) with an entering line T , carrying the same labels as T . Hence, by Lemma 3.14,

σ c−σ j c j

−σ ˘ σ ˘ σ L V (T , u) = c−σ j |E j (θ )| V (θ ) = c j |E j (θ )| V (θ )

T ∈G1 (T )

= cσj c−σ j

T ∈G

L V (T , u),

3 (T )

which yields the second identity, and hence completes the proof.

Lemma 6.4. For all k ∈ Z+ , all j, j = 1, . . . , d, and all σ, σ ∈ {±}, one has (i) η(k) = η(k) (|c1 |2 , . . . , |cd |2 ), i.e., η(k) depends on c only through the quantities |c1 |2 , . . . , |cd |2 ; (k) (k) σ (k) (ii) L M j,σ, j ,σ (u, n) = c−σ j c j M j, j (n), where M j, j (n) does not depend on the indices σ, σ . (k)

Proof. One works on the single trees contributing to L M j,σ, j ,σ (u, n). Then the proof follows from Lemma 3.14 and the results above.

Remark 6.5. Note that Lemma 6.4 could be reformulated as (k)

(k)

L M j,σ, j ,σ (u, n) = ∂cσ cσj L M j,σ, j,σ (n), j

(k)

with M j,σ, j,σ (n) defined after (3.13). We omit the proof of the identity, since it will not be used.

390

L. Corsi, G. Gentile, M. Procesi

7. Cancellations and Bounds We have seen in Sect. 4 that, as far as resonant lines are not considered, no problems arise in obtaining ‘good bounds’, i.e., bounds on the tree values of order k proportional to some constant to the power k (see Lemma 4.3). For the same bound to hold for all tree values we need a gain factor proportional to 2−n for each resonant line on scale n ≥ 1. Let us consider a tree θ , and write its value as in (5.4). Let be a resonant line. Then exits a self-energy cluster T2 and enters a self-energy cluster T1 ; see Fig. 9. By construction T1 ∈ Rkj1,σ , j ,σ (ω · ν T , n 1 ) and T2 ∈ Rkj2,σ , j ,σ (ω · ν T , n 2 ), for suitable 1

1

1

2

1

1

2

2

2

2

values of the labels, with the constraint j1 = j2 = j and σ1 = σ2 = σ . If OT1 = OT2 = L , we consider also all trees obtained from θ by replacing T1 and T2 with other clusters T1 ∈ Rkj1,σ , j ,σ (ω · ν T , n 1 ) and T2 ∈ Rkj2,σ , j ,σ (ω · ν T , n 2 ), 1 1 1 1 2 2 2 2 1 2 respectively, with OT1 = OT2 = L . In this way ] L V (T1 , ω · ν T ) G [n j (ω · ν ) L V (T2 , ω · ν T2 ) 1

is replaced with (k )

(k2 ) (ω ,σ , j2 ,σ2

1 ] L M j1 ,σ (ω · ν T , n 1 ) G [n j (ω · ν ) L M j 1 , j ,σ 1

· ν T , n 2 ).

(7.1)

2

Then consider also all trees in which the factor (7.1) is replaced with (k )

(k2 ) (ω ,−σ , j2 ,σ2

1 ] L M j1 ,σ (ω · ν T , n 1 ) G [n j (ω · ν ) L M j 1 , j ,−σ 1

· ν T , n 2 ),

(7.2)

2

with ν such that ω · ν − σ ω j = ω · ν + σ ω j ; see Fig. 18. Because of Lemmas 6.2 and 6.3 the sum of the two contributions (7.1) and (7.2) gives (k1 ) (k2 ) [n ] [n ] , n1) G L M j1 ,σ (ω · ν (ω · ν ) + G (ω · ν ) L M j ,σ (ω · ν T , n 2 ), j j , j ,σ , j ,σ 1 T

1

2

2

2

where n (δ j (ω · ν )) 1 1 + (ω · ν − σ ω j ) ω · ν + σ ω j ω · ν − σ ω j 2n (δ j (ω · ν )) , (7.3) = (ω · ν + σ ω j )(ω · ν − σ ω j )

[n ] ] G [n j (ω · ν ) + G j (ω · ν ) =

[n ] −2 ] −n ) and hence |G [n j (ω · ν ) + G j (ω · ν )| ≤ 2ω j . This provides the gain factor O(2 we were looking for, with respect to the original bound C0 2n on the propagator G . ˚ If OT1 = R then if k(T˚1 ) > K 0 2n T1 /τ we can extract a factor C k(T1 ) from V (T1 , ω · ˚ ˚ ν T ) (C is the constant appearing in Lemma 4.3), and, after writing C k(T1 ) = C 2k(T1 ) 1

˚ C −k(T1 ) , use −n O(2 ).

that C −k(T1 ) ≤ C −K 0 2 ˚

n T /τ 1

≤ const.2−n T1 in order to obtain a gain factor

If k(T˚1 ) ≤ K 0 2n T1 /τ and n ≥ 0 for all ∈ PT , we obtain a gain factor proportional to 2−n because of the first line of (5.2). Of course whenever one has such a case,

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

391

Fig. 18. Graphical representation of the cancellation mechanism discussed in the text: ν = ν − 2σ e j . If we sum the two contributions we obtain a gain factor O(2−n )

then one has a derivative acting on V (T, u) – see (5.2). Therefore one needs to control derivatives like ⎛ ⎞⎛ ⎞ ∂u G ⎝ G ⎠ ⎝ Fv ⎠ , (7.4) ∂u V (T, u) = ∈PT

∈L(T )\{ }

v∈N (T )

where ∂u G =

n (δ(ω · ν )) ∂u n (δ(ω · ν )) − 2ω · ν . 2 2 (ω · ν ) − ω j ((ω · ν )2 − ω2j )2

(7.5)

The derived propagator (7.5) can be easily bounded by |∂u G | ≤ C1 22n ,

(7.6)

for some positive constant C1 . In principle, given a line , one could have one derivative of G for each self-energy cluster containing . This should be a problem, because in a tree of order k, a propagator G could be derived up to O(k) times, and no bound proportional to some constant to the power k can be expected to hold to order k. In fact, it happens that no propagator has to be derived more than once. This can be seen by reasoning as follows. Let T be a self-energy cluster of depth D(T ) = 1. If OT = R then a gain factor O(2−n T ) is obtained. When writing ∂u V (T, u) according to (7.4) one obtains |PT | terms, one for each line ∈ PT . Then we can bound the derivative of G according to −n (7.6). By collecting together the gain factor and the bound (7.6) we obtain 22n 2 T . We can interpret such a bound by saying that, at the cost of replacing the bound 2n of −n the propagator G with its square 22n , we have a gain factor 2 T for the self-energy cluster T . Suppose that is contained inside other self-energy clusters besides T , say T p ⊂ T p−1 ⊂ · · · ⊂ T1 (hence T p is that with largest depth, and D(T p ) = p + 1). Then, when taking the contribution to (7.4) with the derivative ∂u acting on the propagator G , we consider together the labels OTi = R and OTi = L for all i = 1, . . . , p (in other words we do not distinguish between localised and regularised values for such self-energy

392

L. Corsi, G. Gentile, M. Procesi

clusters), because we do not want to produce further derivatives on the propagator G . Of course we have obtained no gain factor corresponding to the entering lines of the self-energy clusters T1 , . . . , T p , and all these lines can be resonant lines. So, eventually we shall have to keep track of this. Then we can iterate the procedure. If the self-energy cluster T does not contain any line whose propagator is derived, we split its value into the sum of the localised value plus the regularised value. On the contrary, if a line along the path PT of T is derived we do not separate the localised value of T from its regularised value. Note that, if T is contained inside a regularised self-energy cluster, then both ω · ν and ω · ν in (7.1) and (7.2) must be replaced with ω · ν (t ) and ω · ν (t ), respectively, but still ω · ν (t ) − σ ω j = ω · ν (t ) + σ ω j , so that the cancellation (7.3) still holds. Let us call a ghost line a resonant line such that (1) is along the path PT of a regularised self-energy cluster T and either (2a) enters or exits a self-energy cluster T ⊂ T containing a line whose propagator is derived or (2b) the propagator of is derived. Then, eventually one obtains a gain 2−n for all resonant lines , except for the ghost lines. In other words we can say that there is an overall factor proportional to ⎛ ⎞⎛ ⎞ −n n ⎝ 2 ⎠ ⎝ 2 ⎠ , (7.7) ∈L R (θ)

∈L G (θ)

where L G (θ ) is the set of ghost lines. Indeed, in case (2a) there is no gain corresponding to the line , so that we can insert a ‘good’ factor 2−n provided we allow also a compensating ‘bad’ factor 2n . In case (2b) one can reason as follows. Call (with some abuse of notation) T1 and T2 the self-energy clusters which enters and exits, respectively. If OT1 = OT2 = L , we consider ] L V (T1 , ω · ν T ) ∂u G [n j (ω · ν (t )) L V (T2 , ω · ν T2 ), 1

and, by summing over all possible self-energy clusters as done in (7.1), we obtain (k2 ) [n ] 1) L M (k j1 ,σ1 , j ,σ (ω · ν T , n 1 ) ∂u G j (ω · ν (t )) L M j ,σ , j ,σ (ω · ν T , n 2 );

1

2

2

2

then we sum this contribution with (k )

(k2 ) (ω ,−σ , j2 ,σ2

1 ] L M j1 ,σ (ω · ν T , n 1 ) ∂u G [n j (ω · ν (t )) L M j 1 , j ,−σ 1

· ν T , n 2 ), 2

where ν = ν − 2σ e j ; again we can use Lemmas 6.2 and 6.3 to obtain (k1 ) [n ] [n ] , n 1 ) ∂u G (ω · ν (ω · ν (t )) + ∂ G (ω · ν (t )) L M j1 ,σ u j j , j ,σ 1 T 1

×L

(k2 ) M j ,σ (ω , j2 ,σ2

· ν T , n 2 ), 2

where 2∂u n (δ(ω · ν ( t ))) (ω · ν ( t ) + σ ω j )(ω · ν ( t ) − σ ω j ) 4(ω · ν ( t ) − σ ω j )n (δ(ω · ν ( t ))) , − (ω · ν ( t ) + σ ω j )2 (ω · ν ( t ) − σ ω j )2

[n ] ] ∂u G [n j (ω · ν ( t )) + ∂u G j (ω · ν ( t )) =

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

393

so that we have not only the gain factor 2−n due to the cancellation, but also a factor 2n because of the term ∂u n (δ(ω · ν )). A trivial but important remark is that all the ghost lines contained inside the same self-energy cluster have different scales: in particular there is at most one ghost line on a given scale n. Therefore we can rely upon Lemma 4.4 and Lemma 5.4, to ensure that for each such line there is also at least one non-resonant line on scale ≥ n − 3 (inside the same self-energy cluster). Therefore we can bound the second product in (7.7) as ⎛ ⎞ ∞ ⎝ 2n ⎠ ≤ 2n Nn−3 (θ) , ∈L G (θ)

n=1

which in turn is bounded as a constant to the power k = k(θ ), as argued in the proof of Lemma 4.3. Finally if k(T˚1 ) ≤ K 0 2n T1 /τ and T1 contains at least one line ∈ PT1 with n = −1, in general there are p ≥ 1 self-energy clusters T p ⊂ T p−1 ⊂ · · · ⊂ T1 = T1 such that ∈ PTi for i = 1, . . . , p, and T p is the one with largest depth containing . For i = 1, . . . , p call i the exiting line of the self-energy cluster Ti and θi = θ i . Denote also, for i = 1, . . . , p − 1, by i = i+1 (θi ) (recall Notation 3.29). By Lemma 3.30 one has |ν i − ν i+1 | ≤ E 2 k( ˚ i ) for i = 1, . . . , p − 1. Moreover one has |ν 1 − σ e j | ≤ E 2 (k( ˚ 1 ) + · · · + k( ˚ p−1 )). On the other hand one has γ −n +2 ≤ δ ji (ω · ν i ) + δ ji+1 (ω · ν i+1 ) ≤ 2 Ti+1 γ , |ν i − ν i+1 |τ γ −n ≤ δ j1 (ω · ν 1 ) ≤ 2 T1 γ , τ |ν 1 − σ e j | so that one can write C

k( ˚ 1 )+···+k( ˚ p−1 ))

≤C

3k( ˚ 1 )+···+k( ˚ p−1 )) −n T1

2

p

2

−n T i

,

(7.8)

i=2

which assures the gain factors for all self-energy clusters T1 , . . . , T p . To conclude the analysis, if OT1 = L but OT2 = R, one can reason in the same way by noting that |n T − n | ≤ 1. 2

Lemma 7.1. Set (c) = max{|c1 |, . . . , |cd |, 1}. There exists a positive constant C such that for k ∈ N, j ∈ {1, . . . , d} and ν ∈ Zd one has | θ∈Tk V (θ )| ≤ C k 3k (c). j,ν

Proof. Each time one has a resonant line , when summing together the values of all self-energy clusters, a gain B1 2−n is obtained (either by the cancellation mechanism described at the beginning of this section or as an effect of the regularisation operator R). The number of trees of order k is bounded by B2k for some constant B2 ; see Remark 3.12. The derived propagators can be bounded by (7.6). By taking into account also the bound of Lemma 4.3, setting B3 = C0 , and bounding by B4k , with ∞ −n/τ B4 = exp 3c log 2 2 n , n=0

394

L. Corsi, G. Gentile, M. Procesi

the product of the propagators (both derived and non-derived) of the non-resonant lines times the derived propagators of the resonant lines, we obtain the assertion with C = B1 B2 B3 B4 .

Lemma 7.2. The function (1.7), with x j,ν as in (3.10), and the counterterms η j defined in (3.11) are analytic in ε and c, for |ε| 3 (c) ≤ η0 with η0 small enough and (c) = max{|c1 |, . . . , |cd |, 1}. Therefore the solution x(t, ε, c) is analytic in t, ε, c for |ε| 3 (c)e3|ω| |Im t| ≤ η0 , with η0 small enough. Proof. Just collect together all the results above, in order to obtain the convergence of the series for η0 small enough and |ε| ξ (c) ≤ η0 , for some constant ξ . Moreover (k) x j,ν = 0 for |ν| > ξ k, for the same constant ξ . Lemma 3.10 gives ξ = 3.

A. Momentum-Depending Perturbation Here we discuss the Hamiltonian case in which the perturbation depends also on the coordinates y1 , . . . , yd , as in (1.13). As we shall see, differently from the y-independent case, here the Hamiltonian structure of the system is fundamental. It is more convenient to work in complex variables z, w = z ∗ , with z j = (y j + iω j x j )/ 2ω j , where the Hamilton equations are of the form −i˙z j = ω j z j + ε∂w j F(z, w, ε) + η j z j , (A.1) iw˙ j = ω j w j + ε∂z j F(z, w, ε) + η j w j , with F(z, w, ε) =

∞

εp

s+

s+

s−

s−

as + ,...,s + ,s − ,...,s − z 11 . . . z dd w11 . . . wdd . (A.2) 1

p=0 s + ,...,s + ,s − ,...,s − ≥0 1 d d 1 s1+ +···+sd+ +s1− +···+sd− = p+3

d

1

d

Note that, since the Hamiltonian (1.11) is real, one has a s+ ,s− = a s∗− ,s+ , s± = (s1± , . . . , sd± ) ∈ Zd+ .

(A.3)

Let us write f j+ (z, w, ε) = ε∂w j F(z, w, ε),

f j− (z, w, ε) = ε∂z j F(z, w, ε)

so that f jσ (z, w, ε) =

∞

εp

s+

s+

s−

s−

f j,σ s+ ,s− z 11 . . . z dd w11 . . . wdd ,

σ = ±,

s+ , s− ∈Zd+ s1+ +···+sd+ +s1− +···+sd− = p+1

p=1

− + with f j,+ s+ ,s− = (s − j + 1)a s + , s− +e j and f j, s+ , s− = (s j + 1)a s + +e j , s− , and hence ∗ f j,−s+ ,s− = f j,+ s− ,s+ , j = 1, . . . , d, s+ , s− ∈ Zd , (A.4a)

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series − (s +j2 + 1) f j+1 ,s+ +e j ,s− = (s − j1 + 1) f j2 , s+ , s− +e j1 , 2

(s − j2

+ 1) f j+1 ,s+ ,s− +e j 2

(s +j2

+ 1) f j−,s+ +e ,s− 1 j2

395

j1 , j2 = 1, . . . , d, s+ , s− ∈ Zd ,

=

(s − j1

+ 1) f j+2 ,s+ ,s− +e j , 1

(A.4b) j1 , j2 = 1, . . . , d, s , s ∈ Zd ,

=

(s +j1

+ 1) f j−,s+ +e ,s− , 2 j1

(A.4c) j1 , j2 = 1, . . . , d, s , s ∈ Zd .

+

+

−

−

(A.4d) Expanding the solution (z(t), w(t)) in Fourier series with frequency vector ω, (A.1) gives (ω · ν − ω j )z j,ν = η j z j,ν + f j,+ν (z, w, ε), (A.5) (−ω · ν − ω j )w j,ν = η j w j,ν + f j,−ν (z, w, ε). We write the unperturbed solutions as (0)

z j (t) = c+j eiω j t ,

(0)

w j (t) = c−j e−iω j t ,

j = 1, . . . , d,

with c j = c+j ∈ C and c−j = c∗j . As in Sect. 1.2 we can split (A.5) into f j,+ e j (z, w, ε) + η j z j,e j = 0,

j = 1, . . . , d,

− f j,− e j (z, w, ε) + η j w j,−e j = 0, (ω · ν) − ω j z j,ν = f j,+ν (z, w, ε) + η j z j,ν , −(ω · ν) − ω j w j,ν = f j,−ν (z, w, ε) + η j w j,ν ,

j = 1, . . . , d,

(A.6a) (A.6b)

j = 1, . . . , d, ν = e j , (A.6c) j = 1, . . . , d, ν = −e j , (A.6d)

so that first of all one has to show that the same choice of η j makes both (A.6a) and (A.6b) hold simultaneously, and that such η j is real. We consider a tree expansion very close to the one performed in Sect. 3: we simply drop (3) in Constraint 3.4. We denote by Tkj,ν ,σ the set of inequivalent trees of order k, tree component j, tree momentum ν and tree sign σ that is, the sign label of the root line is σ . We introduce θ˘ and θ˚ as in Notation 3.5 and 3.27 respectively, and we define the value of a tree as follows. The node factors are defined as in (3.2) for the end nodes, while for the internal nodes v ∈ V (θ ) we define ⎧ + + !s − ! . . . s − ! sv,1 ! . . . sv,d σ ⎪ v,1 v,d ⎪ f j ,vs+ ,s− , kv ≥ 1, ⎪ ⎪ v v v ⎨ sv ! Fv = (A.7) ⎪ ⎪ 1 ⎪ ⎪ kv = 0. ⎩− σv , 2c jv The propagators are defined as G = 1 if ν = σ e j and ] G = G [n j (σ ω · ν ),

otherwise, and we define V (θ ) as in (3.9).

G [n] j (u) =

n (|u − ω j |) , u − ωj

(A.8)

396

L. Corsi, G. Gentile, M. Procesi

Finally we set z j,e j = w ∗j,−e j = c j , and formally define z j,ν =

∞

(k)

(k)

εk z j,ν ,

z j,ν =

w j,ν =

V (θ ),

ν = e j ,

θ∈Tkj,ν,+

k=1 ∞

ε

k

w (k) j,ν ,

w (k) j,ν

=

(A.9) V (θ ),

ν = −e j ,

θ∈Tkj,ν,−

k=1

and η j,σ =

∞

εk η(k) j,σ ,

η(k) j,σ = −

k=1

1 cσj

V (θ ).

(A.10)

θ∈Tkj,σ e ,σ j

Note that Remarks 3.9, 3.13 and 3.17 still hold. Lemma A.1. With the notations introduced above, one has η∗j,+ = η j,− and z ∗j,ν = w j,−ν . Proof. By definition we only have to prove that for any θ ∈ Tkj,ν ,+ there exists θ ∈ Tkj,−ν ,− such that V (θ )∗ = V (θ ). The proof is by induction on the order of the tree. Given θ ∈ Tkj,ν ,+ , let us consider the tree θ obtained from θ by replacing the labels σv of all the nodes v ∈ N0 (θ ) with −σv and the labels σ of all the lines ∈ L(θ ) with −σ . Call 1 , . . . , p the lines on scale −1 (if any) closest to the root of θ , and denote by vi the node i enters and by θi the tree with root line i . Each tree θi is then replaced with a tree θi such that V (θi )∗ = V (θi ) by the inductive hypothesis. Moreover, as for any internal line in θ the momentum becomes −ν , the propagators do not change. Finally, for any v ∈ V (θ˘ ) the node factor is changed into ⎧ − − + ! · · · s+ ! sv,1 ! · · · sv,d !sv,1 ⎪ −σ v,d ⎪ ⎪ f j ,s −v ,s+ , kv ≥ 1, ⎪ ⎪ v v v ⎨ sv ! Fv = (A.11) ⎪ ⎪ 1 ⎪ ⎪ kv = 0. ⎪ ⎩− 2c−σv , jv

Hence by (A.4a) one has V (θ )∗ = V (θ ).

Lemma A.2. With the notations introduced above, one has η j,+ ∈ R. Proof. We only have to prove that for any θ ∈ Tkj,e j ,+ there exists θ ∈ Tkj,e j ,+ such that c+j V (θ )∗ = c−j V (θ ). Let v0 ∈ E +j (θ˘ ) (existing by Remark 3.9) and let us consider the tree θ obtained from θ by (1) exchanging the root line 0 with v0 , (2) replacing all the labels σv of all the nodes v ∈ N0 (θ )\{v0 } with −σv , and (3) replacing all the labels σ of all the internal lines with −σ , except for those in P( v0 , 0 ) which remain the same. The propagators do not change; this is trivial for the lines outside P( v0 , 0 ), while for ∈ P( v0 , 0 ) one

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

397

can reason as follows. The line divides E(θ˘ )\{v0 } into two disjoint sets of end nodes E(θ˘ , p) and E(θ˘ , s) such that if = w one has E(θ˘ , p) = {v ∈ E(θ˘ )\{v0 } : v ≺ w} and E(θ˘ , s) = (E(θ˘ )\{v0 })\E(θ˘ , p). If ν ( p) =

νv ,

˘ p) v∈E(θ,

ν (s) =

νv ,

˘ v∈E(θ,s)

one has ν ( p) + ν (s) = 0. When considering as a line in θ one has ν = ν ( p) + e j while in θ one has ν = −ν (s) + e j . Hence, as we have not changed the sign label σ , also G does not change. The node factors of the internal nodes are changed into their complex conjugates; this can be obtained as in Lemma A.1 for the internal nodes w such that w ∈ / P( v0 , 0 ) while for the other nodes one can reason as follows. First of all if v is such that v ∈ P( v0 , 0 )∪{ v0 }, there is a line v ∈ P( v0 , 0 )∪{ 0 } entering v. We shall denote j v = j1 , σ v = σ , j v = j2 , and σ v = σ . Moreover we call siσ the number of lines outside P( v0 , 0 ) ∪ { 0 } with component label i and sign label σ entering v. Let us consider first the case σ = σ = +. When considering v as node of θ one has + ∗ s1 ! · · · sd+ !s1− ! · · · sd− !(s +j2 + 1) + ∗ Fv = f j1 ,s+ +e j ,s− 2 sv ! =

s1+ ! · · · sd+ !s1− ! · · · sd− !(s +j2 + 1) sv !

f j−,s− ,s+ +e . 1

j2

+ When considering v as node of θ one has s+v = s− + e j1 and s− v = s , so that

Fv =

s1+ ! · · · sd+ !s1− ! · · · sd− !(s − j1 + 1) sv !

f j+2 ,s− +e j ,s+ , 1

and hence by (A.4b) Fv∗ = Fv . Reasoning analogously one obtains Fv∗ = Fv also in the cases σ = σ = − and σ = σ , using again (A.4b) when σ = σ = −, and (A.4c) and (A.4d) for σ = −, σ = + and σ = +, σ = − respectively. Hence the assertion is proved.

We define the self-energy clusters as in Sect. 3.6, but replacing the constraint (3) with (3 ) one has |ν T − ν T | ≤ 2 and |σ T ω · ν T − ω j T | = |σ T ω · ν T − ω j |. We T

introduce T˘ and T˚ as in Notation 3.23 and 3.27 respectively, and we can define V (T ) as in (3.12) and the localisation and the regularisation operators as in Sect. 5. Note that the main difference with the y-independent case is in the role of the sign label σ . In fact, here the sign label of a line does not depend on its momentum and component labels, and the small divisor is given by δ j,σ (ω · ν) = |σ ω · ν − ω j |. Hence the dimensional bounds of Sect. 4 and the symmetries discussed in Sect. 6 and summarised in Lemma 6.1 can be proved word by word as in the y-independent case, except for the second equality in Lemma 6.3 where one has to take into account a change of signs. More precisely for T ∈ Rkj,σ, j ,σ (u, n), with j = j and n ≥ 0 for all ∈ PT , we define G1 (T ) as in Sect. 6 and G3 (T ) as in Sect. 6 but replacing also the sign labels σ of the lines ∈ L(T ) with −σ .

398

L. Corsi, G. Gentile, M. Procesi

Lemma A.3. For all T ∈ Rkj,σ, j ,σ (u, n), with j = j and n ≥ 0 for all ∈ PT , one has σ c−σ L V (T , u) = cσj c−σ L V (T , u). (A.12) j c j j T ∈G1 (T )

T ∈G3 (T )

Proof. We consider only the case k(T˚ ) ≤ K 0 2n T /τ . For fixed T ∈ Rkj,σ, j ,σ (u, n), with j = j , let θ ∈ Tkj,σ e j ,σ (n) be the tree obtained from T by replacing the entering line T with a line exiting a new end node v0 with σv0 = σ and ν v0 = σ e j . As in the proof of Lemma 6.3 one has cσj L V (T , u) = |E σj (θ˘ )| V (θ ). T ∈G1 (T )

Now, let θ ∈ Tkj,−σ e j ,−σ (n) be the tree obtained from θ by replacing all the labels σv of the nodes v ∈ N0 (θ ) with −σv , and the labels σ of all the lines ∈ L(θ ) with −σ . Any T ∈ G3 (T ) can be obtained from θ by replacing a line exiting an end node v ∈ E σj (θ˘ ) with entering line T , carrying the same labels as T . Hence, by Lemma A.1, −σ σ σ ˘ σ ˘ ∗ c−σ L V (T , u) = c−σ j c j j |E j (θ)| V (θ ) = c j |E j (θ )| V (θ ) T ∈G1 (T ) −σ = c−σ j c j

(L V (T , u))∗ .

T ∈G3 (T )

On the other hand, exactly as in Lemma A.2 one can prove that for any T ∈ G3 (T ) there exists T ∈ G3 (T ) such that ∗ σ c−σ j (L V (T , u)) = c j L V (T , u),

and hence the assertion follows.

The cancellation mechanism and the bounds proved in Sect. 7 follow by the same reasoning (in fact it is even simpler); see the next appendix for details. B. Matrix Representation of the Cancellations As we have discussed in Sect. 5 the only obstacle to convergence of the formal power series of the solution is given by the accumulation of resonant lines; see Fig. 14. The cancellation mechanism described in Sect. 7 can be expressed in matrix notation. This is particularly helpful in the y-dependent case. For this reason, and for the fact that the formalism introduced in Appendix A includes the y-independent case, we prefer to work here with the variables (z, w). We first develop a convenient notation. Given ν such that σ (ν, 1) = + and δ1,+ (ω·ν) < γ let us group together, in an ordered set S(ν), all the ν such that ν = ν ( j, σ ) := ν − e1 + σ e j , σ = ±1 and j = 1, . . . , d, see Remark 3.19. By definition one has δ1,+ (ω · ν) = δ j,σ (ω · ν ( j, σ )) for all j = 1, . . . , d and σ = ±. Then we construct a 2d × 2d localised self-energy matrix L M (k) (ω · ν, n) with entries L M (k) j,σ, j ,σ (ω · ν ( j , σ ), n). We also define the 2d × 2d diagonal propagator matrix G [n] (ω · ν) with

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

399

[n] [n] [n] entries G j,σ, j ,σ (ω · ν) = δ j, j δσ,σ G j (ω · ν ( j, σ )), with G j (u) defined according to (A.8), and δa,b is the Kronecker delta. As in Sect. 7 let us consider a chain of two self-energy clusters; see Fig. 9. By definition its value is ] L V (T1 , ω · ν 1 ) G [n j (ω · ν ) L V (T2 , ω · ν 2 ),

with ν 1 = ν T and ν 2 = ν T2 . 1 Notice that, if one sets also for the sake of simplicity, σ1 = σ T , j1 = j T , σ2 = σ T2 , 1

1

and j2 = j T2 , by the constraint (3 ) in the definition of self-energy clusters given in Appendix A, one has ν 1 − ν = σ1 e j1 − σ e j and ν − ν 2 = σ e j − σ2 e j2 ; moreover ν 1 , ν , ν 2 all belong to a single set S(ν) for some ν. As done in Sect. 7 let us sum together the values of all the possible self-energy clusters T1 and T2 with fixed labels associated with the external lines, and of fixed orders k1 and k2 , respectively. We obtain (k )

(k )

1 2 (ω · ν ( j , σ ), n T1 ) G j[n ,σ ] , j ,σ (ω · ν) L M j ,σ (ω · ν ( j2 , σ2 ), n T2 ). L M j1 ,σ 1 , j ,σ , j2 ,σ2

If we also sum over all possible values of the labels j , σ we get d σ =± j =1

[n ]

(k )

(k )

2 L M j 1,σ , j ,σ (ω · ν ( j , σ ), n T1 ) G j ,σ (ω · ν)L M j ,σ (ω · ν ( j2 , σ2 ), n T2 ) 1 1 , j ,σ , j2 ,σ2

= L M (k1 ) (ω · ν, n T1 ) G [n ] (ω · ν) L M (k2 ) (ω · ν, n T2 )

j1 ,σ1 , j2 ,σ2

,

(i.e. the entry j1 , σ1 , j2 , σ2 of the matrix in square brackets). By the definition (A.8) of the propagators and by the symmetries of Lemma 6.1, G [n] (ω · ν) and L M (k) (ω · ν, n) have the form ⎛

1 0 ⎜ 0 −1 ⎜ ⎜ (|ω · ν − ω |) 0 n 1 ⎜ [n] ⎜ G (ω · ν) = ⎜ .. ω · ν − ω1 ⎜ . ⎜ ⎝ 0

⎞ ···

0 .. ..

. .

···

0

⎟ ⎟ ⎟ .. ⎟ . ⎟, ⎟ .. . 0 ⎟ ⎟ 1 0 ⎠ 0 0 −1 .. .

(B.1)

and c1∗ c1 (k) ⎜ M1,1 (n) ⎜ c1 c1 ⎜ ⎜ .. ⎜ . L M (k) (ω · ν , n) = ⎜ ⎜ ⎜ ⎜ c ∗ c1 ⎝ (k) Md,1 (n) d cd c1 ⎛

c1∗ cd (k) (n) · · · M 1,d ∗ c1 c1 c1 cd ∗ .. c j c j c∗j c∗j (k) M j, j (n) . ∗ c j c j c j c j cd∗ cd cd∗ c1∗ (k) (n) · · · M d,d cd c1∗ cd cd

c1∗ c1∗

c1∗ cd∗

⎞

⎟ ⎟ ⎟ ⎟ ⎟ ⎟, ⎟ ⎟ cd∗ cd∗ ⎟ ⎠ cd cd∗

c1 cd∗

400

L. Corsi, G. Gentile, M. Procesi

respectively. A direct computation gives L M (k1 ) (ω · ν, n T1 ) G [n ] (ω · ν) L M (k2 ) (ω · ν, n T2 ) =

n (|ω · ν − ω1 |) −σ1 σ2 c j1 c j2 ω · ν − ω1

d

j1 ,σ1 , j2 ,σ2

M j1 , j (n T1 ) M j, j2 (n T2 ) |c j |2

j=1

(−1)1+σ 1 = 0,

σ =±

(B.2) for all choices of the scales n , n T1 , n T2 and of the orders k1 , k2 . This proves the necessary cancellation. Note that this is an exact cancellation in terms of the variables (z, w): all chains of localised self-energy clusters of length p ≥ 2 can be ignored as their values sum up to zero. In the y-independent case, and in terms of the variables x, the cancellation is only partial, and one only finds L M (k1 ) G [n] L M (k2 ) = O(2−n ), as discussed in Sect. 7. C. Resummation of the Perturbation Series The fact that the series obtained by systematically eliminating the self-energy clusters converges, as seen in Sect. 4, suggests that one may follow another approach, alternative to what we have described so far, and leading to the same result. Indeed, one can consider a resummed expansion, where one really gets rid of the self-energy clusters at the price of changing the propagators into new dressed propagators – again terminology is borrowed from quantum field theory. This is a standard procedure, already exploited in the case of KAM tori [10], lower-dimensional tori [10,12], skew-product systems [11], etc. The convergence of the perturbation series reflects the fact that the dressed propagators can be bounded proportionally to (a power of) the original ones for all values of the perturbation parameter ε. In our case, the latter property can be seen as a consequence of the cancellation mechanism just described. In a few words – and oversimplifying the strategy – the dressed propagators are obtained starting from a tree expansion where no self-energy clusters are allowed, and then ‘inserting arbitrary chains of self-energy clusters’: this means that each propagator G [n] = G [n] (ω · ν) is replaced by a dressed propagator

[n] = G [n] + G [n] MG [n] + G [n] MG [n] MG [n] + · · · ,

(C.1)

where M = M(ω · ν) denotes the insertion of all possible self-energy clusters compatible with the labels of the propagators of the external lines (M is the matrix with entries M j,σ, j σ (ω · ν ( j , σ )) formally defined in Remark 3.26). Then, formally, one can sum together all possible contributions in (C.1), so as to obtain −1 −1

[n] = G [n] 1 − MG [n] = A−1 − B ,

A := G [n] ,

B := M. (C.2)

For sake of simplicity, let us also identify the self-energy values with their localised parts, so as to replace in (C.1), and hence in (C.2), M with L M, if L is the localisation operator. Then, in the notations we are using, the cancellation (B.2) reads B AB = 0, which implies

[n] = A + AB A.

KAM Theory in Configuration Space and Cancellations in the Lindstedt Series

401

Therefore one finds [n] ≤ A + A2 B = O(22n ). So the values of the trees appearing in the resummed expansion can be bounded as done in Sect. 4, with the only difference that now, instead of the propagators G bounded proportionally to 2n , one has the dressed propagators [n ] bounded proportionally to 22n . Of course, the argument above should be made more precise. First of all one should have to take into account also the regularised values of the self-energy clusters. Moreover, the dressed propagators should be defined recursively, by starting from the lower scales: indeed, the dressed propagator of a line on scale n is defined in terms of the values of the self-energy clusters on scales < n, as in (C.2), and the latter in turn are defined in terms of (dressed) propagators on scales < n, according to (3.13). As a consequence, the cancellation mechanism becomes more involved because the propagators are no longer of the form (B.1); in particular the symmetry properties of the self-energy values should be proved inductively on the scale label. In conclusion, really proceeding by following the strategy outlined above requires some work (essentially the same amount as performed in this paper). We do not push forward the analysis, which in principle could be worked out by reasoning as done in the papers quoted above.

References 1. Bartuccelli, M.V., Gentile, G.: Lindstedt series for perturbations of isochronous systems: a review of the general theory. Rev. Math. Phys. 14(2), 121–171 (2002) 2. Berretti, A., Gentile, G.: Bryuno function and the standard map. Commun. Math. Phys. 220(3), 623–656 (2001) 3. Bollobás, B.: Graph theory. An introductory course. Graduate Texts in Mathematics 63, New York-Berlin: Springer-Verlag, 1979 4. Bricmont, J., Gaw¸edzki, K., Kupiainen, A.: KAM theorem and quantum field theory. Commun. Math. Phys. 201(3), 699–727 (1999) 5. Bryuno, A.D.: Analytic form of differential equations. I, II. Trudy Moskov. Mat. Obšˇc. 25, 119–262 (1971); ibid. 26, 199–239 (1972). English translations: Trans. Moscow Math. Soc. 25, 131–288 (1971); ibid. 26, 199–239 (1972) 6. de la Llave, R., González, A., Jorba, À., Villanueva, J.: KAM theory without action-angle variables. Nonlinearity 18(2), 855–895 (2005) 7. De Simone, E., Kupiainen, A.: The KAM theorem and renormalization group. Erg. Th. Dynam. Syst. 29(2), 419–431 (2009) 8. Eliasson, L.H.: Absolutely convergent series expansions for quasi periodic motions. Math. Phys. Electron. J. 2, Paper 4, 33 pp. (electronic) (1996) 9. Gallavotti, G.: Twistless KAM tori. Commun. Math. Phys. 164(1), 145–156 (1994) 10. Gallavotti, G., Bonetto, F., Gentile, G.: Aspects of ergodic, qualitative and statistical theory of motion. Texts and Monographs in Physics, Berlin: Springer-Verlag, 2004 11. Gentile, G.: Resummation of perturbation series and reducibility for Bryuno skew-product flows. J. Stat. Phys. 125(2), 321–361 (2006) 12. Gentile, G.: Degenerate lower-dimensional tori under the Bryuno condition. Erg. Th. Dynam. Syst. 27(2), 427–457 (2007) 13. Gentile, G.: Diagrammatic methods in classical perturbation theory. Encyclopedia of Complexity and System Science, Vol. 2, Ed. R.A. Meyers, Berlin: Springer, 2009, pp. 1932–1948 14. Gentile, G.: Quasi-periodic motions in strongly dissipative forced systems. Erg. Th. Dynam. Syst. 30(5), 1457–1469 (2010) 15. Gentile G. (2010) Quasi-periodic motions in dynamical systems. Review of a renormalisation group approach. J. Math. Phys. 51, no. 1, 015207, 34 pp (2010) 16. Gentile, G., Bartuccelli, M., Deane, J.: Summation of divergent series and Borel summability for strongly dissipative equations with periodic or quasi-periodic forcing terms. J. Math. Phys. 46, no. 6, 062704, 21 pp (2005) 17. Gentile, G., Mastropietro, V.: Methods for the analysis of the Lindstedt series for KAM tori and renormalizability in classical mechanics. A Review with Some Applications. Rev. Math. Phys. 8(3), 393–444 (1996) 18. Harary, F.: Graph theory. Reading, MA-Menlo Park, CA-London: Addison-Wesley Publishing Co., 1969

402

L. Corsi, G. Gentile, M. Procesi

19. Levi, M., Moser, J.: A Lagrangian proof of the invariant curve theorem for twist mappings. In: Smooth ergodic theory and its applications (Seattle, WA, 1999), Proc. Sympos. Pure Math. 69, Providence, RI: Amer. Math. Soc., 2001, pp. 733–746 20. Moser, J.: Convergent series expansions for quasi–periodic motions. Math. Ann. 169, 136–176 (1967) 21. Poincaré, H.: Les méthodes nouvelles de la mécanique céleste. Vol. I–III, Paris: Gauthier-Villars, 1892– 1899 22. Salamon, D., Zehnder, E.: KAM theory in configuration space. Comment. Math. Helv. 64, 84–132 (1989) Communicated by G. Gallavotti

Commun. Math. Phys. 302, 403–423 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1168-7

Communications in

Mathematical Physics

On the C ∗ -Algebra of a Locally Injective Surjection and its KMS States Klaus Thomsen Institut for Matematiske Fag, Ny Munkegade, 8000 Aarhus C, Denmark. E-mail: [email protected] Received: 10 March 2010 / Accepted: 2 August 2010 Published online: 20 November 2010 – © Springer-Verlag 2010

Abstract: It is shown that a locally injective surjection on a compact metric space is a factor of a local homeomorphism in such a way that the associated C ∗ -algebras are isomorphic. This is subsequently used to obtain upper and lower bounds for the possible β-values of KMS-states for generalized gauge actions on the C ∗ -algebra. 1. Introduction In [Th] the construction of a C ∗ -algebra from an étale groupoid, as introduced by J.Renault in [Re1], was generalized to a larger class of locally compact groupoids called semi-étale groupoids, where the range and source maps are locally injective, but not necessarily open. The main purpose with the generalization was to make the powerful techniques for étale groupoids available to the study of dynamical systems via the groupoid constructed in increasing generality by Renault, Deaconu and AnantharamanDelaroche, [Re1,D,A], also when the underlying map is not open. In particular, as shown in [Th] this makes it possible to handle general (one-sided) subshifts. One of the intriguing connections between dynamical systems and C ∗ -algebras is the relation between the thermodynamical formalism of Ruelle, as described in [Ru], and quantum statistical mechanics, as described in [BR]. One relation between these formalisms is very concrete and direct and manifests itself in almost all of the C ∗ -algebraic settings of quantum statistical mechanics through a bijective correspondence between KMS states and measures fixed by a dual Ruelle operator. This relation is implicit in the work of J. Renault, [Re1 and Re2], and has been developed further by R. Exel, [E]. By using this correspondance Kumjian and Renault, [KR], were able to use Walters’ results, [W2], on the convergence of the Ruelle operator to extend most results on the existence and uniqueness of KMS states for the generalized gauge actions on Cuntz-Krieger algebras which has been one of the favourite models in quantum statistical mechanics. The main purpose of the present work is to show that there is a canonical way to pass from a locally injective continuous surjection to a local homeomorphism in such

404

K. Thomsen

a way that the C ∗ -algebras of the corresponding groupoids, one of them defined as in [Th], are isomorphic. The construction is a generalization of W. Krieger’s construction of a canonical extension for a sofic shift, [Kr1,Kr2], now known as the left Krieger cover. The canonical local homeomorphic extension of a general locally injective surjection which we construct is undoubtedly useful for other purposes, and it seems to deserve a more thorough investigation. Here we use it to investigate the KMS states of the generalized gauge actions. In fact, we restrict our considerations even further by focusing only on the possible values of the inverse temperature β for such KMS states. The results we obtain give bounds on the possible β-values and ensure the existence of KMS states under mild conditions on the potential function. We depart from the work of Exel in [E] and the main tool to prove existence of KMS states is a method developed by Matsumoto, Watatani and Yoshida in [MWY] and Pinzari, Watatani and Yonetani in [PWY]. Concerning bounds on the possible β-values of KMS states, the main novelty is the observation that it is not so much the entropy of the map which provides the bounds but rather the exponential growth rate of the number of pre-images. The relevant entity is thus an invariant h m which was introduced by Hurley in [Hu] and studied further in [FFN], among others. For forward expansive maps the invariant of Hurley is equal to the topological entropy, but in general it is smaller than the topological entropy. The invariant of Hurley controls the existence of KMS states completely when the potential function is strictly positive or strictly negative: For such potential functions there is a KMS-state if and only if h m is not zero. We refer to Sect. 6 for more details on our results on KMS states. 2. Recap about C r∗ ϕ Let X be a locally compact Hausdorff space and ϕ : X → X a continuous map. We assume that ϕ is locally injective, meaning that there is a basis for the topology of X consisting of sets on which ϕ is injective. Set ϕ = (x, k, y) ∈ X × Z × X : ∃a, b ∈ N, k = a − b, ϕ a (x) = ϕ b (y) . This is a groupoid with the set of composable pairs being ϕ(2) = (x, k, y), (x , k , y ) ∈ ϕ × ϕ : y = x . The multiplication and inversion are given by (x, k, y)(y, k , y ) = (x, k + k , y ) and (x, k, y)−1 = (y, −k, x). To turn ϕ into a locally compact topological groupoid, fix k ∈ Z. For each n ∈ N such that n + k ≥ 0, set ϕ (k, n) = (x, l, y) ∈ X × Z × X : l = k, ϕ k+i (x) = ϕ i (y), i ≥ n . This is a closed subset of the topological product X ×Z× X and hence a locally compact Hausdorff space in the relative topology. Since ϕ is locally injective ϕ (k, n) is an open subset of ϕ (k, n + 1), and hence the union ϕ (k) = ϕ (k, n) n≥−k

Local Homeomorphisms

405

is a locally compact Hausdorff space in the inductive limit topology. The disjoint union ϕ = ϕ (k) k∈Z

is then a locally compact Hausdorff space in the topology where each ϕ (k) is an open and closed set. In fact, as is easily verified, ϕ is a locally compact groupoid in the sense of [Re1]. Note that the unit space ϕ0 of ϕ equals X via the identification x → (x, 0, x). The local injectivity of ϕ ensures that the range map r (x, k, y) = x is locally injective, i.e. ϕ is semi étale.Wecan therefore define the corresponding C ∗ -algebra Cr∗ ϕ as in [Th]. Briefly Cr∗ ϕ is the completion of the ∗-algebra alg∗ ϕ generated by the continuous and compactly supported function on ϕ under the convolution product f g(x, k, y) = f (x, n, z)g(z, m, y), z,m+n=k

and the involution f ∗ (x, k, y) = f (y, −k, x). The elements of alg∗ ϕ are all bounded and of compact support, but not necessarily continuous. The elements of alg∗ ϕ whose supports are contained in the unit space, identified with X as it is, generate under the completion an abelian C ∗ -algebra Dϕ which contains C0 (X ) and consists of bounded functions vanishing at infinity. The restriction map extends to a conditional expectation Pϕ : Cr∗ ϕ → Dϕ . Let us now restrict the attention to the case where X is compact and metrizable. One ∗ R × N of the results from [Th] is that Cr∗ ϕ can then be realized as a crossed C ϕ r ϕ

∗ R ∗ -subalgebra of C ∗ is the C generated in the sense of Paschke, where C ϕ ϕ r r ϕ is the endomorphism of Cr∗ ϕ given by conjugation with the by Cc ϕ (0) and

isometry Vϕ , where ⎧ ⎨ m(x)− 21 when k = 1 and y = ϕ(x) Vϕ (x, k, y) = ⎩ 0 otherwise. The function m : X → N which enters here is also going to play an important role in the present paper and it is equal to m = N ◦ ϕ, with N (x) = #ϕ −1 (x). While this crossed product descriptionis useful for several purposes, including the cal culation of the K -theory groups of Cr∗ ϕ , it is going to be instrumental here to relate to a crossed product description in the sense of Exel, [E]. 3. C r∗ ϕ as a Crossed Product in the Sense of Exel Let f ∈ Dϕ . Then Pϕ Vϕ f Vϕ∗ (x) = m(x)−1 f (ϕ(x)). Since m ∈ Dϕ this shows that f ◦ ϕ ∈ Dϕ . We can therefore define a ∗-endomorphism αϕ of Dϕ such that αϕ ( f ) = f ◦ ϕ.

(3.1)

406

K. Thomsen

Note that αϕ is unital, and injective since ϕ is surjective. Let f ∈ Dϕ , and let 1ϕ (1,0) be the characteristic function of the open and compact subset ϕ (1, 0) of ϕ . Then 1∗ϕ (1,0) f 1ϕ (1,0) ∈ Dϕ and 1∗ϕ (1,0) f 1ϕ (1,0) (x) = Hence the function X x →

f (z).

(3.2)

z∈ϕ −1 (x)

f (z) is in Dϕ . In particular, the function 1 N (x) = #ϕ −1 (x) = z∈ϕ −1 (x)

z∈ϕ −1 (x)

is in Dϕ . This allows us to define Lϕ : Dϕ → Dϕ such that f (z). Lϕ ( f )(x) = N (x)−1 z∈ϕ −1 (x)

Lϕ is a unital positive linear map and Lϕ f αϕ (g) = Lϕ ( f )g for all f, g ∈ Dϕ . Hence Lϕ is a transfer operator in the sense of Exel, cf. [E and EV], so that the crossed product Dϕ αϕ ,Lϕ N is defined. Observe that Lϕ is faithful and that the Standing Hypotheses of [EV], Hypotheses 3.1, are all satisfied. The following result generalizes Theorem 9.2 in [EV], and to some extent also Theorem 4.18 of [Th]. Theorem 3.1. There is a ∗-isomorphism Dϕ αϕ ,Lϕ N → Cr∗ ϕ which is the identity on Dϕ and takes the isometry S of Exel (cf. [E]) to the isometry Vϕ ∈ Cr∗ ϕ . k Proof. Since ϕ is locally injective there is a partition of unity {bi }i=1 in C(X ) ⊆ Dϕ such that ϕ is injective on supp bi for each i. It is then straightforward to check that

f =

k

1 1 (bi m) 2 αϕ ◦ Lϕ (bi m) 2 f

i=1

1 k is a quasi-basis for the conditional expectation for all f ∈ Dϕ , so that (bi m) 2 i=1 αϕ ◦ Lϕ of Dϕ onto αϕ Dϕ in the sense of [EV]. It is also straightforward to check that Vϕ f = αϕ ( f )Vϕ and Vϕ∗ f Vϕ = Lϕ ( f ) for all f ∈ Dϕ . Furthermore,

1=

k

1

1

(bi m) 2 Vϕ Vϕ∗ (bi m) 2 .

i=1

It follows therefore from Corollary 7.2 of [EV] that there is a ∗-homomorphism ρ : Dϕ αϕ ,Lϕ N → Cr∗ ϕ which is the identity on Dϕ and takes the isometry S to the isometry Vϕ ∈ Cr∗ ϕ . To see that ρ is surjective we must show that Cr∗ ϕ n is generated by Dϕ and Vϕ . From the expresssion for Vϕn Vϕ∗ given in the proof of Theorem 4.8 of [Th], combined with Corollary 4.5 from [Th], it follows that the

Local Homeomorphisms

407

C ∗ -algebra generated by Vϕ and Dϕ contains the characteristic function 1 R(ϕ n ) for each n. It follows then that it contains C(X ) 1 R(ϕ n ) C(X )

(3.3)

since C(X ) ⊆ Dϕ . Among the functions in (3.3) are the elements of C (R (ϕ n )) which are restrictions to R (ϕ n ) of product type functions, X × X (x, y) → f (x)g(y), with f, g ∈ C(X ). These functions generate C(X × X ) and their restriction generate C (R (ϕ n )) so it follows that the C ∗ -algebra generated by Vϕ and Dϕ contains C (R (ϕ n )) for each n. Since C (R (ϕ n )), Cr∗ Rϕ = n

we conclude from Theorem 4.6 of [Th] that it coincides with Cr∗ ϕ , proving that ρ is surjective. Finally, it follows from Theorem 4.2 of [EV] that ρ is injective since the gauge action on Cr∗ ϕ can serve as the required T-action.

4. A Canonical Local Homeomorphism Extending (X, ϕ) In this section we show that the continuous map ψ from the Gelfand spectrum of Dϕ to itself which corresponds to the endomorphism (3.1) of Dϕ is a local homeomorphism and that the corresponding dynamical system is a canonical extension of (X, ϕ). The proof is based on the well-known contravariant equivalence between compact Hausdorff spaces and unital abelian C ∗ -algebras.

be the Gelfand spectrum of Dϕ . Recall To simplify notation, set D = Dϕ and let D

that D consists of the unital ∗-homomorphisms c : D → C, also known as the char is closed in the weak*-topology of the unit ball in the dual space D ∗ acters of D. D of D and obtains in this way a compact topology. Since X is compact and metrizable

is metrizable. Finally, recall it follows that D is separable and it follows that also D

in the natural way; viz. that every element d ∈ D becomes a continuous function on D d(c) = c(d), and this recipe gives rise to an (isometric) ∗-isomorphism between D and

which we suppress in the notation by simply identifying D and C( D)

whenever C( D) it is convenient.

→ X arising from the fact that every character of C(X ) comes There is a map π : D

of D there is a unique point from evaluation at a point in X : Given a character c ∈ D π(c) ∈ X such that c( f ) = f (π(c)) for all f ∈ C(X ). Note that π is continuous. We

→ D

such that ψ(c)(g) = c (g ◦ ϕ) for all g ∈ D. It follows straightfordefine ψ : D

that ψ is continuous. Hence D,

ψ is wardly from the definition of the topology of D a dynamical system. Note that f ((ϕ ◦ π ( c)) = f ◦ ϕ (π(c)) = c ( f ◦ ϕ) = ψ(c)( f ) = f (π ◦ ψ(c))

ψ) is equivariant. Define ι : X → D

for all f ∈ C(X ), proving that π : (X, ϕ) → ( D,

by ι(x) = cx ∈ D, where cx is the character defined such that cx (g) = g(x) for all g ∈ D. Since g (ψ ◦ ι(x)) = cx (g ◦ ϕ) = g (ϕ(x)) = cϕ(x) (g) we see that also

ψ) is equivariant. Furthermore π ◦ ι(x) = x for all x ∈ X , proving ι : (X, ϕ) → ( D, that ι is injective and π surjective. Note, however, that ι is generally not continuous.

Since g ∈ D, cx (g) = 0 ∀x ∈ X ⇒ g = 0, the range ι(X ) of ι is dense in D.

408

K. Thomsen

ψ) is canonical in the following sense: If It is evident that the construction of ( D, ϕ : X → X is another locally injective surjection of a compact Hausdorff space X ,

ψ) to ( then a conjugacy from (X, ϕ) to (X , ϕ ) induces a conjugacy from ( D, D, ψ ) which extends the given conjugacy in the sense that the diagram

π

D

/ D

X

/ X

π

commutes. It remains now only to establish the following Proposition 4.1. ψ is a surjective local homeomorphism.

and set z = π(ψ(c)) = ϕ (π(c)). By Lemma Proof. ψ is locally injective: Let c ∈ D 3.6 of [Th] there is an open neighborhood U of z and open sets Vi , i = 1, 2, . . . , j, where j = #ϕ −1 (z), such that 1) ϕ −1 U ⊆ V1 ∪ V2 ∪ · · · ∪ V j , 2) Vi ∩ Vi = ∅ when i = i , and 3) ϕ is injective on Vi for each i. Without loss of generality we may assume that π(c) ∈ V1 . Let h, H ∈ C(X ) be such that 0 ≤ h ≤ 1, h (π(c)) = 1, ϕ (supp h) ⊆ U , H h = h and supp H ⊆ V1 . Set

: c (h) > 0 ; W = c ∈ D

To show that c ∈ W we choose a sequence {z k } in X such clearly an open subset of D. that limk ι (z k ) = c. Then π(c) = limk π ◦ ι (z k ) = limk z k so that c(h) = lim ι (z k ) (h) = lim h (z k ) = h (π(c)) = 1. k

k

W is therefore an open neighborhood c in D. To show that ψ is injective of on W , let c , c ∈ W and z in X such that lim and choose sequences z ι z k = c and k k k limk ι z k = c . Since lim h z k = lim ι z k (h) = c (h) > 0, k

k

it follows that h z k > 0 for all large k. Hence ϕ z k ∈ U , H z k = 1 and z k ∈ V1 for all large k. It follows that ⎛ ⎛ ⎞ ⎞ f H (v)⎠ = lim ι ◦ ϕ z k ⎝ f H (v)⎠ ψ(c ) ⎝ k

v∈ϕ −1 (·)

= lim k

for all f ∈ D. Similarly,

⎛ ψ(c ) ⎝

v∈ϕ −1 (·)

f H (v) = lim f (z k ) = c ( f )

v∈ϕ −1 (ϕ(z k ))

k

⎞ f H (v)⎠ = c ( f )

v∈ϕ −1 (·)

for all f ∈

D. It follows that ψ(c ) = ψ(c ) ⇒ c = c , proving that ψ

is injective on W .

Local Homeomorphisms

409

ψ is open: Let f ∈ D be a non-negative function and set

: c( f ) > 0 . V = c∈D

so we consider an element c ∈ V , and set It suffices to show that ψ(V ) is open in D, ⎧ ⎫ ⎞ ⎛ ⎨ c( f ) ⎬

: c ⎝ W = c ∈ D f (v)⎠ > . ⎩ 2 ⎭ −1 v∈ϕ

(·)

Let {z k } be a sequence in X such that limk ι (z k ) = c and note that ⎛ ⎞ ⎛ ⎞ ψ(c) ⎝ f (v)⎠ = lim ι (ϕ(z k )) ⎝ f (v)⎠ v∈ϕ −1 (·)

k

= lim k

v∈ϕ −1 (·)

f (v) ≥ lim f (z k )

v∈ϕ −1 (ϕ(z k ))

k

c( f ) . k 2 It follows that W is an open neighborhood of ψ(c). It suffices therefore to show that W ⊆ ψ(V ). Let c ∈ W and choose a sequence {z k } in X such that limk→∞ ι(z k ) = c

For all large k, in D. ⎛ ⎞ c( f ) , f (v) = ι z k ⎝ f⎠ > 2 −1 −1 = lim ι(z k )( f ) = c( f ) >

v∈ϕ

(z k )

v∈ϕ

(·)

so for all large k there are elements vk ∈ ϕ −1 (z k ) such that f (vk ) ≥ c(2Mf ) , where

of the sequence {ι(vk )}. For M = max x∈X #ϕ −1 (x). Let c be point in D a condensation the corresponding subsequence vki we find that ψ(c ) = limi ϕ vki = limi z k i = c . Since c( f ) c ( f ) = lim f vki ≥ > 0, i 2M it follows that c ∈ V , proving that W ⊆ ψ(V ).

= D,

there is an element f ∈ D such that f = 0 and f ≥ 0, ψ is surjective: If ψ( D)

Since ψ(c)( f ) = c ( f ◦ ϕ) it follows that f ◦ ϕ = 0. while ψ(c)( f ) = 0 for all c ∈ D. This is impossible since f = 0 and ϕ is surjective.

The dynamical system ( D, ψ) will be called the canonical local homeomorphic

ψ) is the left Krieger cover of (X, ϕ) when extension of (X, ϕ). It can be shown that ( D, (X, ϕ) is a one-sided sofic shift. 5. Isomorphism of the C ∗-Algebras C r∗ ϕ and C r∗ ψ Since ψ is a local homeomorphism the C ∗ -algebras Cr∗ Rψ and Cr∗ ψ coincide with the one considered in [A]. In particular, the abelian C ∗ -algebra Dψ is equal to

= Dϕ . In this section we show that this identification, Dϕ = Dψ , is the C( D) restriction of an isomorphism between Cr∗ ϕ and Cr∗ ψ . As above we let N ∈ D be the function N (x) = #ϕ −1 (x), and set m = N ◦ ϕ.

410

K. Thomsen

Lemma 5.1. c(N ) = #ψ −1 (c) for all c ∈ D. Proof. For any f ∈ D, let I ( f ) denote the function I ( f )(x) = f (v). v∈ϕ −1 (x)

and let {z k } be a sequence in X such It follows from (3.2) that I ( f ) ∈ D. Let c ∈ D that limk ι(z k ) = c. Set z = π(c), and let be U, V1 , V2 , . . . , V j as in Lemma 3.6 of [Th], i.e. 1)-3) from the proof of Proposition 4.1 hold. Since limk N (z k ) = c(N ) we can assume that N (z k ) = c(N ) for all k, and since limk z k = limk π ◦ ι (z k ) = z in X we can assume that z k ∈ U for all k. Choose functions h i , Hi ∈ Cc (X ), i = 1, 2, . . . , j, such that 0 ≤ h i ≤ 1, h i (wi ) = 1, where wi = Vi ∩ ϕ −1 (z), ϕ (supp h i ) ⊆ U , Hi h i = h i and supp Hi ⊆ Vi for all i. Observe that c(N ) ≤ j and set gF = I (h i ) ∈ D i∈F

for every subset F ⊆ {1, 2, . . . , j} with c(N ) elements. For all sufficiently large k there is a subset F ⊆ {1, 2, . . . , j} with c(N ) elements such that g F (z k ) ≥ 21 . Indeed, since N (z k ) = c(N ) there is for each k a subset Fk ⊆ {1, 2, . . . , j} with c(N ) elements and elements vki ∈ Vi , i ∈ Fk , such that ϕ −1 (z k ) = vki : i ∈ Fk . When g Fk (z k ) < 21 there must be at least one i k ∈ Fk for which 1 c(N1 ) . h ik vkik < 2 Hence, if g Fk (z k ) <

1 2

for infinitely many k, a condensation point of the sequence vkik

would give us, for some i ∈ {1, 2, . . . , j}, a point in Vi ∩ ϕ −1 (z) other than wi , contradicting property 3) of the Vi ’s. Hence g Fk (z k ) ≥ 21 for all sufficiently large k. Since there are only finitely many subsets of {1, 2, . . . , j} we can pass to a subsequence of {z k } to arrange that the same subset F works for all k, i.e. that g F (z k ) ≥

1 2

(5.1)

for all k. Since N (z k ) = c(N ) = #F this implies that ϕ −1 (z k ) = vki : i ∈ F for some (unique) elements vki ∈ Vi , i ∈ F . For each i, let ci be a condensation

Then ψ(ci ) = limk ψ ι v i = limk ι (z k ) = c for all i. Since point of {ι vki } in D. k ci (h i ) = limk h i vki = 0 if and only if i = i for i, i ∈ F , we conclude that ci = ci when i = i , proving that #ψ −1 (c) ≥ c(N ). As shown in the proof of Proposition 4.1, ψ is injective on

: c (h i ) > 0 . Wi = c ∈ D

Local Homeomorphisms

411

To show that #ψ −1 (c) ≤ N (c) it suffices therefore to show that every element c of ψ −1 (c) is contained in Wi for some i ∈ F . To this end we pick a sequence {yk } in X

Set z = ϕ(yk ) and note that limk ι z = ψ(c ) = c such that limk ι (yk ) = c in D. k k while limk z k = limk π ◦ ψ ◦ ι (yk ) = limk π ◦ ψ c = z. In particular, N (z k ) = c(N )

(5.2)

z k ∈ U

(5.3)

and

for all sufficiently large k. Furthermore, by using (5.1) we find that lim g F (z k ) = c (g F ) = lim g F (z k ) ≥ k

k

1 . 2

(5.4)

By combining (5.2), (5.3) and (5.4) we find that 1 −1 −1 ϕ ,∞ zk ⊆ hi 4 i∈F

for all large k. Since yk ∈

z k , it follows that 1 −1 ,∞ hi yk ∈ 4

ϕ −1

i∈F

for all large k. Hence there is an i ∈ F such that yk ∈ h i−1 k which implies that c (h i ) = lim h i (yk ) ≥ k

Hence

c

∈ Wi .

1

4, ∞

for infinitely many

1 . 4

Corollary 5.2. #ψ −1 (ψ(c)) = c(m) for all c ∈ D. Proof. Using Lemma 5.1 for the first equality we find that #ψ −1 (ψ(c)) = ψ(c)(N ) = c(N ◦ ϕ) = c(m).

and all f ∈ D. Lemma 5.3. 1∗ψ (1,0) f 1ψ (1,0) (c) = c f (z) for all c ∈ D −1 z∈ϕ (·)

it suffices to establish Proof. Since both sides are continuous in c and ι(X ) is dense in D the identity when c = cx for some x ∈ X . It follows from Proposition 4.1 that we can apply (3.2) with ψ replacing ϕ to conclude that 1∗ψ (1,0) f 1ψ (1,0) (cx ) = c ( f ). c ∈ψ −1 (cx )

In comparison we have that ⎛ cx ⎝

z∈ϕ −1 (·)

⎞ f (z)⎠ =

z∈ϕ −1 (x)

f (z).

412

K. Thomsen

So it remains only to show that ψ −1 (cx ) = cz : z ∈ ϕ −1 (x) .

(5.5)

In fact, since the two sets have the same number of elements by Lemma 5.1, it suffices to check that ψ(cz ) = cx when z ∈ ϕ −1 (x). This is straightforward: ψ(cz )( f ) = cz ( f ◦ ϕ) = f (ϕ(z)) = f (x) = cx ( f ) for all f ∈ D.

Note that (5.5) means that ψ −1 (ι(X )) = ι(X ).

(5.6)

We can now adopt the proof of Theorem 3.1 to get the following: Theorem 5.4. There is a ∗-isomorphism Cr∗ ϕ → Cr∗ ψ which is the identity on Dϕ and takes the isometry Vϕ ∈ Cr∗ ϕ to Vψ ∈ Cr∗ ψ . Proof. We will appeal to Theorem 3.1 above and combine itwith Corollary 7.2 of [EV] for the existence of a ∗-homomorphism Cr∗ ϕ → Cr∗ ψ with the stated properties. We need therefore to check that 1) Vψ f = f ◦ ϕVψ,

and 2) c(Vψ∗ f Vψ ) = c N (·)−1 z∈ϕ −1 (·) f (z) , c ∈ D, k 1 1 3) 1 = i=1 (bi m) 2 Vψ Vψ∗ (bi m) 2 , where f ∈ D. To check 1) note first that cx , 1, c y : ϕ(x) = y

the openness of is dense in ψ (1, 0). This follows from the density of ι(X ) in D, ψ and (5.6). Since both sides of 1) are elements in Cc ψ (1, 0) it suffices therefore to check 1) on elements of the form (cx , 1, c y ) with ϕ(x) = y, where it is easy: 1 Vψ f cx , 1, c y = Vψ cx , 1, c y f c y = m(x)− 2 f (ϕ(x)) = f ◦ ϕVψ cx , 1, c y . The identity 2) is established in a similar way: Since both sides are continuous functions

it suffices to check it on elements from ι(X ): on D cx Vψ∗ f Vψ =

Vψ∗ cx , −1, c c ( f )Vψ c , 1, cx

c ∈ψ −1 (cx )

=

−1 #ψ −1 ψ(c ) c (f)

c ∈ψ −1 (cx )

=

N (x)−1 f (y)

y∈ϕ −1 (x)

⎛

= cx ⎝ N (·)−1

z∈ϕ −1 (·)

(by Corollary 5.2 and (5.6)) ⎞

f (z)⎠ .

Local Homeomorphisms

413

k 1 1 To check 3) note that i=1 (bi m) 2 Vψ Vψ∗ (bi m) 2 ∈ Cc (R(ψ)). Since elements of the form cx , c y with (x, y) ∈ R(ϕ) are dense in R(ψ) it suffices to show that for (x, y) ∈ R(ϕ), k

1 1 (bi m) 2 Vψ Vψ∗ (bi m) 2 cx , c y =

i=1

0 1

when x = y when x = y.

So let (x, y) ∈ R(ϕ). Then ϕ(x) = ϕ(y) and we find that k

1 1 (bi m) 2 Vψ Vψ∗ (bi m) 2 cx , c y

i=1

=

k

1

1

1

1

1

1

bi (x) 2 m(x) 2 m(x)− 2 m(y)− 2 bi (y) 2 m(y) 2 (using Corollary 5.2)

i=1

=

0 1

when x = y when x = y

k since ϕ is injective on supp bi and i=1 bi = 1. This establishes the existence of a ∗-homomorphism μ : Cr∗ ϕ → Cr∗ ψ which is the identity on Dϕ and takes Vϕ to Vψ . The injectivity of μ follows from the faithfulness of the conditional expectation Pϕ : Cr∗ ϕ → Dϕ and the observation that Pψ ◦ μ = Pϕ . And, finally, the surjectivity of μ follows from the fact that Cr∗ ψ is generated by Vψ and Dψ = Dϕ .

By Theorem 5.4 we can identify Cr∗ ϕ with Cr∗ ψ and we will do that freely in the following. Remark 5.5. The isomorphism of Theorem 5.4 is clearly equivariant with respect to the gauge actions and it induces therefore an isomorphism between the correspond T T ing fixed point algebras, Cr∗ ϕ and Cr∗ ψ . Since ψ is a local homeomorphism T we have the equality Cr∗ ψ = Cr∗ Rψ . Since there are subshifts σ for which T Cr∗ (Rσ ) Cr∗ ( σ ) it follows that in general the isomorphism in Theorem 5.4 does not take Cr∗ Rϕ onto Cr∗ Rψ . 6. KMS States Let F : X → R be a real-valued function from D. Such a function defines a continuous action α F : R → Aut Cr∗ ϕ such that αtF (d) = d when d ∈ Dϕ and αtF Vϕ = ei Ft Vϕ , cf. [E]. The action α F can also be defined from the one-cocycle on ϕ defined by F as in the last line on p. 2072 in [KR], but the definition above allows us to combine Theorem 3.1 with the work of Exel in [E] to establish the connection between the KMS states

of α F and the Borel probablity measures on D fixed by the dual of a Ruelle-type operator. Let β ∈ R\{0}. A state ω on Cr∗ ψ is a KMS state with inverse temperature β for α F (or just a β-KMS state for short) when F (x) (6.1) ω(x y) = ω yαiβ for all α F -analytic elements x, y of Cr∗ ϕ .

414

K. Thomsen

Let τλ , λ ∈ T, bethe gauge action on Cr∗ ψ (so that τeit = αtF when F is constant 1) and let Pψ : Cr∗ ψ → D be the conditional expectation. Let S(D) denote the set of states on D. When χ ∈ S(D) the composition χ ◦ Pψ is a state on Cr∗ ψ . Note that χ ◦ Pψ is gauge-invariant since Pψ ◦ τλ = Pψ for all λ ∈ T. Let Q : Cr∗ ψ → Cr∗ Rψ be the conditional expectation ! τλ (x) dλ. Q(x) = T

Lemma 6.1. Let ω be a β-KMS state for α F . Then ω ◦ Q is a gauge-invariant β-KMS state for α F . Proof. Let x, y ∈ Cr∗ ϕ be analytic for α F . Since τ commutes with α F we find that ! ω ◦ Q (x y) = ω (τλ (x y)) dλ !T ! F = ω (τλ (x)τλ (y)) dλ = ω τλ (y)αiβ (τλ (x) ) dλ T !T F F = ω τλ (y)τλ (αiβ (x) dλ = ω ◦ Q yαiβ (x) . T

For any β ∈ R, define L −β F : D → D such that e−β F(y) g(y). L −β F (g)(x) = y∈ϕ −1 (x)

Theorem 6.2. Let β ∈ R\{0}. The map χ → χ ◦ Pψ is a bijection from the states χ ∈ S(D) which satisfy that χ ◦ L −β F = χ

(6.2)

onto the gauge-invariant β-KMS states for α F . Proof. Consider first the case β > 0. By Proposition 9.2 and Sect. 11 in [E] it suffices to show that any gauge-invariant β-KMS state ω of α F factorizes through Pψ , and this follows from Lemma 2.24 gauge-invariant we have of [Th] in the following way. Since ω Fis " " that ω = ω ◦ Q. Let d j be a partition of unity in D. Since αiβ d j = d j it follows " " from the KMS condition (6.2) that j ω d j x d j = ω(x) for all x ∈ Cr∗ ϕ . In " " d j Q(x) d j and hence ω Pψ (Q(x)) = ω(Q(x)) particular, ω(Q(x)) = jω by Lemma 2.24 of [Th] because Q(x) ∈ Cr∗ Rψ . Since Pψ ◦ Q = Pψ this shows that ω = ω ◦ Pψ as desired. The case β < 0 follows from the preceding case by observing that ω is a β-KMS state for α F if and only if ω is a (−β)-KMS state for α −F .

It follows from [E] that every β-KMS state is gauge invariant when F is strictly positive or strictly negative. This is not the case in general, but note that if there is a β-KMS state for α F then there is also one which is gauge invariant by Lemma 6.1. We have deliberately omitted β = 0 as an admissable β-value for KMS-states because they correspond to trace states and they exist only in rather exceptional cases, e.g. when ϕ has a fixed point x0 for which ϕ −1 (x0 ) = {x0 }.

Local Homeomorphisms

415

6.1. Bounds on the possible β-values. Define Iβ F : D → D such that Iβ F (g)(x) =

eβ F(x) g ◦ ϕ(x). m(x)

Then L −β F ◦ Iβ F (g) = g for all g ∈ D, so if χ ∈ S(D) satisfies (6.2) we find that χ = χ ◦ Iβ F . (6.3) Thus 1 ∈ Spectrum L ∗−β F ∩ Spectrum Iβ∗F when there is a state χ ∈ S(D) for which (6.2) holds. Let ρ(T ) be the spectral radius of an operator T . Since Spectrum L ∗−β F ∩ Spectrum Iβ∗F = Spectrum L −β F ∩ Spectrum Iβ F , cf. [DS], we find that 1 ≤ ρ Iβ F

(6.4)

1 ≤ ρ L −β F

(6.5)

and

when (6.2) holds. To get the most out of these inequalities we consider a non-invertible invariant h m which has been introduced for general dynamical systems by M. Hurley in [Hu] and developed further in [FFN]. For a locally injective map like the map ϕ we consider here, the invariant h m (ϕ) is simply given by the formula 1 h m (ϕ) = lim log max #ϕ −n (x) , (6.6) n→∞ n x∈X cf. [FFN], or, alternatively, as h m (ϕ) = sup lim sup x∈X

n

1 log #ϕ −n (x), n

cf. Corollary 2.4 of [FFN]. For forward expansive maps, and hence in particular for one-sided subshifts, h m equals the topological entropy h, but in general we only have the inequality h m (ϕ) ≤ h(ϕ). It can easily happen that h m (ϕ) < h(ϕ) even when ϕ is a local homeomorphism. The next lemma shows that for a locally injective surjection, as the map ϕ we consider, the invariant h m agrees with that of its canonical local homeomorphic extension. Lemma 6.3. h m (ψ) = h m (ϕ). −k −k Proof. It follows from (5.6) that #ψ (ι(x)) = #ϕ (x) for all x ∈ X . Since −k

#ψ (c) = c ∈ψ −k (c) 1 depends continuously on c ∈ D and ι(X ) is dense in D, −k −k we conclude that maxc∈ D

#ψ (c) = max x∈X #ϕ (x). Hence h m (ψ) = h m (ϕ), cf. (6.6).

416

K. Thomsen

In the following we let M(X ) denote the set of Borel probability measures on X and Mϕ (X ) the subset of M(X ) consisting of the ϕ-invariant elements of M(X ). Similarly,

be the set of Borel probability measures on D

and Mψ ( D)

the set of we let M( D)

ψ-invariant elements in M( D). Lemma 6.4. Let β ∈ R and assume that there is a state χ ∈ S(D) such that (6.2) holds.

such that It follows that there are measures ν, ν ∈ Mψ ( D) ! F dν ≤ h m (ϕ) (6.7) β

D

and

!

D

log #ψ −1 (c) dν (c) ≤ β

!

D

F dν .

(6.8)

Proof. Let δ > 0. It follows from (6.5) that ρ L −β F ≥ 1 which implies that ⎛ ⎞ # # k−1 j 1 1 −β # # j=0 F ψ (c ) ⎠ −δ ≤ log #L k−β F (1)# = log ⎝ sup e ∞ k k

c∈ D −k c ∈ψ

(c)

such that for all large k. There is therefore, for each large k, a point ck ∈ D $ % j k−1 1 −β F ψ (c ) −k k −2δ ≤ log e j=0 sup #ψ (c) . k

c∈ D Let ν be a weak* condensation point of the sequence 1 δψ j (ck ) k k−1

j=0

Then ν ∈ Mψ ( D)

by Theorem 6.9 of [W1] and in M( D). 1 −β F ψ j (ck ) ≤ k k−1

!

D

j=0

−β F dν + δ

for infinitely many k. It follows that −2δ ≤

1 log sup #ψ −k (c) + k

c∈ D

!

D

−β F dν + δ

for infinitely many k, and we conclude therefore that 0 ≤ h m (ψ) + h m (ψ) = h m (ϕ) by Lemma 6.3 we get (6.7). Similarly it follows from (6.4) that ⎛

β

k−1

e 1 ≤ lim sup ⎝ 'k−1 k→∞ c∈ D

j=0

j=0 m

F ψ j (c)

ψ j (c)

⎞1 k

⎠ ,

&

−β Fdν. D

Since

Local Homeomorphisms

417

which implies that % $ j j k−1 1 β F ψ (c) −log m ψ (c) −δ ≤ log sup e j=0 k

c∈ D for all large k. We can then work as before with −β & F replaced by β F − log m to pro such that −2δ ≤ β F − log m dν + δ. We omit the duce the measure ν ∈ Mψ ( D) & D & −1 repetition. Since ν is ψ-invariant we have that D

log m dν = D

log #ψ (c) dν (c). In this way we get (6.8).

ϕ

When H : X → R is a bounded real-valued function, set A H (k) = inf x∈X j ϕ ϕ ϕ ϕ (x) . Then A H (k + n) ≥ A H (k) + A H (n) for all n, k and we can set ϕ

ϕ

ϕ

ϕ

k−1 j=0

H

A (k) A (n) = sup H . = lim H k→∞ k n n j ϕ Similarly, we set B H (k) = supx∈X k−1 j=0 H ϕ (x) and ϕ AH

ϕ

B H = lim

k→∞

B H (k) B (n) = inf H . n k n

Proposition 6.5. When β > 0 is the inverse temperature of a KMS state for α F we have ϕ ϕ ϕ that Alog m ≤ β B F and β A F ≤ h m (ϕ). ϕ When β < 0 is the inverse temperature of a KMS state for α F we have that Alog m ≤ ϕ ϕ β A F and β B F ≤ h m (ϕ). Proof. Let ν and ν be the measures from Theorem 6.4. When β > 0 we find that !

1 h m (ϕ) ≥ β F dν = β n

D

! n−1

D k=0

ϕ

F ◦ ψ k dν ≥ β

A F (n) n

and !

ϕ

Alog m (n) n

≤

D

! n−1 n−1 ϕ B (n) 1 1 k log m ◦ ψ dν ≤ β F ◦ ψ k dν ≤ β F n n D n

k=0

k=0

for all n. The two first inequalities of Theorem 6.5 follow from this. The case β < 0 is handled similarly.

ϕ

Corollary 6.6. Assume that h m (ϕ) = 0. There are no KMS states for α F unless A F ≤ ϕ 0 ≤ BF . Lemma 6.7. Assume that there is a β-KMS state for α F . It follows that there is a measure

such that ν ∈ Mψ ( D) ! 1 F dν ≥ lim sup log inf #ψ −n (c). (6.9) β

n

n c∈ D D

418

K. Thomsen

Proof. Let χ ∈ S(D) be a state such that χ ◦ L −β F = χ . Then ⎛ ⎞ k−1 j χ⎝ e−β j=0 F◦ψ (c ) ⎠ = 1

(6.10)

c ∈ψ −k (·)

for all k and hence

⎛

inf #ψ −k (c)χ ⎝

c∈ D

1 #ψ −k (·)

e−β

k−1 j=0

⎞ F◦ψ j (c )

⎠≤1

(6.11)

c ∈ψ −k (·)

for all k ∈ N. Since log is concave we can apply Jensen’s inequality to the state μ on D defined by ⎛ ⎞ 1 μ(g) = χ ⎝ −k g(c )⎠ . #ψ (·) −k c ∈ψ

Then (6.11) gives the estimate

⎛

log inf #ψ −k (c) − βμ ⎝

c∈ D

k−1

(·)

⎞ F ◦ ψ j⎠ ≤ 0

(6.12)

j=0

of the sequence for all k. We can therefore choose a condensation point ν ∈ Mψ ( D) μk , k = 1, 2, . . ., where ⎛ ⎞ k−1 1 μk (g) = μ ⎝ g ◦ ψ j⎠ , k j=0

such that (6.9) holds.

Theorem 6.8. Assume that F is continuous and that there is a β-KMS state for α F . Set 1 −n m = lim log min #ϕ (x) n→∞ n x∈X and

1 log max #ϕ −n (x) . n→∞ n x∈X

M = lim

There is then a ϕ-invariant Borel probability measure μ ∈ Mϕ (X ) such that ! β F dμ ∈ [m, M]. X

such that Proof. By Proposition 6.5 and& Lemma 6.7 there are measures ν, ν ∈ Mψ ( D) & β D

F dν ≤ M and m ≤ β D

F dν . Since F is continuous on X by assumption we

It follows that with an appropriate convex have that F(c) = F(π(c)) for all c ∈ D. combination we have that m ≤ β

μ = sν ◦ π −1 + (1 − s)ν ◦ π −1

& X

F dμ ≤ M.

Local Homeomorphisms

419

6.2. Existence of KMS states. While Proposition 6.5 and Theorem 6.8 give upper and lower bounds on the possible β-values of a KMS state for α F they say nothing about existence. This is where the work of Matsumoto, Watatani and Yoshida, [MWY], and Pinzari, Watatani and Yonetani, [PWY], comes in. Theorem 6.9 (cf. [PWY] and [MWY]). Let B be a unital commutative C ∗ -algebra and L : B → B a positive linear operator with spectral radius ρ(L). Then ρ(L) is in the spectrum of L and there is a state ω ∈ S(B) such that ω ◦ L = ρ(L)ω. Proof. We adopt arguments from [PWY] to show that ρ(L) is in the spectrum of L and then arguments from [MWY] to produce the state ω. Recall that Spectrum(L) = Spectrum(L ∗ ), cf. [DS]. By definition of ρ(L) there is an element z ∈ Spectrum (L ∗ ) with |z| = ρ(L). Let {z n } be a sequence of complex numbers such that |z n | > ρ(L) for all n and limn z n = z. It follows then from the principle of uniform boundedness that there is an element μ ∈ B ∗ such that lim R(z n )μ = ∞,

n→∞

where R(z) = (z − L ∗ )−1 is the resolvent. Since B ∗ is spanned by the states we may assume that μ ∈ S(B). Since |z n | > ρ (L ∗ ) the resolvent R(z n ) is given by the norm convergent Neumann series R(z n ) =

∞

z n−k−1 L ∗ k .

k=0

Since μ is a state and L a positive operator it follows that |R(z n )μ| ≤

∞

|z n |−k−1 L ∗ k μ = R (|z n |) μ

k=0

in B ∗ where |R(z n )μ| is the total variation measure of R(z n )μ. Hence R(z n )μ ≤ R (|z n |) μ , and we conclude that limn→∞ R (|z n |) μ = ∞, which implies that ρ(L) = limn→∞ |z n | is in Spectrum (L ∗ ) = Spectrum(L). Set μn =

R (|z n |) μ . R (|z n |) μ

A glance at the Neumann series shows that μn is a state since L is positive. As ρ(L) − L ∗ μn = (ρ(L) − |z n |) μn + R (|z n |) μ−1 μ converges to 0 in norm, any weak* condensation point ω of {μn } will be a state such that ω ◦ L = ρ(L)ω.

Corollary 6.10. Let β ∈ R\{0} satisfy that the spectral radius ρ L −β F of L −β F is 1. It follows that there is a gauge invariant β-KMS state for α F . Proof. Combine Theorem 6.9 with Theorem 6.2.

420

K. Thomsen

be a closed subset such that ψ −1 (A) ⊆ A. Assume that Lemma 6.11. Let A ⊆ D ψ

A F| A > 0. It follows that there are states ω, ν, ν ∈ S(D) and a β ∈ [0, ∞) such that ν ◦ ψ = ν, ν ◦ ψ = ν , ω(A) = ν(A) = ν (A) = 1, βν(F) ≤ limn→∞ n1 log maxc∈A #ψ −k (c) ≤ βν (F), and ω ◦ L −β F = ω. k ψ −1 Proof. Set δ = A F| A = limn inf c∈A n1 n−1 k=0 F ψ (c) . Since ψ (A) ⊆ A we can 1) 2) 3) 4)

A for any t ∈ R define a positive linear operator L −t F : C(A) → C(A) such that A e−t F(c ) g(c ). L −t F (g)(c) = c ∈ψ −1 (c)

Then A L −t F ◦ r A = r A ◦ L −t F ,

(6.13)

A where r A : D → C(A) is the restriction map. To estimate the spectral radius of L −t F we observe that when t ≥ 0 we get the estimate n n−1 k A (1)(c) = sup e−t k=0 F ψ (c ) sup L −t F c∈A

c∈A

c ∈ψ −n (c)

≤ sup c∈A

tδ

tδ

e−n 2 ≤ e−n 2 sup #ψ −n (c)

c ∈ψ −n (c)

c∈A

for infinitely many n. It follows that 1 n n A A sup lim ρ L −t = lim L lim (1)(c) = 0. F −t F

t→∞

t→∞ n→∞

c∈A

On the other hand ρ Since

L 0A

1 n −n = lim sup #ψ (c) ≥ 1. n→∞

c∈A

( ( ( ( ( ( ( A A ( (ρ L −t F − ρ L −t F ( ≤ t − t F∞

A for all t, t ∈ R, cf. Proposition 2.2 of [ABL], it follows that [0, ∞) t → ρ L −t F is continuous. Hence the intermediate value theorem of calculus implies the existence of a A β ∈ [0, ∞) such that ρ L −β F = 1. Then Theorem 6.9 implies the existence of a state A ω ∈ S (C(A)) such that ω ◦ L −β F = ω . Set ω = ω ◦ r A and note that (6.13) implies that ω ◦ L −β F = ω. Since ω( f ) = 0 for all f ∈ D with support in X \A it follows that ω(A) = 1.

Local Homeomorphisms

421

To construct the ψ-invariant states ν and ν let > 0 and note that ⎛ ⎞ n−1 k 1 −β F ψ (c ) k=0 ⎠ = 0. lim log ⎝sup e n→∞ n c∈A −n c ∈ψ

(6.14)

(c)

For n ∈ N there are cn , cn ∈ ψ −n (A) such that n−1

F ψ k (cn ) =

k=0

≤

n−1

inf

c ∈ψ −n (A)

sup

k=0 n−1

c ∈ψ −n (A) k=0

F ψ k (c ) n−1 F ψ k (c ) = F ψ k (cn ) . k=0

Then −β

n−1 1 k 1 F ψ (cn ) + log sup #ψ −n (c) ≤ 0 n n c∈A k=0

≤ −β

n−1 1 1 k F ψ (cn ) + log sup #ψ −n (c) n n c∈A

(6.15)

k=0

asymptotically as n goes to infinity. Let ν and ν be states of D such that the correspond

are weak* condensation points of the sequences 1 n−1 δψ k (c ) and ing measures on D k=0 n n n−1 1 , = 1, 2, 3, . . . , respectively. Then 1) holds by Theorem 6.9 of [W1] δ k k=0 ψ (cn ) n and ν(A) = ν (A) = 1 since A is closed and ψ k (cn ) , ψ k cn ∈ A for all k, n. The estimates 3) follow from (6.15).

Theorem 6.12. Assume that h m (ϕ) > 0. ϕ

ϕ

ϕ

1) If A F > 0 there is a β-KMS state for α F such that β A F ≤ h m (ϕ) ≤ β B F . ϕ ϕ ϕ 2) If B F < 0 there is a β-KMS state for α F such that β B F ≤ h m (ϕ) ≤ β A F . 3) When F is continuous there is in both cases, 1) or 2), a ϕ-invariant Borel probability measure μ ∈ Mϕ (X ) such that ! F dμ = h m (ϕ).

β

(6.16)

X

and 2) follows by Proof. 1) follows directly from Lemma 6.11 applied with A = D applying 1) to −F. ϕ ϕ 3) Assume now that F is continuous. Since we either have that β A F ≤ h m (ϕ) ≤ β B F ϕ ϕ or β B F ≤ h m (ϕ) ≤ β A F there is a sequence n 1 < n 2 < · · · in N and points xi , yi ∈ X such that h m (ϕ) −

n i −1 1 1 ≤ β F ◦ ϕ j (xi ) i ni j=0

422

K. Thomsen

and n i −1 1 1 β F ◦ ϕ j (yi ) ≤ h m (ϕ) + ni i j=0

for all i. For each i we can then find a number si ∈ [0, 1] such that n i −1 ! 1 1 1 F ◦ ϕ j dνi ≤ h m (ϕ) + , h m (ϕ) − ≤ β i ni i X

(6.17)

j=0

where νi = si δxi + (1 − si )δ yi . Any weak* condensation point of the sequence n i −1 1 νi ◦ ϕ − j ni j=0

will be ϕ-invariant by Theorem 6.9 of [W1] and β

& X

F dμ = h m (ϕ) thanks to (6.17).

Corollary 6.13. Assume that F is continuous and either strictly positive or strictly negative. There is no KMS-state for α F if h m (ϕ) = 0. If h m (ϕ) > 0 there is a β-KMS-state for α F such that β=&

h m (ϕ) X F dμ

for some μ ∈ Mϕ (X ). Proof. The first assertion follows from Corollary 5.2 and the second from Theorem 6.12.

Example 6.14. Assume that ϕ : X → X is uniformly n-to-1, i.e. that #ϕ −1 (x) = n for all x ∈ X . Note that n ≥ 2 since we assume that ϕ is not injective. Then h m (ϕ) = log n and it follows from Theorem6.12 and Theorem 6.8 that there is exactly one β such that the gauge action on Cr∗ ϕ has a β-KMS state, namely β = log n. In many cases log n is also the topological entropy, h(ϕ). This is for example the case when ϕ is an affine map on Tk . To see that in general log n is smaller than the topological entropy, let f : Y → Y be an arbitrary homeomorphism of a compact metric space Y . Then ϕ × f : X × Y → X × Y is also locally injective and n-to-1. In particular h m (ϕ × f ) = log n, while the topological entropy is h(ϕ) + h( f ) which can be any number ≥ log n.

References [A] [ABL] [BKR]

Anantharaman-Delaroche, C.: Purely infinite C ∗ -algebras arising from dynamical systems. Bull. Soc. Math. France 125, 199–225 (1997) Antonevich, A.B., Bakhtin, V.I., Lebedev, A.V.: T-entropy and variational principle for the spectral radius of transfer and weighted shift operators. http://arXiv.org/abs/0809.3116v2 [math.DS], 2008 Boyd, S., Keswari, N., Raeburn, I.: Faithful Representations of Cross ed Products by Endomorphisms. Proc. Amer. Math. Soc. 118, 427–436 (1993)

Local Homeomorphisms

[BR]

423

Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics II. New York-Heidelberg-Berlin: Springer Verlag, 1981 [D] Deaconu, V.: Groupoids associated with endomorphisms. Trans. Amer. Math. Soc. 347. 1779–1786 (1995) [DS] Dunford N., Schwartz, J.T.: Linear Operators, Part I : General Theory. New York: Interscience Publishers, 1966 [E] Exel, R.: Crossed products by finite index endomorphisms and KMS states. J. Func. Anal. 199, 153–158 (2003) [EV] Exel, R., Vershik, A.: C ∗ -algebras of Irreversible Dynamical Systems. Canad. J. Math. 58, 39–63, (2006) [FFN] Fiebig, D., Fiebig, U., Nitecki, Z.: Entropy and preimage sets. Erg. Th. & Dyn. Sys. 23, 1785–1806 (2003) [Hu] Hurley, M.: On topological entropy of maps. Erg. Th. Dyn. Sys. 15, 557–568. (1995) [Kr1] Krieger, W.: On sofic systems I. Israel J. Math. 48, 305–330 (1984) [Kr2] Krieger, W.: On sofic systems II. Israel J. Math. 60, 167–176 (1987) [KR] Kumjian, A., Renault, J.: KMS-states on C ∗ -algebras associated to expansive maps. Proc. Amer. Math. Soc. 134, 2067–2078 (2006) [MWY] Matsumoto, K., Watatani, Y., Yoshida, M.: KMS states for gauge actions on C ∗ -algebras associated with subshifts. Math. Z. 228, 489–509 (1998) [PWY] Pinzari, C., Watatani, Y., Yonetani, K.: KMS states, entropy and the variational principle in full C ∗ -dynamical systems. Commun. Math. Phys. 213, 331–379 (2000) [Re1] Renault, J.: A Groupoid Approach to C ∗ -algebras. LNM 793, Berlin-Heidelberg-New York: Springer Verlag, 1980 [Re2] Renault, J.: A F-equivalence relations and their co-cycles. Operator Algebras and Mathemaical Physics, Conference Proceedings, Constanza 2001, Bucharest: The Theta Foundation, 2003, pp. 365–377 [Ru] Ruelle, D.: Thermodynamic Formalism. Encyclopedia of Mathematics and its Applications 5, Reading, MA: Addison-Wesley, 1978 [Th] Thomsen, K.: Semi-étale groupoids and applications. Ann. l’Inst. Fourier 60(3), 759–800 (2010) [W1] Walters, P.: An Introduction to Ergodic Theory. New York-Heidelberg-Berlin: Springer Verlag, 1982 [W2] Walters, P.: Convergence of the Ruelle operator for a function satisfying Bowen’s condition. Trans. Amer. Math. Soc. 353, 327–347 (2000) Communicated by A. Connes

Commun. Math. Phys. 302, 425–451 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1134-4

Communications in

Mathematical Physics

Meixner Class of Non-commutative Generalized Stochastic Processes with Freely Independent Values II. The Generating Function Marek Bo˙zejko1 , Eugene Lytvynov2 1 Instytut Matematyczny, Uniwersytet Wrocławski, Pl. Grunwaldzki 2/4, 50-384 Wrocław, Poland.

E-mail: [email protected]

2 Department of Mathematics, Swansea University, Singleton Park, Swansea SA2 8PP, U.K.

E-mail: [email protected] Received: 15 March 2010 / Accepted: 6 June 2010 Published online: 15 September 2010 – © Springer-Verlag 2010

Abstract: Let T be an underlying space with a non-atomic measure σ on it. In [Comm. Math. Phys. 292, 99–129 (2009)] the Meixner class of non-commutative generalized stochastic processes with freely independent values, ω = (ω(t))t∈T , was characterized through the continuity of the corresponding orthogonal polynomials. In this paper, we derive a generating function for these orthogonal polynomials. The first question we have to answer is: What should serve as a generating function for a system of polynomials of infinitely many non-commuting variables? We construct a class of operator-valued functions Z = (Z (t))t∈T such that Z (t) commutes with ω(s) s, t ∈ T . Then a generating function can be understood as any(n) for G(Z , ω) = ∞ n=0 T n P (ω(t1 ), . . . , ω(tn ))Z (t1 ) · · · Z (tn ) σ (dt1 ) · · · σ (dtn ), where P (n) (ω(t1 ), . . . , ω(tn )) is (the kernel of the) n th orthogonal polynomial. We derive an explicit form of G(Z , ω), which has a resolvent form and resembles the generating function in the classical case, albeit it involves integrals of non-commuting operators. We finally discuss a related problem of the action of the annihilation operators ∂t , t ∈ T . In contrast to the classical case, we prove that the operators ∂t related to the free Gaussian and Poisson processes have a property of globality. This result is genuinely infinitedimensional, since in one dimension one loses the notion of globality.

1. Introduction and Preliminaries This paper serves as a continuation of our research started in [10]. We recall that the Meixner class of non-commutative generalized stochastic processes with freely independent values was characterized in [10] through the continuity of the corresponding orthogonal polynomials. The main aim of the present paper is to derive the generating function for these orthogonal polynomials. Let us first briefly recall some known results on the generating function of Meixner polynomials, in both the classical and free cases. Below, when speaking of orthogonal

426

M. Bo˙zejko, E. Lytvynov

polynomials on the real line, we will always assume that their measure of orthogonality has infinite support and is centered. According to e.g. [11] (see also the original paper [16]), the Meixner class of orthogonal polynomials on R consists of all monic orthogonal polynomials (P (n) )∞ n=0 whose (exponential) generating function has the form G(z, x) :=

∞ ∞ P (n) (x) n 1 z = exp(x(z) + (z)) = (x(z) + (z))k , n! k! n=0

k=0

where z is from a neighborhood of zero in C, and are analytic functions in a neighborhood of zero such that (0) = (0) = (0) = 0. This assumption automatically implies that (z) = −C((z)),

(1.1)

where C(z) is the cumulant generating function of the measure of orthogonality, μ: C(z) =

∞ n z C (n) , n! n=1

C (n) being the n th cumulant of μ. Recall that zs C(z) = log e μ(ds) . R

Each system of Meixner polynomials is characterized by three parameters k > 0, λ ∈ R, and η ≥ 0. The corresponding orthogonal polynomials satisfy the recursion relation x P (n) (x) = P (n+1) (x) + λn P (n) (x) + (kn + ηn(n − 1))P (n−1) (x), and the generating function takes the form G(z, x) = exp xλ,η (z) − kCλ,η (λ,η (z)) ,

(1.2)

(1.3)

where the functions λ,η (z) and Cλ,η (z) are determined by the parameters λ and η only. In particular, Cλ,η (z) is the cumulant generating function of the measure of orthogonality corresponding to the parameters k = 1, λ and η. We refer the reader to e.g. [16] for an explicit form of λ,η (z) and Cλ,η (z). We also note that these functions continuously depend on their parameters λ and η, see [18] for details. Let us now outline the infinite dimensional case, see [13,14,18] for further details. Let T be a complete, connected, oriented C ∞ Riemannian manifold and let B(T ) be the Borel σ -algebra on T . Let σ be a Radon, non-atomic, non-degenerate measure on (T, B(T )). (For simplicity, the reader may think of T as Rd and of σ as the Lebesgue measure). Let D denote the space of all real-valued infinitely differentiable functions on T with compact support. We endow D with the standard nuclear space topology. Let D denote the dual space of D with respect to the center space L 2 (T, σ ). Thus, D consists of generalized functions (distributions) on T . Let C denote the cylinder σ -algebra on D , i.e., the minimal σ -algebra on D with respect to which, for any ξ ∈ D, the mapping D ω → ω, ξ ∈ R is Borel-measurable. Here and below, ·, · denotes the pairing between elements of a given linear topological space and its dual space.

Meixner Class of Non-commutative Generalized Stochastic Processes II

427

Let μ be a probability measure on (D , C ) (a generalized stochastic process). The cumulant generating function of μ is given by C(ξ ) = log

D

e ω,ξ μ(dω) , ξ ∈ D.

The Meixner class of generalized stochastic processes with independent values may be identified as follows. We fix arbitrary smooth functions λ : T → R and η : T → [0, ∞), and define a probability measure μ on (D , C) whose cumulant generating function is Cλ(t),η(t) (ξ(t)) σ (dt), ξ ∈ D. C(ξ ) = T

Here, Cλ(t),η(t) (·) is as in (1.3). Consider the set of all continuous polynomials on D , i.e., functions on D which have the form F(ω) =

n

ω⊗i , f (i) , n ∈ N0 .

i=0

Here, for each i, f (i) belongs to the i th symmetric tensor power of D, i.e., f (i) ∈ D i , where denotes symmetric tensor product. Note that D i consists of all smooth symmetric functions on T i with compact support. For each f (n) ∈ D n , we denote by P( f (n) ) = P( f (n) , ω) the orthogonal projection of the monomial ω⊗n , f (n) onto the n th chaos, i.e., onto the orthogonal difference in L 2 (D , μ) of the closures of the polynomials of order ≤ n and of order ≤ n − 1, respectively. Then P( f (n) ) is a continuous polynomial. By construction, for any f (n) ∈ D n and g (m) ∈ D m with n = m, the polynomials P( f (n) ) and P(g (m) ) are orthogonal. Furthermore, for each ω ∈ D , one can recursively define P (n) (ω) ∈ D n , n ∈ N, so that P( f (n) , ω) = P (n) (ω), f (n) . The (exponential) generating function of these polynomials is defined by G(ξ, ω) :=

∞ 1 P (n) (ω), ξ ⊗n , n! n=0

where ξ is from a neighborhood of zero in D. We have:

G(ξ, ω) = exp ω(·), λ(·),η(·) (ξ(·)) −

T

Cλ(t),η(t) (λ(t),η(t) (ξ(t))) σ (dt) , (1.4)

compare with (1.3). Note that the measure σ now plays the role of the parameter k in (1.3). Below, in the free case, we will use, for many objects, the same notations as those used for their counterpart in the classical case. However, it should always be clear from the context which objects are meant.

428

M. Bo˙zejko, E. Lytvynov

Introduced by Anshelevich [1] and Saitoh, Yoshida [19], the free Meixner class of orthogonal polynomials on R consists of all monic orthogonal polynomials (P (n) )∞ n=0 on R whose (usual) generating function has the form G(z, x) :=

∞

P (n) (x)z n = (1 − x(z) − (z)))−1 =

n=0

∞ (x(z) + (z))k , k=0

where z is from a neighborhood of zero and and satisfy the same conditions as in the classical case. Then the function (z) automatically takes the form as in (1.1), but with C(·) being the free cumulant generating function of the measure of orthogonality, μ: C(z) :=

∞

z n C (n) ,

n=1

where C (n) is the n th free cumulant of μ, see [1,3]. A system of such polynomials is also characterized by three parameters k > 0, λ ∈ R, η ≥ 0 and the polynomials satisfy the recursion relation as in (1.2) but with the factors n and n − 1 being replaced by [n]0 and [n − 1]0 , respectively. Here, for q ∈ R and n = 0, 1, 2 . . ., we denote [n]q := (1 − q n )/(1 − q) and so [n]0 = 0 for n = 0 and = 1 for all n = 1, 2, . . . . Thus, P (0) (x) = 1, P (1) (x) = x, x P (1) (x) = P (2) (x) + λP (1) (x) + k P (0) (x), x P (n) (x) = P (n+1) (x) + λP (n) (x) + (k + η)P (n−1) (x), n ≥ 2. Furthermore, the generating function G(z, x) takes the form as in (1.3) but with the the resolvent function replacing the exponential function. In fact, we have [1] λ,η (z) =

z 2z 2 , C (z) = , λ,η 1 + λz + ηz 2 1 − λz + (1 − λz)2 − 4z 2 η Cλ,η (λ,η (z)) =

z2 , 1 + λz + ηz 2

(1.5) (1.6)

so that G(z, x) = 1 − x

z z2 +k 2 1 + λz + ηz 1 + λz + ηz 2

−1 .

(1.7)

We also note that the class of orthogonal polynomials which is now called the free Meixner class, was derived in the conditionally free central limit theorem and in the conditionally free Poisson limit theorem in [9], see also [6] for a characterization of these polynomials in terms of a regression problem. In [3] (see also [2]), Anshelevich introduced and studied multivariate orthogonal polynomials of non-commuting variables with a resolvent-type generating function. He, in particular, noticed that the generating function G(z, x) should be defined for noncommuting indeterminates (z 1 , . . . , z k ) = z (which form coefficients by the orthogonal

Meixner Class of Non-commutative Generalized Stochastic Processes II

429

polynomials) and non-commuting indeterminates (x1 , . . . , xk ) = x (which are variables of the polynomials), and the z i -variables must commute with the x j -variables for all i, j = 1, . . . , k. The generating function is then supposed to have the form −1 k G(z, x) = 1 − xi i (z) − (z) . (1.8) i=1

We refer to [2,3] for an extension of formula (1.1) to the multivariate case, for a recursion relation satisfied by the corresponding orthogonal polynomials, for an operator model of these polynomials, and for further related results. In part 1 of this paper, [10], we identified the Meixner class of non-commutative generalized stochastic processes ω = (ω(t))t∈T as those a) which have free independent values; b) whose orthogonal polynomials are continuous in ω. The main aim of the present paper is to derive the generating function for a system of orthogonal polynomials as in b). However, when discussing a generating function for a system of polynomials of infinitely many non-commuting variables, the first question we have to answer is: What should serve as a generating function? Developing the idea of [3], we will proceed in this paper as follows. Think informally of each polynomial of ω as P (n) (ω), f (n) , where P (n) (ω) is an operator-valued distribution on T n and f (n) is a test function on T n . We will consider a class of test operator-valued functions on T , denoted by Z(T ). We assume that, for each Z ∈ Z(T ) and t ∈ T , the operator Z (t) commutes with each polynomial P (n) (ω), f (n) . (However, for s, t ∈ T , Z (s) and Z (t) do not need to commute.) In Sect. 2, we give a rigorous meaning to a ‘dual pairing’ P (n) (ω), Z n and define a generating function G(Z , ω) =

∞ P (n) (ω), Z n ,

Z ∈ Z(T ).

n=0

Here Z n (t1 , . . . , tn ) := Z (t1 ) · · · Z (tn ) for (t1 , . . . , tn ) ∈ T n . We also show that the generating function G(Z , ω) uniquely characterizes the corresponding system of polynomials. In Sect. 3, we prove that the generating function of the Meixner system is given by −1 G(Z , ω) = 1 − ω(·), λ(·),η(·) (Z (·)) + Cλ(t),η(t) (λ(t),η(t) (Z (t))) σ (dt) = 1 − ω,

Z + 1 + λZ + ηZ 2

T

T

Z (t)2 σ (dt) 1 + λ(t)Z (t) + η(t)Z (t)2

−1 , (1.9)

where ω is Meixner’s non-commutative generalized stochastic processes with freely independent values corresponding to functions λ and η. The reader is advised to compare formula (1.9) with the generating function in the classical infinite dimensional case, formula (1.4), and with the generating function in the finite-dimensional free case, formulas (1.7) and (1.8). In Sect. 4, we discuss a related problem of the action of the annihilation operator at point t ∈ T , denoted by ∂t in [10]. Recall that, in the classical infinite-dimensional

430

M. Bo˙zejko, E. Lytvynov

case, the annihilation operator ∂t can be represented as an analytic function of the Hida– −1 (Dt ). (Recall that Dt is the derivMalliavin derivative Dt , more precisely ∂t = λ(t),η(t) ative in the direction of the delta-function δt .) We discuss a free counterpart of this result in the free Gauss–Poisson case, i.e., when η ≡ 0. A striking difference from the classical case is that we represent ∂t not just as a function of the free derivative Dt in the direction δt (this being impossible), but rather as a function of an operator Dt G. More precisely, −1 (Dt G). Here G is a ‘global’ operator, which is independent we show that ∂t = λ(t),0 of t. In fact, G is a sum of certain integrals of Ds over the whole space T . It should be stressed that this result is genuinely infinite-dimensional, since in one dimension we lose the notion of ‘globality’. We expect that a similar result should also hold in the general case, not necessarily when η ≡ 0, and we hope to return to this problem in our future research. We finish the paper with a discussion of a free differential equation satisfied by the cumulant generating function for a free Meixner class. Such an equation in the multivariate case was first derived by Anshelevich [2]. We show how this equation may be properly interpreted in our infinite dimensional setting. 2. Generating Function: Construction and Uniqueness of Corresponding Polynomials Just as in [10], we will assume that T is a locally compact Polish space. We denote by B(T ) the Borel σ -algebra on T , and by B0 (T ) the collection of all relatively compact sets from B(T ). For any fixed A ∈ B0 (T ), we will denote by B(A) the trace σ -algebra of B(T ) on A, i.e., {B ∈ B(T ) | B ⊂ A}. 2.1. Construction of the integral of an operator-valued function with respect to an operator-valued measure. Let G be a real separable Hilbert space, and let L (G) denote the Banach space of all bounded linear operators in G. We will call a mapping Z : T → L (G) simple if it has a form Z (t) =

n

Z i χ i (t),

(2.1)

i=1

where Z 1 , . . . , Z n ∈ L (G), 1 , . . . , n ∈ B0 (T ), n ∈ N, and χ i (t) denotes the indicator function of the set i . We denote by Z(T ) the set of all mappings Z : T → L (G) such that there exists a set A ∈ B0 (T ) and a sequence of simple mappings {Z n }∞ n=1 which vanish outside A and satisfy sup Z (t) − Z n (t)L (G ) → 0 as n → ∞.

(2.2)

t∈T

Clearly, Z(T ) is a normed vector space equipped with the norm Z ∞ := sup Z (t)L (G ) . t∈T

By construction, the set of all simple mappings forms a dense subspace in Z(T ). Remark 2.1. It can be easily shown that any mapping Z : T → L (G) which is continuous and which vanishes outside a compact set in T , belongs to Z(T ).

Meixner Class of Non-commutative Generalized Stochastic Processes II

431

Let H be another real, separable Hilbert space. We consider a mapping B0 (T ) → M( ) ∈ L (H). We assume: (A1) M(∅) = 0. (A2) M(·) admits a decomposition M( ) = U ( ) + V ( ), ∈ B0 (T ), with U ( ), V ( ) ∈ L (H) being such that, for any mutually disjoint sets 1 , 2 ∈ B0 (T ), we have Ran U ( 1 ) ⊥ Ran U ( 2 ), Ran V ( 1 )∗ ⊥ Ran V ( 2 )∗ , where Ran A denotes the range of a bounded linear operator A, and the symbol ⊥ refers to orthogonality in H. (A3) For any A ∈ B0 (T ), any sequence of mutually disjoint sets n ∈ B(A), n ∈ N, and any F ∈ H, ∞ ∞ ∗ ∞ ∞ U n F = U ( n )F, V n F = V ( n )∗ F, n=1

n=1

n=1

n=1

where the series converges in H. Remark 2.2. The reader will see below that Assumptions (A1)–(A3) are sufficient for our purposes. For each Z ∈ Z(T ), we will now identify an integral T Z ⊗ d M as a bounded linear operator in the Hilbert space G ⊗ H. We fix any A ∈ B0 (T ). Let Z be a simple mapping as in (2.1) such that i ⊂ A for all i = 1, . . . , n. Without loss of generality, we may assume that the sets 1 , . . . , n are mutually disjoint. We define Z ⊗ dU :=

n

T

Z i ⊗ U ( i ) ∈ L (G ⊗ H).

i=1

By (A2), Ran(Z i ⊗ U ( i )) ⊥ Ran(Z j ⊗ U ( j )), i = j, where ⊥ refers to orthogonality in G ⊗ H. Hence, for each F ∈ G ⊗ H, 2 n Z ⊗ dU F = Z i ⊗ U ( i )F2G ⊗H T

G ⊗H

i=1

≤

n

Z i 2L (G ) 1 ⊗ U ( i )F2G ⊗H

i=1

≤

max Z i 2L (G ) i=1,...,n

n i=1

1 ⊗ U ( i )F2G ⊗H

432

M. Bo˙zejko, E. Lytvynov

=

Z 2∞ 1 ⊗ U

n i=1

2 i F

G ⊗H

≤ Z 2∞ 1 ⊗ U (A)F2G ⊗H .

(2.3)

Note that the latter estimate follows from the inequality 1 ⊗ U (A1 )FG ⊗H ≤ 1 ⊗ U (A2 )FG ⊗H ,

A1 , A2 ∈ B0 (T ), A1 ⊂ A2 ,

which, in turn, is a consequence of (A2) and (A3). Hence, by (2.3), Z ⊗ dU ≤ Z ∞ U (A)L (H) .

(2.4)

L (G ⊗H)

T

Let now Z be an arbitrary element of Z(T ), and let {Z n }∞ n=1 be an approximating sequence of simple mappings as in the definition of Z(T ). By (2.4), for any m, n ∈ N, Z n ⊗ dU − Z m ⊗ dU = (Z n − Z m ) ⊗ dU T

L (G ⊗H)

T

L (G ⊗H)

T

≤ Z n − Z m ∞ U (A)L (H) .

∞ Hence, T Z n ⊗ dU n=1 is a Cauchy sequence in L (G ⊗ H), and so it has a limit, which we denote by T Z ⊗ dU . Clearly, the definition of T Z ⊗ dU does not depend on the choice of approximating sequence of simple mappings. Note that, if Z (·) belongs to Z(T ), then also Z (·)∗ belongs to Z(T ). We can therefore define, for each Z ∈ Z(T ), ∗ Z ⊗ d V := Z∗ ⊗ dV ∗ . (2.5) T

T

Finally, we set

Z ⊗ d M := T

By (2.4) and (2.5), Z ⊗ d M T

Z ⊗ dU + T

L (G ⊗H)

Z ⊗ d V. T

≤ Z ∞ U (A)L (H) + V (A)L (H) .

(2.6)

Thus, we have proved Proposition 2.1. Let M satisfy (A1)–(A3). Then, for each A ∈ B0 (T ), there exists a constant C1 (A) ≥ 0, such that, for each Z ∈ Z(T ) satisfying Z (t) = 0 for all t ∈ A, we have Z ⊗ d M ≤ C1 (A)Z ∞ . T

L (G ⊗H)

Remark 2.3. The reader is advised to compare our construction of constructions of operator-valued integrals available in [7,12,17].

T

Z ⊗ d M with

Meixner Class of Non-commutative Generalized Stochastic Processes II

433

Let us consider the special case where G = R, and so L (G) = R. As easily seen, the set Z(T ) is now the space B0 (T ) of all bounded measurable functions f : T → R with compact support. Furthermore, for each f ∈ B0 (T ), the operator T f d M := f ⊗ d M ∈ L (H) is characterized by the formula T f d M F1 , F2 := f d M F1 ,F2 , F1 , F2 ∈ H. (2.7) H

T

T

Here, for any A ∈ B0 (T ) and any F1 , F2 ∈ H, the mapping B(A) → M F1 ,F2 ( ) := (M( )F1 , F2 )H ∈ R is a signed measure on (A, B(A)). By Proposition 2.1, the total variation of M F1 ,F2 on A satisfies |M F1 ,F2 |(A) ≤ C1 (A)F1 H F2 H .

(2.8)

Remark 2.4. Assume that T = R and M(·) is an orthogonal resolution of the identity in H, i.e., a projection-valued measure on (R, B(R)). Then M(·) clearly satisfies the above assumptions and R f d M is a usual spectral integral (see e.g. [5,20]). 2.2. Generating function uniquely identifies polynomials. We will now consider a n sequence (M (n) )∞ n=1 of operator-valued measures on B0 (T ), respectively. Our initial (n) assumptions on each M will be slightly weaker than those in Subsect. 2.1. We assume that, for each n ∈ N, we are given a function B0 (T n ) → M (n) ( ) ∈ L (H) which satisfies the following assumption: (B) For any F1 , F2 ∈ H and any A ∈ B0 (T ), the mapping (n)

B0 (An ) → M F1 ,F2 ( ) := (M (n) ( )F1 , F2 )H ∈ R is a signed measure on (An , B(An )) whose total variation on An satisfies (n)

|M F1 ,F2 |(An ) ≤ C2 (A)n F1 H F2 H ,

(2.9)

where the constant C2 (A) only depends on A, and is independent of F1 , F2 ∈ H and n ∈ N. to (2.7), we may then identify, for each f (n) ∈ B0 (T n ), the integral Analogously (n) (n) d M as an element of L (H). (This operator may be thought of as a polynoTn f mial of the n th order.) For any Z 1 , . . . , Z n ∈ Z(T ), we define (Z 1 Z 2 · · · Z n )(t1 , t2 , . . . , tn ) := Z 1 (t1 )Z 2 (t2 ) · · · Z n (tm ), where the right-hand side is understood in the sense of the usual product of operators. Note that, in the case where G = R, for any f 1 , f 2 . . . , f n ∈ Z(T ) = B0 (T ), we evidently have f1 ⊗ f2 ⊗ · · · ⊗ fn = f1 f2 · · · fn .

434

M. Bo˙zejko, E. Lytvynov

For each Z ∈ Z(T ), we would like to identify an integral T n Z n ⊗ d M (n) as an element of L (G ⊗ H). However, we cannot do this under the above assumptions, so we define a four-linear form Z n ⊗ d M (n) (G 1 , F1 , G 2 , F2 ) n T := (Z (t1 ) · · · Z (tn )G 1 , G 2 )G d M F(n) (t , . . . , tn ), G 1 , G 2 ∈ G, F1 , F2 ∈ H. 1 ,F2 1 Tn

(2.10) As easily follows from the definition of Z(T ) and (B), the function T n (t1 , . . . , tn ) → (Z (t1 ) · · · Z (tn )G 1 , G 2 )G ∈ R is indeed measurable, the integral in (2.10) is finite, and moreover,

Tn

Z n ⊗ d M (n) (G 1 , F1 , G 2 , F2 ) ≤ Z n∞ C2 (supp Z )n G 1 G G 2 G F1 H F2 H .

Here, supp Z denotes the support of Z . Hence, continuous) form.

(2.11)

Tn

Z n ⊗ d M (n) is a bounded (and so

Remark 2.5. If there exists an operator Q (n) ∈ L (G ⊗ H) such that (Q (n) G 1 ⊗ F1 , G 2 ⊗ F2 )G ⊗H = Z n ⊗ d M (n) (G 1 , F1 , G 2 , F2 ), G 1 , G 2 ∈ G, F1 , F2 ∈ H, Tn

then we can identify T n Z n ⊗ d M (n) with the operator Q (n) . However, the estimate (2.11) is not sufficient for this to hold. We define a generating function of (M (n) )∞ n=1 as follows. We set Dom(G) := {Z ∈ Z(T ) : Z ∞ C2 (supp Z ) < 1}. Note that for each Z ∈ Z(T ), one can find ε > 0 such that, for each a ∈ (−ε, ε), a Z belongs to Dom(G). By virtue of (2.11), for each Z ∈ Dom(G), ∞ Z n ⊗ d M (n) (2.12) G(Z ) := 1 + n n=1 T

defines a bounded four-linear form on G × H × G × H. Here, 1 denotes the form which corresponds to the identity operator in G ⊗ H. Remark 2.6. Just as in Remark 2.5, if there exists an operator Q ∈ L (G ⊗ H) such that (QG 1 ⊗ F1 , G 2 ⊗ F2 )G ⊗H = G(Z )(G 1 , F1 , G 2 , F2 ), G 1 , G 2 ∈ G, F1 , F2 ∈ H, then we can identify G(Z ) with the operator Q. The following proposition shows that the generating function uniquely identifies the sequence (M (n) )∞ n=1 .

Meixner Class of Non-commutative Generalized Stochastic Processes II

435

˜ (n) ∞ Proposition 2.2. Let (M (n) )∞ n=1 and ( M )n=1 satisfy condition (B). Assume that ˜ ), Z ∈ Dom(G) ∩ Dom(G). ˜ G(Z ) = G(Z

(2.13)

˜ ) denotes the generating function of ( M˜ (n) )∞ .) Then, for each n ∈ N, (Here, G(Z n=1 (n) M = M˜ (n) . Proof. Let Z ∈ Z(T ). Fix ε > 0 such that, for each a ∈ (−ε, ε), a Z ∈ Dom(G) ∩ ˜ Then, by (2.13), for each G 1 , G 2 ∈ G, F1 , F2 ∈ H and each a ∈ (−ε, ε), Dom(G). ∞ n=1

a

n

Z

n

Tn

⊗ dM

(n)

∞ n (G 1 , F1 , G 2 , F2 ) = a n=1

Hence, for each n ∈ N, Z n ⊗ d M (n) = Tn

Tn

Tn

Z n ⊗ d M˜ (n) (G 1 , F1 , G 2 , F2 ).

Z n ⊗ d M˜ (n) ,

Z ∈ Z(T ).

Now, take as Hilbert space G the full Fock space over 2 : G = F(2 ). Fix n ∈ N and choose any mutually orthogonal vectors e1 , . . . , en in 2 with norm 1. Fix arbitrary 1 , . . . , n ∈ B0 (T ) and define Z ∈ Z(T ) by Z (t) :=

n

a + (ei )χ i (t),

i=1

a + (ei ) being the creation operator at ei . Set G 1 := — the vacuum, and G 2 := e1 ⊗ e2 ⊗ · · · ⊗ en . Then, for any F1 , F2 , ∈ H,

Z n ⊗ d M (n) (G 1 , F1 , G 2 , F2 ) ei1 ⊗ ei2 ⊗ · · · ⊗ ein , e1 ⊗ e2 ⊗ · · · ⊗ en F ( =

Tn

i 1 , i 2 ,...,i n =1,...,n (n)

×M F1 ,F2 ( i1 × i2 × · · · × in ) (n)

= M F1 ,F2 ( 1 × 2 × · · · × n ). Therefore, (n) (n) M F1 ,F2 ( 1 × 2 × · · · × n ) = M˜ F1 ,F2 ( 1 × 2 × · · · × n ).

Hence, by (B), for any ∈ B0 (T n ), (n) (n) M F1 ,F2 ( ) = M˜ F1 ,F2 ( ),

which implies the proposition.

2)

436

M. Bo˙zejko, E. Lytvynov

3. Generating Function for a Free Meixner Process We start with a brief recalling of the construction of a free Meixner process from [10]. Let T be as in Sect. 2, and we denote D := C0 (T ). Let σ be a Radon non-atomic measure on (T, B(T )) which satisfies σ (O) > 0 for each open, non-empty set O in T . Fix some functions λ, η ∈ C(T ), which play the role of parameters of the free Meixner process. Consider the extended Fock space F=R⊕

∞

L 2 (T n , γn ).

(3.1)

n=1

Here, for n ∈ N, the measure γn on (T n , B(T n )) satisfies

f (n) (t1 , . . . , tn ) γn (dt1 , . . . , dtn ) = f (n) (t1 , . . . , t1 , . . . , ti , . . . , ti )ηl1 −1 (t1 ) · · · ηli −1 (ti )σ (dt1 ) · · · σ (dti ) Ti

Tn

i∈N, l1 ,...,li ∈N, l1 +···+li =n

l1 times

li times

for any measurable function f (n) : T n → [0, ∞]. In particular, γn = σ ⊗n if and only if η ≡ 0. The free Meixner process is defined as the family (X ( f )) f ∈D of bounded linear operators in F given by X ( f ) = X + ( f ) + X 0 ( f ) + X − ( f ), where the creation operator X + ( f ), the neutral operator X 0 ( f ) and the (extended) annihilation operator X − ( f ) are defined by formulas (4.1)–(4.3) in [10]. We also have a representation of each X ( f ) as σ (dt) f (t)ω(t) = ω, f , X( f ) = T

where ω(t) = ∂t† + λ(t)∂t† ∂t + ∂t + η(t)∂t† ∂t ∂t

(3.2)

with ∂t† and ∂t being the creation and annihilation operator at point t, respectively (see [10, Cor. 4.2]). The corresponding system of orthogonal polynomials is denoted in this paper by P (n) (ω), f (n) ,

f (n) ∈ D(n) := C0 (T n ), n ∈ N0 .

These are the bounded linear operators in F which are recursively defined through P (0) (ω) = 1, P

(n)

P (1) (ω)(t) = ω(t),

(ω)(t1 , . . . , tn ) = ω(t1 )P (n−1) (ω)(t2 , . . . , tn ) − δ(t1 , t2 )λ(t1 )P (n−1) (ω)(t2 , . . . , tn ) − δ(t1 , t2 )P (n−2) (ω)(t3 , . . . , tn ) − [n − 2]0 δ(t1 , t2 , t3 )η(t1 )P (n−2) (ω)(t3 , . . . , tn ), n ≥ 2,

Meixner Class of Non-commutative Generalized Stochastic Processes II

437

where δ(t1 , t2 ) and δ(t1 , t2 , t3 ) are the ‘delta-functions’ defined as in [10, Sect. 2]. In particular, for any f 1 , . . . , f n ∈ D, n ≥ 2, P (n) (ω), f 1 ⊗ · · · ⊗ f n = ω, f 1 P (n−1) (ω), f 2 ⊗ · · · ⊗ f n

− P (n−1) (ω), (λ f 1 f 2 ) ⊗ f 3 ⊗ · · · ⊗ f n

− f 1 (t) f 2 (t) σ (dt) P (n−2) (ω), f 3 ⊗ · · · ⊗ f n

T

−[n − 2]0 P (n−2) (ω), (η f 1 f 2 f 3 ) ⊗ f 4 ⊗ · · · ⊗ f n .

(3.3)

Recall also that we may extend the definition of X ( f ) and of P (n) (ω), f (n) to the case where f ∈ B0 (T ) and f (n) ∈ B0 (T n ), respectively. Our aim now is to derive the generating function for these orthogonal polynomials. So, let us fix a Hilbert space G. From now on, for simplicity of notation, we will sometimes identify operators X ∈ L (G) and Y ∈ L (F) with the operators X ⊗ 1 and 1 ⊗ Y in L (G ⊗ F). For each f ∈ D, we clearly have ω, f = T f d M, where for each ∈ B0 (T ), M( ) := X (χ ). Note that M satisfies conditions (A1)–(A3) with U ( ) = X + (χ ) + X 0 (χ ), V ( ) = X − (χ ). Indeed, (A1) is trivially satisfied. For any ∈ B0 (T ), we have, by (4.1) and (4.2) in [10], X + ( ) = χ ,

X 0 ( ) = 0,

(3.4)

where is the vacuum in F, and for any n ∈ N and any f (n) ∈ L 2 (T n , γn ), (X + ( ) f (n) )(t1 , . . . , tn+1 ) = χ (t1 ) f (n) (t2 , . . . , tn+1 ), (X ( ) f 0

(n)

)(t1 , . . . , tn ) = χ (t1 )λ(t1 ) f

(n)

(t1 , . . . , tn ).

(3.5) (3.6)

Furthermore, X − ( )∗ = X + ( ).

(3.7)

Now, (A2) and (A3) easily follow from (3.4)–(3.7). Therefore, by Subsect. 2.1, we define, for each Z ∈ Z(T ), ω, Z := Z ⊗ d M ∈ L (G ⊗ F). T

It easily follows from (2.6) and the definition of the space F that ω, Z L (G ⊗F) ≤ Z ∞ C3 (supp Z ), where

C3 (A) := 2 σ (A) + 2 sup η(t) + sup |λ(t)|, t∈A

Z ∈ Z,

(3.8)

A ∈ B0 (T ).

(3.9)

t∈A

For any n ∈ N and any Z 1 , . . . , Z n ∈ Z(T ), we recurrently define an operator P (n) (ω), Z 1 · · · Z n from L (G ⊗ F) as follows. By analogy with (3.3), we set P (0) (ω), Z 0 := 1, P (1) (ω), Z := ω, Z and for n ≥ 2,

438

M. Bo˙zejko, E. Lytvynov

P (n) (ω), Z 1 · · · Z n = ω, Z 1 P (n−1) (ω), Z 2 · · · Z n

− P (n−1) (ω), (λZ 1 Z 2 ) Z 3 · · · Z n

− Z 1 (t)Z 2 (t) σ (dt) P (n−2) (ω), Z 3 · · · Z n

T

−[n − 2]0 P (n−2) (ω), (ηZ 1 Z 2 Z 3 ) Z 4 · · · Z n . (3.10) Note that, for any Z 1 , Z 2 ∈ Z(T ), the point-wise (non-commutative) product Z 1 Z 2 belongs to Z(T ), and for each Z ∈ Z(T ), λZ and ηZ also belong to Z(T ). In formula (3.10) and below, for each Z ∈ Z(T ), the integral T Z (t) σ (dt) is understood in Bochner’s sense, see e.g. [5,20]. It then easily follows by induction from (3.8)–(3.10) and a standard estimate of the norm of a Bochner integral that, for any A ∈ B0 (T ), n ∈ N, and any Z 1 , . . . , Z n ∈ Z(T ) with support in A: P (n) (ω), Z 1 · · · Z n L (G ⊗F) ≤ C4 (A)n Z 1 ∞ · · · Z n ∞ ,

(3.11)

where C4 (A) := C3 (A) + σ (A) + sup |λ(t)| + sup η(t) t∈A

t∈A

= 2 σ (A) + σ (A) + 2 sup η(t) + 3 sup |λ(t)|. t∈A

(3.12)

t∈A

Hence, for each Z ∈ Z(T ) such that Z ∞ C4 (supp Z ) < 1, the sum G(Z ) = 1 +

∞

P (n) (ω), Z n

(3.13)

n=1

defines an operator from L (G ⊗ F). Next, we set, for each n ∈ N and ∈ B0 (T n ), M (n) ( ) := P (n) (ω), χ . Analogously to (3.11), we conclude that the sequence (M (n) )∞ n=1 satisfies condition (B), and so the function G defined by (3.13) is the generating function of the operator-valued measures (M (n) )∞ n=1 in the sense of Subsect. 2.2. Hence, by Proposition 2.2, the generating function G uniquely identifies (M (n) )∞ n=1 , and hence also polynomials P (n) (ω), f (n) , f (n) ∈ D(n) . To stress the dependence of the generating function G(Z ) on the free generalized stochastic process ω, we will write G(Z , ω). Theorem 3.1. Fix any A ∈ B0 (T ). Then there exists a constant C5 (A) > 0 such that, for any Z ∈ Z(T ) satisfying supp Z ⊂ A and Z ∞ < C5 (A), formula (1.9) holds. Furthermore, we have −1 G(Z , ω) = 1 − f (Z ) ω(·), λ(·),η(·) (Z (·)) f (Z ), (3.14) where

f (Z ) := 1 +

−1 Z (t)2 σ (dt) . (3.15) 2 T 1 + λ(t)Z (t) + η(t)Z (t) Remark 3.1. The right hand side of formula (1.9)should be understood in the follown ing sense: for any real-valued function f (x) = ∞ n=0 an x which is real-analytic on

Meixner Class of Non-commutative Generalized Stochastic Processes II

439

(−r, r ), we write, for a bounded linear operator B whose norm is less than r : f (B) := ∞ Zl n n=0 an B . Under our assumption on Z ∈ Z(T ), we then have 1+λZ +ηZ 2 ∈ Z(T ), l = 1, 2. Proof. We divide the proof into several steps. Step 1. First, for a fixed A ∈ B0 (T ), let us explicitly specify a possible choice of a constant C5 (A) in the theorem. For each t ∈ T , define α(t), β(t) ∈ C so that α(t) + β(t) = λ(t), α(t)β(t) = η(t). Hence, for each x ∈ R, 1 + λ(t)x + η(t)x 2 = (1 − α(t)x)(1 − β(t)x). The right hand side of formula (1.9) now reads as 1 − ω, Z (1 − α Z )−1 (1 − β Z )−1

−1

Z (t) (1 − α(t)Z (t)) 2

+

−1

(1 − β(t)Z (t))

σ (dt)

−1 (3.16)

T

(we consider the above operator in the complexification of the real Hilbert space G ⊗ F, for which we keep the same notation). Set α A := sup |α(t)|, β A := sup |β(t)|. t∈A

t∈A

Choose C6 (A) > 0 so that ∞ ∞ k k l l α A C6 (A) β A C6 (A) C6 (A) C3 (A) + C6 (A)σ (A) < 1. (3.17) k=0

l=0

Then, by virtue of (3.8), we have that, for each Z ∈ Z(T ) such that supp Z ⊂ A and Z ∞ ≤ C6 (A), formula (3.16) defines a bounded linear operator in L (G ⊗ F). Recalling (3.11)–(3.13), we set C5 (A) := min{C4 (A)−1 , C6 (A)}.

(3.18)

Then, for each Z ∈ Z(T ) such that supp Z ⊂ A and Z ∞ < C5 (A), the left- and right-hand sides of formula (1.9) identify bounded linear operators in G ⊗ F. Let us denote the operators on the left- and right-hand sides of formula (1.9) by L(Z ) and R(Z ), respectively. Fix any , ϒ ∈ G ⊗ F. It follows that, for any Z ∈ Z(T ) such that supp Z ⊂ A, the functions f (L) (z) := (L(z Z ), ϒ)G ⊗F , f (R) (z) := (R(z Z ), ϒ)G ⊗F are analytic on z ∈ C : |z| < C5 (A)Z −1 ∞ . Step 2. Fix any A ∈ B0 (T ). Choose any set partition P = { 1 , . . . , J } of A, i.e., A=

J j=1

j , j ∈ B0 (T ), j = 1, . . . , J, J ∈ N

440

M. Bo˙zejko, E. Lytvynov

and the sets j are mutually disjoint. Set λ j := inf λ(t), η j := inf η(t), t∈ j

t∈ j

j = 1, . . . , J,

and define a function λP (t) :=

λ j , if t ∈ j , j = 1, . . . , J, 0, if t ∈ Ac ,

and analogously a function ηP (t). Now, we define a generalized operator-valued process ωP (t) and corresponding non-commutative polynomials P (n) (ωP ), f (n) , f (n) ∈ B0 (T n ), in the same way as ω(t) and P (n) (ω), f (n) were defined, but by using the functions λP and ηP instead of λ and η, respectively. We stress that these are also defined in the extended Fock space F constructed through the function η. Hence, generally speaking, the operators P (n) (ωP ), f (n) are not self-adjoint in F. This, however, does not lead to any problem when we define a generating function G P (Z ) of these polynomials. In particular, the corresponding operator-valued measure MP ( ) := ωP , χ , ∈ B0 (T ), 0 (χ ) and V ( ) = X − ( ), satisfies conditions (A1)–(A3) with U ( ) = X + (χ ) + X P P where 0 X + (χ ) := ∂t† σ (dt), X P (χ ) := λP (t)∂t† ∂t σ (dt), † − X P (χ ) := (∂t + ηP (t)∂t ∂t ∂t ) σ (dt),

− compare with (3.2). (We leave the evaluation of the adjoint operator of X P (χ ) in F to the interested reader.) Furthermore, analogously to (3.11), we get, for any A ∈ B0 (T ), n ∈ N and any Z 1 , . . . , Z n ∈ Z(T ) with support in A,

P (n) (ωP ), Z 1 · · · Z n L (G ⊗F) ≤ C4 (A)n Z 1 ∞ · · · Z n ∞ ,

(3.19)

with the same constant C4 (A) given by (3.12). (We, in particular, used that ηP (t) ≤ η(t) for all t ∈ A.) ⊗n ∞

n=1 Step 3. By definition, for each j = 1, . . . , J , the polynomials P (n) (ωP ), χ j satisfy the recursion relation (n−1) ⊗(n−1) ⊗n P

= ω , χ

− λ (ωP ), χ j

P (n) (ωP ), χ j P j j ⊗(n−2) −(σ ( j ) + [n − 2]0 η j ) P (n−2) (ωP ), χ

, n ≥ 2. j

Therefore, (n)

⊗n

= Pλ j ,η j ,σ ( j ) ( ωP , χ j ), P (n) (ωP ), χ j

(3.20)

Meixner Class of Non-commutative Generalized Stochastic Processes II

441

(n)

where (Pλ j ,η j ,σ ( j ) )∞ n=0 is a system of polynomials on R recursively defined by (u) = 1, Pλ(0) j ,η j ,σ ( j )

Pλ(1) (u) = u, j ,η j ,σ ( j )

(n)

(n−1)

Pλ j ,η j ,σ ( j ) (u) = (u − λ j )Pλ j ,η j ,σ ( j ) (u) − (σ ( j ) (n−2)

+[n − 2]0 η j )Pλ j ,η j ,σ ( j ) (u), n ≥ 2.

(3.21)

(n)

By [1], the generating function of (Pλ j ,η j ,σ ( j ) )∞ n=0 is given by ∞

z

n

(n) Pλ j ,η j ,σ ( j ) (u)

= 1−u

n=0

σ ( j )z 2 z + 1 + λ j z + η j z2 1 + λ j z + η j z2

−1 . (3.22)

More precisely, for each r > 0, there exists εr,A > 0 such that formula (3.22) holds for each u ∈ R with |u| ≤ r and for each z ∈ C such that |z| < εr,A . Let Z j ∈ L (G) be such that Z j L (G ) < C5 (A), where C5 (A) is given by (3.18). Then, by (3.20) and (3.22), we get 1+

∞

⊗n Z nj P (n) (ωP ), χ

j

n=1

= 1 − ωP , χ j

Zj 1 + λ j Z j + η j Z 2j

+ σ ( j )

−1

Z 2j 1 + λ j Z j + η j Z 2j

,

j = 1, . . . , J.

(3.23) Denote U j := ωP , χ j

Z 2j Zj − σ ( ) , j 1 + λ j Z j + η j Z 2j 1 + λ j Z j + η j Z 2j

j = 1, . . . , J.

Then (3.23) is equivalent to ∞

P (n) (ωP ), (Z j χ j )n =

n=1

∞

U nj ,

j = 1, . . . , J.

(3.24)

n=1

Step 4. We claim that, for any n ∈ N and any j1 , j2 , . . . , jn ∈ {1, 2, . . . , J } such that j1 = j2 , j2 = j3 ,…, jn−1 = jn , and any k1 , k2 , . . . kn ∈ N, we have ⊗kn ⊗k1 ⊗k2 P (k1 +k2 +···+kn ) (ωP ), χ ⊗ χ ⊗ · · · ⊗ χ

j j j 1

= P

(k1 )

n

2

⊗kn ⊗k1 ⊗k2 (ωP ), χ

P (k2 ) (ωP ), χ

· · · P (kn ) (ωP ), χ

. j1 j2 jn

(3.25)

Indeed, first we can prove by induction in k1 ∈ N that, for any fixed k2 ∈ N, and any j1 , j2 ∈ {1, 2, . . . , J }, j1 = j2 , ⊗k1 ⊗k2 ⊗k1 ⊗k2 ⊗ χ

= P (k1 ) (ωP ), χ

P (k2 ) (ωP ), χ

. P (k1 +k2 ) (ωP ), χ j j j j 1

2

1

2

442

M. Bo˙zejko, E. Lytvynov

Then, we prove (3.25) by induction in n ∈ N. Step 5. Now, fix any Z 1 , . . . , Z J ∈ L (G) such that C5 (A) . J

max Z j L (G ) <

j=1,...,J

(3.26)

Then, it follows from the derivation of the constant C5 (A) (see, in particular, (3.16)– (3.18)) that max U j L (G ⊗F) <

j=1,...,J

1 . J

By (3.24) and (3.25), we have: ∞ (U1 + U2 + · · · + U J )n n=1

=

=

∞

n=1

j1 , j2 ,..., jn ∈{1,2,...,J } j1 = j2 , j2 = j3 ,..., jn−1 = jn

∞

n=1

j1 , j2 ,..., jn ∈{1,2,...,J } j1 = j2 , j2 = j3 ,..., jn−1 = jn

⎛ ×⎝

∞ k2 =1

∞

=

n=1

⎛ ⎝

∞

k1 =1

⎛ ⎝

∞

k1 =1

⎞⎛ U kj11 ⎠ ⎝

∞

k2 =1

⎛

U kj22 ⎠ · · · ⎝

∞

kn =1

⎞ U kjnn ⎠

⎞ P (k1 ) (ωP ), (Z j1 χ j1 )k1 ⎠ ⎞

⎛

P (k2 ) (ωP ), (Z j2 χ j2 )k2 ⎠ · · · ⎝

⎞

∞ ∞

j1 , j2 ,..., jn ∈{1,2,...,J } k1 =1 k2 =1 j1 = j2 , j2 = j3 ,..., jn−1 = jn

···

∞

⎞ P (kn ) (ωP ), (Z jn χ jn )kn ⎠

kn =1

∞ kn =1

× P (k1 +k2 +···+kn ) (ωP ), (Z j1 χ j1 )k1 (Z j2 χ j2 )k2 . . . (Z jn χ jn )kn

∞ P (n) (ωP ), (Z 1 χ 1 + Z 2 χ 2 + . . . + Z n χ n )n .

=

n=1

Setting Z (t) = Z 1 χ 1 (t) + Z 2 χ 1 (t) + · · · + Z J χ J (t), t ∈ T,

(3.27)

we thus get 1+

∞ P (n) (ωP ), Z n

n=1

=

ωP ,

−1 Z (t)2 Z + σ (dt) , 2 1 + λP Z + ηP Z 2 T 1 + λP (t)Z (t) + ηP (t)Z (t) (3.28)

Meixner Class of Non-commutative Generalized Stochastic Processes II

443

provided Z ∞ <

C5 (A) . J

(3.29)

Step 6. We note that the estimate of Z ∞ in (3.29) depends on J , the number of elements in the partition P of A. It will now be shown that one can get rid of this dependence. Denote the left- and right-hand sides of formula (3.28) by L P (Z ) and RP (Z ), respectively. Analogously to Step 1, we see that L P (Z ) and RP (Z ) are in L (G ⊗ F) for any Z ∈ Z(T ) such that supp Z ⊂ A and Z ∞ < C5 (A), and furthermore, for fixed , ϒ ∈ G ⊗ F and any Z ∈ Z(T ) such that supp Z ⊂ A, the functions (L)

(R)

f P (z) := (L P (z Z ), ϒ)G ⊗F , f P (z) := (RP (z Z ), ϒ)G ⊗F are analytic on z ∈ C : |z| < C5 (A)Z −1 ∞ . Fix any δ ∈ (0, 1). Let Z (t) be of the form (3.27) and let

(3.30)

Z ∞ < δC5 (A). (L)

(R)

Then the corresponding functions f P (z) and f P (z) as in (3.30) are analytic on z ∈ C : |z| < δ −1 . By Step 5, (L)

(R)

f P (z) = f P (z), z ∈ C, |z| < (J δ)−1 . Hence, by the uniqueness of analytic continuation, (L)

(R)

f P (z) = f P (z), z ∈ C, |z| < δ −1 . In particular, (L)

(R)

f P (1) = f P (1).

(3.31)

Since , ϒ ∈ G ⊗ F were arbitrary, by (3.30) and (3.31), L P (Z ) = RP (Z ). Since δ ∈ (0, 1) was arbitrary, we conclude that formula (3.28) holds for Z (t) of the form (3.27) provided Z ∞ < C5 (A).

(3.32)

Step 7. Fix any simple mapping Z ∈ Z(T ) satisfying supp Z ⊂ A and (3.32). Without loss of generality, we may assume that Z (t) has form (3.27), where 1 , . . . , J form some partition P of A. Consider any partition P = {1 , . . . K } of A which is finer than P, i.e., any element i ∈ P is a subset of some j ∈ P. Clearly, Z (t) can be written down in the form Z (t) = Z 1 χ1 (t) + Z 2 χ2 (t) + · · · + Z K χ K (t), t ∈ T.

444

M. Bo˙zejko, E. Lytvynov

Therefore, by Step 6, for this Z (t), formula (3.28) holds in which P is replaced with P , i.e., L P (Z ) = RP (Z ). For each n ∈ N, denote by Pn a partition of A which is finer than P and such that, for each ∈ Pn , 1 sup |λ(t) − λ(s)| ∨ sup |η(t) − η(s)| ≤ . n s,t∈ s,t∈ (Clearly, such Pn exists.) By the dominated convergence theorem, L Pn (Z ) → L(Z ),

RPn (Z ) → R(Z ) as n → ∞

in L (G ⊗ F). Hence, L(Z ) = R(Z ), i.e., formula (1.9) holds for any simple mapping Z ∈ Z(T ) satisfying supp Z ⊂ A and (3.32). Step 8. For a general Z ∈ Z(T ) satisfying supp Z ⊂ A and (3.32), formula (1.9) follows by approximation of Z by simple mappings and by the dominated convergence theorem. Finally, formulas (3.14), (3.15) follow directly from (1.9), since under our assumptions, the operator Z (t)2 σ (dt) 1+ 2 T 1 + λ(t)Z (t) + η(t)Z (t) is invertible. Corollary 3.1. Let 1 , 2 , . . . , n ∈ B0 (T ) (n ≥ 2) be such that i ∩ i+1 = ∅, i = 1, 2, . . . , n −1. Let k1 , k2 , . . . , kn ∈ N and let, for each i = 1, 2, . . . , n, g (ki ) ∈ B0 (T ki ) vanish outside the set iki . Then P (k1 +k2 +···+kn ) (ω), g (k1 ) ⊗ g (k2 ) ⊗ · · · ⊗ g (kn )

= P (k1 ) (ω), g (k1 ) P (k2 ) (ω), g (k2 ) · · · P (kn ) (ω), g (kn ) . Proof. The statement follows analogously to the proof of formula (3.25). 4. Annihilation Operators and Free Differentiation Recall that the free Meixner process (X ( f )) f ∈D is a family of bounded linear operators acting in F. In view of the unitary isomorphism between F and the non-commutative L 2 -space L 2 (τ ) (see [10]), each X ( f ) acts in L 2 (τ ) as the operator of left multiplication by ω, f . In view of the expansion (3.2) it is, in particular, desirable to better understand the action of the annihilation operators ∂t , t ∈ T . Each such operator is well defined as a linear operator acting on the set of continuous polynomials in ω (denoted by CP in [10]) through ∂t P (n) (ω), f (n) = P (n−1) (ω), f (n) (t, ·) ,

f (n) ∈ D(n) .

(Recall that each continuous polynomial has a unique representation as a finite sum of orthogonal polynomials P (n) (ω), f (n) with f (n) ∈ D(n) .)

Meixner Class of Non-commutative Generalized Stochastic Processes II

445

In the classical case, in one dimension, the annihilation operator ∂, defined by ∂ P (n) (t) = n P (n) (t), is an analytic function of the operator of differentiation, D. Indeed, it directly follows from (1.3) that −1 ∂ = λ,η (D),

(4.1)

cf. [16]. This result has its counterpart in the infinite-dimensional case: as follows from (1.4), the annihilation operator at a point t satisfies −1 ∂t = λ(t),η(t) (Dt ),

(4.2)

where Dt is the operator of differentiation in the direction of the delta-function at t (often called Hida–Malliavin derivative), cf. [13,18]. In particular, if λ(t) = η(t) = 0 (Gassian white noise at t), ∂t = D t ,

(4.3)

and more generally, if η(t) = 0 (Poisson white noise if λ(t) = 0) ∂t =

∞ # λ(t)k−1 k 1 " λ(t)Dt e Dt . −1 = λ(t) k!

(4.4)

k=1

In the free case, in one dimension, one has G(z, t) = (1 − tλ,η+k (z))−1 f (z),

(4.5)

where z , 1 + λz + (η + k)z 2 1 + λz + ηz 2 f (z) = . 1 + λz + (η + k)z 2

λ,η+k (z) =

Therefore, the corresponding annihilation operator, defined by ∂ P (n) (t) = [n]0 P (n−1) (t), satisfies −1 ∂ = λ,η+k (D),

(4.6)

where D now denotes the operator of free differentiation: Dt n = [n]0 t n−1 , or equivalently D f (t) = f (t)−t f (0) , cf. [15]. Note that the operator in (4.1) is independent of the parameter k, unlike the operator in (4.6). In the infinite-dimensional case, we define operators of free differentiation by setting, for each t ∈ T , Dt ω⊗n , f (n) = [n]0 ω⊗(n−1) , f (n) (t, ·) ,

f (n) ∈ D(n) .

(4.7)

However, the operator ∂t cannot be represented as a function of Dt . Indeed, the operator f (Z ) in (3.14) is ‘global’: according to (3.15), f (Z ) depends on the whole ‘trajectory’ (Z (s))s∈T . Still, in the free Gauss–Poisson case, we will now derive a free counterpart of formulas (4.2)–(4.4).

446

M. Bo˙zejko, E. Lytvynov

So, let η ≡ 0. Let N C≥2 (1, 2, . . . , n) denote the set of all non-crossing partitions of {1, 2, . . . , n} such that each element of a partition contains at least two points. Analogously to [10], we define, for each ζ ∈ N C≥2 (1, 2, . . . , n), W − (ζ )(t1 , . . . , tn ) =

$

$

λl−2 (ti1 )δ(ti1 , ti2 , . . . , til ).

l≥2 {i 1 ,i 2 ,...,il }∈ζ

We define a linear operator G acting on CP (a ‘global’ operator) by G := 1 +

∞

n n=2 ζ ∈N C≥2 (1,2,...,n) T

σ (dt1 )σ (dt2 ) · · · σ (dtn )W − (ζ )(t1 , . . . , tn )Dt1 Dt2 · · · Dtn .

(4.8) In fact, by virtue of (4.7), when the operator G acts on a polynomial from CP, all but finitely many terms in the sum in (4.8) vanish. For example, G ω⊗4 , f 1 ⊗ f 2 ⊗ f 3 ⊗ f 4 = 1 + σ (dt1 )σ (dt2 )δ(t1 , t2 )Dt1 Dt2 T2 + σ (dt1 )σ (dt2 )σ (dt3 )λ(t1 )δ(t1 , t2 , t3 )Dt1 Dt2 Dt3 3 T + σ (dt1 )σ (dt2 )σ (dt3 )σ (dt4 ) δ(t1 , t2 )δ(t3 , t4 ) + δ(t1 , t4 )δ(t2 , t3 ) 4 T + λ(t1 )2 δ(t1 , t2 , t3 , t4 ) Dt1 Dt2 Dt3 Dt4 ω⊗4 , f 1 ⊗ f 2 ⊗ f 3 ⊗ f 4

= ω⊗4 , f 1 ⊗ f 2 ⊗ f 3 ⊗ f 4 + f 1 f 2 σ ω⊗2 , f 3 ⊗ f 4

+ λ f 1 f 2 f 3 σ ω, f 4 + f 1 f 2 σ f 3 f 4 σ + f 1 f 4 σ f 2 f 3 σ + λ2 f 1 f 2 f 3 f 4 σ , (4.9) where f σ := T f (t)σ (dt). Theorem 4.1. Let η ≡ 0. For each t ∈ T , the operator ∂t acting on CP has the following representation: −1 ∂t = λ(t),0 (Dt G) =

∞

Dt G = λ(t)k−1 (Dt G)k . 1 − λ(t)Dt G

(4.10)

k=1

In particular, if λ(t) = 0, ∂t = Dt G. k−1 (D G)k acts on a polynomial from CP, Remark 4.1. When the operator ∞ t k=1 λ(t) all but finitely many terms in the sum vanish. Remark 4.2. The reader is advised to compare formulas (4.4) and (4.10). Recall that the free counterpart of k! is [k]0 ! = 1.

Meixner Class of Non-commutative Generalized Stochastic Processes II

447

−1 Proof. First, we mention that, by (1.5), λ,0 (z) = z/(1 + λz), and so λ,0 (z) = z/ (1 − λz). For any g1 , g2 , . . . , gk ∈ D, k ≥ 2, we denote R(g1 , g2 , . . . , gk ) := σ (dt1 )σ (dt2 ) · · · σ (dtk )W − (ζ )(t1 , t2 , . . . , tk ) k ζ ∈N C≥2 (1,2,...,k) T

×g1 (t1 )g2 (t2 ) · · · gk (tk ). Then, by (4.8), for any f 1 , . . . , f n ∈ D, n ∈ N, G ω⊗n , f 1 ⊗ f 2 ⊗ · · · ⊗ f n = ω⊗n , f 1 ⊗ f 2 ⊗ · · · ⊗ f n

n R( f 1 , f 2 , . . . , f k ) ω⊗(n−k) , f k+1 ⊗ f k+2 ⊗ · · · ⊗ f n . + k=2

Hence, Dt G ω⊗n , f 1 ⊗ f 2 ⊗ · · · ⊗ f n

= f 1 (t) ω⊗(n−1) , f 2 ⊗ f 3 ⊗ · · · ⊗ f n

+

n−1

R( f 1 , f 2 , . . . , f k ) f k+1 (t) ω⊗(n−k−1) , f k+2 ⊗ f k+3 ⊗ · · · ⊗ f n

k=2

=

n

R( f 1 , f 2 , . . . , f k−1 ) f k (t) ω⊗(n−k) , f k+1 ⊗ f k+2 · · · ⊗ f n ,

k=1

where R( f 1 , f 2 , . . . , f k ) := 1 for k = 0, and R( f 1 ) := 0. Therefore, ∞

λ(t)k−1 (Dt G)k ω⊗n , f 1 ⊗ f 2 ⊗ · · · ⊗ f n

k=1

=

n k=1

λ(t)k−1

f i1 (t) f i2 (t) · · · f ik (t)

{i 1 ,i 2 ,...,i k }⊂{1,2,...,n} i 1
×R( f 1 , f 2 , . . . , f i1 −1 )R( f i1 +1 , f i1 +2 , . . . , f i2 −1 ) · · · R( f ik−1 +1 , f ik−1 +2 , . . . , f ik −1 ) × ω⊗(n−ik ) , f ik +1 ⊗ f ik +2 ⊗ · · · ⊗ f n .

(4.11)

Let now n ∈ N. Recall that, in [10] (after Theorem 2.1), we introduced the class G n of ±1-marked non-crossing partitions of {1, 2, . . . , n} such that each element of a partition with mark −1 has at least two elements and no element of a partition with mark +1 is ‘within’ any other element of this partition. Analogously, for any set {i 1 , i 2 , . . . , i k } ⊂ N, with i 1 < i 2 < · · · < i k , we denote by G(i 1 , i 2 , . . . , i k ) the corresponding class of marked partitions of {i 1 , i 2 , . . . , i k }. For any two sets {i 1 , i 2 , . . . , i k } and { j1 , j2 , . . . , jl }, with i 1 < i 2 < · · · < i k < j1 < j2 < · · · < jl , and any κ1 ∈ G(i 1 , i 2 , . . . , i k ), κ2 ∈ G( j1 , j2 , . . . , jl ), we may consider κ1 ∪ κ2 as an element − of G(i 1 , i 2 , . . . , i k , j1 , j2 , . . . , jl ). We also denote by G − n and G (i 1 , i 2 , . . . , i k ) the subsets of G n and G(i 1 , i 2 , . . . , i k ), respectively, consisting of all marked partitions all of whose elements have mark −1. (Note that each set from such a partition has at least two elements.)

448

M. Bo˙zejko, E. Lytvynov

By [10, Theorem 2.3] and using the notation introduced in that paper, we have, for any f 1 , . . . , f n ∈ D, ω⊗n , f 1 ⊗ · · · ⊗ f n =

=

κ∈G − n

+

n

Tn

n κ∈G n T

σ (dt1 ) · · · σ (dtn ) :W (κ)ω(t1 ) · · · ω(tn ): f (t1 ) · · · f (tn )

σ (dt1 ) · · · σ (dtn ) :W (κ)ω(t1 ) · · · ω(tn ): f (t1 ) · · · f (tn )

k=1 {i 1 ,i 2 ,...,i k }⊂{1,2,...,n} i 1
R( f 1 , f 2 , . . . , f i 1 −1 )R( f i 1 +1 , f i 1 +2 , . . . , f i 2 −1 )

× · · · × R( f i k−1 +1 , f i k−1 +2 , . . . , f i k −1 ) σ (dti 1 )σ (dti 2 ) · · · σ (dti k )σ (dti k +1 )σ (dti k +2 ) · · · σ (dtn ) × n−i k +k κ∈G(i k +1,i k +2,...,n) T

×:W (ζ (i 1 , i 2 , . . . , i k ) ∪ κ)ω(ti 1 )ω(ti 2 ) · · · ω(ti k )ω(ti k+1 )ω(ti k+2 ) · · · ω(tn ): × f i 1 (ti 1 ) f i 2 (ti 2 ) · · · f i k (ti k ) f i k +1 (ti k+1 ) f i k +2 (ti k+2 ) · · · f n (tn ).

(4.12)

Here, ζ (i 1 , i 2 , . . . , i k ) denotes the element of G(i 1 , i 2 , . . . , i k ) which has only one element: the set {i 1 , i 2 , . . . , i k } with mark +1. Also, in the case where i k = n, the summation sign κ∈G(ik +1,ik +2,...,n) is supposed to be omiited, and so is κ in W (ζ (i 1 , i 2 , . . . , i k ) ∪ κ). By (4.12), ∂t ω⊗n , f 1 ⊗ · · · ⊗ f n

n λ(t)k−1 = k=1

f i1 (t) f i2 (t) · · · f ik (t)

{i 1 ,i 2 ,...,i k }⊂{1,2,...,n} i 1
×R( f 1 , f 2 , . . . , f i1 −1 )R( f i1 +1 , f i1 +2 , . . . , f i2 −1 ) · · · R( f ik−1 +1 , f ik−1 +2 , . . . , f ik −1 ) σ (dtik +1 )σ (dtik +2 ) · · · σ (dtn ) × n−i k κ∈G(i k +1,i k +2,...,n) T

×:W (κ)ω(tik+1 )ω(tik+2 ) · · · ω(tn ): f ik +1 (tik+1 ) f ik +2 (tik+2 ) · · · f n (tn ) n λ(t)k−1 f i1 (t) f i2 (t) · · · f ik (t) = k=1

{i 1 ,i 2 ,...,i k }⊂{1,2,...,n} i 1
×R( f 1 , f 2 , . . . , f i1 −1 )R( f i1 +1 , f i1 +2 , . . . , f i2 −1 ) · · · R( f ik−1 +1 , f ik−1 +2 , . . . , f ik −1 ) × ω⊗(n−ik ) , f ik +1 ⊗ f ik +2 ⊗ · · · ⊗ f n . By (4.11) and (4.13), the theorem follows.

(4.13)

We finish the paper with a discussion of a free differential equation satisfied by the cumulant generating function for a free Meixner class. In the one-dimensional case, Anshelevich [2] proved that the free cumulant generating function, Cλ,η (z), satisfies the following second-order free differential equation: D 2 Cλ,η (z) = 1 + λDCλ,η (z) + η(DCλ,η (z))2 ,

(4.14)

Meixner Class of Non-commutative Generalized Stochastic Processes II

449

where D is the operator of free differentiation. Note that, by (1.5), DCλ,η (z) = z −1 Cλ,η (z) and D 2 Cλ,η (z) = z −2 Cλ,η (z). In fact, Eq. (4.14) uniquely characterizes the corresponding free Meixner distribution. This result admits a generalization to the multivariate case, see Theorem 3 in [2], in particular, formula (14). So our aim now is to derive an infinite dimensional analog of Eq. (4.14). We fix a free Meixner class. By [10, Cor. 4.1], for each A ∈ B0 (T ) there exists a constant C7 (A) > 0 such that, for each f ∈ B0 (T ) with supp f ⊂ A and f ∞ < C7 (A), the free cumulant generating function is given by −1 % 2 2 σ (dt) 2 f (t) 1 − λ(t) f (t) + (1 − λ(t) f (t)) − 4 f (t)η(t) .

C( f ) =

2

T

(4.15) Recall that, for a real-valued functional F( f ) on B0 (T ) and ϕ ∈ B0 (T ), the derivative of F( f ) in direction ϕ is defined by d F( f + εϕ) . ε=0 dε

Dϕ F( f ) := If a function

δ F( f ) δ f (t)

(4.16)

: T → R exists such that

Dϕ F( f ) =

σ (dt) ϕ(t) T

δ F( f ) , ϕ ∈ B0 (T ), δ f (t)

f) δ F( f ) then δδF( f (t) is called the functional derivative of F( f ). Note that, even if δ f (t) does not exist as a function, in some cases one may interpret the functional derivative as a distri2 ) bution. Analogously, one defines the second functional derivative, δ fδ(s)F(δ ff (t) , through

Dψ Dϕ F( f ) =

T2

σ (ds)σ (dt)

δ 2 F( f ) ψ(s)ϕ(t), ϕ, ψ ∈ B0 (T ). δ f (s) δ f (t)

Let us assume that a functional F( f ) has the form F( f ) = σ (dt) ϒ(t, f (t))

(4.17)

T

with some function ϒ : R2 → R. Then, by (4.16), Dϕ F( f ) = σ (dt) Dz ϒ(t, z)) T

z= f (t)

ϕ(t),

(4.18)

where Dz ϒ(t, z) denotes the derivative of the function ϒ(t, z) in the z variable. Hence, the first and second functional derivatives of F( f ) are δ F( f ) = Dz ϒ(t, z) , z= f (t) δ f (t) δ 2 F( f ) . = δ(s, t)Dz2 ϒ(t, z) z= f (t) δ f (s) δ f (t)

(4.19) (4.20)

450

M. Bo˙zejko, E. Lytvynov

By analogy, for F( f ) of the form (4.17), we define a free directional derivative by the formula (4.18) in which Dz ϒ(t, z) now denotes the free derivative of the function ϒ(t, z) in the z variable: Dz ϒ(t, z) =

ϒ(t, z) − ϒ(t, 0) . z

(4.21)

The first and second free functional derivatives are then given by (4.19), (4.20) with Dz as in (4.21). Hence, by (4.15), the first and second free functional derivatives of C( f ) are −1 % δC( f ) = 2 f (t) 1 − λ(t) f (t) + (1 − λ(t) f (t))2 − 4 f 2 (t)η(t) , δ f (t) (4.22) −1 % δ 2 C( f ) = δ(s, t)2 1 − λ(t) f (t) + (1 − λ(t) f (t))2 − 4 f 2 (t)η(t) . δ f (s) δ f (t) Proposition 4.1. For each A ∈ B0 (T ) and each f ∈ B0 (T ) with supp f ⊂ A and f ∞ < C7 (A), the cumulant generating function C( f ) satisfies the following free differential equation: & ' δC( f ) δC( f ) 2 δ 2 C( f ) = δ(s, t) 1 + λ(t) + η(t) . δ f (s) δ f (t) δ f (t) δ f (t)

(4.23)

The above equation is understood as an equality of two signed measures on (T 2 , B(T 2 )). Proof. Equation (4.23) is equivalent to stating that, for any ϕ, ψ ∈ B0 (T ), T

σ (dt)ψ(t)ϕ(t)Dz2 Cλ(t),η(t) (z) z= f (t) &

= T

σ (dt)ψ(t)ϕ(t) 1 + λ(t)Dz Cλ(t),η(t) (z)

2 '

z= f (t)

+ η(t) Dz Cλ(t),η(t) (z)

z= f (t)

.

(4.24) But (4.14) yields that, for each t ∈ T , 2 Dz2 Cλ(t),η(t) (z) = 1 + λ(t)Dz Cλ(t),η(t) (z) + η(t) Dz Cλ(t),η(t) (z) , from which (4.24) follows. Acknowledgements. We would like to thank the referee for a careful reading of the manuscript and making very useful comments and suggestions. The research was partially supported by the International Joint Project grant 2008/R2 of the Royal Society. The authors also acknowledge the financial support of the SFB 701 “Spectral structures and topological methods in mathematics”, Bielefeld University. MB was partially supported by the Polish Ministry of Science and Higher Education, grant N N201 364436. EL was partially supported by the PTDC/MAT/67965/2006 grant, University of Madeira.

Meixner Class of Non-commutative Generalized Stochastic Processes II

451

References 1. Anshelevich, M.: Free martingale polynomials. J. Funct. Anal. 201, 228–261 (2003) 2. Anshelevich, M.: Free Meixner states. Commun. Math. Phys. 276, 863–899 (2007) 3. Anshelevich, M.: Orthogonal polynomials with a resolvent-type generating function. Trans. Amer. Math. Soc. 360, 4125–4143 (2008) 4. Berezansky, Y.M., Lytvynov, E., Mierzejewski, D.A.: The Jacobi field of a Lévy process. Ukrainian Math. J. 55, 853–858 (2003) 5. Berezansky, Y.M., Sheftel, Z.G., Us, G.F.: Functional analysis. Vol. I, II. Basel: Birkhäuser Verlag, 1996 6. Bo˙zejko, M., Bryc, W.: On a class of free Lévy laws related to a regression problem. J. Funct. Anal. 236, 59–77 (2006) 7. Biane, P., Speicher, R.: Stochastic calculus with respect to free Brownian motion and analysis on Wigner space. Probab. Theory Related Fields 112, 373–409 (1998) 8. Bo˙zejko, M., Demni, N.: Generating functions of Cauchy–Stieltjes type for orthogonal polynomials. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 12, 91–98 (2009) 9. Bo˙zejko, M., Leinert, M., Speicher, R.: Convolutions and limit theoerems for conditionally free random variables. Pacific J. Math. 175, 357–388 (1996) 10. Bo˙zejko, M., Lytvynov, E.: Meixner class of non-commutative generalized stochastic processes with freely independent values I. A characterization. Commun. Math. Phys. 292, 99–129 (2009) 11. Chihara, T. S.: An introduction to orthogonal polynomials. New York-London-Paris: Gordon and Breach Science Publishers, 1978 12. Hudson, R.L., Parthasarathy, K.R.: Quantum Ito’s formula and stochastic evolutions. Commun. Math. Phys. 93, 301–323 (1984) 13. Lytvynov, E.: Polynomials of Meixner’s type in infinite dimensions—Jacobi fields and orthogonality measures. J. Funct. Anal. 200, 118–149 (2003) 14. Lytvynov, E.: Orthogonal decompositions for Lévy processes with an application to the gamma, Pascal, and Meixner processes. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 6, 73–102 (2003) 15. Lytvynov, E., Rodionova, I.: Lowering and raising operators for the free Meixner class of orthogonal polynomials. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 12, 387–399 (2009) 16. Meixner, J.: Orthogonale Polynomsysteme mit einem besonderen Gestalt der erzeugenden Funktion. J. London Math. Soc. 9, 6–13 (1934) 17. Parthasarathy, K.R.: An introduction to quantum stochastic calculus. Basel: Birkhäuser Verlag, 1992 18. Rodionova, I.: Analysis connected with generating functions of exponential type in one and infinite dimensions. Methods Funct. Anal. Topology 11, 275–297 (2005) 19. Saitoh, N., Yoshida, H.: The infinite divisibility and orthogonal polynomials with a constant recursion formula in free probability theory. Probab. Math. Statist. 21, 159–170 (2001) 20. Yosida, K.: Functional analysis. Sixth edition. Berlin-New York: Springer-Verlag, 1980 Communicated by Y. Kawahigashi

Commun. Math. Phys. 302, 453–476 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1159-8

Communications in

Mathematical Physics

On the Global Existence of Mild Solutions to the Boltzmann Equation for Small Data in L D Diogo Arsénio Laboratoire Jacques-Louis Lions, Université Pierre et Marie Curie, Paris, France. E-mail: [email protected] Received: 15 March 2010 / Accepted: 24 May 2010 Published online: 13 November 2010 – © Springer-Verlag 2010

Abstract: We develop a new theory of existence of global solutions to the Boltzmann equation for small initial data. These new mild solutions are analogous to the mild solutions for the Navier-Stokes equations. The existence comes as a result of the study of the competing phenomena of dispersion, due to the transport operator, and of singularity formation, due to the nonlinear Boltzmann collision operator. It is the joint use of the so-called dispersive estimates with new convolution inequalities on the gain term of the collision operator that allows to obtain uniform bounds on the solutions and thus demonstrate the existence of solutions.

1. Mild Solutions The Boltzmann equation is a model for gases in a low density regime. That is, a many particles system where we assume Nr 2 ≈ 1 as N → ∞ and r → 0, N being the number of particles and r their diameter. Denoting by F(t, x, v) ≥ 0 the density function describing the microscopic state of the gas at time t ∈ [0, ∞), position x ∈ ⊂ R D and velocity v ∈ R D , where D ≥ 2 is the dimension, then the initial value problem for Boltzmann’s equation reads as follows: ∂t F + v · ∇x F = Q(F, F),

F(0, x, v) = F in (x, v),

(1.1)

for some initial density F in ≥ 0. The Boltzmann operator Q(F, F) is a quadratic integral operator and accounts for the variations of the density F due to the interparticle elastic collisions in the gas. It is defined by Q(F, G)(v) =

RD

S D−1

F G ∗ − F G ∗ b(v − v∗ , σ ) dσ dv∗ ,

(1.2)

454

D. Arsénio

where the unit sphere S D−1 ⊂ R D is endowed with its standard surface measure and we have denoted F = F(v), F = F(v ), G ∗ = G(v∗ ) and G ∗ = G(v∗ ), and v + v∗ |v − v∗ | − σ. (1.3) 2 2 It is natural to split Q(F, G) into its gain part Q+ (F, G)(v) = F G ∗ b dσ dv∗ and its loss part Q− (F, G)(v) = F G ∗ b dσ dv∗ . One can show that the quadruple of pre-collisional and post-collisional velocities (v, v∗ , v , v∗ ) parametrized by σ ∈ S D−1 provides the family of all solutions to the system of D + 1 equations v =

v + v∗ |v − v∗ | + σ, 2 2

v∗ =

v + v∗ = v + v∗ ,

|v|2 + |v∗ |2 = |v |2 + |v∗ |2 ,

(1.4)

which, at the kinetic level, expresses the fact that interparticle collisions are assumed elastic and thus conserve momentum and energy.The so-called collision kernel or collisional cross section b (z, σ ) ∈ L 1loc R D × S D−1 is a nonnegative measurable function that is determined by the molecular forces that are being considered in the gas. In fact, by the Galilean invariance of collisions, the kernel only depends on the magnitude of v−v∗ the relative velocity |v − v∗ | and the deflection angle θ defined by cos θ = |v−v · σ, ∗| so that we will often write b(v − v∗ , σ ) = b (|v − v∗ |, cos θ ) interchangeably, by abuse of language. Moreover, noticing that the change of variable σ → −σ only exchanges v with v∗ , it is always possible to replace the collision kernel b(z, σ ) in the Boltzmann collision operator Q(F, F) by its symmetrized version [b(z, σ ) + b(z, −σ )] 1{cos θ≥0} without changing its nature. Therefore, without any loss of generality, we will always assume that the cross section is supported on {cos θ ≥ 0}. The Boltzmann equation is a cornerstone of collisional kinetic theory and its Cauchy problem has thus always attracted much interest. However, due to its nonlinear nature, constructing solutions to (1.1) remains a challenging problem, even though it has already been tackled with numerous strategies. Most notably, the DiPerna-Lions theory of renormalized solutions [11] provides the existence, globally in time, of solutions to (1.1) based solely on the physical a priori estimates and for large initial data. Unfortunately, the uniqueness of renormalized solutions is unknown. On the other hand, several local well-posedness results are available, of which we will only mention the weak solutions of Illner, Kaniel and Shinbrot [17,18] and the strong solutions of Guo [13,14]. In this work, we establish the existence (without uniqueness) for small initial data of a new class of weak solutions to (1.1), which we call mild solutions and are analogous to the mild solutions for the incompressible Navier-Stokes equations developed by Kato [19]. 1.1. Strategy. Throughout this work, we will assume that the spatial domain is in fact the entire space R D , which will be crucial in order to allow for the mechanisms of transfer of integrability to act on the solutions as explained lateron in Sect. 3. Considering any f (t, x, v), g(t, x, v) ∈ L 1loc R × R D × R D , it is well-known from the linear transport theory that (∂t + v · ∇x ) f (t, x, v) = g(t, x, v),

in the sense of distributions

(1.5)

On the Global Existence of Mild Solutions to the Boltzmann Equation

455

if and only if ∂t ( f (t, x + tv, v)) = g(t, x + tv, v),

in the sense of distributions. (1.6) 1,1 R; L 1loc R D × R D , which in turn In particular, it holds that f (t, x + tv, v) ∈ Wloc implies (after possibly redefining f (t, x+ tv, v) on a set of measure zero) the time continuity f (t, x + tv, v) ∈ Cloc R; L 1loc R D × R D and thus allows to give an unambiguous sense to weak solutions of the Cauchy problem for the transport equation (1.5). in Considering D then any specific initial value f (0, x, v) = f D(x, v)D in the space 1 D L loc R × R , it holds that, for almost every (t, x, v) ∈ R × R × R , t f (t, x, v) = f in (x − tv, v) + g (s, x − (t − s)v, v) ds. (1.7) 0

This representation is merely Duhamel’s formula for the transport equation and is obtained by simply integrating (1.5) along its characteristic curves. For convenience, we will utilize the operators T and N defined by T f in (t, x, v) = f in (x − tv, v) t and N g(t, x, v) = g (s, x − (t − s)v, v) ds,

(1.8)

0

so that Duhamel’s formula may be written as f = T f in + N g. Then, equipped with this integral formulation, it becomes simple to demonstrate a basic local well-posedness result for the Boltzmann equation (1.1). Indeed, by possibly assuming that the collision kernel is globally integrable, it is readily seen that the Boltzmann operator satisfies, for any given T > 0, N Q(F, F) − N Q(G, G) L ∞ ∞ t [0,T ];L x,v

= N Q(F, F − G) + N Q(F − G, G) L ∞ ∞ t [0,T ];L x,v + G ∞ ≤ T b L 1 (R D ×S D−1 ) F L ∞ ∞ ∞ L t [0,T ];L x,v t [0,T ];L x,v . × F − G L ∞ ∞ t [0,T ];L x,v

(1.9)

Consequently, the nonlinear operator KF = T F in + N Q (F, F) is a well-defined contraction on the complete metric space { F L ∞ ≤ R} for some R > 0, provided T > 0 is small enough. It follows then from the Banach fixed point theorem that the above operator K has a unique fixed point, thus showing the local well-posedness of Boltzmann’s equation, i.e. the existence and uniqueness of weak solutions in L ∞ [0, T ]; L ∞ t x,v for some small time T > 0. It had already been noticed by Lions in [20] that such local wellposedness results could be proven in L ∞ as a simple and direct way to construct solutions that could be applied to the weak-strong uniqueness principle for the Boltzmann equation demonstrated therein. As emphasized by Lions, we do not make any claim regarding the originality of this simple result. It is then legitimate to attempt to extend this method of construction of mild solutions to other function spaces. Thus, we wish to find fixed points of the operator K in other p q functional settings such as the mixed Lebesgue spaces L t L x L rv and the success of this approach therefore relies on obtaining norm preserving estimates similar to (1.9) for some mixed Lebesgue norm. It turns out that such stable controls are only available for the gain term Q+ (F, G) of the Boltzmann operator as demonstrated in Sect. 3.3 and

456

D. Arsénio

it will therefore not be possible to prove contractivity properties for the operator K, which is the reason why our methods employed here will fail to yield the uniqueness of solutions. The uniqueness may however be recovered in some situations with some mildly restrictive assumptions, which is currently under study. The key idea in obtaining norm preserving estimates on the operator F → N Q+ (F, F) is based on the analysis of two distinct mechanisms of transfer of integrability. On the one hand, we have the dispersion estimates, which we present in Sect. 3.1 and were initially developed by Castella and Perthame in [9]. These estimates express the fact that densities are transported along the microscopic trajectories under the action of the transport operator. Loosely speaking, this dispersive character due to the action of the operator N will allow the transfer of some integrability from the velocity variable into the time and space variables. On the other hand, even though the nonlinear nature of the Boltzmann operator may lead to the formation of singularities and blowups, which is physically due to the collisions in the gas, we have a convoluting effect of the gain term of the Boltzmann operator whose action allows us to tame and control the singularities with appropriate norms. These effects are expressed through new convolution inequalities on the Boltzmann operator exposed in Sect. 3.2, which will let us transfer the integrability back into the velocity variable and thus obtain norm preserving estimates. In fact, the gain term is already known to possess some convoluting properties as in [1,12,15,16,24] and even to provide smoothing effects as seen in [7,20,21,27]. Thus, what will be possible to achieve is in fact the stability of a complete metp q ric space F L t L x L r ≤ R for some R > 0 under the action of the operator K. v This will provide the weak relative compactness of the solutions Fn , n ∈ N, to an approximate Boltzmann equation Fn = Kn Fn , where Kn is a truncated operator where the nonlinearity is tamed. By a compactness argument, we will then let n tend to infinity and thus recover a fixed point F = KF in the limit, thus yielding the existence of mild solutions for Boltzmann’s equation in the small, i.e. for small time or small initial values. In fact, we will focus on obtaining existence results valid globally in time for small initial data, but it will be clear that the same method may be applied to yield the existence of solutions locally in time and for large initial data.

2. Global Existence for Small Initial Data The main result in this work is expressed in the following theorem. Theorem 2.1. Let the dimension be D = 2 or D = 3 only and 2 < λ0 ≤ 3 be a fixed parameter. Let the cross section b(z, σ ) ≥ 0 satisfy D 1 D− 1+ α

b(z, σ ) ∈ L z

R D ; L 1σ S D−1 for some α ≥ λ0 such that α > D,

and b(z, σ ) = b (|z|, cos θ ) ≤ a0 (|z|)b0 (cos θ ) , D b0 (cos θ ) and some for some a0 (|z|) ∈ L zD−2 R D ∈ L 1σ S D−1 . (2.1) λ0 +1 sin 2λ0 θ2

On the Global Existence of Mild Solutions to the Boltzmann Equation

Then, there exists a constant C0 > 0 such that for every initial data satisfying D R D × R D , F in ≥ 0 F in (x, v) ∈ L x,v

b0 (cos θ )

a0 D

and

F in D < C0 , λ0 +1

D−2 L x,v Lz

sin 2λ0 θ

2 L1

457

(2.2)

σ

there exists a weak solution F(t, x, v) to the Boltzmann equation (1.1) with initial data F in and satisfying λ D λ F ∈ L λt [0, ∞); L x λ−1 R D ; L D λ+1 R D for all λ0 ≤ λ ≤ ∞ such that λ > D. Furthermore, the Boltzmann collision operator satisfies Dα α α D α+1 2(α−1) 2 . Q(F, F) ∈ L t [0, ∞); L x Lv

(2.3)

(2.4)

We provide the demonstration of the above theorem in Sect. 4. A few comments are now in order: 1. First of all, as explained in Sect. 1.1, notice that the above weak solutions enjoy, at the very least, the temporal continuity along the characteristics F(t, x + tv, v) ∈ Cloc [0, ∞); L 1loc R D × R D , (2.5) so that the initial value problem makes sense. 2. Note that the integrability assumption on the angular kernel b0 in (2.2) becomes more stringent as the parameter λ0 tends closer to two. On the other hand, a smaller value for λ0 yields more spaces characterizing the weak solution in (2.3). In three dimensions, it is pointless to choose any value other than λ0 = 3 because of the restriction λ > D. However, in two dimensions, if one is willing to impose stricter conditions on the angular kernel b0 , it is thus possible to enlarge the existence spaces in (2.3) by letting λ0 approach the value λ0 → 2. 3. The integrability assumption (2.1) on the collision kernel makes the above existence result only suitable for cross sections that have a decaying kinetic kernel at infinity. However, it allows for some local integrable singularities. Thus, examples of kernels suitable for the application of the above theorem are given by b (|z|, cos θ ) = and

1 |z| A

(1 + |z|) B−A

sinC θ 1{cos θ≥0}

(2.6)

1 1 b (|z|, cos θ ) = χ (|z|) A + (1 − χ ) (|z|) B sinC θ 1{cos θ≥0} , (2.7) |z| |z| 0 +1 − (D − 1), where χ (|z|) is a for any A < D − 2, B > D − 1 + α1 and C > λ2λ 0 cutoff function satisfying 1{|z|≤1} ≤ χ (|z|) ≤ 1{|z|≤2} , say. These examples remain rather artificial as they are not physical (see [26] for more details on cross sections).

458

D. Arsénio

4. Even though the above results provide the basic framework for understanding the mechanisms of transfer of integrability between the transport and collision operators, it is clear that they can be largely improved. For instance, by using spaces with weights and regularity, as was done in the homogeneous case in [24] by Mouhot and Villani for instance, it should be possible to extend the well-posedness results to more function spaces and, by the same token, considerably relax the assumptions on the cross section. D of the initial data illustrates the 5. The smallness condition (2.2) on the norm L x,v existing competition between the dispersive effects of the transport operator and the singularity formations due to collisions. Indeed, suppose that the initial data has finite mass, i.e. finite L 1x,v norm, no matter how large. Then, loosely speaking, D norm is arbitrarily small by sufficiently it is still possible to ensure that the L x,v spreading this mass about the whole space so that the inequality (2.2) is satisfied. Thus, the condition (2.2) guarantees that the ensuing dispersion may overcome the large amount of collisions measured by the norm of the collision kernel. 6. The significance of these existence results lies in the fact that they yield global solutions for a very large class of initial data. Indeed, only an integrability condition without any regularity or pointwise bound is imposed on the initial value. Furthermore, there is no need to renormalize the equation. 7. Some major drawbacks have to be acknowledged. Indeed, the existence is only true for small initial data and necessitates stringent hypotheses on the collision kernel. However, due to the sometimes crude methods of proof we employed, it seems that the results are not yet optimal. Our methods are based on the splitting of the Boltzmann operator into its gain and its loss part and thus do not take advantage of the cancellation properties between these two components. Still, this work truly exhibits the mechanisms of transfer of integrability between the dispersive effects of the transport operator and the convoluting effects of the Boltzmann operator, thus leaving much room for interesting research perspectives. Finally, it is crucial to consider the whole space in order to exploit the dispersive effects and so it seems difficult to adapt the present methods to other spatial domains. 8. We have stated Theorem 2.1 as an existence result near vacuum. However, it is possible to prove a similar existence assertion near a Maxwellian equilibrium M(x − tv, v), i.e. M(x, v) is a Gaussian distribution in both variables x and v, by using the fact that Q (F, F) = Q (F, F) − Q (M, M) = Q (F − M, F − M) + Q (F − M, M) + Q (M, F − M) , (2.8) and adapting the proof where necessary. This is precisely the kind of mild solution that may be used to perform hydrodynamic limits of the Boltzmann equation. This is currently under study. 9. Notice the similarity of the present theory with the mild solutions for the incompressible Navier-Stokes equations built by Kato [19], which are also set in an L D context, where D is the space dimension. It is strongly expected that a link through hydrodynamic limits exists between the mild solutions of the Boltzmann and Navier-Stokes equations. 10. It is possible to obtain similar existence results for large initial data but for a small time of existence. Moreover, one can show, at least in some cases, the uniqueness of mild solutions by exploiting the global conservation of mass, which is a natural a

On the Global Existence of Mild Solutions to the Boltzmann Equation

459

priori estimate on the solutions of the Boltzmann equation. This is currently under study. 11. With an elementary Picard iteration scheme and using the same controls on the gain term of the Boltzmann collision operator that are utilized in the proof of Theorem 2.1, it is possible to prove a global existence and uniqueness result for the gain-term-only Boltzmann equation for small initial data. This result is not in contradiction with the blowup results for the same equation from [2]. Furthermore, the ensuing weak solutions provide very convenient candidates for the so-called beginning condition for the Kaniel-Shinbrot iteration scheme from [17,18]. 3. Transfer of Integrability In this section, we first expose the mechanisms of transfer of integrability in Sects. 3.1 and 3.2 and then draw some consequences in the form of norm preserving estimates in Sect. 3.3. 3.1. Dispersion estimates. We recall here the so-called dispersion estimates developed by Castella and Perthame in [9], which are reminiscent of the well-known Strichartz estimates for the wave, Klein-Gordon and Schrödinger equations from [25]. We also provide proofs of these results for the convenience of the reader. One may also consult [8] for a clear exposition of the subject. p Lemma 3.1. Let 1 ≤ r ≤ p ≤ ∞ and f ∈ L rv R D ; L x R D . Then, f (x, v) L xp L r ≤ f (x, v) L r L xp . v

(3.1)

v

Proof. The above assertion is readily verified in the case p = r and in the case p = ∞. We conclude by applying the Riesz-Thorin interpolation theorem for mixed Lebesgue spaces (see [4, p. 316]). p Proposition 3.2. Let 1 ≤ r ≤ p ≤ ∞ and f ∈ L rx R D ; L v R D . Then, for any t = 0, we have that f (x − tv, v) L xp L r ≤ v

1

|t|

D

1 1 r−p

f (x, v) L r L p . x

v

(3.2)

Proof. Using first the change of variable v = x−y t and then using Lemma 3.1, we infer

−D 1 x−y

r f (x − tv, v) L xp L r = |t| f y,

p r v t Lx Ly

1 1

−D 1

−D −

x−y

r p

r ≤ |t| f y, = |t| f (y, x)

(3.3)

r p, p t L ry L x Ly Lx which concludes the proof of the proposition. Proposition 3.3. Let 1 ≤ a, p, q, r ≤ ∞ be such that 2 1 1 2 1 1 =D − , = + and q r p a p r

a < q.

(3.4)

460

D. Arsénio

Then, for any f ∈ L ax,v R D × R D , we have that f (x − tv, v) L qt R;L xp L r ≤ C f (x, v) L ax,v ,

(3.5)

v

for some fixed constant C > 0. Proof. First, let us assume that a = 2 and k = q. It follows that r = p . Then, for any q p p g ∈ L t L x L v , we have that

2

g(t, x + tv, v) dt

R

L 2x,v

=

g(t, x + tv, v)g(s, x + sv, v) dtdsd xdv R R = g(t, x, v) g(s, x − (t − s)v, v) dsdtd xdv R R

g(s, x − (t − s)v, v)ds ≤ g(t, x, v) q p p

q Lt Lx Lv R

p

p

.

(3.6)

Lt Lx Lv

Next, using Proposition 3.2 and the classical Hardy-Littlewood-Sobolev inequality, we deduce

g(s, x − (t − s)v, v) p p ds

g(s, x − (t − s)v, v) ds

≤

q p p

q

L L x v R R Lt Lx Lv Lt

2 −

q ≤

(3.7)

|t − s| g(s, x, v) p p ds ≤ C g(t, x, v) q p p , Lx Lv

R

Lt Lx Lv

q

Lt

where C > 0 is a fixed constant. Note that the use above of the classical Hardy-Littlewood-Sobolev inequality is valid because it is assumed that q2 < 1. Combining now the estimates (3.6) and (3.7), we obtain

1

g(t, x + tv, v) dt

≤ C 2 g(t, x, v) q p p . (3.8)

R

Lt Lx Lv

L 2x,v

We can now turn to estimating the norm of f (x − tv, v). Using (3.8), we infer

f (x − tv, v)g(t, x, v) dtd xdv

R

f (x, v) = g(t, x + tv, v) dtd xdv

R

≤ f (x, v) L 2x,v g(t, x + tv, v) dt

R

L 2x,v

1 2

≤ C f (x, v) L 2x,v g(t, x, v) q

p

q

p

p

Lt Lx Lv

.

(3.9)

p

Taking the supremum over all g ∈ L t L x L v in (3.9) permits us to conclude f (x − tv, v)

q

p

p

Lt Lx Lv

1

≤ K 2 f (x, v) L 2x,v ,

thus proving that (3.5) holds in the case a = 2.

(3.10)

On the Global Existence of Mild Solutions to the Boltzmann Equation

If a = 2, it suffices to apply inequality (3.10) to f

a 2

461

to deduce

2

a a

f (x − tv, v) L q L xp L r = f (x − tv, v) 2 2q t v

1

a

2p

2r

L t a L xa L va

2

1

≤ K a f (x, v) 2 La 2 = K a f (x, v) L ax,v ,

(3.11)

x,v

which concludes the proof of the proposition. Proposition 3.4. Let 1 ≤ k, l, p, q, r ≤ ∞ be such that 1 1 1 and 1 < l < k < ∞, =D − q r p

1+

1 1 1 = + . k q l

(3.12)

p Then, for any g ∈ L lt R; L rx R D ; L v R D , we have the following estimate:

t

g(s, x − (t − s)v, v) ds

0

p

L kt L x L rv

≤ C g(t, x, v) L l L r L vp , t

x

(3.13)

for some fixed constant C > 0. Proof. First, using Proposition 3.2, we estimate

t t

g(s, x − (t − s)v, v) ds ≤ g(s, x − (t − s)v, v) L xp L r ds

p r

v 0 0 Lx Lv t −1 ≤ (t − s) q g(s, x, v) L r L vp ds. x

0

(3.14)

Next, applying the Hardy-Littlewood-Sobolev inequality to the last integral in (3.14), we infer

t

1

(t − s)− q g(s, x, v) r p ds ≤ C g(t, x, v) l r p , (3.15) Lx Lv Lt Lx Lv

k 0

Lt

which concludes the proof. Proposition 3.5. Let 1 ≤ l, p, q, r ≤ ∞ be such that 1 1 1 1 1 =D − , 1= + 1 < q, q r p l 2q

and

1 1 + ≤ 1. p r

(3.16)

p Then, for any g ∈ L lt R; L rx R D ; L v R D , we have the following estimate:

t

g(s, x − (t − s)v, v) ds

0

for some fixed constant C > 0.

2 pr p+r

L∞ t L x,v

≤ C g(t, x, v) L l L r L vp , t

x

(3.17)

462

D. Arsénio

Proof. First, using Hölder’s inequality and Proposition 3.2, we find that

t

g(s, x − (t − s)v, v) ds

0

2 pr p+r

L x,v

t t

1

2

=

g(s, x − (t − s)v, v)g(u, x − (t − u)v, v) dsdu

pr

p+r 0

≤

0

=

0

t

t

t

g(s, x, v)g(u, x − (s − u)v, v)

t

0

21

dsdu

pr p+r

dsdu

L x,v

g(s, x, v) L r L vp g(u, x − (s − u)v, v) L xp L r duds

21

v

x

0

pr p+r L x,v 1 2

0

t 0

g(s, x − (t − s)v, v)g(u, x − (t − u)v, v)

0

t 0

≤

t

t 0

≤

L x,v

g(s, x, v) L r L vp |s − u|

− q1

x

g(u, x, v) L r L vp duds

21

x

.

(3.18)

If q < ∞, applying the classical Hardy-Littlewood-Sobolev inequality to the last integral of (3.18) yields

t

g(s, x − (t − s)v, v) ds

0

2 pr p+r

L x,v

≤ C g(s, x, v)

(2q)

Ls

p

L rx L v

,

(3.19)

where C > 0 is a constant. On the other hand, if q = ∞, the same holds true by direct computation, which concludes the proof. 3.2. Convolution inequalities. Recall that the gain and loss operators, Q+ and Q− respectively, are defined by Q+ (F, G) =

RD

−

Q (F, G) =

RD

S D−1

F G ∗ b(v − v∗ , σ ) dσ dv∗ ,

S D−1

(3.20) F G ∗ b(v − v∗ , σ ) dσ dv∗ .

Employing the well-known pre-post-collisional change of variables, which merely permutes (v, v∗ ) and (v , v∗ ) and has unit jacobian, these operators become, in their Maxwellian (weak) formulation,

RD

Q+ (F, G)ϕ dv = −

RD

Q (F, G)ϕ dv =

RD

RD

RD

RD

S D−1

S D−1

F(v)G(v∗ )ϕ(v )b(v − v∗ , σ ) dσ dv∗ dv, (3.21) F(v)G(v∗ )ϕ(v)b(v − v∗ , σ ) dσ dv∗ dv.

On the Global Existence of Mild Solutions to the Boltzmann Equation

463

Clearly, the loss operator acts as a convolution in the variables v and v∗ . Thus, defining a(z) = S D−1 b(z, σ ) dσ and using Young’s inequality, one easily finds that

−

s F(v)G(v∗ )a(v − v∗ ) dv∗

Q (F, G) L v ≤

Rn

≤ F L p G L q a L r ,

L sv

(3.22)

where 1+ 1s = 1p + q1 + r1 and s ≤ p. Since Q− (F, G) is merely a product between a∗G and F, it is not possible to improve the integrability of F for parameters s > p. Otherwise, we would be able to choose some fixed a and G so that there exists ϕ ∈ C0∞ R D satisfying 0 ≤ ϕ ≤ a ∗ G, thus yielding Fϕ L s ≤ C F L p , where C = G L q a L r , which is obviously a contradiction when s > p as soon as ϕ is not trivial. At first, due to its intricate nature, it is unclear whether Q+ will satisfy an identical estimate or not. In fact, Q+ bears a much nicer structure since it is known to have some convoluting effects (see [1,12,15,16,24]) and even to provide a gain of regularity (see [7,20,21,27]) and so, it is reasonable to hope for a similar inequality to hold. It turns out that a slight modification of the above argument shows that a convolution inequality holds for the gain operator as well. Indeed, writing the Maxwellian formulation and using Hölder’s and Young’s inequalities, we deduce

+

D Q (F, G)(v)ϕ(v) dv ≤ D D D−1 |F G ∗ ϕ |b(v − v∗ , σ ) dσ dv∗ dv R R R S

V |V |

≤

F(v)G(v − V )ϕ v − 2 + 2 σ b(V, σ )d V dvdσ ≤ ϕ L s F(v)G(v − V ) L sv a(V ) d V

≤ ϕ L s F(v)G(v − V ) L sv L r a L r V

≤ ϕ L s F L p G L q a L r , where 1 ≤ s ≤ p, q ≤ r ≤ ∞ and 1 + s

(3.23) 1 s

=

1 p

+

1 q

+ r1 . Taking the supremum in (3.23)

over all ϕ ∈ L , we conclude

+

Q (F, G) s ≤ F L p G L q a L r , L

−

Q (F, G) s ≤ F L p G L q a L r ,

(3.24)

L

for any 1 ≤ p, q, r, s ≤ ∞ such that 1 + 1s = 1p + q1 + r1 and s ≤ p, q ≤ r . However, this simple argument retains the restriction on the parameters s ≤ p, q for Q+ , which is absolutely not sufficient in order to carry out our arguments on the mechanisms of transfer of integrability leading to the existence of mild solutions. It is fortunate that a better convolution inequality for the gain term Q+ including the full range of parameters 1 ≤ p, q, r, s ≤ ∞ is available, as shown in Proposition 3.6 below. Its validity comes however under some further integrability condition on the angular collision kernel. It is to be emphasized that this extension of the parameters range constitutes the originality of this new inequality. Even though it may seem of rather technical nature, it is crucial in our work and its demonstration remains elementary since it merely involves the utilization of Hölder’s inequality and some changes of variables (just as Young’s convolution inequality is a consequence of Hölder’s inequality).

464

D. Arsénio

Finally, we remark that the convoluting nature of the gain term had been noticed in several previous studies, especially in the works of Gustafsson [15,16], Mouhot and Villani [24], and Duduchava, Kirsch and Rjasanow [12]. Simultaneously to our work, another similar result has been obtained independently by Alonso and Carneiro [1]. However, none of these results included the whole parameter range for p, q, r, s in (3.24) for Q+ , since they regarded the cross section as a weight rather than as an element of the convolution. Proposition 3.6. Let b(z, σ ) ≥ 0 be a collision kernel satisfying b(z, σ ) = b (|z|, cos θ ) = a0 (|z|) b0 (cos θ ), where cos θ =

z |z|

(3.25)

· σ , and let the parameters 1 ≤ p, q, r, s ≤ ∞ be such that 1 1 1 1 + + =1+ . p q r s

(3.26)

Then,

+

Q (F, G)

where

D

Ls

≤ 2 2 C0 F L p G L q a0 L r ,

(3.27)

C0 = and C0 =

S

D−1

S D−1

b0 (cos θ ) dσ if s ≤ p, b0 (cos θ )

D

1 1 p−s

θ 2

dσ if s ≥ p.

(3.28)

sin Proof. Let ϕ ∈ L sv R D . By duality and using the collisional changes of variables, we will need to control + Q (F, G)(v)ϕ(v) dv = F G ∗ ϕ(v)a0 (v − v∗ )b0 (cos θ ) dσ dvdv∗ = F(v)G(v∗ )a0 (v − v∗ ) ϕ v b0 (cos θ ) dσ dvdv∗ = F(v)G(v∗ )a0 (v − v∗ ) 0 (v, v∗ ) dvdv∗ , (3.29) where we have written 0 (v, v∗ ) = ϕ v b0 (cos θ ) dσ . To this end, we employ the set of parameters 1 ≤ p1 , p2 , p3 , p4 , p5 , p6 ≤ ∞ given by Lemma 3.7 with s replaced by s and we define p p1 q β1 = p1 r γ1 = p2 s ρ1 = p3

α1 =

p p2 q β2 = p4 r γ2 = p4 s ρ2 = p5 α2 =

p p3 q β3 = p5 r γ3 = p6 s ρ3 = p6 α3 =

(3.30)

On the Global Existence of Mild Solutions to the Boltzmann Equation

465

so that α1 + α2 + α3 = 1, β1 + β2 + β3 = 1, γ1 + γ2 + γ3 = 1 and ρ1 +ρ2 + ρ3 = 1. Furthermore, in accordance with that lemma, we may choose p13 = max 0, 1p − 1s . Then, defining an auxiliary kernel b1 (cos θ ) =

b0 (cos θ ) D

sin p3

(3.31)

θ 2

and writing c0 =

1 S D−1

s

b1 (cos θ ) dσ

,

1

s s

ϕ v sin D θ b1 (cos θ ) dσ , 2 S D−1 1

s s

ϕ v b1 (cos θ ) dσ

2 (v, v∗ ) = 3 (v, v∗ ) = c0 ,

1 (v, v∗ ) = c0

(3.32)

S D−1

we obtain, simply using Hölder’s inequality, that ρ

ρ

ρ

| 0 | ≤ 1 1 · 2 2 · 3 3 .

(3.33)

Therefore, we may decompose the last integrand in (3.29) as |F(v)G(v∗ )a0 (v − v∗ ) 0 (v, v∗ )| ≤ |P1 · P2 · P3 · P4 · P5 · P6 |,

(3.34)

where P1 = F(v)α1 G(v∗ )β1

P2 = F(v)α2 a0 (v − v∗ )γ1

P3 = F(v)α3 1 (v, v∗ )ρ1

P4 = G(v∗ )β2 a0 (v − v∗ )γ2

P5 = G(v∗

)β3

2 (v, v∗

)ρ 2

P6 = a0 (v − v∗

)γ3

3 (v, v∗

(3.35) )ρ 3

so that, by Hölder’s inequality again,

Q+ (F, G)(v)ϕ(v) dv

p1 · P2 p2 · P3 p3 · P4 p4 · P5 p5 · P6 p6 . ≤ P1 L v,v (3.36) L v,v L v,v L v,v L v,v L v,v ∗

∗

∗

∗

∗

∗

Next, we estimate each of the six resulting terms separately, which is trivial for P1 , P2 and P4 since their variables are separated because they do not contain the functions k . Indeed, one easily verifies that β

α 1 p1 = F 1p G q , P1 L v,v L L ∗

v α2 p Lv

p2 = F P2 L v,v ∗

β

γ

v

a0 L1r , v

γ

2 2 p4 = G q a0 r . P4 L v,v L L ∗

v

v

(3.37)

466

D. Arsénio

On the other hand, the terms P3 , P5 and P6 only satisfy

α ρ p3 = F 3 (v) 1 (v, v∗ ) 1 P3 L v,v s

, L v∗ p3 ∗ Lv

P5 L p5 = G β3 (v∗ ) 2 (v, v∗ ) ρ2s p , v,v∗ Lv Lv5 ∗

γ3 ρ 3

p P6 L v,v6 = a0 (v) 3 (v + v∗ , v∗ ) s

L v∗

∗

(3.38)

p

Lv6

.

Thus, in order to carry on these estimates, we will need to exploit the explicit definition of each k . To this end, for any given σ ∈ S D−1 , we consider the function Rσ (v) =

v |v| + σ 2 2

(3.39)

defined for any v ∈ R D . It is then easy to see that Rσ is a well-defined bijection from R D \ {1 : σ · v = −|v|} onto u ∈ R D : σ · u > 0 with an inverse given by Rσ−1 (u) = 2u −

|u|2 σ. σ ·u

(3.40)

Furthermore, it is readily seen, with the use of spherical coordinates, that the D−1 2 Jacobian of Rσ−1 is given by 2 (σ ·u)|u|2 . In other words, for any measurable function P : u ∈ R D : σ · u > 0 → R, it holds that

R D \{σ ·v=−|v|}

P (Rσ (v)) dv =

{σ ·u>0}

P(u)

2 D−1 |u|2 du. (σ · u)2

(3.41)

Finally, it is straightforward to check that if θ is the angle between v and σ , then θ2 is the v · σ = cos θ = 2 cos2 θ2 − 1 = angle between Rσ (v) and σ . Therefore, it holds that |v| 2 2 R|Rσ σ(v)·σ − 1. (v)| Thus, employing the function Rσ with the explicit expression for 1 and then the change of variable formula (3.41), we arrive at

1 (v, v∗ ) s s

L v∗

Rσ (V ) · σ D Rσ (V ) · σ 2 |ϕ (v + Rσ (V ))| = b1 1 − 2 dσ d V |Rσ (V )| |Rσ (V )| D u · σ 2 2 D−1 |u|2 s s u · σ |ϕ (v + u)| b1 1 − 2 dσ du = c0 |u| |u| (σ · u)2 {σ ·u>0}

π2 s s D−2 |ϕ (v + u)| S = c0 b1 (− cos 2θ ) 2 D−1 cos D−2 θ sin D−2 θ dθ du

0 = c0s +s |ϕ (u)|s du. (3.42) c0s

s

On the Global Existence of Mild Solutions to the Boltzmann Equation

467

As to the term 2 , we treat it similarly, recalling that b1 (cos θ ) is supported on {cos θ ≥ 0},

2 (v, v∗ ) s s Lv

Rσ (V ) · σ 2 |ϕ (v∗ + Rσ (V ))| b1 2 = c0 − 1 dσ d V |Rσ (V )| 2 D−1 |u|2 u·σ 2 s s |ϕ (v∗ + u)| b1 2 = c0 −1 dσ du |u| (σ · u)2 {σ ·u>0}

π2 2 D−1 cos D−2 θ sin D−2 θ

|ϕ (v∗ + u)|s S D−2 dθ du b1 (cos 2θ ) = c0s cos D θ 0 D ≤ 2 2 c0s +s |ϕ (u)|s du. (3.43) s

s

Finally, the term 3 receives the simpler treatment v s s s |ϕ (V + Rσ (v))| b1 3 (v + v∗ , v∗ ) s = c0 · σ dσ d V L v∗ |v| = c0s +s |ϕ (u)|s du.

(3.44)

Therefore, incorporating (3.42), (3.43) and (3.44) into (3.38), we arrive at ρ1 L sv

p3 P3 L v,v = c0 1 F αL3p ϕ

sρ

∗

v

D

,

β

P5 L p5 ≤ 2 2 p5 c0sρ2 G 3q ϕ ρ2s , L v,v ∗

v

γ

sρ ρ p6 P6 L v,v = c0 3 a0 L3r ϕ 3s . ∗

v

(3.45)

Lv

Lv

Thus, on the whole, combining (3.36), (3.37) and (3.45), we have shown that

D

Q+ (F, G)(v)ϕ(v) dv ≤ 2 2 p5 cs F p G q a0 L r ϕ s . (3.46) 0 Lv Lv L

v v

We conclude by taking the supremum over all ϕ ∈ L sv and setting C0 = c0s . Lemma 3.7. Let 1 ≤ p, q, r, s ≤ ∞ be such that 1p + parameters 1 ≤ p1 , p2 , p3 , p4 , p5 , p6 ≤ ∞ satisfying 1 1 1 1 + + = , p1 p2 p3 p 1 1 1 1 + + = , p2 p4 p6 r

1 + p1 1 + p3

1 q

1 + p4 1 + p5

+

1 r

+

1 s

1 1 = , p5 q 1 1 = . p6 s

= 2. Then, there are

(3.47)

In particular, they satisfy 1 1 1 1 1 1 + + + + + = 1. p1 p2 p3 p4 p5 p6

(3.48)

468

D. Arsénio

Moreover, it is always possible to set ble choice for whenever

1 p

+

1 p3

= max 0, 1s +

1 p

− 1 , which is the best possi-

1 p3 in the sense that it minimizes its value. This way, it holds that 1 s ≤ 1.

p3 = ∞

Proof. It is possible to show that the general solution to the system (3.47) is given by ⎛ 1 ⎞ ⎛1 1 ⎞ p1 ⎜ 1 ⎟ p + q −1 ⎞ ⎛ ⎞ ⎛ ⎜ ⎟ ⎜1 1 ⎟ 1 0 ⎜ p2 ⎟ ⎜ + − 1 ⎟ ⎜ 0 ⎟ ⎜ 1 ⎟ ⎜ 1 ⎟ ⎜p r ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ 1 ⎜ −1 ⎟ ⎜ −1 ⎟ ⎜ p3 ⎟ ⎜ ⎟ (3.49) ⎜ 1 ⎟=⎜ ⎟ + α ⎜ −1 ⎟ + β ⎜ −1 ⎟ s ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎜ p4 ⎟ ⎜ 1 − 1 ⎟ ⎝ 0 ⎠ ⎝ 1 ⎠ ⎜ 1 ⎟ ⎜ p ⎟ ⎜ ⎟ ⎝ ⎠ 1 0 0 ⎝ p5 ⎠ 0 1 p6

for all α, β ∈ R. In order to conclude, it suffices to suitably choose α and β so that 0 ≤ p11 , p12 , p13 , p14 , p15 , p16 ≤ 1. Therefore, α and β need to satisfy 1 1 1 1 max 1 − + , 0 ≤ α ≤ min 2 − + ,1 , p r p r 1 1 1 1 + , 0 ≤ β ≤ min 2 − + ,1 , (3.50) max 1 − p q p q 1 1 1 1 ,1 − max − , − 1 ≤ α + β ≤ min p s s p and one straightforwardly checks that this is always possible. Finally, it is readily seen that we may always choose α and β so that α + β = min 1s , 1 − 1p . Since p13 = 1s − (α + β), this choice clearly maximizes the value of p3 and α + β =

1 s

as soon as

1 p

+

1 s

≤ 1. This concludes the proof.

3.3. Norm preserving estimates. We will now make use of the mechanisms of transfer of integrability exposed in the previous sections to demonstrate some norm preserving estimates on the operator F → N Q+ (F, F) that are crucial to our work. Lemma 3.8. Let b(z, σ ) ≥ 0 be a collision kernel satisfying a decomposition b(z, σ ) = a0 (|z|) b0 (cos θ ). Then, for any time T > 0 and any fixed parameter 2 ≤ a ≤ ∞, the gain operator satisfies the quadratic estimate

+

Q (F, G) a D D L t2 [0,T ];L x2

≤ K F

a

L v2

a

G , D a D a D a D a L at [0,T ];L x a−1 L v a+1 L at [0,T ];L x a−1 L v a+1

where the constant K > 0 is defined by

D b0 (cos θ )

K =22

a0 D .

sin a−1 a θ 1 L zD−2 2 L σ

(3.51)

(3.52)

On the Global Existence of Mild Solutions to the Boltzmann Equation

469

Proof. A straightforward application of Proposition 3.6 yields

+

Q (F, G) D ≤ K F D a G D a . a a+1 a+1 L v2

Lv

Lv

(3.53)

The result then follows from the Cauchy-Schwarz inequality on the time and space variables only. Lemma 3.9. Let b(z, σ ) ≥ 0 be a collision kernel satisfying a decomposition b(z, σ ) = a0 (|z|) b0 (cos θ ). Then, for any time T > 0 and any fixed parameter 2 < a < 4, the a gain operator satisfies the quadratic estimate, valid for all a−2 ≤ λ ≤ ∞,

N Q+ (F, G)

≤ K F

λ D λ−1

L λt [0,T ];L x

λ D λ+1

Lv

G , D a D a D a D a L at [0,T ];L x a−1 L v a+1 L at [0,T ];L x a−1 L v a+1

where the constant K > 0 is defined by

D b0 (cos θ )

K = C2 2

sin a−1 a θ

2

a0 L 1σ

D

L zD−2

,

(3.54)

(3.55)

for some constant C > 0 independent of T > 0. Proof. First, by an application of Proposition 3.4 and denoting conjugate exponents by 1 1 a + a = 1, we obtain that

N Q+ (F, G) a D a D a L ta−2 [0,T ];L x2 L v2

≤ C Q+ (F, G)

a

D a

L t2 [0,T ];L x2

Da

L v2

,

(3.56)

where C is independent of T > 0. Notice that the use above of Proposition 3.4 is valid because the parameter a lies in the range 2 < a < 4. Then, defining for convenience an a so that 2 < b < ∞, the estimate (3.56) becomes auxiliary parameter b = a−2

N Q+ (F, G)

b D b−1

L bt [0,T ];L x

≤ C Q+ (F, G)

b D b+1

Lv

.

(3.57)

Similarly, by an application of Proposition 3.5, we obtain that

N Q+ (F, G) ∞ D L t [0,T ];L x,v

≤ C Q+ (F, G) a D a D a ,

(3.58)

a L t2

D a D a [0,T ];L x2 L v2

L t2 [0,T ];L x2

L v2

470

D. Arsénio

where C > 0 is independent of T > 0. Notice that the use above of Proposition 3.5 is valid because the parameter a lies in the range 2 ≤ a < 4. Furthermore, for any b < λ < ∞, we may apply Hölder’s inequality to deduce

N Q+ (F, G) λ λ D λ−1

L λt [0,T ];L x

D λ+1

Lv

1− b

b

≤ N Q+ (F, G) L ∞λ[0,T ];L D N Q+ (F, G) λ t

x,v

b D b−1

L bt [0,T ];L x

so that, incorporating (3.57) and (3.58) into (3.59), we infer, for any

N Q+ (F, G)

D λ L λt [0,T ];L x λ−1

D λ L v λ+1

≤ C Q+ (F, G)

a L t2

b D b+1

,

(3.59)

Lv

a a−2

≤ λ ≤ ∞,

D a D a [0,T ];L x2 L v2

.

(3.60) Finally, the conclusion of the lemma follows from a direct application of Lemma 3.8 to the last term above. D R D × R D and any D < λ ≤ ∞, it holds that Lemma 3.10. For any F in ∈ L x,v

T F in

L λt

D λ [0,T ];L x λ−1

D λ L v λ+1

≤ C F in L x,v D ,

(3.61)

where C > 0 is independent of T > 0. Proof. First, notice that it trivially holds that

T F in ∞ D Lt

[0,T ];L x,v

= F in L x,v D .

(3.62)

Furthermore, since λ > D, a direct application of Proposition 3.3 yields that (3.61) holds, which concludes the proof of the lemma. 4. Proof of the Main Theorem In this section, we provide a proof of the Main Theorem 2.1. The key idea of the demonstration consists in utilizing the estimates developed in Sect. 3 on the Boltzmann operator to obtain the weak compactness of an approximating sequence of solutions to a truncated Boltzmann equation. 4.1. Weak compactness of truncated approximations. Many different choices for the truncated equation are available at this point and most of them would suit our demonstration. However, we will choose for convenience the truncated equation that was employed in the DiPerna-Lions theory of renormalized solutions [11]. We will detail now this truncation procedure.

On the Global Existence of Mild Solutions to the Boltzmann Equation

471

We consider an approximating sequence of regularized and compactly supported ∞ R D × S D−1 such that collision kernels {bn (z, σ )}∞ ⊂ C n=1 0 z bn (z, σ ) = bn |z|, · σ , 0 ≤ bn ≤ bn+1 ≤ b, |z| and bn → b almost everywhere as n → ∞, (4.1) and a suitable approximating sequence of initial data ∞ Fnin (x, v) ⊂ S RD × RD

(4.2)

n=1

(here S denotes the Schwartz space of rapidly decreasing functions) such that 0 ≤ Fnin ≤ F in and Fnin → F in almost everywhere as n → ∞. (4.3)

in

Furthermore, let δn > 0 be such that limn→∞ δn Fn L 1 = 0. x,v Notice that the general properties and estimates on the Boltzmann collision operator remain unchanged if one allows the collision kernel to depend on the time and space variables. Thus, it is possible to show, as was performed in [11] that there (see also[10]), D D of exists a unique nonnegative sequence {Fn (t, x, v)}∞ n=1 ⊂ C [0, ∞); S R × R solutions to the truncated equation ∂t Fn + v · ∇x Fn =

Qn (Fn , Fn ) , 1 + δn R D |Fn | dv

Fn (0, x, v) = Fnin (x, v),

(4.4)

where the regularized Boltzmann operator Qn is simply defined by replacing the collision kernel b(z, σ ) by its regularized version bn (z, σ ) in Definition (1.2). In particular, by the collisional solutions Fn satisfy the global symmetries, the approximating conservation of mass Fn (t, x, v)d xdv = Fnin (x, v)d xdv, for each t ≥ 0, so that limn→∞ δn Fn (t) L 1x,v = 0. Thus, up to extraction of a subsequence, we may assume that 1 → 1 almost everywhere in (t, x) ∈ [0, ∞) × R D . 1 + δn R D |Fn | dv

(4.5)

We will now obtain important uniform estimates on the solutions Fn . Thus, according to Duhamel’s formula (1.7), we have the following representation: Fn = T Fnin + N

Qn (Fn , Fn ) . 1 + δn R D |Fn | dv

(4.6)

Consequently, for any fixed parameter 3 ≤ a < 4, by virtue of Lemmas 3.9 and 3.10, it a holds that, for any a−2 ≤ λ ≤ ∞ such that λ > D,

Q+n (Fn , Fn )

N

1 + δ D |F | dv λ D λ D λ n R n L [0,T ];L λ−1 L λ+1 t

≤ C · K Fn 2

x

D a D a L at [0,T ];L x a−1 L v a+1

v

(4.7)

472

D. Arsénio

and

T Fnin

λ D λ−1

L λt [0,T ];L x

λ D λ+1

Lv

≤ C F in L x,v D ,

(4.8)

where K > 0 is determined by (3.55) and C > 0 is independent of T > 0. Thus, on the whole, we conclude that Fn

λ D λ−1

L λt [0,T ];L x

λ D λ+1

Lv

2 ≤ C F in L x,v D + C · K Fn

. D a D a L at [0,T ];L x a−1 L v a+1

(4.9) In particular, since a ≥ 3, the above estimate holds true for λ = a, so that defining the a a which is continuous on T ∈ [0, ∞) and function ρn (T ) = Fn a D a−1 D a+1 L t [0,T ];L x

Lv

satisfies ρ(0) = 0, we see that 0 ≤ C · Kρn (T )2 − ρn (T ) + C F in L x,v D .

(4.10)

Provided 4C 2 K F in L x,v D < 1, which is guaranteed by the smallness condition (2.2) with an appropriate choice of constant C0 , it follows that ρn (T ) ∈ [0, η1 ] ∪ [η2 , ∞), for every T > 0, where 0 ≤ η1 < η2 are the two real roots of the quadratic equation C K η2 − η + C F in L x,v D and may thus be expressed as η1 =

1−

1 − 4C 2 K F in L x,v D 2C K

and η2 =

1+

1 − 4C 2 K F in L x,v D 2C K

.

(4.11)

Hence, by virtue of the continuity of ρn (T ), we infer that Fn a D a D a L t [0,∞);L x a−1 L v a+1

= sup ρn (T ) ≤ η1 =

1−

1 − 4C 2 K F in L x,v D 2C K

T >0

, (4.12)

which yields, when incorporated into (4.9), Fn

D λ L λt [0,∞);L x λ−1

D λ L v λ+1

≤

1−

1 − 4C 2 K F in L x,v D 2C K

,

a for every a−2 ≤ λ ≤ ∞ such that λ > D. Consequently, by possibly extracting a subsequence and setting λ0 = 2 < λ0 ≤ 3, we find that D λ D λ Fn → F weakly in L λt [0, ∞); L x λ−1 L v λ+1

for every λ0 ≤ λ ≤ ∞ such that λ > D.

(4.13)

a a−2

so that

(4.14)

On the Global Existence of Mild Solutions to the Boltzmann Equation

473

We will prove that F is a weak solution to Boltzmann’s equation. Finally, notice that an application of Lemma 3.8 shows that Q+n (Fn , Fn ) and Q+ (F, F) are uniformly bounded λ D λ Dλ for any λ0 ≤ λ ≤ ∞ such that λ > D. (4.15) in L t2 [0, ∞); L x2 L v2

4.2. Strong compactness by velocity averaging. We wish now to pass to the limit in the truncated equation (4.4) and thus recover a weak solution of the Boltzmann equation (1.1). To this end, we need to show the convergence of the nonlinear terms in the right-hand side of (4.4), Q± n (Fn , Fn ) → Q± (F, F) weakly in L 1loc [0, ∞) × R D × R D . 1 + δn R D |Fn | dv (4.16) Recall now that a bounded sequence u n in L ∞ R D converging almost everywhere to some u and a sequence vn converging weakly to some v in L 1 R D satisfy the non linear convergence of the product u n vn → uv weakly in L 1 R D . This result is a basic combination of Egorov’s theorem with the Dunford-Pettis criterion for weak relative compactness in L 1 R D . Essentially, we use the equi-integrability and the tightness of vn to reduce the domain to a region where u n converges uniformly towards u. Thus, in view of the almost everywhere convergence of the denominators (4.5), the limit (4.16) will be verified as soon as we show that ± 1 D D . (4.17) Q± n (Fn , Fn ) → Q (F, F) weakly in L loc [0, ∞) × R × R Furthermore, by virtue of the basic convolution inequalities (3.24), it holds that, for any k ≤ n,

±

Q (Fn , Fn ) − Q± (Fn , Fn ) α Dα n k D α 2(α−1)

L t2 [0,∞);L x

≤ Fn 2

bn D α D α L αt [0,∞);L x α−1 L v α+1

≤ Fn 2

b D α D α L αt [0,∞);L x α−1 L v α+1

− bk

− bk

Lv

α+1

Dα α(D−1)−1

Lz

Dα α(D−1)−1

Lz

L 1σ

L 1σ

,

(4.18)

where α is the parameter in the assumptions of the Main Theorem 2.1. Utilizing now that Dα

b(z, σ ) ∈ L zα(D−1)−1 L 1σ together with the convergence properties of the approximating kernels (4.1), we see that the norm of the difference b − bk above can be made arbitrarily small if k is chosen large enough. Therefore, thanks to the control (4.14) with λ = α, we deduce that it will be enough to show for a fixed k that, as n tends to infinity, ± 1 D D Q± , (4.19) k (Fn , Fn ) → Qk (F, F) weakly in L loc [0, ∞) × R × R which will be achieved by means of velocity averaging.

474

D. Arsénio

Notice first, that the estimate (4.18) also shows, if one mentally replaces bk by zero, that the right-hand side of the truncated Eq. (4.4) is bounded locally in L 1loc [0, ∞) × R D × R D . This will allow us to apply the following basic velocity averaging lemma (see [8,11]), which is not optimal but sufficient for our purpose. Lemma 4.1. Suppose that

1 D D {Fn (t, x, v)}∞ n=1 is weakly relatively compact in L loc [0, ∞) × R × R 1 D D and {∂t Fn + v · ∇x Fn }∞ [0, ∞) × R . (4.20) is bounded in L × R n=1 loc Then, for any ψ(t, x, v∗ , v) ∈ L ∞ [0, ∞) × R D × R D ; L 1 R D such that ψ(t, x, v∗ , ·) is compactly supported, ∞ Fn (t, x, v∗ )ψ(t, x, v∗ , v) dv∗ RD n=1 1 (4.21) is strongly relatively compact in L loc [0, ∞) × R D × R D . We apply now the above averaging lemma twice. First, with ψ(t, x, v∗ , v) = ϕ(t, x, v) bk (v − v∗ , σ ) dσ S D−1

and second with

ψ(t, x, v∗ , v) =

S D−1

ϕ(t, x, v )bk (v − v∗ , σ ) dσ,

(4.22)

(4.23)

where ϕ ∈ C0∞ [0, ∞) × R D × R D . Thus, in the first case, we conclude that ϕ Fn∗ bk (v − v∗ , σ ) dσ dv∗ → ϕ F∗ bk (v − v∗ , σ ) dσ dv∗ RD S D−1 RD S D−1 (4.24) strongly in L 1 [0, ∞) × R D × R D , while, in the second case, we find that Fn∗ ϕ bk (v − v∗ , σ ) dσ dv∗ → F∗ ϕ bk (v − v∗ , σ ) dσ dv∗ RD S D−1 RD S D−1 (4.25) strongly in L 1 [0, ∞) × R D × R D . Moreover, notice that the above sequences are compactly supported in all variables because the kernel bk itself is compactly supported. This implies, when combined with the uniform bounds on the Fn ’s obtained from (4.14), that the convergences (4.24) and (4.25) hold in the strong topology of L 2 [0, ∞) × R D × R D as well. Furthermore, the D D L∞ t L x L v control obtained by setting λ = ∞ in (4.14) implies that the Fn ’s are weakly compact locally in L 2loc [0, ∞) × R D × R D . Consequently, as n tends to infinity, we see that, thanks to the collision symmetries, Q− , F ϕ Fn Fn∗ bk (v − v∗ , σ ) dσ dv∗ dv ϕ dv = (F ) n n k RD S D−1 ×R D ×R D ϕ F F∗ bk (v − v∗ , σ ) dσ dv∗ dv = Q− −→ k (F, F) ϕ dv, (4.26) S D−1 ×R D ×R D

RD

On the Global Existence of Mild Solutions to the Boltzmann Equation

475

and that + Qk (Fn , Fn ) ϕ dv = ϕ Fn Fn∗ bk (v − v∗ , σ ) dσ dv∗ dv RD S D−1 ×R D ×R D ϕ F F∗ bk (v − v∗ , σ ) dσ dv∗ dv = Q+k (F, F) ϕ dv. (4.27) −→ S D−1 ×R D ×R D

RD

Since we already know that the loss and gain operators Q± k (Fn , Fn ) form families that are weakly precompact in L 1loc [0, ∞) × R D × R D by the estimate (4.18), where we mentally replace bn by zero, we conclude that the weak convergence (4.19) holds and so, that the weak convergence of the truncated operators (4.16) holds as well. We are now ready to easily pass to the limit in thetruncated Eq. (4.4). To this end, we consider any ϕ(t, x, v) ∈ C0∞ [0, ∞) × R D × R D and, integrating Eq. (4.4) against ϕ, we infer − Fnin ϕ(0) d xdv − Fn ∂t ϕ dtd xdv R D ×R D [0,∞)×R D ×R D Qn (Fn , Fn ) ϕ dtd xdv. (4.28) = Fn v · ∇x ϕ + 1 + δn R D |Fn | dv [0,∞)×R D ×R D Therefore, since Fnin converges to F in in L 1loc R D × R D , letting n tend to infinity, we arrive at − F in ϕ(0) d xdv − F∂t ϕ dtd xdv R D ×R D [0,∞)×R D ×R D = Fv · ∇x ϕ + Q(F, F)ϕ dtd xdv, (4.29) [0,∞)×R D ×R D

which shows that F is a weak solution of the Boltzmann equation (1.1) and thus concludes the proof of Theorem 2.1. Acknowledgement. The author would like to sincerely thank Nader Masmoudi for sharing his insight on the Boltzmann equation and thus helping in the genesis of this article and to acknowledge the support from the MacCracken fellowship of the New York University while this research was being carried out.

References 1. Alonso, R.J., Carneiro, E.: Estimates for the Boltzmann collision operator via radial symmetry and Fourier transform. Adv. Math. 223(2), 511–528 (2010) 2. Andréasson, H., Calogero, S., Illner, R.: On blowup for gain-term-only classical and relativistic Boltzmann equations. Math. Methods Appl. Sci. 27(18), 2231–2240 (2004) 3. Arsénio D.: On the Boltzmann equation: hydrodynamic limit with long-range interactions and mild solutions. PhD thesis, New York University, New York, September (2009) 4. Benedek, A., Panzone, R.: The spaces L P , with mixed norm. Duke Math. J 28, 301–324 (1961) 5. Boltzmann, L.: Weitere Studien über das Wärmegleichgewicht unter Gasmolekülen. Sitzungsberichte Akad. Wiss. 66, 275–370 (1872) 6. Boltzmann, L.: Vorlesungen über Gastheorie. Leipzig: J.A. Barth (1898) 7. Bouchut, F., Desvillettes, L.: A proof of the smoothing properties of the positive part of Boltzmann’s kernel. Rev. Mat. Iberoamericana 14(1), 47–61 (1998) 8. Bouchut, F., Golse, F., Pulvirenti, M.: Kinetic equations and asymptotic theory, Vol. 4 of Series in Applied Mathematics (Paris). Perthame, B., Desvillettes, L. (eds.), Gauthier-Villars, Éditions Scientifiques et Médicales. Paris: Elsevier, 2000

476

D. Arsénio

9. Castella, F., Perthame, B.: Estimations de Strichartz pour les équations de transport cinétique. C. R. Acad. Sci. Paris Sér. I Math. 322(6), 535–540 (1996) 10. Cercignani, C., Illner, R., Pulvirenti, M.: The mathematical theory of dilute gases. Vol. 106 of Applied Mathematical Sciences. Springer-Verlag: New York, 1994 11. DiPerna, R. J., Lions, P.-L.: On the Cauchy problem for Boltzmann equations: global existence and weak stability. Ann. of Math. (2) 130(2), 321–366 (1989) 12. Duduchava, R., Kirsch, R., Rjasanow, S.: On estimates of the Boltzmann collision operator with cutoff. J. Math. Fluid. Mech. 8(2), 242–266 (2006) 13. Guo, Y.: Classical solutions to the Boltzmann equation for molecules with an angular cutoff. Arch. Ration. Mech. Anal. 169(4), 305–353 (2003) 14. Guo, Y.: The Vlasov-Maxwell-Boltzmann system near Maxwellians. Invent. Math. 153(3), 593–630 (2003) 15. Gustafsson, T.: L p -estimates for the nonlinear spatially homogeneous Boltzmann equation. Arch. Rat. Mech. Anal. 92(1), 23–57 (1986) 16. Gustafsson, T.: Global L p -properties for the spatially homogeneous Boltzmann equation. Arch. Rat. Mech. Anal. 103(1), 1–38 (1988) 17. Illner, R., Shinbrot, M.: The Boltzmann equation: global existence for a rare gas in an infinite vacuum. Commun. Math. Phys. 95(2), 217–226 (1984) 18. Kaniel, S., Shinbrot, M.: The Boltzmann equation. I. Uniqueness and local existence. Commun. Math. Phys. 58(1), 65–84 (1978) 19. Kato, T.: Strong L p -solutions of the Navier-Stokes equation in Rm , with applications to weak solutions. Math. Z 187(4), 471–480 (1984) 20. Lions P.-L.: Compactness in Boltzmann’s equation via Fourier integral operators and applications. I, II. J. Math. Kyoto Univ. 34(2), 391–427, 429–461 (1994) 21. Lions, P.-L.: Compactness in Boltzmann’s equation via Fourier integral operators and applications. III. J. Math. Kyoto Univ. 34(3), 539–584 (1994) 22. Maxwell, J.C.: Illustrations of the dynamical theory of gases. Philos. Mag. 19(20), 19–32, 21–37 (1860) 23. Maxwell, J.C.: On the dynamical theory of gases. Philos. Transa. Royal Soc. Lond. 157, 49–88 (1867) 24. Mouhot, C., Villani, C.: Regularity theory for the spatially homogeneous Boltzmann equation with cutoff. Arch. Ration. Mech. Anal. 173(2), 169–212 (2004) 25. Strichartz, R.S.: Restrictions of Fourier transforms to quadratic surfaces and decay of solutions of wave equations. Duke Math. J. 44(3), 705–714 (1977) 26. Villani C.: A review of mathematical topics in collisional kinetic theory. In: Handbook of mathematical fluid dynamics, Vol. I. Amsterdam: North-Holland, 2002. pp. 71–305 27. Wennberg, B.: Regularity in the Boltzmann equation and the Radon transform. Comm. Part. Diff. Eq. 19(11-12), 2057–2074 (1994) Communicated by P. Constantin

Commun. Math. Phys. 302, 477–511 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1132-6

Communications in

Mathematical Physics

Three-Dimensional Stability of Burgers Vortices Thierry Gallay1 , Yasunori Maekawa2 1 Institut Fourier, Université de Grenoble I, BP 74, 38402 Saint-Martin-d’Hères, France.

E-mail: [email protected]

2 Faculty of Science, Kobe University, 1-1 Rokkodai, Nada-ku, Kobe 657-8501, Japan.

E-mail: [email protected] Received: 17 March 2010 / Accepted: 24 May 2010 Published online: 18 September 2010 – © Springer-Verlag 2010

Abstract: Burgers vortices are explicit stationary solutions of the Navier-Stokes equations which are often used to describe the vortex tubes observed in numerical simulations of three-dimensional turbulence. In this model, the velocity field is a two-dimensional perturbation of a linear straining flow with axial symmetry. The only free parameter is the Reynolds number Re = /ν, where is the total circulation of the vortex and ν is the kinematic viscosity. The purpose of this paper is to show that Burgers vortices are asymptotically stable with respect to small three-dimensional perturbations, for all values of the Reynolds number. This general result subsumes earlier studies by various authors, which were either restricted to small Reynolds numbers or to two-dimensional perturbations. Our proof relies on the fact that the linearized operator at Burgers vortex has a simple and very specific dependence upon the axial variable. This allows to reduce the full linearized equations to a vectorial two-dimensional problem, which can be treated using an extension of the techniques developed in earlier works. Although Burgers vortices are found to be stable for all Reynolds numbers, the proof indicates that perturbations may undergo an important transient amplification if Re is large, a phenomenon that was indeed observed in numerical simulations.

1. Introduction The axisymmetric Burgers vortex is an explicit solution of the three-dimensional NavierStokes equations which provides a simple and widely used model for the vortex tubes or filaments that are observed in turbulent flows [1,27]. Despite obvious limitations, due to oversimplified assumptions, this model describes in a correct way the fundamental mechanisms which are responsible for the persistence of coherent structures in three-dimensional turbulence, namely the balance between vorticity amplification due to stretching and vorticity dissipation due to viscosity. If one believes that vortex tubes play a significant role in the dynamics of turbulent flows, it is an important issue

478

Th. Gallay, Y. Maekawa

to determine their stability with respect to perturbations in the largest possible class. So far, this problem has been studied only for the axisymmetric Burgers vortex and for a closely related family of asymmetric vortices [21,24]. As was shown by Leibovich and Holmes [17], one cannot hope to prove energetic stability of the Burgers vortex even if the circulation Reynolds number is very small. To tackle the stability problem, it is therefore necessary to have a closer look at the spectrum of the linearized operator. This is a relatively easy task if we restrict ourselves to twodimensional perturbations. Assuming that the vortex tube is aligned with the vertical axis, this means that the perturbed velocity field lies in the horizontal plane and does not depend on the vertical variable. Under such conditions, the Burgers vortex is known to be stable for any value of the Reynolds number. This result was first established by Giga and Kambe [13] for Re 1 and then in the general case by Gallay and Wayne [8], see also [2,11,12]. Moreover, a lot is known about the spectrum of the linearized operator, which turns out to be purely discrete in a neighborhood of the origin in the complex plane. Using perturbative expansions, Robinson and Saffman [24] showed that all linear modes are exponentially damped for small Reynolds numbers. This property was then numerically verified by Prochazka and Pullin [22] for Re ≤ 104 , and finally rigorously established in [8]. The situation is much more complicated if we allow for arbitrary three-dimensional perturbations. In that case, it was shown by Rossi and Le Dizès [25] that the linearized operator does not have any eigenfunction with nontrivial dependence in the vertical variable. While this result precludes the existence of unstable eigenvalues, it also implies that stability cannot be deduced from such a simple analysis, and that continuous spectrum necessarily plays an important role. Unfortunately, the vertical dependence of the perturbed solutions is not easy to determine, as can be seen from the note [3] where a few attempts are made in that direction. The only rigorous result so far is due to Gallay and Wayne [9], who proved that the Burgers vortex is asymptotically stable with respect to three-dimensional perturbations in a fairly large class provided that the Reynolds number is sufficiently small. For larger Reynolds numbers, up to Re = 5000, an important numerical work by Schmid and Rossi [26] indicates that all modes are exponentially damped by the linearized evolution, although significant short-time amplification can occur. In this paper, we prove that the axisymmetric Burgers vortex is asymptotically stable with respect to small three-dimensional perturbations for arbitrary values of the Reynolds number. As in [9], we assume that the perturbations are nicely localized in the horizontal variables, but we do not impose any decay with respect to the vertical variable. Our approach is based on the fact that the linearized operator has a very simple dependence upon the vertical variable: the only term involving x3 is the dilation operator x3 ∂x3 , which originates from the background straining field. This crucial property was already exploited in [3,25,26], but we shall show that it allows to reduce the three-dimensional stability problem to a two-dimensional one, which can then be treated using an extension of the techniques developed in [8]. Although the spectrum of the linearized operator remains stable for all Reynolds numbers, the estimates we have on the associated semigroup deteriorate as Re increases, in full agreement with the amplification phenomena observed in [26]. We now formulate our results in a more precise way. We start from the three-dimensional incompressible Navier-Stokes equations: ∂t V + (V, ∇)V = νV −

1 ∇ P, ρ

∇ · V = 0,

(1.1)

Three-Dimensional Stability of Burgers Vortices

479

where V = V (x, t) ∈ R3 denotes the velocity field, P = P(x, t) ∈ R is the pressure field, and x = (x1 , x2 , x3 ) ∈ R3 is the space variable. The parameters in (1.1) are the kinematic viscosity ν > 0 and the density ρ > 0. To obtain tubular vortices, we assume that the velocity V can be decomposed as follows: V (x, t) = V s (x) + U (x, t),

(1.2)

where V s is an axisymmetric straining flow given by the explicit formula ⎞ ⎛ 1 ⎞ ⎛ − 0 0 −x 1 2 γ V s (x) = ⎝−x2 ⎠ ≡ γ M x, where M = ⎝ 0 − 21 0⎠ . 2 2x 0 0 1 3

(1.3)

Here γ > 0 is a parameter which measures the intensity of the strain. Note that ∇·V s = 0, and that V s is a stationary solution of (1.1) with the associated pressure P s = − 21 ρ|V s |2 . Our goal is to study the evolution of the perturbed velocity field U (x, t). To simplify the notations, we shall assume henceforth that γ = ν = ρ = 1. This can be achieved without loss of generality by replacing the variables x, t and the functions V , P with the dimensionless quantities γ 1/2 V P x˜ = . x, t˜ = γ t, V˜ = , P˜ = 1/2 ν (γ ν) ργ ν For further convenience, instead of considering the evolution of V or U , we prefer working with the vorticity field = ∇ × V = ∇ × U . Taking the curl of (1.1) and using (1.2), (1.3), we obtain for the evolution equation ∂t + (U, ∇) − (, ∇)U = L,

∇ · = 0,

(1.4)

where L is the differential operator defined by L = − (M x, ∇) + M.

(1.5)

Under mild assumptions that will be specified below, the velocity field U can be recovered from the vorticity via the three-dimensional Biot-Savart law 1 (x − y) × (y) U (x) = − dy =: (K 3D ∗ )(x). (1.6) 4π R3 |x − y|3 In what follows we shall often encounter the particular situation where the velocity U is two-dimensional and horizontal, namely U (x) = (U1 (x h ), U2 (x h ), 0) , where x h = (x1 , x2 ) ∈ R2 . In that case the vorticity satisfies (x) = (0, 0, 3 (x h )) , and the relation (1.6) reduces to the two-dimensional Biot-Savart law (x h − yh )⊥ 1 Uh (x h ) = 3 (yh ) dyh =: (K 2D 3 )(x h ), (1.7) 2π R2 |x h − yh |2 where Uh = (U1 , U2 ) and x h⊥ = (−x2 , x1 ) . We can now introduce the Burgers vortices, which are explicit stationary solutions of (1.4) of the form = αG, where α ∈ R is a parameter. The vortex profile is given by ⎛ ⎞ 0 1 −|xh |2 /4 G(x) = ⎝ 0 ⎠ , e . (1.8) where g(x h ) = 4π g(x ) h

480

Th. Gallay, Y. Maekawa

The associated velocity field U = αU G can be obtained from the Biot-Savart law (1.7) and has the following form: ⎛ ⎞ −x2 1 1 − e−r/4 . U G (x) = u g (|x h |2 ) ⎝ x1 ⎠ , (1.9) where u g (r ) = 2πr 0 If = αG, it is easy to verify that α = R2 3 (x h ) dx h . This means that the parameter α ∈ R represents the total circulation of the Burgers vortex αG. In the physical literature, the quantity |α| is often referred to as the (circulation) Reynolds number. The aim of this paper is to study the asymptotic stability of the Burgers vortices. We thus consider solutions of (1.4) of the form = αG + ω, U = αU G + u, and obtain the following evolution equation for the perturbation: ∂t ω + (u, ∇)ω − (ω, ∇)u = (L − α )ω,

∇ · ω = 0,

(1.10)

where is the integro-differential operator defined by

ω = (U G , ∇)ω − (ω, ∇)U G + (u, ∇)G − (G, ∇)u.

(1.11)

Here and in the sequel, it is always understood that u = K 3D ∗ ω. An important issue is now to fix an appropriate function space for the admissible perturbations. Since the Burgers vortex itself is essentially a two-dimensional flow, it is natural to choose a functional setting which allows for perturbations in the same class, but we also want to consider more general ones. Following [9], we thus assume that the perturbations are nicely localized in the horizontal variables, but merely bounded in the vertical direction. As we shall see below, this choice is more or less imposed by the particular form of the linear operator (1.5). To specify the horizontal decay of the admissible perturbations, we first introduce two-dimensional spaces. Given m ∈ [0, ∞], we denote by ρm : [0, ∞) → [1, ∞) the weight function defined by ⎧ ⎪ if m = 0, ⎨1 r m ρm (r ) = (1 + 4m ) (1.12) if 0 < m < ∞, ⎪ ⎩er/4 if m = ∞. We introduce the weighted L 2 space 2 2 2 2 L (m) = f ∈ L (R ) f L 2 (m) =

| f (x h )| ρm (|x h | ) dx h < ∞ , 2

R2

2

(1.13)

which is a Hilbert space with a natural inner product. Using Hölder’s inequality, it is easy to verify that L 2 (m) → L 1 (R2 ) if m > 1. In that case, we also define the closed subspace L 20 (m) = f ∈ L 2 (m) f (x h ) dx h = 0 . (1.14) R2

Next, we define the three-dimensional space X (m) as the set of all φ : R3 → R for which the map x h → φ(x h , x3 ) belongs to L 2 (m) for any x3 ∈ R, and is a bounded and continuous function of x3 . In other words, we set X (m) = BC(R ; L 2 (m)),

X 0 (m) = BC(R ; L 20 (m)),

(1.15)

Three-Dimensional Stability of Burgers Vortices

481

where “BC(R ; Y )” denotes the space of all bounded and continuous functions from R into Y . Both X (m) and X 0 (m) are Banach spaces equipped with the norm φ X (m) = sup φ(·, x3 ) L 2 (m) . x3 ∈R

(1.16)

Our goal is to study the stability of the Burgers vortex = αG with respect to perturbations ω ∈ X (m)3 . In fact, we can assume without loss of generality that ω belongs to the subspace X(m) = X (m) × X (m) × X 0 (m) ⊂ X (m)3 ,

(1.17)

which is invariant under the evolution defined by (1.10). Indeed, we have the following result, whose proof is postponed to Sect. 6.1: Lemma 1.1. Fix m ∈ (1, ∞]. If ω˜ ∈ X (m)3 satisfies ∇ · ω˜ = 0 in the sense of distributions, then there exists α˜ ∈ R such that ω˜ 3 (x h , x3 ) dx h = α, ˜ for all x3 ∈ R. (1.18) R2

At a formal level, this is a direct consequence of the divergence-free assumption, since d ω˜ 3 (x h , x3 ) dx h = ∂x3 ω˜ 3 (x h , x3 ) dx h = − ∇h · ω˜ h (x h , x3 ) dx h = 0. dx3 R2 R2 R2 In view of Lemma 1.1, if = αG + ω˜ for some ω˜ ∈ X (m)3 , we can write = (α + α)G ˜ + ω, where α˜ is given by (1.18) and ω = ω˜ − αG. ˜ Then ω ∈ X(m) by construction, and we are led back to the stability analysis of the Burgers vortex (α + α)G ˜ with respect to perturbations in X(m). In what follows we always consider the solutions ω(x, t) of (1.10) as X(m)-valued functions of time, and we often denote by ω(·, t) or simply ω(t) the map x → ω(x, t). A minor drawback of our functional setting is that we cannot expect the solutions of (1.10) to be continuous in time in the strong topology of X(m). This is because the operator L defined in (1.5) contains the dilation operator −x3 ∂x3 , see Sect. 2.1 below. To restore continuity, it is thus necessary to equip X(m) with a weaker topology. Following [9], we denote by X loc (m) the space X (m) equipped with the topology defined by the family of seminorms φ X n (m) = sup φ(·, x3 ) L 2 (m) , n ∈ N. |x3 |≤n

In analogy with (1.17), we set Xloc (m) = X loc (m) × X loc (m) × X 0,loc (m), where X 0,loc (m) is of course the space X 0 (m) equipped with the topology of X loc (m). We are now able to formulate our main result: Theorem 1.2. Fix m ∈ (2, ∞] and α ∈ R. Then there exist δ = δ(α, m) > 0 and C = C(α, m) ≥ 1 such that, for any ω0 ∈ X(m) with ∇ · ω0 = 0 and ω0 X(m) ≤ δ, Eq. (1.10) has a unique solution ω ∈ L ∞ (R+ ; X(m)) ∩ C([0, ∞) ; Xloc (m)) with initial data ω0 . Moreover, ω(t) X(m) ≤ C ω0 X(m) e−t/2 , for all t ≥ 0.

(1.19)

482

Th. Gallay, Y. Maekawa

Theorem 1.2 shows that the Burgers vortex αG is asymptotically stable with respect to small perturbations in X(m), for any value of the circulation α ∈ R. If one prefers to consider perturbations in the larger space X (m)3 , then our result means that the family {αG}α∈R of all Burgers vortices is asymptotically stable with shift, because the perturbations may then modify the circulation of the underlying vortex. The key point in the proof is to show that the linearized operator L − α has a uniform spectral gap for all α ∈ R. This implies a uniform decay rate in time for the perturbations, as in (1.19). However, it should be emphasized that the constants C and δ in Theorem 1.2 depend on α, in such a way that C(α, m) → ∞ and δ(α, m) → 0 as |α| → ∞. This is in full agreement with the amplification phenomena numerically observed in [26]. The proof of Theorem 1.2 gives more detailed information on the solutions of (1.10) than what is summarized in (1.19). First of all, we can prove stability in X(m) for any m > 1, but the exponential factor e−t/2 in (1.19) should then be replaced by e−ηt , where η < (m − 1)/2 if m ≤ 2. Next, thanks to parabolic smoothing, we can obtain decay estimates not only for ω but also for the spatial derivatives ∂x j ω ≡ ∂ω/∂ x j , j = 1, 2, 3. β

β

β

β

For convenience, we shall often use the multi-index notation ∂x = ∂x11 ∂x22 ∂x33 for β = (β1 , β2 , β3 ) ∈ N3 . Finally, due to the particular structure of the linear operator L − α , it turns out that the horizontal part ωh = (ω1 , ω2 ) of the vorticity vector has a faster decay than the vertical component ω3 as t → ∞. Thus, a more complete (but less readable) version of our result is as follows: Theorem 1.3. Fix m ∈ (1, ∞], α ∈ R, and take μ ∈ (1, 23 ), η ∈ (0, 21 ] such that 2μ < m + 1 and 2η < m − 1. Then there exist δ = δ(α, m) > 0 and C = C(α, m, μ, η) ≥ 1 such that, for all initial data ω0 ∈ X(m) with ∇ · ω0 = 0 and ω0 X(m) ≤ δ, Eq. (1.10) has a unique solution ω ∈ L ∞ (R+ ; X(m)) ∩ C([0, ∞) ; Xloc (m)). Moreover, for all t > 0, C ω0 X(m) −μt e , a(t)|β|/2 C ω0 X(m) −ηt ≤ e , a(t)|β|/2

∂xβ ωh (t) X (m)2 ≤ ∂xβ ω3 (t) X (m)

(1.20) (1.21)

where a(t) = 1 − e−t and β ∈ N3 is any multi-index of length |β| = β1 + β2 + β3 ≤ 1. The decay rates (1.20), (1.21) are optimal when β = 0, but it turns out that vertical derivatives such as ∂x3 ωh (t) or ∂x3 ω3 (t) have a faster decay as t → ∞, see Sects. 4 and 5 for more details. In any case, we believe that the optimal rates are those provided by the linear stability analysis, as in Proposition 4.1 below. The rest of this paper is devoted to the proof of Theorems 1.2 and 1.3. Before giving the details, we explain here the important ideas in an informal way. As was already mentioned, the main difficulty is to obtain good estimates on the solutions of the linearized equation ∂t ω = (L − α )ω, ∇ · ω = 0.

(1.22)

Once this is done, the nonlinear terms in (1.10) can be controlled using rather standard arguments, which are recalled in Sect. 5. To study (1.22), we use the fact that the operator L − α depends on the vertical variable in a simple and very specific way. Indeed, it is easy to verify that [∂x3 , L] = −∂x3 and [∂x3 , ] = 0, where [A, B] = AB − B A

Three-Dimensional Stability of Burgers Vortices

483

denotes the commutator of the operators A and B. This key observation, which already plays a crucial role in the previous works [3,25,26], implies the following identity: ∂xk3 et (L−α ) ω0 = e−kt et (L−α ) ∂xk3 ω0 ,

(1.23)

for all k ∈ N and all t ≥ 0. If we take k ∈ N sufficiently large, depending on |α|, we can use (1.23) to show that ∂xk3 ω(t) decays exponentially as t → ∞ if ω(t) is a solution of (1.22). Then, by an interpolation argument, we deduce that all expressions involving at least one vertical derivative play a negligible role in the long-time asymptotics, see Sect. 4 for more details. This “smoothing effect” in the vertical direction is due to the stretching properties of the linear flow (1.2). As a consequence of these remarks, we can restrict our attention to those solutions of (1.22) which are independent of the vertical variable x3 . We call this particular situation the vectorial 2D problem, and we study it in Sect. 3. Note that the perturbations we allow here are two-dimensional in the sense that ∂x3 u = ∂x3 ω = 0, but that all three components of u or ω are possibly nonzero. This is in contrast with the purely twodimensional case considered in [8,9], where in addition u 3 = ω1 = ω2 = 0. Extending the techniques developed in [8,9], it is possible to show that all solutions of (1.22) with ∂x3 ω = 0 converge exponentially to zero as t → ∞, and that the decay rate is uniform in α. This is done using spectral estimates and a detailed study of the eigenvalue equation (L − α )ω = λω. It is then a rather straightforward task to complete the proof of Theorem 1.2 using the arguments outlined above. Remark. The vortex tubes observed in numerical simulations are usually not axisymmetric: in general, they rather exhibit an elliptical core region [14,21]. A simple model for such asymmetric vortices is obtained by replacing the straining flow V s in (1.3) with the nonsymmetric strain Vλs (x) = γ Mλ x, where λ ∈ (0, 1) is an asymmetry parameter and ⎛ 1+λ ⎞ − 2 0 0 Mλ = ⎝ 0 (1.24) − 1−λ 0⎠ . 2 0 0 1 Asymmetric Burgers vortices are then stationary solutions to (1.4), where the operator L in the right-hand side is defined by (1.5) with M replaced by Mλ . Unlike in the symmetric case λ = 0, no explicit formula is available and proving the existence of stationary solutions is already a nontrivial task, except perhaps in the perturbative regime where either the asymmetry parameter λ or the circulation number α is very small. In view of these difficulties, asymmetric Burgers vortices were first studied using formal asymptotic expansions and numerical calculations, see e.g. [21,23,24]. The mathematical theory is more recent, and includes several existence results which cover now the whole range of parameters λ ∈ (0, 1) and α ∈ R [9,10,18,19]. In addition, the stability with respect to two-dimensional perturbations is known to hold at least for small values of the asymmetry parameter [10,18]. However, the only result so far on three-dimensional stability is restricted to the particular case where the circulation number α is sufficiently small, depending on λ [9]. Using Theorem 1.2 and a simple perturbation argument, it is easy to show that asymmetric Burgers vortices are stable with respect to small three-dimensional pertubations in the space X(m), provided that the asymmetry parameter λ is small enough depending on the circulation number α. This follows from the fact the linearized operator at

484

Th. Gallay, Y. Maekawa

the symmetric Burgers vortex has a uniform spectral gap for all α ∈ R, and that the asymmetric Burgers vortex is O(λ)-close to the corresponding symmetric vortex in the topology of X(m), uniformly for all α ∈ R [10]. Although this stability result is new and not covered by [9], it is certainly not optimal, and we prefer to postpone the study of the three-dimensional stability of asymmetric Burgers vortices to a future investigation. 2. Preliminaries In this preliminary section we collect a few basic estimates which will be used throughout the proof of Theorems 1.2 and 1.3. They concern the semigroup generated by the linear operator (1.5), and the Biot-Savart law (1.6) relating the velocity field to the vorticity. Most of the results were already established in [9, App. A], and are reproduced here for the reader’s convenience. As in [9], we introduce the following generalization of the function spaces (1.13) and (1.15). Given m ∈ [0, ∞] and p ∈ [1, ∞), we define the weighted L p space p L p (m) = f ∈ L p (R2 ) f L p (m) = | f (x h )| p ρm (|x h |2 ) p/2 dx h < ∞ , R2

and the corresponding three-dimensional space X p (m) = BC(R ; L p (m)), φ X p (m) = sup φ(·, x3 ) L p (m) . x3 ∈R

p

If m > 2 − 2p , we also denote by L 0 (m) the subspace of all f ∈ L p (m) such that p p p p R f dx h = 0. In analogy with (1.17), we set X (m) = X (m) × X (m) × X 0 (m), p p where X 0 (m) = BC(R ; L 0 (m)). 2.1. The semigroup generated by L. If we decompose the vorticity ω into its horizontal part ωh = (ω1 , ω2 ) and its vertical component ω3 , it is clear from (1.3) and (1.5) that the linear operator L has the following expression: L h ωh (Lh + L3 − 23 )ωh Lω = = , (2.1) L 3 ω3 (Lh + L3 )ω3 where Lh is the two-dimensional Fokker-Planck operator Lh = h +

xj xh · ∇h + 1 = ∂x + 1, ∂x2j + 2 2 j 2

2

j=1

j=1

(2.2)

and L3 = ∂x23 − x3 ∂x3 is a convection-diffusion operator in the vertical variable. As is shown in [7, App. A], the operator Lh is the generator of a strongly continuous semigroup in L 2 (m) given by the explicit formula (et Lh f )(x h ) =

|x h −yh |2 et e− 4a(t) f (yh et/2 ) dyh , t > 0, 4πa(t) R2

(2.3)

Three-Dimensional Stability of Burgers Vortices

485

where a(t) = 1 − e−t . Similarly, the operator L3 generates a semigroup of contractions in BC(R) given by |x3 e−t −y3 |2 1 t L3 f )(x3 ) = √ e− 2a(2t) f (y3 ) dy3 , t > 0, (2.4) (e 2πa(2t) R see [9, App. A]. Note that the semigroup et L3 is not strongly continuous in the space BC(R) equipped with the supremum norm. This is mainly due to the dilation factor e−t in (2.4). However, if we equip BC(R) with the (weaker) topology of uniform convergence on compact sets, then the map t → et L3 f is continuous for any f ∈ BC(R). This observation is the reason for introducing the space X loc (m) in Sect. 1. Since the operators Lh and L3 act on different variables, it is easy to obtain the semigroup generated by L 3 = Lh + L3 by combining the formulas (2.3) and (2.4). We find |x3 e−t −y3 |2 1 t L3 et Lh φ(·, y3 ) (x h ) dy3 , t > 0. e− 2a(2t) (2.5) (e φ)(x) = √ 2πa(2t) R In [9, Prop. A.6], it is shown that this expression defines a uniformly bounded semigroup in X (m) for any m > 1, and that the map t → et L 3 is strongly continous in the topology of X loc (m). Moreover, the subspace X 0 (m) is left invariant by et L 3 for any t ≥ 0. Using these results and the relation (2.1), we conclude that the three-dimensional operator L generates a uniformly bounded semigroup in the space X(m), given by et L ω = e−3t/2 et L 3 ω1 , e−3t/2 et L 3 ω2 , et L 3 ω3 , t ≥ 0. (2.6) As is easily verified, if ∇ · ω = 0, then ∇ · et L ω = 0 for all t ≥ 0. The asymptotic stability of the Burgers vortices relies heavily on the decay properties of the semigroup et L as t → ∞. In the proof of Theorems 1.2 and 1.3, we also use the smoothing properties of the operator et L for t > 0, and in particular the fact that et L extends to a bounded operator from X p (m) into X2 (m) for all p ∈ [1, 2]. All the needed estimates are collected in the following statement. Proposition 2.1. Let m ∈ (1, ∞], p ∈ [1, 2], and take η ∈ (0, 21 ] such that 2η < m − 1. For any β = (β1 , β2 , β3 ) ∈ N3 , there exists C > 0 such that the following estimates hold: 3

∂xβ et L h ωh X (m)2

≤

∂xβ et L 3 ω3 X (m) ≤

Ce−( 2 +β3 )t 1

− 1 + |β|

1

− 21 + |β| 2

a(t) p 2 2 Ce−(η+β3 )t a(t) p

ωh X p (m)2 ,

(2.7)

ω3 X p (m) ,

(2.8)

for any ω ∈ X p (m) and all t > 0. Here a(t) = 1 − e−t and |β| = β1 + β2 + β3 . Proof. We first assume that m ∈ (1, ∞). If p ∈ [1, 2] and βh = (β1 , β2 ) ∈ N2 , it is proved in [7, App. A] that C

∂xβhh et Lh f L 2 (m) ≤ a(t)

1 1 |βh | p−2+ 2

f L p (m) , t > 0,

(2.9)

486

Th. Gallay, Y. Maekawa p

for all f ∈ L p (m). If in addition f ∈ L 0 (m), we have the stronger estimate Ce−ηt

∂xβhh et Lh f L 2 (m) ≤

1

a(t) p

− 21 +

|βh | 2

f L p (m) , t > 0,

(2.10)

where η > 0 is as in Proposition 2.1. On the other hand, using (2.4), we find by direct calculation ∂xβ33 et L3 f L ∞ (R) ≤

Ce−β3 t a(t)

β3 2

f L ∞ (R) , t > 0.

(2.11)

Here, as in (1.23), the stabilizing factor e−β3 t comes from the dilation operator −x3 ∂x3 which enters the definition of L3 . Now, if we start from the representation (2.5) and use the estimates (2.9)–(2.11), we easily obtain (2.7), (2.8) by a direct calculation, see [9, Prop. A.6]. To complete the proof of Proposition 2.1, it remains to show that (2.9), (2.10) still hold when m = ∞. If t ∈ (0, 1), estimate (2.9) is easily obtained by a direct calculation, based on the representation (2.3). Using this remark and the semigroup property of et Lh , we conclude that it is sufficient to establish (2.9), (2.10) in the particular case where p = 2 and βh = 0. This in turns follows easily from the spectral properties of the generator Lh . Indeed, it is well-known that Lh is a self-adjoint operator in L 2 (∞) with purely discrete spectrum σ (Lh ) = {− k2 | k = 0, 1, 2, . . . }. Moreover, the subspace L 20 (∞) is precisely the orthogonal complement of the eigenspace corresponding to the zero eigenvalue, see for example [8, Lemma 4.7]. It follows that et Lh is a semigroup of contractions in L 2 (∞), and that et Lh f L 2 (∞) ≤ e−t/2 f L 2 (∞) for all t ≥ 0 if f ∈ L 20 (∞). This proves (2.9) and (2.10), with η = 1/2. 2.2. Estimates for the velocity fields. If the velocity u and the vorticity ω are related by the Biot-Savart law (1.6), we have |u| ≤ J (|ω|), where J is the Riesz potential defined by 1 1 J (φ)(x) = φ(y) dy, x ∈ R3 . (2.12) 4π R3 |x − y|2 Since ω will typically belong to the Banach space X(m), we need estimates on the Riesz potential J (φ) for φ ∈ X (m). We start with a preliminary result: Lemma 2.2. Let p1 ∈ [1, 2), p2 ∈ [1, 2], and assume that φ ∈ X p1 (0) ∩ X p2 (0). If q1 , q2 ∈ [1, ∞] satisfy 2 p1 < q1 ≤ ∞, 2 − p1

p2 < q 2 <

2 p2 , 2 − p2

(2.13)

then J (φ) = J1 (φ) + J2 (φ) with Ji (φ) ∈ X qi (0) for i = 1, 2, and we have the following estimates J1 (φ) X q1 (0) ≤ C( p1 , q1 ) φ X p1 (0) , J2 (φ) X q2 (0) ≤ C( p2 , q2 ) φ X p2 (0) .

(2.14) (2.15)

Three-Dimensional Stability of Burgers Vortices

487

Proof. We proceed as in [9, Prop. A.9]. We first observe that J (φ)(x h , x3 ) = F(x h ; x3 , y3 ) dy3 + |x3 −y3 |≥1

|x3 −y3 |<1

F(x h ; x3 , y3 ) dy3

= J1 (φ)(x h , x3 ) + J2 (φ)(x h , x3 ), where F(x h ; x3 , y3 ) =

φ(yh , y3 ) dyh , x h ∈ R2 , x3 , y3 ∈ R. 2 2 R2 |x h − yh | + (x 3 − y3 )

For any a ∈ R, let f a (yh ) = (a 2 + |yh |2 )−1 . Then f a ∈ L r (R2 ) for any r > 1 and any a = 0, and there exists Cr > 0 such that f a L r (R 2 ) ≤

Cr 2

|a|2− r

.

Moreover, we have F(· ; x3 , y3 ) = φ(·, y3 ) f x3 −y3 by construction. Thus, if we take 1 ≤ p, q, r ≤ ∞ such that 1 + q1 = 1p + r1 , we obtain using Young’s inequality F(· ; x3 , y3 ) L q (R2 ) ≤ φ(·, y3 ) L p (R2 ) f x3 −y3 L r (R2 ) ≤

Cr φ(·, y3 ) | L p (R2 ) 2

|x3 − y3 |2− r

.

To estimate J1 (φ), we choose p = p1 , q = q1 . In view of (2.13), the corresponding exponent r = r1 satisfies 2 < r1 ≤ ∞, so that 2 − r21 ∈ (1, 2]. By Minkowski’s inequality, we thus find J1 (φ)(·, x3 ) L q1 (R2 ) ≤ F(· ; x3 , y3 ) L q1 (R2 ) dy3 |x3 −y3 |≥1

≤ C(r1 ) sup φ(·, y3 ) L p1 (R2 ) . y3 ∈R

Taking the supremum over x3 ∈ R, we obtain (2.14). Similarly, to bound J2 (φ), we take p = p2 , q = q2 . Then 1 < r2 < 2, so that 2 − r22 ∈ (0, 1). We thus obtain F(· ; x3 , y3 ) L q2 (R2 ) dy3 J2 (φ)(·, x3 ) L q2 (R2 ) ≤ |x3 −y3 |<1

≤ C(r2 ) sup φ(·, y3 ) L p2 (R2 ) , y3 ∈R

and (2.15) follows. Finally, the uniform continuity of Ji (φ)(·, x3 ) with respect to x3 can be verified exactly as in the proof of [9, Prop. A.9]. As an immediate consequence, we obtain the following useful statements. Proposition 2.3. Let φ ∈ X (m) for some m ∈ (1, ∞]. Then J (φ) ∈ X q (0) for all q ∈ (2, ∞), and there exists a positive constant C = C(m, q) such that J (φ) X q (0) ≤ C φ X (m) .

(2.16)

Proof. If m > 1, we recall that X (m) → X p (0) for all p ∈ [1, 2]. Thus we can apply Lemma 2.2 with p1 = 1, p2 = 2, q1 = q2 = q ∈ (2, ∞), and the result follows.

488

Th. Gallay, Y. Maekawa

Corollary 2.4. Let φ1 , φ2 ∈ X (m) for some m ∈ (1, ∞]. Then φ1 J (φ2 ) ∈ X p (m) for all p ∈ (1, 2), and there exists a positive constant C = C(m, p) such that φ1 J (φ2 ) X p (m) ≤ C φ1 X (m) φ2 X (m) .

(2.17)

Proof. We proceed as in [9, Cor. A.10]. Let p ∈ (1, 2), and take q ∈ (2, ∞) such that 1 1 1 q = p − 2 . For any x 3 ∈ R, we have by Hölder’s inequality, 1/ p

φ1 (·, x3 )J (φ2 )(·, x3 )

L p (m)

=

ρm (|x h | )

2 p/2

R2

|φ1 (x h , x3 )| |J (φ2 )(x h , x3 )| dx h p

1/2

≤

R2

ρm (|x h |2 )|φ1 (x h , x3 )|2 dx h 1/q

×

p

|J (φ2 )(x h , x3 )| dx h q

R2

= φ1 (·, x3 ) L 2 (m) J (φ2 )(·, x3 ) L q (0) . Taking the supremum over x3 ∈ R and using Proposition 2.3, we obtain (2.17). Finally, it is clear that the map x3 → φ1 (·, x3 )J (φ2 )(·, x3 ) is continuous from R into L p (m). We conclude this section with an estimate on the linear operator (1.11) which will be needed in Sect. 4. Lemma 2.5. Let p ∈ [1, 2] and 2 − such that

2 p

< m ≤ ∞. For any β ∈ N3 , there exists C > 0

∂xβ ω X p (m) ≤ C

˜

∂xβ ω X p (m) .

(2.18)

˜ |β|≤|β|+1

Proof. It is sufficient to prove (2.18) for β = 0. The general case easily follows if we use the Leibniz rule to differentiate ω (we omit the details). Assume thus that ω belongs to X p (m), together with its first order derivatives. Since the function U G defined in (1.9) is smooth and bounded (together with all its derivatives), it is clear that ˜ ∂xβ ω X p (m) . (U G , ∇)ω X p (m) + (ω, ∇)U G X p (m) ≤ C ˜ |β|≤1

We now estimate the term (u, ∇)G = (K 3D ∗ ω, ∇)G, using the fact that |K 3D ∗ ω| ≤ J (|ω|). Since |ω| ∈ X 1 (0) ∩ X p (0) by assumption, we can apply Lemma 2.2 with 2p p1 = 1, q1 = ∞, p2 = p, and q2 ∈ ( p, 2− p ). By Hölder’s inequality, we easily find J1 (|ω|)|∇G| X p (m) ≤ C J1 (|ω|) X ∞ (0) ≤ C |ω| X 1 (0) ≤ C ω X p (m) , J2 (|ω|)|∇G| X p (m) ≤ C J2 (|ω|) X q2 (0) ≤ C |ω| X p (0) ≤ C ω X p (m) . We conclude that (u, ∇)G X p (m) = (K 3D ∗ ω, ∇)G X p (m) ≤ C ω X p (m) . In a similar way, commuting the derivative and the convolution operator, we obtain the estimate (G, ∇)u X p (m) ≤ (G, ∇)(K 3D ∗ ω) X p (m) ≤ C ∇ω X p (m) . This completes the proof.

Three-Dimensional Stability of Burgers Vortices

489

3. The Vectorial 2D Problem In this section we study the linearized equation ∂t ω = (L − α )ω in the particular case where the vorticity ω does not depend on the vertical variable. As was explained in the Introduction, this preliminary step is an essential ingredient in the linear stability proof which will be presented in Sect. 4. Since the results established here will eventually be applied to the restriction of the 3D vorticity field to a horizontal plane x3 = const., we do not assume in this section that ω is divergence-free. If ∂x3 ω = 0, then L3 ω = 0, and the expression (2.1) of the linear operator L becomes significantly simpler. On the other hand, we know from (1.11) that

ω = 1 ω − 2 ω + 3 ω − 4 ω,

(3.1)

where

1 ω = (U G , ∇)ω = (UhG , ∇h )ω, 3 ω = (u, ∇)G = (u h , ∇h )G,

2 ω = (ω, ∇)U G = (ωh , ∇h )U G ,

4 ω = (G, ∇)u = g∂x3 u.

(3.2)

Here u = K 3D ∗ ω is the velocity field obtained from ω via the three-dimensional Biot-Savart law (1.6). Since ∂x3 ω = 0, we have ∂x3 u = 0, hence 4 ω = 0 in our case. Moreover, it is easy to verify that u = (u h , u 3 ), where u h = K 2D ω3 . Thus, we see that (L − α )ω = Lα ω, if ∂x3 ω = 0, where Lα is the two-dimensional differential operator defined by ˜ 2 )ωh (Lh − 23 )ωh − α( 1 − Lα,h ωh = . Lα ω = ˜ 3 )ω3 Lα,3 ω3 Lh ω3 − α( 1 +

(3.3)

˜ 2 ωh = (ωh , ∇h )U G and ˜ 3 ω3 = (K 2D ω3 , ∇h )g. Here h For any α ∈ R and any m ∈ (1, ∞], the operator Lα defined by (3.3) is the generator of a strongly continuous semigroup in the space L 2 (m)3 . This property can be established by a standard perturbation argument, see Lemma 3.2 below. Our main goal here is to obtain accurate decay estimates for the semigroup et Lα as t → ∞. As is clear from (3.3), the evolutions for ωh and ω3 are completely decoupled, so that we can consider the semigroups et Lα,h and et Lα,3 separately. The main contribution of this section is: Proposition 3.1. Fix m ∈ (1, ∞], α ∈ R, μ ∈ (0, 23 ), and take η ∈ (0, 21 ] such that 1 + 2η < m. Then there exists C > 0 such that et Lα,h ωh L 2 (m)2 ≤ C e−μt ωh L 2 (m)2 , t ≥ 0, e

t Lα,3

ω3 L 2 (m) ≤ C e

−ηt

ω3 L 2 (m) , t ≥ 0,

(3.4) (3.5)

for all ω ∈ L 2 (m)2 × L 20 (m). Estimate (3.5) was obtained in [8, Prop. 4.12] for m < ∞, and the proof given there extends to the limiting case m = ∞ without additional difficulty. We recall that the decay rate e−ηt is obtained using the fact that ω3 ∈ L 20 (m): If we only assume that ω3 ∈ L 2 (m) for some m > 1, then (3.5) holds with η = 0. From now on, we focus on the semigroup et Lα,h , which has not been studied yet. To prove (3.4), we use the same arguments as in [8, Sect. 4.2]. We first establish a short time estimate:

490

Th. Gallay, Y. Maekawa

Lemma 3.2. Fix m ∈ (1, ∞], α ∈ R, and T > 0. There exists C = C(T, m, |α|) > 0 such that 1 sup et Lα,h ωh L 2 (m)2 + a(t) 2 ∇h et Lα,h ωh L 2 (m)4 ≤ C ωh L 2 (m)2 , (3.6) 0≤t≤T

for all ωh ∈ L 2 (m)2 . Here a(t) = 1 − e−t . Proof. Given ωh0 ∈ L 2 (m)2 , the idea is to solve the integral equation t 3 t (Lh − 23 ) 0 ˜ 2 )ωh (s) ds, t ∈ [0, T ], ωh (t) = e ωh − α e(t−s)(Lh − 2 ) ( 1 −

(3.7)

0

by a fixed point argument, in the space X T = {ωh ∈ C([0, T ], L 2 (m)2 | ωh X T < ∞} defined by the norm 1

ωh X T = sup ωh (t) L 2 (m)2 + sup a(t) 2 ∇h ωh (t) L 2 (m)4 . 0≤t≤T

0≤t≤T

t (Lh − 23 )

From (2.9) we know that e ωh0 X T ≤ C1 ωh0 L 2 (m)2 , for some C1 > 0 independent of T . To estimate the integral term in (3.7), we first observe that the velocity field U G defined by (1.9) satisfies sup (1 + |x h |)|U G (x h )| + sup (1 + |x h |)2 |∇h U G (x h )| < ∞.

x h ∈R2

(3.8)

x h ∈R2

In view of the definitions (3.2), we thus have (1 + |x h |) 1 ωh L 2 (m)2 ≤ C ∇h ωh L 2 (m)4 , ˜ 2 ωh L 2 (m)2 ≤ C ωh L 2 (m)2 . (1 + |x h |) 2

(3.9) (3.10)

Using these estimates together with (2.9), we can bound t e(t−s)(Lh − 23 ) ( 1 − ˜ 2 )ωh (s) ds 2 2 0 L (m) t 3 ≤ C e− 2 (t−s) ωh (s) L 2 (m)2 + ∇h ωh (s) L 2 (m)4 ds 0 t 3 1 1 ≤ C ωh X T e− 2 (t−s) a(s)− 2 ds ≤ Ca(T ) 2 ωh X T . 0

In a similar way, t (t−s)(Lh − 23 ) ∇ h ˜ e ( −

)ω (s) ds 1 2 h 0

≤ C 0

t

e

− 23 (t−s)

a(t − s)

1 2

L 2 (m)4

ωh (s) L 2 (m)2 + ∇h ωh (s) L 2 (m)4 ds ≤ C ωh X T .

(3.11)

Summarizing, we have shown that ωh X T ≤ C1 ωh0 L 2 (m)2 + C2 |α|a(T )1/2 ωh X T , for some positive constants C1 , C2 . If we now take T > 0 small enough so that C2 |α|a(T )1/2 ≤ 1/2, we see that the right-hand side of (3.7) is a strict contraction in X T . We deduce that (3.7) has a unique solution, which satisfies ωh X T ≤ 2C1 ωh0 L 2 (m)2 . Since ωh (t) = et Lα,h ωh0 by construction, this proves (3.6) for T sufficiently small, and the general case follows due to the semigroup property. This concludes the proof.

Three-Dimensional Stability of Burgers Vortices

491

We next consider the essential spectrum of the semigroup et Lα,h , and begin with a few definitions. If A is a bounded linear operator on a (complex) Banach space X , we define the essential spectrum σess (A ; X ) as the set of all z ∈ C such that A − z is not a Fredholm operator with zero index, see [16] or [4]. The essential spectral radius of A in X is given by ress (A ; X ) = sup {|z| ; z ∈ σess (A ; X )} < ∞. If |z| > ress (A ; X ), then either z is in the resolvent set of A, or z is an eigenvalue of A with finite multiplicity, see [4, Cor. IV.2.11]. In the latter case, we say that z belongs to the discrete spectrum of A. In what follows, we consider the linear operator Lα,h as acting on the complexified space L 2 (m)2 , i.e. the space of all ωh : R2 → C2 such that ωh L 2 (m)2 < ∞. Our first result shows that the essential spectral radius of the operator et Lα,h in L 2 (m)2 does not depend on α. Proposition 3.3. Let m ∈ (1, ∞] and α ∈ R. Then for each t > 0 we have m ress et Lα,h ; L 2 (m)2 = ress et L0,h ; L 2 (m)2 = e−( 2 +1)t .

(3.12)

Proof. Since L0,h = Lh − 23 , the last equality in (3.12) follows from [7, Theorem A.1] if m < ∞. If m = ∞, then et Lh is a compact operator for any t > 0, hence ress (et L0,h ; L 2 (∞)2 ) = 0. To prove the first equality in (3.12), we fix t > 0. Our goal 3 is to show that the linear operator α (t) = et Lα ,h − et (Lh − 2 ) is compact in L 2 (m)2 . By Weyl’s theorem, this will imply that both semigroups have the same essential spectrum, hence the same essential spectral radius. In view of (3.7) we have, for all ωh ∈ L 2 (m)2 , t 3 ˜ 2 )es Lα,h ωh ds. α (t)ωh = −α e(t−s)(Lh − 2 ) ( 1 − (3.13) 0

Let w(x h ) = 1 + |x h |. If m < ∞, it follows from (2.9) and definition (1.13) that w et Lh ωh L 2 (m)2 ≤ C et Lh ωh L 2 (m+1)2 ≤ C w ωh L 2 (m)2 ,

(3.14)

for all ωh ∈ L 2 (m)2 and all t ≥ 0. If m = ∞, we know from [10, Prop. 2.1] that w(−Lh + 1)−1/2 is a bounded operator in L 2 (∞)2 , and since Lh is the generator of an analytic semigroup we easily obtain w et Lh ωh L 2 (∞)2 ≤ C (−Lh + 1)1/2 et Lh ωh L 2 (∞)2 ≤

C ωh L 2 (∞)2 , a(t)1/2

(3.15)

for all t > 0. Now, starting from (3.13) and using either (3.14) or (3.15) together with (3.9), (3.10), and Lemma 3.2, we find t − 3 (t−s) e 2 w α (t)ωh L 2 (m)2 ≤ C|α| a(t−s)1/2 0 × es Lα,h ωh L 2 (m)2 + ∇h es Lα,h ωh L 2 (m)4 ds 3 t e− 2 (t−s) ≤ C|α| ωh L 2 (m)2 ds ≤ C|α| ωh L 2 (m)2 . 1/2 a(s)1/2 0 a(t−s)

492

Th. Gallay, Y. Maekawa

Moreover, proceeding as in (3.11), we find ∇h α (t)ωh L 2 (m)4 ≤ C|α| ωh L 2 (m)2 . Thus we have shown that wα (t) and ∇h α (t) are bounded operators in L 2 (m). By Rellich’s criterion, we conclude that α (t) is a compact operator in L 2 (m)2 , for any t > 0. This completes the proof. In view of Proposition 3.3, the spectrum of the semigroup et Lα,h outside the disk of m radius e−( 2 +1)t in the complex plane is purely discrete. By the spectral mapping theorem [4], to control that part of the spectrum it is sufficient to locate the eigenvalues of the generator Lα,h . Thus we look for nontrivial solutions of the eigenvalue problem Lα,h ωh = λωh ,

(3.16)

where ωh ∈ L 2 (m)2 and λ ∈ C satisfies Re λ > − m2 − 1. The following auxiliary result shows that the eigenfunctions ωh always have a Gaussian decay at infinity. Proposition 3.4. Let m ∈ (1, ∞) and α ∈ R. If ωh ∈ L 2 (m)2 is a solution of (3.16) with Re λ > − m2 − 1, then ωh ∈ L 2 (∞)2 . The proof of Proposition 3.4 is postponed to Sect. 6.2 below. Note that a similar result for the nonlocal operator Lα,3 has been obtained in [8, Lemma 4.5], and plays a key role in the derivation of estimate (3.5). Thanks to Proposition 3.4, we only need to control the eigenvalues of Lα,h in the Gaussian space L 2 (∞)2 . This is the last important step in the proof of Proposition 3.1. Proposition 3.5. If λ is an eigenvalue of Lα,h in L 2 (∞)2 , then Re λ ≤ − 23 . Proof. Assume that ωh ∈ L 2 (∞)2 is a nontrivial solution of the eigenvalue problem (3.16), for some α ∈ R and some λ ∈ C. Using (3.3), we thus have 3 λωh = Lh ωh − ωh − α(UhG , ∇h )ωh + α(ωh , ∇h )UhG , 2

(3.17)

where the velocity field U G is defined in (1.9). Since Lα,h is a relatively compact perturbation of L0,h = Lh − 23 , both operators have the same domain, and it follows that ωh belongs to the domain of Lh . In particular, we have ∇h ωh ∈ L 2 (∞)4 and |x h |ωh ∈ L 2 (∞)2 , see e.g. [10, Sect. 2]. In the rest of the proof, we denote by ·, · the inner product in the complexified space L 2 (∞)2 , namely ωh1 , ωh2 = p(x h )ωh1 (x h ) · ωh2 (x h ) dx h , R2

where p(x h ) = ρ∞ (|x h |2 ) = e|xh | /4 . We also denote ωh 2 = ωh , ωh . We recall that Lh is a selfadjoint operator in L 2 (∞)2 which satisfies −Lh ≥ 0 on L 2 (∞)2 and −Lh ≥ 1/2 on L 20 (∞)2 . For later use, we observe that the (unbounded) operator ωh → (UhG , ∇h )ωh is skew-symmetric in L 2 (∞)2 , because the vector field p(x h )U G (x h ) is divergence-free. We now take the inner product of (3.17) with ωh , and evaluate the real part of the result. Using the skew-symmetry of the operator (UhG , ∇h ), we easily obtain 2

Three-Dimensional Stability of Burgers Vortices

493

3 Re λ ωh 2 = Lh ωh , ωh − ωh 2 + α Re(ωh , ∇h )UhG , ωh 2 3 = Lh ωh , ωh − ωh 2 2 + 2α Re

R2

p(x h )(x h · ωh )(x h⊥ · ωh )(u g ) (|x h |2 ) dx h ,

(3.18)

where u g (r ) is defined in (1.9). On the other hand, it follows from (3.17) that the scalar function x h · ωh ∈ L 2 (∞) satisfies λ x h · ωh = Lh (x h · ωh ) − 2x h · ωh − α(UhG , ∇h )(x h · ωh ) − 2∇h · ωh . Thus, proceeding as above and using the same notation ·, · for the inner product in L 2 (∞), we find Re λ x h · ωh 2 = Lh (x h · ωh ), x h · ωh − 2 x h · ωh 2 −2 Re∇h · ωh , x h · ωh .

(3.19)

Finally, the two-dimensional divergence ∇h · ωh ∈ L 20 (∞) satisfies λ ∇h · ωh = Lh (∇h · ωh ) − ∇h · ωh − α(UhG , ∇h )(∇h · ωh ),

(3.20)

Re λ ∇h · ωh 2 = Lh (∇h · ωh ), ∇h · ωh − ∇h · ωh 2 .

(3.21)

hence

Since ∇h ·ωh ∈ L 20 (∞), it follows from (3.21) that Re λ ∇h ·ωh 2 ≤ − 23 ∇h ·ωh 2 . Thus we must have Re λ ≤ − 23 , unless ∇h · ωh ≡ 0. In the latter case, we deduce from (3.19) that Re λ x h · ωh 2 ≤ −2 x h · ωh 2 , hence Re λ ≤ −2 unless x h · ωh ≡ 0. But if this last condition is met, it follows from (3.18) that Re λ ωh 2 ≤ − 23 ωh 2 , hence Re λ ≤ − 23 because ωh is not identically zero. Summarizing, we conclude that Re λ ≤ − 23 in all cases. Remark. Actually the conclusions of Proposition 3.5 can be slightly strengthened. First, in the invariant subspace where ∇h · ωh = 0, one can show that all eigenvalues of Lα,h satisfy Re λ ≤ −2. This follows from the proof above if we use in addition the fact that ωh ∈ L 20 (∞)2 , due to the divergence-free condition. The result is clearly sharp, because if g(x h ) is defined by (1.8) it is easy to verify that the function ωh = x h⊥ g(x h ) satisfies Lα,h ωh = −2ωh for any α ∈ R. On the other hand, if ωh is a solution of (3.17) such that ∇h · ωh = 0, we have Re λ < − 23 if α = 0. This follows from (3.21), because we know from [7, App. A] that 1 Lh (∇h · ωh ), ∇h · ωh < − ∇h · ωh 2 , 2 unless ∇h · ωh = (a1 x1 + a2 x2 )g(x h ) for some a1 , a2 ∈ C. But this ansatz is not compatible with (3.20) if α = 0. In fact, using the techniques developed in [20] or [6], it is possible to show that, given any M > 0, the eigenvalue equation (3.20) restricted to the orthogonal complement of the space of all radially symmetric functions in L 2 (∞) has no nontrivial solution such that Re λ ≥ −M, if |α| is sufficiently large depending on M.

494

Th. Gallay, Y. Maekawa

It is now easy to conclude the proof of Proposition 3.1. As was already mentioned, we only need to prove that estimate (3.4) holds for any μ < 3/2. If ρα (m) > 0 denotes the spectral radius of the operator eLα,h in L 2 (m)2 , this is equivalent to showing that log ρα (m) ≤ −3/2, see [4, Prop. IV.2.2]. But that inequality follows immediately from Propositions 3.3, 3.4, and 3.5, since m > 1. The proof of Proposition 3.1 is now complete. 4. Linear Stability Equipped with the results of the previous section, we now study the linearized equation (1.22) in its full generality. Using Proposition 2.1 and a perturbation argument, it is not difficult to verify that the linear operator L − α generates a locally bounded semigroup in the space X(m) for any α ∈ R and any m ∈ (1, ∞], see Proposition 4.2 below. The goal of this section is to show that the semigroup et (L−α ) extends to a bounded operator from X p (m) to X(m) for any t > 0 and any p ∈ [1, 2], and satisfies the following uniform estimates: Proposition 4.1. Fix m ∈ (1, ∞], p ∈ [1, 2], α ∈ R, and take μ ∈ (1, 23 ), η ∈ (0, 21 ] such that 2μ < m + 1 and 2η < m − 1. For any β = (β1 , β2 , β3 ) ∈ N3 , there exists C > 0 such that ∂xβ (et (L−α ) ω0 )h X (m)2 ≤ ∂xβ (et (L−α ) ω0 )3 X (m) ≤

C e−(μ+β3 )t 1

− 1 + |β|

1

− 21 + |β| 2

a(t) p 2 2 C e−(η+β3 )t a(t) p

ω0 X p (m) ,

(4.1)

ω0 X p (m) ,

(4.2)

for any ω0 ∈ X p (m) and all t > 0. Moreover, if ∇ · ω0 = 0, then ∇ · et (L−α ) ω0 = 0 for all t > 0. The proof of this important result is divided into several steps. 4.1. Global existence and short time estimates. We first prove that the linearized equation (1.22) has a unique global solution in X(m). Proposition 4.2. Fix m ∈ (1, ∞], p ∈ [1, 2], and α ∈ R. Then, for any ω0 ∈ X p (m), ∞ (R ; X(m)) ∩ C([0, ∞); X p (m)) with initial Eq. (1.22) has a unique solution ω ∈ L loc + loc 3 data ω0 . Moreover, for any β ∈ N , there exist positive constants C1 , C2 (independent of α) such that C1

∂xβ ω(t) X(m) ≤ a(t)

1 1 |β| p−2+ 2

ω0 X p (m) , for 0 < t ≤

C2 , |α|2 + 1

(4.3)

where a(t) = 1 − e−t . Finally, if ∇ · ω0 = 0, then ∇ · ω(t) = 0 for all t > 0. Proof. We proceed as in the proof of Lemma 3.2. Let et L be the semigroup generated by L, which is given by the explicit expression (2.6). The integral equation corresponding to (1.22) is t ω(t) = et L ω0 − α e(t−s)L ω(s) ds =: (Fω)(t), t > 0. (4.4) 0

Three-Dimensional Stability of Burgers Vortices

495

Given k ∈ N\{0} and a sufficiently small T ∈ (0, 1], we shall solve (4.4) in the Banach space p ∞ Uk,T = ω ∈ L loc ((0, T ); X(m)) ∩ C([0, T ]; Xloc (m)) ω k,T < ∞ , equipped with the norm 1 1 |β| |β| −2+ 2 β β p 2 sup a(t) ∂x ω(t) X(m) + sup a(t) ∂x ω(t) X p (m) , ω k,T = 0
|β|≤k

0
where a(t) = 1 − e−t . If ω0 ∈ X p (m), we know from Proposition 2.1 that the map t → et L ω0 belongs to Uk,T for any T > 0, and that et L ω0 k,T ≤ C1 ω0 X p (m) for some C1 > 0 depending only on k, m, p. Given ω ∈ Uk,T , we now estimate the integral term in (4.4). Using Proposition 2.1 and Lemma 2.5, we find ∂xβ e(t−s)L ω(s) X(m)

≤ ≤

C ω(s) X p (m) 1

− 1 + |β|

1

− 21 + |β| 2

a(t − s) p 2 2 C ω k,T a(t − s) p

≤

C

β˜ ∂x ω(s) X p (m) ˜ |β|≤1 1

a(t − s) p 1

− 21 + |β| 2

, 0 < s < t.

(4.5)

a(s) 2

|β|

β

1

Similarly we have ∂x e(t−s)L ω(s) X p (m) ≤ Ca(t − s)− 2 a(s)− 2 ω k,T for 0 < s < t. In the particular case where β = 0, it follows that t 1− 1 e(t−s)L ω(s) ds ≤ Ca(t) p ω k,T , (4.6) X(m)

0

t e(t−s)L ω(s) ds 0

1

X p (m)

≤ Ca(t) 2 ω k,T , 0 < t ≤ T.

(4.7)

Assume now that 1 ≤ |β| ≤ k. If β ≤ β and |β | = |β|−1, we have from Lemma 2.5

∂xβ ω(s) X(m) ≤ C

˜

∂xβ ω(s) X(m) ≤

1

a(s) p

˜ |β|≤|β| β

C ω k,T

β−β β

β−β (

Thus, writing ∂x e(t−s)L = ∂x ∂x e(t−s)L = ∂x Proposition 2.1 again, we obtain

e

− 21 + |β| 2

β1 +β2 2 −β3 )t

, 0 < s ≤ T.

β

e(t−s)L ∂x , and using

∂xβ e(t−s)L ω(s) X(m) ≤ C ∂xβ−β e(t−s)L ∂xβ ω(s) X(m) C ω k,T ≤ , 1 1 |β| 1 − + a(t − s) 2 a(s) p 2 2 β

1

for 0 < s < t. Similarly, we have ∂x e(t−s)L ω(s) X p (m) ≤ a(t−s)− 2 a(s)− Combining (4.5) and (4.8), we obtain the following estimate:

(4.8) |β| 2

ω k,T .

496

Th. Gallay, Y. Maekawa

t β (t−s)L ∂ e

ω(s) ds x 0 X(m) t 2

≤C

− 1p + 21 − |β| 2

a(t − s)

a(s)

− 21

ds +

0 1− 1p − |β| 2

≤ Ca(t)

t t 2

− 21

a(t − s)

a(s)

− 1p + 21 − |β| 2

ds ω k,T

ω k,T , 0 < t ≤ T,

(4.9)

which generalizes (4.6). Similarly, the generalization of (4.7) is t β (t−s)L ∂ e

ω(s) ds x 0

1

X p (m)

≤ Ca(t) 2 −

|β| 2

ω k,T , 0 < t ≤ T.

(4.10)

Summarizing, we have shown that the linear map F defined by (4.4) satisfies the estimate ˜ 2 ω k,T , if 0 < T ≤ 1, Fω k,T ≤ C1 ω0 X p (m) + C|α|T 1

where C˜ > 0 depends only on k, m and p. Arguing as in [9, Cor. A.7 and Remark A.8], p it is also straightforward to verify that Fω ∈ C([0, T ]; Xloc (m)) if ω ∈ Uk,T . If we now assume that T ≤ C2 (1 + |α|2 )−1 , where C2 = 1/(4C˜ 2 ), we see that F is a strict contraction in Uk,T . As a consequence, the integral equation (4.4) has a unique fixed point ω ∈ Uk,T , which satisfies ω k,T ≤ 2C1 ω0 X p (m) . This proves that Eq. (1.22) is locally well-posed in X p (m), and that the solutions satisfy (4.3). Moreover, as the local existence time T is independent of the initial data, the solutions can be extended globally in time. Finally, since both operators L and preserve the divergence-free condition, it is easy to check that, if ∇ · ω0 = 0, the solution ω of (1.22) satisfies ∇ · ω(t) = 0 for all t > 0. This completes the proof.

4.2. Decay estimates for the vertical derivatives. Proposition 4.2 shows that the linearized equation (1.22) is globally well-posed in the space X(m) for m > 1, but does not provide accurate estimates on the solution ω(t) = et (L−α ) ω0 for large times. In this section, we focus on the derivatives of ω(t) with respect to the vertical variable x3 . Using identity (1.23), we shall show that ∂xk3 ω(t) decays exponentially as t → ∞, provided k ∈ N is large enough depending on |α|. Albeit elementary, this observation plays a crucial role in the proof of Proposition 4.1, because it will allow us to simplify the study of the semigroup associated to L − α by disregarding most of the terms involving a vertical derivative. Proposition 4.3. Fix m ∈ (1, ∞]. There exist positive constants C3 , C4 such that, for all α ∈ R, all k ∈ N, and all ω0 ∈ X(m) with ∂xk3 ω0 ∈ X(m), the following estimate holds: ∂xk3 et (L−α ) ω0 X(m) ≤ C3 e(C4 (|α|

2 +1)−k)t

∂xk3 ω0 X(m) , t ≥ 0.

(4.11)

Proof. In view of (1.23), it is sufficient to prove (4.11) for k = 0. If ω0 ∈ X(m), we know from Proposition 4.2 that there exist constants C1 ≥ 1 and C2 > 0, depending only on m, such that the solution ω(t) = et (L−α ) ω0 of (1.22) satisfies ω(t) X(m) ≤ C1 ω0 X(m)

Three-Dimensional Stability of Burgers Vortices

497

for t ∈ (0, t0 ], where t0 = C2 /(|α|2 + 1). Using the semigroup property, we can iterate this bound, and we easily obtain et (L−α ) ω0 X(m) ≤ C3 eC4 (|α|

2 +1)t

ω0 X(m) , t ≥ 0,

where C3 = C1 and C4 = C2−1 log(C1 ). This concludes the proof.

4.3. Decomposition of the linearized operator. Motivated by Proposition 4.3, we now decompose the linear operator L − α as follows: L − α = Lα + L3 − α H,

(4.12)

where Lα is defined in (3.3) and L3 = ∂x23 − x3 ∂x3 . We recall that the operator Lα does not involve any derivative with respect to the vertical variable x3 , and does not couple the horizontal and vertical components of ω = (ωh , ω3 ) . In view of (3.1)–(3.3), the last term in (4.12) has the following expression: ˜ 3 − 4 , H = 3 − ˜ 3 after (3.3). More explicitly, we have where 3 , 4 are defined in (3.2) and Hω=

−g(K 3D ∗ ∂x3 ω)h Hh ω = , H3 ω ((K 3D ∗ ω)h − K 2D ω3 , ∇h )g − g(K 3D ∗ ∂x3 ω)3

(4.13)

where K 3D , K 2D are the Biot-Savart kernels (1.6), (1.7), and g is defined in (1.8). Here

denotes the convolution with respect to the horizontal variables, so that (K 2D ω3 )(x h , x3 ) = K 2D (x h − yh ) ω3 (yh , x3 ) dyh . R2

Thus, unlike Lα , the operator H involves vertical derivatives, and couples the horizontal and vertical components of ω. As was already observed in Sect. 3, we have H ω = 0 whenever ∂x3 ω = 0, see Proposition 4.5 below. Let Rα (t) denote the semigroup generated by the linear operator Lα + L3 . In analogy with (2.5), we have the following representation: (Rα (t)ω)(x) = √

|x3 e−t −y3 |2 1 et Lα ω(·, y3 ) (x h ) dy3 , t > 0, e− 2a(2t) 2πa(2t) R

(4.14)

where a(t) = 1 − e−t and et Lα is the semigroup generated by Lα . Since Rα (t) does not couple the horizontal and vertical components of ω, we can write Rα (t)ω =

Rα,h (t)ωh , Rα,3 (t)ω3

where Rα,h (t) and Rα,3 (t) are the semigroups generated by Lα,h + L3 and Lα,3 + L3 , respectively. Using the results of Sect. 3, we obtain the following estimates:

498

Th. Gallay, Y. Maekawa

Proposition 4.4. Fix m ∈ (1, ∞], α ∈ R, μ ∈ (1, 23 ), and take η ∈ (0, 21 ] such that 2η < m − 1. Then there exists C5 > 0 such that Rα,h (t)ωh X (m)2 ≤ C5 e−μt ωh X (m)2 , Rα,3 (t)ω3 X (m) ≤ C5 e

−ηt

(4.15)

ω3 X (m) ,

(4.16)

for all ω ∈ X(m) and all t ≥ 0. Proof. Both estimates follow from the representation (4.14), Proposition 3.1, and estimate (2.11). The calculations are straightforward, and can be omitted here. We just remark that, even if ∇ · ω = 0, the map x h → ωh (x h , x3 ) usually has a nonzero divergence for all values of x3 ∈ R. This is why Proposition 3.1, hence also Proposition 4.4, was established without imposing any divergence-free condition. We conclude this section with a useful bound on the linear operator H . Proposition 4.5. Fix m ∈ (1, ∞] and γ ∈ (0, 1). There exists C6 > 0 such that, for all ω ∈ X(m) with ∂x3 ω ∈ X(m), one has Hh ω X (m)2 ≤ C6 ∂x3 ω X(m) ,

(4.17) γ

1−γ

H3 ω X (m) ≤ C6 ( ∂x3 ω X(m) + ωh X (m)2 ∂x3 ωh X (m)2 ).

(4.18)

Proof. We use the expression (4.13) of the linear operator H . Since ∂x3 ω ∈ X(m), we know from Proposition 2.3 that ∂x3 u ≡ K 3D ∗ ∂x3 ω ∈ X 4 (0). Thus, using Hölder’s inequality, we obtain 1/4 g ∂x3 u X(m) ≤ ∂x3 u X 4 (0) ρm (|x h |2 )2 g(x h )4 dx h ≤ C ∂x3 ω X(m) . R2

In particular, we have Hh ω X (m)2 ≤ C ∂x3 ω X(m) , which is (4.17). We next consider the two-dimensional vector I = (K 3D ∗ ω)h − K 2D ω3 and estimate the term (I, ∇h )g. Using the definitions (1.6), (1.7), it is straightforward to verify that I (x) = I1 (x) + I2 (x), where (x h −yh )⊥ 1 I1 (x) = (ω3 (yh , y3 ) − ω3 (yh , x3 )) dy, 4π R3 |x − y|3 x3 −y3 1 I2 (x) = (ωh (yh , y3 ) − ωh (yh , x3 ))⊥ dy. 4π R3 |x − y|3 Here we have used the identities 1 x3 −y3 2 dy = , and dy3 = 0. 3 3 2 3 |x h − yh | R |x − y| R |x − y| Since ∇h g(x h ) = −g(x h )x h /2 and |x h · (x h − yh )⊥ | ≤ |x h ||x h − yh |1−σ |yh |σ for any σ ∈ [0, 1], we can bound |yh |σ |(I1 , ∇h )g(x)| ≤ Cg(x h )|x h | |ω3 (yh , y3 ) − ω3 (yh , x3 )| dy 2+σ |x3 −y3 |≥1 |x − y| 1 + Cg(x h )|x h | |ω3 (yh , y3 ) − ω3 (yh , x3 )| dy. 2 |x3 −y3 |<1 |x − y|

Three-Dimensional Stability of Burgers Vortices

499

We now proceed like in the proof of Lemma 2.2. Integrating first with respect to the horizontal variable yh ∈ R2 and applying Hölder’s inequality, we obtain |(I1 , ∇h )g(x)|

1 | · |σ {ω3 (·, y3 ) − ω3 (·, x3 )} L 1 (R2 ) dy3 |x − y3 |2+σ |x3 −y3 |≥1 3 1 ω3 (·, y3 ) − ω3 (·, x3 ) L 2 (R2 ) dy3 . + Cg(x h )|x h | |x − y3 | |x3 −y3 |<1 3

≤ Cg(x h )|x h |

Assuming 0 < σ < m − 1, we have the estimate | · |σ f L 1 (R2 ) ≤ C f L 2 (m) for any f ∈ L 2 (m), hence | · |σ {ω3 (·, y3 ) − ω3 (·, x3 )} L 1 (R2 ) + ω3 (·, y3 ) − ω3 (·, x3 ) L 2 (R2 ) ≤ C|x3 − y3 | ∂x3 ω3 X (m) . We conclude that

1 dy3 + |x − y3 |1+σ 3 |x3 −y3 |≥1 ≤ Cg(x h )|x h | ∂x3 ω3 X (m) ,

|(I1 , ∇h )g(x)| ≤ Cg(x h )|x h |

dy3 ∂x3 ω3 X (m)

|x3 −y3 |<1

hence (I1 , ∇)g X (m) ≤ C ∂x3 ω3 X (m) . Finally we consider the term (I2 , ∇h )g. Using again Hölder’s inequality, we obtain 1 |I2 (x)| ≤ C |ωh (yh , y3 ) − ωh (yh , x3 )| dy |x − y|2 |x3 −y3 |≥1 1 +C |ωh (yh , y3 ) − ωh (yh , x3 )| dy |x − y|2 |x3 −y3 |<1 1 ≤C ωh (·, y3 ) − ωh (·, x3 ) L 1 (R2 ) dy3 |x − y3 |2 3 |x3 −y3 |≥1 1 ωh (·, y3 ) − ωh (·, x3 ) L 2 (R2 ) dy3 . + |x − y3 | |x3 −y3 |<1 3 Since L 2 (m) → L p (R2 ) for p ∈ [1, 2], we have ωh (·, y3 ) − ωh (·, x3 ) L p (R2 )2 ≤ 2 ωh X (m)2 and ωh (·, y3 ) − ωh (·, x3 ) L p (R2 )2 ≤ |x3 − y3 | ∂x3 ωh X (m)2 . In particular, for any γ ∈ (0, 1), γ

ωh (·, y3 ) − ωh (·, x3 ) L p (R2 )2 ≤ 2γ |x3 − y3 |1−γ ωh X (m)2 ∂x3 ωh X (m)2 . 1−γ

Thus we obtain γ

1−γ

I2 L ∞ (R3 )2 ≤ C ωh X (m)2 ∂x3 ωh X (m)2 , γ

1−γ

and conclude that (I2 , ∇h )g X (m) ≤ C ωh X (m)2 ∂x3 ωh X (m)2 . This proves (4.18).

500

Th. Gallay, Y. Maekawa

4.4. Large time estimates. In this section we complete the proof of Proposition 4.1. Fix m ∈ (1, ∞], α ∈ R, and assume that ω0 ∈ X p (m) for some p ∈ [1, 2]. Let ω(t) = et (L−α ) ω0 be the solution of the linearized equation (1.22) given by Proposition 4.2. Take any k ∈ N such that k > C4 (|α|2 + 1) + 1/2, where C4 is as in Proposition 4.3, and choose t0 > 0 small enough so that estimate (4.3) holds for all t ∈ (0, t0 ] and all β ∈ N3 with |β| ≤ k. Our goal is to control the solution ω(t) for t ≥ t0 and to establish the decay estimates (4.1), (4.2). To this end, we first observe that ω(t) satisfies the integral equation t ω(t) = Rα (t − t0 )ω(t0 ) − α Rα (t − s)H ω(s) ds, t ≥ t0 , (4.19) t0

where Rα (t) is the semigroup defined by (4.14). Fix η¯ ∈ (0, 1/2) such that 2η¯ < m − 1. By Proposition 4.4, we have t −η(t−t ¯ ) ¯ 0 ω(t) X(m) ≤ C5 e ω(t0 ) X(m) +C5 |α| e−η(t−s) H ω(s) X(m) ds. (4.20) t0

To estimate the term H ω(s) X(m) , we first apply Proposition 4.5 with γ = 1/2, and then the classical interpolation inequality 1−1/k

1/k

∂x3 ω X(m) ≤ C ω X(m) ∂xk3 ω X(m) . Using in addition Young’s inequality, we conclude that, given any > 0, there exists C > 0 such that C5 |α| H ω(s) X(m) ≤ ω(s) X(m) + C ∂xk3 ω(s) X(m) .

(4.21)

On the other hand, since k > C4 (|α|2 + 1) + 1/2, it follows from (4.11) that ∂xk3 ω(s) X(m) ≤ C3 e−(s−t0 )/2 ∂xk3 ω(t0 ) X(m) , s ≥ t0 .

(4.22)

Replacing (4.21) and (4.22) into (4.20), we easily obtain ¯ 0) ω(t) X(m) ≤ C5 ω(t0 ) X(m) + C ∂xk3 ω(t0 ) X(m) e−η(t−t t ¯ + e−η(t−s) ω(s) X(m) ds, t0

for some C > 0. Applying now Gronwall’s Lemma, and using (4.3) to bound ω(t0 ) X(m) and ∂xk3 ω(t0 ) X(m) in terms of ω0 , we see that ω(t) X(m) ≤ C e−ηt ω0 X p (m) for t ≥ t0 , where η = η¯ − . Finally, using (4.3) again to control the solution for t < t0 , we conclude that there exists C7 > 0 such that ω(t) X(m) ≡ et (L−α ) ω0 X(m) ≤

C7 e−ηt 1

a(t) p

− 21

ω0 X p (m) ,

(4.23)

for all t > 0. Since > 0 was arbitrary, estimate (4.23) holds for any η ∈ (0, 1/2) such that 2η < m − 1. To conclude the proof, it remains to find the optimal decay rates for ωh (t) X(m) , ω3 (t) X(m) (including the value η = 1/2 if m > 2), and to establish (4.1), (4.2) for

Three-Dimensional Stability of Burgers Vortices

501

β = 0 too. In view of Proposition 4.2, we can assume here without loss of generality that ∂x3 ω0 ∈ X(m) ∩ X p (m). First, combining (1.23), (4.23), we immediately obtain ∂x3 ω(t) X(m) ≡ ∂x3 et (L−α ) ω0 X(m) ≤ C e−(η+1)t ∂x3 ω0 X(m) ,

(4.24)

for all t ≥ 0. Moreover, if m > 2, we know from Proposition 4.4 that (4.20) holds with η¯ = 1/2. Thus, applying Proposition 4.5 with γ ≤ 1/2 to estimate H ω(s) X(m) and using (4.23), (4.24), we find that ω(t) X(m) decays like e−t/2 as t → ∞, hence (4.23) holds with η = 1/2 if m > 2. Next, to obtain a faster decay estimate for the horizontal component ωh , we use (4.15) and (4.17). Instead of (4.20), we find ωh (t) X (m)2 ≤ C e−μ(t−t0 ) ωh (t0 ) X (m)2 t + C|α| e−μ(t−s) ∂x3 ω(s) X(m) ds,

(4.25)

t0

for any μ ∈ (1, 23 ). Invoking (4.24), we conclude that ωh (t) X (m)2 decays like e−μt as t → ∞, provided μ < 1 + η. In other words, if μ ∈ (1, 23 ) satisfies 2μ < m + 1, we have ωh (t) X (m)2 ≡ (et (L−α ) ω0 )h X (m)2 ≤ C e−μt ( (ω0 )h X (m)2 + ∂x3 ω0 X(m) ), (4.26) for all t ≥ 0. Using the arguments leading to (4.25) and proceeding as in Proposition 4.2, we can also derive the following short time estimate, which complements (4.3): ∂xβ ωh (t) X (m)2 ≤

C1 a(t)

1 1 |β| p−2+ 2

(ω0 )h X p (m)2 + ∂x3 ω0 X p (m) , 0 < t ≤

C2 . |α|2 +1 (4.27)

β

Finally, to obtain decay estimates for the derivative ∂x ω(t), where β ∈ N3 , we can restrict ourselves to t ≥ 2t1 , where t1 > 0 is small enough so that the short time estimates (4.3), (4.27) hold for 0 < t ≤ 2t1 . In view of (1.23), we have the identity ∂xβ et (L−α ) ω0 = e−β3 (t−t1 ) ∂xβhh et1 (L−α ) e(t−2t1 )(L−α ) ∂xβ33 et1 (L−α ) ω0 . Using the short time estimates (4.3), (4.27) with p = 2 to bound the first operator β ∂xhh et1 (L−α ) , then the long-time estimates (4.23), (4.24) or (4.26) to treat the middle β term e(t−2t1 )(L−α ) , and finally (4.3) again to bound the last term ∂x33 et1 (L−α ) ω0 , we easily obtain (4.1) and (4.2), together with the following estimate: ∂xβ (et (L−α ) ω0 )h X (m)2 ≤

C e−(μ+β3 )t (ω0 )h X p (m)2 + ∂x3 ω0 X p (m) , t > 0, 1 1 |β| − + a(t) p 2 2 (4.28)

which will also be used in the next section. This concludes the proof of Proposition 4.1.

502

Th. Gallay, Y. Maekawa

5. Nonlinear Stability In this section we consider the nonlinear stability of the Burgers vortex and prove Theorems 1.2 and 1.3. Our starting point is the perturbation equation (1.10), which is equivalent to the integral equation ω(t) = et (L−α ) ω0 +

2

t

e(t−s)(L−α ) N j (ω(s), ω(s)) ds, t ≥ 0,

(5.1)

j=1 0

where N1 (v, w) = −(K 3D ∗ v, ∇)w, N2 (v, w) = (v, ∇)K 3D ∗ w, and K 3D is the Biot-Savart kernel (1.6). We first establish the following result, which already implies Theorem 1.2. Proposition 5.1. Fix m ∈ (1, ∞], α ∈ R, and take η ∈ (0, 21 ] such that 2η < m − 1. Then there exist δ = δ(α, m, η) > 0 and C = C(α, m, η) > 0 such that, for any ω0 ∈ X(m) with ∇ · ω0 = 0 and ω0 X(m) ≤ δ, Eq. (5.1) has a unique solution ω ∈ L ∞ (R+ ; X(m)) ∩ C([0, ∞); Xloc (m)), which satisfies ∂xβ ω(t) X(m) ≤

C ω0 X(m) a(t)

|β| 2

e−ηt , t > 0,

(5.2)

for any multi-index β ∈ N3 of length |β| ≤ 1. Proof. Let U be the Banach space of all ω ∈ L ∞ (R+ ; X(m)) ∩ C([0, ∞); Xloc (m)) such that ∇ · ω(t) = 0 for all t > 0 and ω U < ∞, where |β| sup a(t) 2 eηt ∂xβ ω(t) X(m) . ω U = |β|≤1 t>0

Given ω0 ∈ X(m) such that ∇ · ω0 = 0, we denote by : U → U the nonlinear map defined by (ω)(t) = et (L−α ) ω0 +

2

j (ω, ω)(t), t > 0,

(5.3)

j=1

where 1 , 2 are the following bilinear operators: t ˜ = e(t−s)(L−α ) N j (ω(s), ω(s)) ˜ ds, j (ω, ω)(t)

j = 1, 2.

(5.4)

0

If ω0 X(m) is sufficiently small, we shall show that the map is a strict contraction in the ball B K = {ω ∈ U | ω U ≤ K } for some suitable K > 0. It will follow that has a unique fixed point ω in B K , which by construction is the desired solution of (5.1). Since ω0 ∈ X(m) and ∇·ω0 = 0, Proposition 4.1 shows that the map t → et (L−α ) ω0 belongs to U, and satisfies the estimate et (L−α ) ω0 U ≤ C1 ω0 X(m) , for some C1 > 0 (depending on m, α, η). On the other hand, if v, w ∈ X(m) and ∇w ∈ X(m), Corollary 2.4 implies that N1 (v, w) and N2 (v, w) belong to X p (m)3 for any p ∈ (1, 2), and satisfy the bound N1 (v, w) X p (m)3 + N2 (v, w) X p (m)3 ≤ C v X(m) ∇w X(m) ,

Three-Dimensional Stability of Burgers Vortices

503

for some C > 0 (depending on m and p). If in addition ∇ · v = 0, then denoting u = K 3D ∗ v we find

R2

(N1 (v, v) + N2 (v, v))3 dx h =

R2

∇h · (vh u 3 − u h v3 ) dx h = 0,

(5.5)

for all x3 ∈ R, hence N1 (v, v) + N2 (v, v) ∈ X p (m). As a consequence, if ω, ω˜ ∈ U, we have N j (ω(t), ω(t)) ˜ ∈ X p (m)3 for j = 1, 2 and all t > 0, and using Proposition 4.1 again we obtain the following estimate for the bilinear operators j : 2 β ∂x j (ω, ω)(t) ˜ j=1

≤

2

t

j=1 0

X(m)

≤C

∂xβ e(t−s)(L−α ) N j (ω(s), ω(s)) ˜ X(m) ds

2 j=1 0

≤C

t

0

≤C

t

0

≤

e−η(t−s)

t

1

a(t−s) p

1

N j (ω(s), ω(s)) ˜ X p (m)3 ds

e−η(t−s)

ω(s) X(m) ∇ ω(s) ˜ X(m) ds 1 1 |β| − + a(t−s) p 2 2 e−η(t−s) e−2ηs ds ω U ω ˜ U 1 1 |β| 1 − + a(t−s) p 2 2 a(s) 2

Ce−ηt a(t) p

− 21 + |β| 2

+ |β| 2 −1

ω U ω ˜ U.

Since we also know that N1 (ω(t), ω(t)) + N2 (ω(t), ω(t)) belongs to X p (m) for all t > 0 and is divergence-free, we have shown that maps U into U, and that there exists C2 > 0 (depending on |α|, m, and η) such that (ω) U ≤ C1 ω0 X(m) +C2 ω 2U , (ω)−(ω) ˜ U ≤ C2 ( ω U + ω ˜ U ) ω− ω ˜ U, for all ω, ω˜ ∈ U. We now take K > 0 such that 2C2 K < 1, and assume that ω0 X(m) ≤ K /(2C1 ). Then the estimates above show that is a strict contraction in the ball B K , hence has a unique fixed point ω ∈ B K which, of course, satisfies (5.1). Moreover ω U ≤ 2C1 ω0 X(m) , hence (5.2) holds with C = 2C1 . This concludes the proof. Remark. The size δ of the local basin of attraction of the Burgers vortex αG in X(m) depends a priori on α, m, and η. However, as announced in Theorem 1.3, the dependence on the decay rate η can easily be removed by the following (standard) argument. Given m > 1, we first choose η = η(m) ¯ = min( 21 , m−1 4 ) and apply Proposition 5.1 with that value of η. We thus obtain a constant δ¯ > 0 depending only on α and m such ¯ Eq. (5.1) has a unique that, for any ω0 ∈ X(m) with ∇ · ω0 = 0 and ω0 X(m) ≤ δ, solution ω ∈ L ∞ (R+ ; X(m)) ∩ C([0, ∞); Xloc (m)), which converges exponentially to zero as t → ∞. In particular, given any η ∈ (0, 21 ] such that 2η < m − 1, there exists T = T (η) > 0 such that ω(t) X(m) ≤ δ for all t ≥ T , where δ = δ(α, m, η) is the constant given by Proposition 5.1. By uniqueness of the solution, we conclude that ω satisfies (5.2) for any admissible value of η.

504

Th. Gallay, Y. Maekawa

In view of Proposition 5.1 and the remark that follows, the proof of Theorem 1.3 will be complete once we have established the improved decay estimate (1.20) for the horizontal component ωh . A convenient way to do so is to repeat the proof of Proposition 5.1 using a different function space, which incorporates a faster decay rate as t → ∞. Given μ ∈ (1, 1 + η), where η ∈ (0, 21 ] is as in Proposition 5.1, we introduce the space V ⊂ U defined by the norm k k (μ+kη)t k β (η+k)t k β 2 2 sup a(t) e ω V = ∂x3 ∂x ωh (t) X (m)2 + sup a(t) e ∂x3 ∂x ω3 (t) X (m) . k=0,1 |β|≤1

t>0

t>0

β

In view of (5.2), we can assume here (without loss of generality) that ∂x ω0 X(m) is finite and arbitrarily small, for all β ∈ N3 with |β| ≤ 1. Using Proposition 4.1, we thus obtain et (L−α ) ω0 V ≤ C3

∂xβ ω0 X(m) ,

|β|≤1

for some C3 > 0. On the other hand, if v, w ∈ X(m), the following estimates hold for any p ∈ (1, 2): N1,h (v, w) X p (m)2 ≤ C v X(m) ∇wh X (m)2 , N2 (v, w) X p (m)3 ≤ C( vh X (m)2 ∇h w X(m) + C v3 X (m) ∂x3 w X(m) ), ∂x3 N j (v, w) X p (m)3 ≤ C( ∂x3 v X(m) ∇w X(m) + v X(m) ∂x3 ∇w X(m) ). We now estimate the bilinear operators j (ω, ω) ˜ for ω, ω˜ ∈ V. First, using (4.28), we find for t ≥ 1: β

∂x 1,h (ω, ω)(t) ˜ X (m)2 ≤

t 0

≤C

β

∂x {e(t−s)(L−α ) N1 (ω(s), ω(s))} ˜ h X (m)2 ds

t

e−μ(t−s) 1

1

0 a(t−s) p − 2 +

|β| 2

( N1,h (ω(s), ω(s)) ˜ X p (m)2

+ ∂x3 N1 (ω(s), ω(s)) ˜ X p (m)3 ) ds t e−μ(t−s) ≤C ( ω(s) X(m) ∇ ω˜ h (s) (X (m))2 1 1 |β| 0 a(t−s) p − 2 + 2 + ∂x3 ω(s) X(m) ∇ ω(s) ˜ ˜ X(m) + ω(s) X(m) ∂x3 ∇ ω(s) X(m) ) ds t −μ(t−s) −(μ+η)s e e ds ω V ω ˜ V ≤C 1 1 |β| − 0 a(t−s) p 2 + 2 a(s) 21 ≤ Ce−μt ω V ω ˜ V.

(5.6)

In the last inequality, we have used the definition of the norm in V and the fact that μ + η < 1 + 2η. The bound (5.6) also holds for t < 1, and can easily be established using (4.1) instead of (4.28).

Three-Dimensional Stability of Burgers Vortices

505

Next, to bound ∂x3 1,h (ω, ω), ˜ we recall that ∂x3 et (L−α ) = e−t et (L−α ) ∂x3 . Applying (4.1), we find ˜ ∂x3 ∂xβ 1,h (ω, ω)(t) X (m)2 t ≤ e−(t−s) ∂xβ {e(t−s)(L−α ) ∂x3 N1 (ω(s), ω(s))} ˜ h X (m)2 ds 0

t

≤C

0 t

≤C 0

≤

e−(μ+1)(t−s)

∂x3 N1 (ω(s), ω(s)) ˜ X p (m)3 ds 1 1 |β| − + a(t−s) p 2 2 e−(μ+1)(t−s) e−(μ+η)s ds ω V ω ˜ V 1 1 |β| 1 − + a(t−s) p 2 2 a(s) 2

Ce−(μ+η)t 1

a(t) p

+ |β| 2 −1

ω V ω ˜ V.

˜ as follows: Similarly, for k = 0, 1, we can estimate ∂xk3 2,h (ω, ω) ∂xk3 ∂xβ 2,h (ω, ω)(t) ˜ X (m)2 t ≤ e−k(t−s) ∂xβ {e(t−s)(L−α ) ∂xk3 N2 (ω(s), ω(s))} ˜ h X (m)2 ds 0

t

≤C

1 1 |β| p−2+ 2

0 t

≤C

e−(μ+k)(t−s)

a(t−s) e−(μ+k)(t−s) e−(μ+η)s 1

0

∂xk3 N2 (ω(s), ω(s)) ˜ X p (m)3 ds

a(t−s) p

− 21 + |β| 2

k

ds ω V ω ˜ V≤

a(s) 2

Ce−(μ+kη)t 1

a(t) p

k 3 + |β| 2 +2−2

ω V ω ˜ V.

Finally, using (4.2), we obtain for the vertical components of j (ω, ω): ˜ ∂xk3 ∂xβ j,3 (ω, ω)(t) ˜ X (m) t ≤ e−k(t−s) ∂xβ {e(t−s)(L−α ) ∂xk3 N j (ω(s), ω(s))} ˜ 3 X (m) ds 0

t

≤C

1 1 |β| p−2+ 2

0 t

≤C 0

e−(η+k)(t−s)

∂xk3 N j (ω(s), ω(s)) ˜ X p (m)3 ds

a(t−s) e−(η+k)(t−s) e−(k+2η)s 1

a(t−s) p

− 21 + |β| 2

k

ds ω V ω ˜ V≤

a(s) 2

Ce−(η+k)t 1

a(t) p

k 3 + |β| 2 +2−2

ω V ω ˜ V.

Summarizing, we have shown that defined by (5.3) maps V into V and satisfies the following bounds: ∂xβ ω0 X(m) + C4 ω 2V , (ω) V ≤ C3 |β|≤1

˜ V ) ω − ω ˜ V, (ω) − (ω) ˜ V ≤ C4 ( ω V + ω β for all ω, ω˜ ∈ V. If K = 2C3 |β|≤1 ∂x ω0 X(m) is sufficiently small, it follows that is a strict contraction in the ball B˜ K = {ω ∈ V | ω V ≤ K }, hence has a unique fixed

506

Th. Gallay, Y. Maekawa

point there. Denoting by ω(t) the solution of (5.1) given by Proposition 5.1, this implies that t → ω(t + T ) belongs to B˜ K if T > 0 is sufficiently large. In particular, ω(t) satisfies (1.20) for some suitable C > 0. The proof of Theorem 1.3 is now complete. 6. Appendix 6.1. Proof of Lemma 1.1. Let χ ∈ C0∞ (R2 ) be a cut-off function such that χ (x h ) = 1 if |x h | ≤ 1 and χ (x h ) = 0 if |x h | ≥ 2. Given R > 0, we denote χ R (x h ) = χ (x h /R), so that |∇h χ R (x h )| ≤ C/R. For any x3 ∈ R, we define f (x3 ) = ω˜ 3 (x h , x3 ) dx h , f R (x3 ) = ω˜ 3 (x h , x3 )χ R (x h ) dx h . R2

R2

Since ω˜ 3 ∈ X (m) for some m > 1, it is easy to verify that f − f R L ∞ (R) → 0 as R → ∞. On the other hand, for any test function ψ ∈ C0∞ (R), we have

R

f (x3 )

dψ dψ (x3 ) dx3 ≤ f R (x3 ) (x3 ) dx3 dx3 dx3 R dψ + f − f R L ∞ (R) dx

3 L 1 (R )

.

(6.1)

The last term in the right-hand side converges to zero as R → ∞. To treat the other term, we observe that dψ dψ ∂φ R f R (x3 ) (x3 ) dx3 = ω˜ 3 (x h , x3 )χ R (x h ) (x3 ) dx h dx3 = ω˜ 3 , , dx3 dx3 ∂ x3 R R3 where φ R (x h , x3 ) = χ R (x h )ψ(x3 ) and ·, · denotes the duality pairing of D (R3 ) and C0∞ (R3 ). Now, since ∇ · ω˜ = 0 in the sense of distributions, we have ω˜ 3 ,

∂φ R ∂ ω˜ 3 = − , φ R = ∇h · ω˜ h , φ R = −ω˜ h , ∇h φ R , ∂ x3 ∂ x3

so that

dψ f R (x3 ) (x3 ) dx3 = − ω˜ h (x h , x3 ) · ∇h χ R (x h )ψ(x3 ) dx h dx3 . dx3 R R3

Using the inclusion L 2 (m) → L 1 (R2 ) and the definition (1.15) of the space X (m), we thus find C dψ f R (x3 ) (x3 ) dx3 ≤ ω˜ h X (m)2 ψ L 1 (R) −−−→ 0. R→∞ dx3 R R Returning to (6.1), we conclude that the left-hand side vanishes for all ψ ∈ C0∞ (R), df hence dx = 0 in the sense of distributions. Since f ∈ BC(R), it follows that f is 3 identically constant, which is the desired result.

Three-Dimensional Stability of Burgers Vortices

507

Remark. If ω(x, t) is any solution of (1.10) which is integrable with respect to the horizontal variables, we can define φ(x3 , t) = ω3 (x h , x3 , t) dx h , x3 ∈ R, t ≥ 0. R2

As was observed in [9], this quantity satisfies a remarkably simple equation ∂t φ(x3 , t) + x3 ∂x3 φ(x3 , t) = ∂x23 φ(x3 , t),

(6.2)

which can be solved explicitly. However, if ω(·, t) ∈ X (m)3 for some m > 1 and if ∇ · ω(·, t) = 0, Lemma 1.1 shows that φ(x3 , t) does not depend on x3 , and (6.2) then implies that φ(x3 , t) is also independent of t. Thus, as was already mentioned, we can restrict ourselves to the particular case where φ ≡ 0 without loss of generality. Being unaware of this simple observation, the authors of [9] have stated their stability result in a seemingly more general form, allowing (apparently) for nontrivial functions φ(x3 , t), but thanks to Lemma 1.1 (which also holds in the slightly different functional setting of [9]) the simpler presentation adopted here in Theorem 1.2 is exactly as in general. 6.2. Proof of Proposition 3.4. This final section is devoted to the proof of Proposition 3.4, which shows that eigenfunctions of Lα,h corresponding to eigenvalues outside the essential spectrum have a Gaussian decay at infinity. For the nonlocal operator Lα,3 , the same result was established in [8, Lemma 4.5] using ODE techniques, but we prefer using here a more flexible method based on weighted L 2 estimates. In fact, we shall consider a more general elliptic problem of the form − L f + F(x, f, ∇ f ) + λ f = h, x ∈ Rn ,

(6.3) ) .

where the unknown is the vector-valued function f = ( f 1 , . . . , f N Here and below we denote by L = + x2 · ∇ + n2 the analog of operator (2.2) in dimension n. The data of the problem are the functions F : Rn × C N × Cn N → C N and h : Rn → C N , and the complex number λ. For m ∈ [0, ∞], we denote by L 2 (m), H 1 (m) the following complex Hilbert spaces on Rn : 2 2 n 2 2 L (m) = f ∈ L (R , C) | f (x)| ρm (|x| ) dx < ∞ , Rn H 1 (m) = f ∈ L 2 (m) ∂x j f ∈ L 2 (m) for j = 1, . . . , n , where ρm is the weight function defined by (1.12). Our main result is: Proposition 6.1. Let m ∈ [0, ∞), λ ∈ C, h ∈ L 2 (∞) N , and assume that F is a continuous function satisfying |F(x, p, Q)| ≤ A(x)| p| + B(x)|Q|, for all (x, p, Q) ∈ Rn × C N × Cn N ,

(6.4)

where A and B are bounded, nonnegative functions such that lim sup A(x) = lim sup B(x) = 0.

R→∞ |x|≥R

If Re λ >

n 4

−

m 2,

R→∞ |x|≥R

(6.5)

then any solution f ∈ H 1 (m) N of (6.3) satisfies f ∈ H 1 (∞) N .

508

Th. Gallay, Y. Maekawa

Proof. The proof is a simple modification of [15, Prop. 12], which in turn is inspired by a recent work of Fukuizumi and Ozawa [5] where decay estimates are obtained for solutions of the Haraux-Weissler equation. For k ≥ 1, > 0, and θ ∈ [0, m], we define the weight functions ξk, (x) = e

(1−)k|x|2 4k+|x|2

, ζθ (x) = (1 + |x|2 )θ , x ∈ Rn .

(6.6)

Multiplying both sides of (6.3) by ζθ ξk, f¯ and integrating by parts the real part of the resulting expression, we obtain the identity x 2 ¯ ζθ ξk, |∇ f | dx + Re | f |2 · ∇(ζθ ξk, ) dx f · (∇(ζθ ξk, ), ∇) f dx + 4 Rn Rn Rn n = − Re − Re λ ζθ ξk, f¯ · F(x, f (x), ∇ f (x)) dx + ζθ ξk, | f |2 dx n n 4 R R + Re ζθ ξk, f¯ · h dx. (6.7) Rn

Clearly, ∇ξk, (x) =

8(1 − )k 2 x 2θ x ξk, (x), ∇ζθ (x) = ζθ (x). (4k + |x|2 )2 1 + |x|2

(6.8)

Thus, the second term in the left-hand side of (6.7) can be written in the following way: f¯ · (ξk, ∇ζθ , ∇) f dx + Re f¯ · (ζθ ∇ξk, , ∇) f dx Re Rn Rn θ xζθ ξk, dx + Re | f |2 ∇ · f¯ · (ζθ ∇ξk, , ∇) f dx =− 1 + |x|2 Rn Rn θ ζθ θ ζθ =− | f |2 ξk, x · ∇ dx − | f |2 x · ∇ξk, dx 2 1 + |x| 1 + |x|2 Rn Rn ζθ ξk, 8(1 − )k 2 ζθ ξk, ¯ 2 − nθ | f | dx + Re f · (x, ∇) f dx. 2 (4k + |x|2 )2 Rn 1 + |x| Rn To bound this quantity from below, we observe that ζθ ξk, θ ζθ 2 | f |2 ξk, x · ∇ dx ≤ 2θ | f |2 dx. 2 2 1 + |x| Rn Rn 1 + |x| Moreover, for each η1 > 0, 8(1 − )k 2 ζθ ξk, ¯ 2(1 − )kζθ ξk, |x f ||∇ f | dx f · (x, ∇) f dx ≤ 2 )2 n n (4k + |x| 4k + |x|2 R R ≤ (1 − η1 ) ζθ ξk, |∇ f |2 dx

− Re

Rn

k 2 ζθ ξk, |x f |2 (1 − )2 + dx. 1 − η1 Rn (4k + |x|2 )2

Three-Dimensional Stability of Burgers Vortices

509

Thus, using the expression (6.8) of ∇ξk, , we find Re f¯ · (∇(ζθ ξk, ), ∇) f dx Rn

ζθ ξk, 8(1 − )θ k 2 ζθ ξk, |x f |2 2 | f | dx − dx 2 2 2 2 Rn 1 + |x| Rn (4k + |x| ) (1 + |x| ) k 2 ζθ ξk, |x f |2 (1 − )2 ζθ ξk, |∇ f |2 d x − dx, − (1 − η1 ) 1 − η1 Rn (4k + |x|2 )2 Rn

≥ −C

(6.9)

where C = nθ + θ 2 does not depend on k and . We next consider the third term in the left-hand side of (6.7), which satisfies x ζθ ξk, θ | f |2 · ∇(ζθ ξk, ) dx = |x f |2 dx 2 n n 4 2 1 + |x| R R k 2 ζθ ξk, |x f |2 + 2(1 − ) dx. (6.10) 2 2 Rn (4k + |x| ) To estimate the right-hand side of (6.7), we use (6.4) and obtain, for each η2 > 0, − Re ζθ ξk, f¯ · F(x, f (x), ∇ f (x)) dx n R 2 ≤ ζθ ξk, A| f | dx + ζθ ξk, B| f ||∇ f | dx Rn Rn B2 | f |2 dx + η2 ≤ ζθ ξk, A + ζθ ξk, |∇ f |2 dx. (6.11) 4η2 Rn Rn Finally, for each η3 > 0, we have ¯ Re ζθ ξk, f · h dx ≤ η3 Rn

1 ζθ ξk, | f | dx + ζθ ξk, |h|2 dx. 4η3 Rn Rn 2

(6.12)

Substituting (6.9)– (6.12) into (6.7), we arrive at our basic inequality: (1 − )k 2 ζθ ξk, |x f |2 (η1 − η2 ) ζθ ξk, |∇ f |2 dx + n n (4k + |x|2 )2 R R 1 − 2η1 + 8θ dx × − 1 − η1 1 + |x|2 C B2 n θ 1 2 2 ≤ dx. −Re λ+ A+ | f | ζθ ξk, + +η − + |h| 3 1+|x|2 4 4η2 2 4η3 Rn

(6.13)

To exploit (6.13), we first take η1 = η2 = 21 and θ = m. Using (6.5) and the assumption that Re λ > n4 − m2 , we see that there exists R > 0 independent of k ≥ 1 such that, if η3 > 0 is sufficiently small, the following inequality holds:

k 2 ζθ ξk, |x f |2 (1 − ) dx ≤ C 2 2 Rn (4k + |x| )

1 ζθ ξk, | f | dx + ζθ ξk, |h|2 dx, 4η3 Rn |x|≤R 2

510

Th. Gallay, Y. Maekawa

where the constant C > 0 is independent of k ≥ 1. Thus, taking the limit k → ∞ and using Fatou’s lemma, we obtain 1− (1−) 2 (1 + |x|2 )m e 4 |x| |x f |2 dx n 16 R 1− 1 2 ≤ C(R) | f |2 dx + (1 + |x|2 )m e 4 |x| |h|2 dx, 4η3 Rn |x|≤R 1−

which shows that e 8 |x| f ∈ L 2 (R2 ) for any > 0. Next we choose η1 = 41 , η2 = 18 , η3 = 1, and θ = 0 in (6.13). Taking again the limit k → ∞ and using Lebesgue’s dominated convergence theorem, we find 1− 1− 1 1− 2 2 e 4 |x| |∇ f |2 dx + e 4 |x| |x f |2 dx 8 Rn 24 Rn 1− 1− 1 2 2 |x| 2 ≤C e 4 | f | dx + e 4 |x| |h|2 dx, 4 Rn Rn 2

where the constant C > 0 does not depend on > 0. This inequality shows that 1− 1− 1 1− 2 2 e 4 |x| |∇ f |2 dx + e 4 |x| |x f |2 dx 8 Rn 48 Rn 1− 1− 1 2 2 |x| 2 4 ≤C e | f | dx + e 4 |x| |h|2 dx, n 4 |x|≤R R for some R > 0 independent of > 0. Taking now the limit → 0, we conclude that f ∈ H 1 (∞), which is the desired result. Proof of Proposition 3.4. We consider the eigenvalue equation (3.16), which can be written in the form 3 ˜ ωh = 0, − Lh ωh + α 1 ωh − α 2 ωh + λ + (6.14) 2 ˜ 2 are defined at the beginning of Sect. 3. where Lh is given by (2.2) and the operators 1 , ˜ 2 ωh | ≤ |∇h U G ||ωh |, where the velocity We recall that | 1 ωh | ≤ |UhG ||∇h ωh | and | h profile UhG satisfies (3.8). Assume that Re λ > − m2 −1 and let ωh ∈ H 1 (m)2 be a solution ˜2 f, to (6.14). Applying Proposition 6.1 with n = N = 2, F(x, f, ∇ f ) = α 1 f − α 1 2 and h = 0, we obtain ωh ∈ H (∞) . This completes the proof of Proposition 3.4. References 1. Burgers, J.M.: A mathematical model illustrating the theory of turbulence. Adv. Appl. Mech. 1, 171–199 (1948) 2. Carpio, A.: Asymptotic behavior for the vorticity equations in dimensions two and three. Commun. in PDE 19, 827–872 (1994) 3. Crowdy, D.G.: A note on the linear stability of Burgers vortex. Stud. Appl. Math. 100, 107–126 (1998) 4. Engel, K.-J., Nagel, R.: One-Parameter semigroups for linear evolution equations. Graduate Texts in Mathematics, Berlin-Heidelberg-New York: Springer, 2000 5. Fukuizumi, R., Ozawa, T.: On a decay property of solutions to the Haraux-Weissler equation. J. Diff. Eqs. 221, 134–142 (2006)

Three-Dimensional Stability of Burgers Vortices

511

6. Gallagher, I., Gallay, Th., Nier, F.: Spectral asymptotics for large skew-symmetric perturbations of the harmonic oscillator. Int. Math. Res. Notices 2009, 2147–2199 (2009) 7. Gallay, Th., Wayne, C.E.: Invariant manifold and the long-time asymptotics of the Navier-Stokes and vorticity equations on R2 . Arch. Rat. Mech. Anal. 163, 209–258 (2002) 8. Gallay, Th., Wayne, C.E.: Global Stability of vortex solutions of the two dimensional Navier-Stokes equation. Commun. Math. Phys. 255, 97–129 (2005) 9. Gallay, Th., Wayne, C.E.: Three-dimensional stability of Burgers vortices: the low Reynolds number case. Phys. D 213, 164–180 (2006) 10. Gallay, Th., Wayne, C.E.: Existence and stability of asymmetric Burgers vortices. J. Math. Fluid Mech. 9, 243–261 (2007) 11. Giga, Y., Giga, M.-H.: Nonlinear Partial Differential Equation, Self-similar solutions and asymptotic behavior. Tokyo: Kyoritsu, 1999 (in Japanese) 12. Giga, M.-H., Giga, Y., Saal, J.: Nonlinear Partial Differential Equations - Asymptotic Behavior of Solutions and Self-Similar Solutions. Basel-Boston: Birkhäuser, in press 13. Giga, Y., Kambe, T.: Large time behavior of the vorticity of two dimensional viscous flow and its application to vortex formation. Commun. Math. Phys. 117, 549–568 (1988) 14. Jiménez, J., Moffatt, H.K., Vasco, C.: The structure of the vortices in freely decaying two-dimensional turbulence. J. Fluid Mech. 313, 209–222 (1996) 15. Kagei, Y., Maekawa, Y.: On asymptotic behaviors of solutions to parabolic systems modelling chemotaxis. To appear 16. Kato, T.: Perturbation Theory for Linear Operators. Berlin-Heidelberg-New York: Springer, 1966 17. Leibovich, S., Holmes, Ph.: Global stability of the Burgers vortex. Phys. Fluids 24, 548–549 (1981) 18. Maekawa, Y.: On the existence of Burgers vortices for high Reynolds numbers. J. Math. Anal. Appl. 349, 181–200 (2009) 19. Maekawa, Y.: Existence of asymmetric Burgers vortices and their asymptotic behavior at large circulations. Math. Model Methods Appl. Sci. 19, 669–705 (2009) 20. Maekawa, Y.: Spectral properties of the linearization at the Burgers vortex in the high rotation limit. To appear in J. Math. Fluid Mech 21. Moffatt, H.K., Kida, S., Ohkitani, K.: Stretched vortices-the sinews of turbulence; large-Reynolds-number asymptotics. J. Fluid Mech. 259, 241–264 (1994) 22. Prochazka, A., Pullin, D.I.: On the two-dimensional stability of the axisymmetric Burgers vortex. Phys. Fluids. 7, 1788–1790 (1995) 23. Prochazka, A., Pullin, D.I.: Structure and stability of non-symmetric Burgers vortices. J. Fluid Mech. 363, 199–228 (1998) 24. Robinson, A.C., Saffman, P.G.: Stability and Structure of stretched vortices. Stud. Appl. Math. 70, 163–181 (1984) 25. Rossi, M., Le Dizès, S.: Three-dimensional temporal spectrum of stretched vortices. Phys. Rev. Lett. 78, 2567–2569 (1997) 26. Schmid, P.J., Rossi, M.: Three-dimensional stability of a Burgers vortex. J. Fluid Mech. 500, 103–112 (2004) 27. Townsend, A.A.: On the fine-scale structure of turbulence. Proc. R. Soc. A 208, 534–542 (1951) Communicated by P. Constantin

Commun. Math. Phys. 302, 513–580 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1162-0

Communications in

Mathematical Physics

Global Gauge Anomalies in Two-Dimensional Bosonic Sigma Models Krzysztof Gaw¸edzki1 , Rafał R. Suszek2 , Konrad Waldorf3 1 Laboratoire de Physique, C.N.R.S., ENS-Lyon, Université de Lyon, 46 Allée d’Italie, 69364 Lyon, France.

E-mail: [email protected]

2 Department Mathematik, Bereich Algebra und Zahlentheorie, Universität Hamburg, Bundesstraße 55,

20146 Hamburg, Germany

3 Department of Mathematics, University of California, Berkeley, 970 Evans Hall #3840, Berkeley,

CA 94720, USA Received: 23 March 2010 / Accepted: 3 June 2010 Published online: 21 November 2010 – © The Author(s) 2010. This article is published with open access at Springerlink.com

Abstract: We revisit the gauging of rigid symmetries in two-dimensional bosonic sigma models with a Wess-Zumino term in the action. Such a term is related to a background closed 3-form H on the target space. More exactly, the sigma-model Feynman amplitudes of classical fields are associated to a bundle gerbe with connection of curvature H over the target space. Under conditions that were unraveled more than twenty years ago, the classical amplitudes may be coupled to the topologically trivial gauge fields of the symmetry group in a way which assures infinitesimal gauge invariance. We show that the resulting gauged Wess-Zumino amplitudes may, nevertheless, exhibit global gauge anomalies that we fully classify. The general results are illustrated on the example of the WZW and the coset models of conformal field theory. The latter are shown to be inconsistent in the presence of global anomalies. We introduce a notion of equivariant gerbes that allow an anomaly-free coupling of the Wess-Zumino amplitudes to all gauge fields, including the ones in non-trivial principal bundles. Obstructions to the existence of equivariant gerbes and their classification are discussed. The choice of different equivariant structures on the same bundle gerbe gives rise to a new type of discrete-torsion ambiguities in the gauged amplitudes. An explicit construction of gerbes equivariant with respect to the adjoint symmetries over compact simply connected simple Lie groups is given. Contents 1. 2. 3.

Introduction . . . . . . . . . . . . . . . . . . . . Wess-Zumino Feynman Amplitudes . . . . . . . . 2.1 2D Wess-Zumino action and gerbes . . . . . 2.2 Rigid symmetries of Wess-Zumino amplitudes Coupling to Topologically Trivial Gauge Fields . . 3.1 Gauging prescription . . . . . . . . . . . . . 3.2 Equivariant-cohomology interpretation . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

514 516 516 517 519 519 521

514

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

3.3 More equivariance properties . . . . . . . . . . . . . . . . . . . Global Gauge Anomalies . . . . . . . . . . . . . . . . . . . . . . . 4.1 General gauge transformations . . . . . . . . . . . . . . . . . . 4.2 Global gauge anomalies in WZW amplitudes . . . . . . . . . . 4.3 Anomalies and WZW partition functions . . . . . . . . . . . . . 4.4 Implications for coset models . . . . . . . . . . . . . . . . . . . 5. Coupling to General Gauge Fields . . . . . . . . . . . . . . . . . . . 5.1 Equivariant gerbes . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 WZ amplitudes with topologically non-trivial gauge fields . . . . 5.3 General gauge invariance . . . . . . . . . . . . . . . . . . . . . 6. Obstructions and Classification of Equivariant Structures . . . . . . . 6.1 Obstructions to 1-isomorphisms α . . . . . . . . . . . . . . . . 6.2 Local description of gerbes . . . . . . . . . . . . . . . . . . . . 6.3 Obstructions to 2-isomorphism β . . . . . . . . . . . . . . . . . 6.4 Obstructions to the commutativity of diagram (5.1) . . . . . . . 6.5 Classification of equivariant structures . . . . . . . . . . . . . . 6.6 Ambiguity of gauged amplitudes . . . . . . . . . . . . . . . . . 6.7 Fixed-point resolved coset partition functions . . . . . . . . . . 7. Ad-Equivariant WZW Gerbes Over Simply Connected Groups . . . 7.1 WZW gerbes over compact simply connected simple Lie groups 7.2 Construction of 1-isomorphism α . . . . . . . . . . . . . . . . . 7.3 Construction of 2-isomorphism β . . . . . . . . . . . . . . . . . 7.4 Commutativity of diagram (5.1) . . . . . . . . . . . . . . . . . 8. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Proof of Proposition 3.1 . . . . . . . . . . . . . . . . . . . . . . 2 Proof of Lemma 3.13 . . . . . . . . . . . . . . . . . . . . . . . 3 Proof of Proposition 4.2 . . . . . . . . . . . . . . . . . . . . . . 4 Proof of Theorem 5.3 . . . . . . . . . . . . . . . . . . . . . . . 5 Proof of Lemma 5.4 . . . . . . . . . . . . . . . . . . . . . . . . 6 Construction of flat gerbes from characters . . . . . . . . . . . . 7 Behavior of isomorphism α under groupoid multiplication . . . 8 Commutativity of diagram (7.74) . . . . . . . . . . . . . . . . . 9 Proof of the equality of isomorphisms (7.76) and (7.77) . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

522 524 524 527 530 534 537 537 539 542 544 545 546 548 550 551 554 557 558 559 561 565 567 568 569 569 571 571 573 575 576 576 577 578 579

1. Introduction Gauge invariance constitutes one of the basic principles underlying the theoretical description of physical reality. The occurrence of its violations, called gauge anomalies [3], in certain models of quantum field theory with chiral fermions yields a powerful selection principle for the model building in high energy physics [56]. Gauge anomalies may describe violations of infinitesimal gauge invariance, or, if the latter holds, the breakdown of invariance under large gauge transformations not homotopic to identity [58]. The second type goes under the name of global gauge anomalies. Anomalies similar to the ones in theories with chiral fermions occur also in effective bosonic models describing the low energy sector [51]. Such effective theories contain Wess-Zumino (WZ) terms in the action [57], see, e.g., the review [46]. The emergence of global gauge anomalies in bosonic theories with WZ terms on the Euclidean space-time compactified

Global Gauge Anomalies in 2-D Bosonic Sigma Models

515

to the four-dimensional sphere was extensively analyzed following the work [58], see [11]. Starting with Witten’s paper [59] on non-Abelian bosonization, the two-dimensional Wess-Zumino actions for bosonic sigma models with Lie-group targets were studied quite thoroughly in the context of the Wess-Zumino-Witten (WZW) models of conformal field theory (CFT). In the latter setting, the problem how to gauge rigid symmetries was solved, at least in the simplest cases, almost from the very start [8]. Nevertheless, the general question about the coupling of two-dimensional Wess-Zumino actions to gauge fields in a way invariant under infinitesimal gauge transformations was posed and answered only a few years later in [37] and in [36]. Besides, this was done only for topologically trivial gauge fields described by global 1-forms on the worldsheet. The conditions that permit such gauging and the obstructions to their fulfillment were subsequently interpreted in [13,14] in terms of equivariant cohomology, as first indicated in [60], see also [61]. The issue of general gauge invariance of gauged two-dimensional WZ actions was addressed only very briefly at the end of [13] and, in the context of the T -duality, in [34,35]. We make it the main topic of the present study. A convenient tool to treat topological intricacies of Wess-Zumino actions [1,17] on closed two-dimensional worldsheets is provided by the theory of bundle gerbes with connection [43,44]. For topologically trivial gauge fields, we identify the global gauge anomalies of gauged WZ actions as the isomorphism classes of certain flat gerbes over the product of the symmetry group and the target space M. Such isomorphism classes correspond to the classes in the cohomology group H 2 ( × M, U (1)) that may often be calculated explicitly. In particular, we show how to do it in the case of WZW models. This permits us to prove that, after the gauging of an adjoint symmetry, some of bulk WZW models with non-simply connected target groups exhibit global gauge anomalies. The latter lead to the inconsistency of the corresponding coset models of CFT [29,30] realized as gauged WZW models with the gauge fields integrated out [2,21,22,40]. This is the main surprise resulting from our study. We also address the problem of the coupling of WZ actions to topologically non-trivial gauge fields given by connections in non-trivial principal bundles of the symmetry group. It was indicated in [33] that such a coupling plays an important role in the construction of consistent coset theories. It seems also important in the T -duality [34]. We show that the existence of certain equivariant structures on gerbes, considered already before for discrete symmetry groups in [27], enables a non-anomalous coupling to all gauge fields and we analyze in a cohomological language the obstructions to the existence of such structures and their classification. An explicit construction of all non-equivalent equivariant structures relative to the adjoint symmetries on gerbes relevant for the WZW models with compact simply connected target groups is given. Different choices of the equivariant structure lead to the amplitudes with topologically non-trivial gauge fields that differ by phases that are given by characters of (a subgroup of) the fundamental group of the (connected) symmetry group. The appearance of such discrete-torsionlike phases in the coset model sectors with topologically non-trivial gauge fields was envisaged in [33]. We discuss its implication on the resolution of the field-identification problem [16] in general coset models. The paper is organized as follows. In Sect. 2, we recall the role of bundle gerbes in the definition of the Feynman amplitudes of two-dimensional sigma models with a WZ action (in Sect. 2.1) and we characterize rigid symmetries of such amplitudes (in Sect. 2.2). Section 3 is devoted to the coupling of WZ actions to topologically trivial gauge fields. In Sect. 3.1, we recall the old result of Jack-Jones-Mohammedi-Osborn [37] and Hull-Spence [36] describing the coupling of a WZ action to the gauge fields

516

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

of its symmetry group. In Sect. 3.2, we review the interpretation, due to Witten [60] and Figueroa-O’Farrill-Stanciu [13,14], of the conditions that permit such gauging in terms of the Cartan model of equivariant cohomology, and, in Sect. 3.3, we study further implications of those conditions. Section 4 is devoted to global gauge anomalies in theories with a WZ action coupled to topologically trivial gauge fields. Section 4.1 derives the transformation law of the Feynman amplitudes under general gauge transformations and identifies, in cohomological terms, the obstruction to the invariance of the amplitudes under large gauge transformations not homotopic to identity. The general discussion is illustrated in Sect. 4.2 by the example of WZW models with non-simply connected target groups and gauged adjoint symmetry. In this case, the presence or the absence of global gauge anomalies is decided by a simple condition stated in Proposition 4.8. In Sect. 4.3, we show that our results are consistent with the known solution for the partition functions of WZW models and in Sect. 4.4, we examine the toroidal partition functions of the coset models in the presence of global anomalies, pointing to the inconsistency of such models. Section 5 is devoted to the coupling of WZ actions to topologically non-trivial gauge fields. In Sect. 5.1, we define gerbes with equivariant structure. In Sect. 5.2, we describe how to use such structures to define WZ amplitudes coupled to gauge fields with arbitrary topology. The general gauge invariance of such amplitudes is proven in Sect. 5.3. In Sect. 6, we study subsequently the obstructions to the existence of the three layers of an equivariant structure on gerbes (in Sects. 6.1, 6.3 and 6.4). We use the local-data description of gerbes that is recalled in Sect. 6.2. The classification of equivariant gerbes is discussed in Sect. 6.5. Sect. 6.6 examines the change of the WZ amplitudes induced by a change of the equivariant structure of the gerbe and Sect. 6.7 studies the reflection of such changes in the coset toroidal partition functions. Next Sect. 7 contains an explicit construction of equivariant structures relative to the adjoint symmetry on gerbes and relevant for the WZW models with compact simple and simply connected target groups. In Sect. 7.1, we recall the construction of the corresponding gerbes over the target groups and in Sects. 7.2, 7.3 and 7.4, we build the different layers of the equivariant structure. Finally, Sect. 8 summarizes our results and discusses directions for future work. More technical proofs are collected in nine Appendices. When the present work was finished we learnt that a similar concept of equivariant gerbes was recently discussed in [45] and a different one, earlier, in [31]. 2. Wess-Zumino Feynman Amplitudes 2.1. 2D Wess-Zumino action and gerbes. Let M be a smooth manifold and H a closed 3-form on M. 2-forms B such that d B = H provide the background Kalb-Ramond fields for the two-dimensional sigma model with target space M. We shall be mostly interested in situations when H is not an exact form so that the 2-forms B exist only locally. The / M, where , called classical fields of the sigma model are smooth maps ϕ : the worldsheet, is a compact surface, not necessarily connected, that will be assumed closed and oriented here. The Kalb-Ramond field contributes to the sigma-model action functional and to the Feynman amplitude of the field configuration ϕ the Wess-Zumino terms which, for the global 2-form B, are equal to SWZ (ϕ) := ϕ ∗ B and AWZ (ϕ) := exp(˙ι SWZ (ϕ)) = exp ι˙ ϕ ∗ B ,

(2.1)

Global Gauge Anomalies in 2-D Bosonic Sigma Models

517

respectively, in the units where the Planck constant = 1. The contribution to the Feynman amplitudes may be defined more generally if, instead of a global 2-form B, one is given a bundle gerbe with unitary connection G over M, called simply gerbe below, with curvature equal to the closed 3-form H [43]. Such gerbes are precisely the geometric / M, objects that allow to define a U (1)-valued holonomy Hol G (ϕ) of maps ϕ : and one sets AWZ (ϕ) := Hol G (ϕ).

(2.2)

In particular, if H = d B for a global 2-form on M, there exists a gerbe I B with curvature H , canonically associated to B, such that H olI B (ϕ) = exp ι˙ ϕ ∗ B . (2.3)

Gerbes with curvature H exist if and only if the periods of the closed 3-form H are in 2π Z. In particular, H is not required to be an exact form. The basic property of the holonomy of a gerbe G with curvature H is that it is a ˜ is a compact oriented (Cheeger-Simons) differential character. This means that if / ˜ ˜ 3-manifold with boundary ∂ = , and if ϕ˜ : M, then, for ϕ = ϕ| ˜ , Hol G (ϕ) = exp ι˙ ϕ˜ ∗ H . (2.4) ˜

Consequently, the gerbe holonomy is fully determined for the boundary values of fields ˜ ones ϕ˜ by the gerbe curvature H . On the other hand, taking a 3-dimensional ball for infers easily that the gerbe holonomy determines the gerbe curvature H . The converse is true only if the homology group H2 (M) is trivial. The (bundle) gerbes (with unitary connection) G over M form a 2-category Gr b∇(M) with 1-morphisms between gerbes and 2-morphisms between 1-morphisms [50]. Below, we shall denote by Id as well the identity maps between spaces as the identity 1-isomorphisms between gerbes and the identity 2-isomorphisms between 1-isomorphisms, with the meaning of the symbol that should be clear from the context. Gerbes G possess duals G ∗ with opposite curvature and inverse holonomy, tensor products G1 ⊗ G2 with added curvatures and multiplied holonomies, and pullbacks f ∗ G under smooth maps f of the underlying base manifolds with curvatures related by the pullback of 3-forms and the same holonomies of maps ϕ related by the composition with f . Up to 1-isomorphisms, gerbes are classified by their holonomy. Indeed, two gerbes with the same curvature differ, up to a 1-isomorphism, by a tensor factor that is a flat gerbe (i.e. has vanishing curvature). Their holonomies differ by the flat gerbe holonomy factor that determines a cohomology class in H 2 (M, U (1)) = H om(H2 (M), U (1)). All the elements of H 2 (M, U (1)) may be obtained this way. 2.2. Rigid symmetries of Wess-Zumino amplitudes. Rigid symmetries of sigma models are induced by transformations of the target space. Let be a Lie group that, in general, will not be assumed to be connected or simply connected. Suppose now that M is a / M of on M. We shall var-space, i.e. that we are given a smooth action : × M iably write (γ , m) := γ (m) := rm (γ ) := γ m. The infinitesimal action of the Lie alged bra g of on M is induced by the vector fields X¯ for X ∈ g, where X¯ (m) = dt |t=0 e−t X m.

518

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

The assignment preserves the commutators: [ X¯ , Y¯ ] = [X, Y ]. We would like to determine when the WZ Feynman amplitudes are invariant under this action. Below, ιX will denote the contraction with the vector field X , and LX = d ιX + ιX d the Lie derivative with respect to it. / M under the Lemma 2.1. The variation of the gerbe holonomy of maps ϕ : infinitesimal action of X ∈ g is given by the formula d −t X ∗ Hol G (e ϕ) = ι˙ ϕ ι X¯ H Hol G (ϕ). (2.5) dt t=0 Proof. The relation (2.4) implies that Hol G (e−t X ϕ) = exp ι˙

[0,1]×

ϕ˜t∗ H

Hol G (ϕ) = exp ι˙

[0,1]×

ψ˜ t∗ pr2∗ H

Hol G (ϕ) (2.6)

e−st X ϕ(x), ψ˜ t (s, x)

= (s, ϕ˜t (s, x)) and pr2 (s, m) = m. Differentiation for ϕ˜t (s, x) = of the right hand side with respect to t gives d Hol G (e−t X ϕ) dt t=0 ∗ ∗ ∗ ∗ ˜ ˜ = ι˙ ψ0 L X˜ pr2 H Hol G (ϕ) = ι˙ d ψ0 ι X˜ pr2 H Hol G (ϕ), [0,1]×

[0,1]×

(2.7)

d −st X m) dt t=0 (s, e

where X˜ is the vector field on [0, 1] × M such that X˜ (s, m) = s X¯ (m). The Stokes formula applied to the last integral results in the claim.

=

Lemma 2.1 implies that the left hand side of Eq. (2.5) vanishes if and only if ϕ ∗ ι X¯ H = 0. (2.8)

This holds for all ϕ if and only if ι X¯ H is an exact form. We obtain this way Corollary 2.2. The Feynman amplitudes AWZ (ϕ) are invariant under the infinitesimal action of the Lie algebra g (or, equivalently, of the connected component of unity 0 ⊂ ) if and only if the 2-forms ι X¯ H are exact for all X ∈ g. Note that the exactness of ι X¯ H implies, in particular, that L X¯ H = 0, i.e. that the curvature 3-form H is invariant under the infinitesimal action of g. Observe also that if H = d B for a global g-invariant 2-form B, then ι X¯ H = −d(ι X¯ B) so that the 2-forms ι X¯ H are exact. If the group is not connected, i.e. = 0 , then the condition for the -invariance of the WZ Feynman amplitudes is more stringent. Since Hol G (γ ϕ) = Hol ∗γ G (ϕ)

(2.9)

for γ ∈ , it follows that AWZ (γ ϕ) = AWZ (ϕ) for all ϕ if and only if the gerbes ∗γ G and G have the same holonomy. In particular, they have to have the same curvature:

∗γ H = H . Since the holonomy determines the 1-isomorphism class of a gerbe, we obtain

Global Gauge Anomalies in 2-D Bosonic Sigma Models

519

Corollary 2.3. The Feynman amplitudes AWZ (ϕ) are invariant under the action of if and only if the gerbes ∗γ G and G are 1-isomorphic for all γ ∈ . Remark 2.4. In most applications, the sigma-model target manifold M is equipped with a Riemannian metric g M and the Feynman amplitudes contain also a factor with the standard sigma-model action S(ϕ) = dϕ 2L 2 defined with the help of g M and of a Riemannian metric g on the worldsheet. In that situation a group of rigid symmetries that leaves the total amplitudes unchanged for arbitrary Riemann surfaces as worldsheets has to preserve additionally the target metric g M so that, in particular, X¯ (for X ∈ g) are Killing vector fields. 3. Coupling to Topologically Trivial Gauge Fields A natural question arises whether g-invariant Feynman amplitudes AWZ (ϕ) may be gauged, i.e. coupled to gauge fields in a gauge-invariant way. First, we shall discuss the case of topologically trivial gauge fields given by global g-valued 1-forms A on the worldsheet . Such forms may be viewed as connections on the trivial principal -bundle × for group with Lie algebra g. 3.1. Gauging prescription. In the particular instance when the WZ Feynman amplitudes are determined by a global g-invariant 2-form B with d B = H , one may realize the gauging by replacing B with its minimally coupled version B A which is a 2-form on × M: 1

B A := exp(−ι A¯ ) B = B − ι A¯ B + 2 ι2A¯ B.

(3.1)

Above, for X ∈ g and α a differential form, we define ι X¯ ⊗α = α ι X¯ (omitting the wedge sign for the exterior product of differential forms). The gauged Wess-Zumino action has then the form 1 ∗ SWZ (ϕ, A) := φ B A = SWZ (ϕ) + φ ∗ −ι A¯ B + 2 ι2A¯ B , (3.2)

/ × M. It is well known that the minimal coupling gives an where φ = (Id, ϕ) : action that is invariant under infinitesimal gauge transformations induced by the maps / g. This means that : d dt

SWZ (e−t ϕ, e−t A) = 0,

(3.3)

where, for x ∈ , −t ϕ (x) = e−t(x) ϕ(x), e−t A (x) = Ade−t(x) A(x) + e−t(x) d e t(x) . e (3.4) The invariance (3.3) will also follow from the considerations below. In the more general case when the Feynman amplitudes AWZ (ϕ) are given by the gerbe holonomy, see Eq. (2.2), one may still postulate that the coupling to the gauge

520

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

fields is realized by terms linear and quadratic in A, resulting in the replacement of AWZ (ϕ) by 1 (3.5) AWZ (ϕ, A) := exp ι˙ φ ∗ −v(A) + 2 u(A2 ) AWZ (ϕ),

where v(X ) are 1-forms on M depending linearly on X ∈ g, u(X ∧ Y ) are functions on M depending linearly on X ∧ Y ∈ g ∧ g and, for a form α on , v(X ⊗ α) := v(X )α and u((X ∧ Y ) ⊗ α) := u(X ∧ Y ) α denote the induced forms on × M. Necessary conditions for the consistency of such a coupling were found in [37] and [36]. They are summarized in Proposition 3.1. The amplitudes AWZ (ϕ, A) defined in (3.5) are invariant under infinitesimal gauge transformations if and only if the 1-forms v(X ) satisfy the relations ι X¯ H = dv(X ), L X¯ v(Y ) = v([X, Y ]), ι X¯ v(Y ) = −ιY¯ v(X )

(3.6)

for all X, Y ∈ g, with the functions u given by u(X ∧ Y ) = ι X¯ v(Y ).

(3.7)

For completeness, we give in Appendix 1 a proof of this result by arguments close to the original ones of [37] and [36]. Remark 3.2. 1. The 1-forms v(X ) satisfying Eqs. (3.6) may be modified by 1-forms w(X ) (also linear in X ) satisfying the homogeneous version of these equations. 2. To make contact with refs. [37] and [36] more explicitly, let us introduce a basis (t a ) of the Lie algebra g with [t a , t b ] = f abc t c (the summation convention!), v(t a ) =: v a , and u(t a ∧ t b ) =: u ab . Denoting by ιa and La the contraction with and the Lie derivative w.r.t. the vector field t¯a , the relations (3.6) and (3.7) may be rewritten as ιa H = dv a , La v b = f abc v c , ιa v b = −ιb v a = u ab .

(3.8)

In view of Proposition 3.1, it will be convenient to introduce a 2-form ρ A on the product manifold × M and a gerbe G A over the same space by the formulae 1

ρ A = −v(A) + 2 ι A¯ v(A) and G A = Iρ A ⊗ G2 .

(3.9)

Equation (3.5), together with the conditions (3.6) and (3.7) on its entries, may then be summarized in the following Definition 3.3. Let G be a gerbe with curvature H over a -space M, and let v(X ) be 1-forms on M, depending linearly on X ∈ g, satisfying conditions (3.6). The Wess/ M to the Feynman amplitude coupled to a Zumino contribution of a field ϕ : gauge field 1-form A on is defined as ∗ AWZ (ϕ, A) = exp ι˙ φ ρ A AWZ (ϕ) = Hol G A (φ), (3.10)

where, as before, φ = (Id, ϕ). Remark 3.4. If the gerbe G is equal to I B for a g-invariant 2-form B such that d B = H , then one may take v(X ) = −ι X¯ B. In this case, Eq. (3.10) agrees with the minimal coupling (3.2) of the Wess-Zumino action.

Global Gauge Anomalies in 2-D Bosonic Sigma Models

521

Proposition 3.1 implies immediately Corollary 3.5. Equation (3.10) defines amplitudes that are invariant under infinitesimal gauge transformations. Below, we shall need two easy implications of relations (3.6) whose straightforward proof is left to the reader. They will be employed repeatedly below. Lemma 3.6. Relations (3.6) imply that ι X¯ ιY¯ H = v([X, Y ]) − dι X¯ v(Y ), ι X¯ ιY¯ ι Z¯ H = ι X¯ v([Y, Z ]) + ι Z¯ v([X, Y ]) + ιY¯ v([Z , X ]).

(3.11) (3.12)

3.2. Equivariant-cohomology interpretation. In refs. [13,14], see also [60] and [61], relations (3.6) were interpreted in terms of equivariant cohomology. Let (M) denote the space of differential forms on M. Recall that the Cartan complex for equivariant cohomology is formed of polynomial maps / ω(X g X ˆ ) ∈ (M) (3.13) which satisfy

d L X¯ ω(Y ˆ )= ω(Ad ˆ et X Y ) for X, Y ∈ g. dt t=0

(3.14)

We shall call such maps g-equivariant forms. Note that relation (3.14) holds if and only if

∗γ ω(Y ˆ ) = ω(Ad ˆ γ −1 Y )

(3.15)

for γ in the connected component 0 of 1 in . We shall say that a form ωˆ is -equivariant if the relation (3.15) is satisfied for all γ ∈ . Of course, the two notions of equivariance coincide if the group is connected. The g-equivariant (-equivariant) forms make up the complex •g (M) (• (M)) with the Z-grading that adds twice the degree of the polynomial to the degree of the form and with the differential of degree 1 given by the formula (dˆ ω)(X ˆ ) = d ω(X ˆ ) − ι X¯ ω(X ˆ ).

(3.16)

The following result was obtained in [13,14]: Proposition 3.7. A g-equivariantly closed 3-form Hˆ = H + v(X ) extends the closed g-invariant 3-form H if and only if the 1-forms v(X ) satisfy conditions (3.6). Proof. The g-equivariance of Hˆ is the relation L X¯ Hˆ (Y ) = L X¯ (H + v(Y )) = v([X, Y ])

(3.17)

that, in view of the g-invariance of H , reproduces the middle equality in (3.6). On the other hand, the form Hˆ is g-equivariantly closed when (dˆ Hˆ )(X ) = d H + dv(X ) − ι X¯ H − ι X¯ v(X ) = 0

(3.18)

which, using that d H = 0, is equivalent to the left and the right equalities of (3.6).

522

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

Remark 3.8. The freedom of choice of v(X ) mentioned in Remark 3.2(1) consists of the addition of a 1-form w(X ) that is g-equivariantly closed. The g-equivariantly closed 3-form Hˆ = H + v(X ) may be related directly to the curvature of the gerbe G A of Eq. (3.9) which is equal to the 3-form H A = H + dρ A

(3.19)

H A = exp(−ι A¯ ) (H + v(F)) ,

(3.20)

on × M. Lemma 3.9.

where F = d A + 21 [A, A] is the gauge-field strength 2-form. Proof. Writing A = t a Aa and F = t a F a with F a = d Aa + 21 f bca Ab Ac , we obtain, using the left one of relations (3.6): 1 H A = H + dρ A = H + d −v a Aa + 2 (ιa v b )Aa Ab 1

= H − ιa H Aa + v a d Aa + 2 d(ιa v b )Aa Ab .

(3.21)

Equation (3.11) permits to transform the last term on the right-hand side and to show that H A = H − ιa H Aa + v a d Aa +

1 2

1

f abc v c Aa Ab − 2 (ιa ιb H )Aa Ab

1

= H − ι A¯ H + v(F) + 2 ι2A¯ H = exp(−ι A¯ ) (H + v(F)) .

(3.22)

Remark 3.10. The minimal coupling operator exp(−ι A¯ ) may be naturally interpreted within equivariant cohomology, see [38]. Let us only mention here that it satisfies the relation exp(ι A¯ ) d exp(−ι A¯ ) = d − ι F¯ + L A¯

(3.23)

for L A¯ = Aa La . 3.3. More equivariance properties. We shall assume below that the 3-form H extends to the -equivariantly closed 3-form Hˆ (X ) = H +v(X ). This means, along with conditions (3.6), that

∗γ H = H and ∗γ v(X ) = v(Adγ −1 X )

(3.24)

for all γ ∈ and all X ∈ g, see Eq. (3.15). In this section, we shall calculate the pull/ M. The result provides back ∗ H of the 3-form H along the action map : × M another way to express equivariance properties of H that will be used in the sequel. More generally, we shall discuss below forms and gerbes over the product spaces p−1 × M that will be considered as -spaces with the adjoint action of on the factors

Global Gauge Anomalies in 2-D Bosonic Sigma Models

523

in p−1 and the original one on M. For a sequence of indices 1 ≤ i 1 < · · · i k1 < i k1 +1 < · · · < i k2 < · · · < i kq ≤ p, we shall denote by i1 ...ik1 ,ik1 +1 ...ik2 ,...,ikq−1 +1 ...ikq the maps p−1 × M (γ1 , . . . , γ p−1 , m)

(γi1 · · · γik1 , γik1 +1 · · · γik2 , . . . , γikq−1 +1 · · · γikq ) / (γi1 · · · γik1 , γik1 +1 · · · γik2 , . . . , γikq−1 +1 · · · γikq −1 m)

∈ q ∈ q−1 × M

if i kq < p, if i kq = p,

(3.25) e.g., 2 (γ , m) = m, 12 (γ , m) = γ m, 12 (γ1 , γ2 , m) = γ1 γ2 , or 2,3 (γ1 , γ2 , m) = (γ2 , m). All these maps commute with the action of . Finally, we shall abbreviate

i∗1 ...ik p H := Hi1 ...ik p . Similar self-explanatory shorthand notations will be employed for other forms, gerbes and gerbe 1- and 2-morphisms, also living on other product spaces. Let us start by considering the pullback H12 = ∗ H of the 3-form H to × M. The 1-forms v(X ) on M define a 2-form 1

ρ := −v() + 2 (ι¯ v)()

(3.26)

on × M, where = t a a = γ −1 dγ is the g-valued Maurer-Cartan 1-form on . As before, we use the notations ι X¯ ⊗α := αι X¯ and v(X ⊗ α) := v(X ) α for X ∈ g and α a form, dropping the exterior product sign. Note the similarity to formula (3.9) for the 2-form ρ A . Lemma 3.11. H12 = dρ + H2 . Proof. In order to find an explicit expression for H12 , a useful tool is the observation that, for a form ω ∈ (M), ∗ ( ∗ ω)(γ , m) = exp[−ι(γ (3.27) ¯ ) ] γ ω (m). Equation (3.27) makes explicit the contributions to ∗ ω with differentials along and along M. Application of identity (3.27) to ω = H gives ∗ ( ∗ H )(γ , m) = exp[−ι(γ ¯ ) ] γ H (m) = exp[−ι(γ ¯ ) ] H (m) 1

= H (m) − a (γ )(ιa H )(m) − 2 (a b )(γ )(ιa ιb H )(m) 1

+ 6 (a b c )(γ )(ιa ιb ιc H )(m) = H (m) − a (g)(dv a )(m) − 1

1 2

f abc (a b )(γ )v c (m)

1

+ 2 (a b )(γ )(dιa v b )(m) + 2 f bcd (a b c )(γ )(ιa v d )(m) 1 = H (m) + d a v a + 2 a b ιa v b (γ , m), (3.28) where the last but one equality was obtained by employing relations (3.6) and Lemma 3.6, and the last equality follows from the structure equation dc = − 21 f abc a b for the Maurer-Cartan forms. The result is the claimed identity.

524

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

Remark 3.12. 1. Similarly one may prove the relation ˆ + Hˆ 2 Hˆ 12 = dρ

(3.29)

which gives an equivariant extension of Lemma 3.11. 2. Lemma 3.11 implies that if w(X ) is a 1-form depending linearly on X that is -equivariantly closed, then the 2-form 1

σ = −w() + 2 ι¯ w()

(3.30)

on × M is closed, see Remark 3.8. This is still true if w(X ) is only g-equivariantly closed. Lemma 3.13. The 2-form ρ defined in Eq. (3.26) has the following properties: 1. ρ is a -invariant form on × M. 2. As forms on 2 × M, ρ12,3 = ρ1,23 + ρ2,3 .

(3.31)

A proof of Lemma 3.13 may be found in Appendix 2. 4. Global Gauge Anomalies 4.1. General gauge transformations. As we have seen, conditions (3.6) assure the infinitesimal gauge invariance of the Feynman amplitudes (3.10). In the present section, we shall examine the behavior of the amplitudes under general gauge transformations gen/ . Such maps act on the space × M erated by -valued smooth maps h : by (x, m) on the sigma-model fields ϕ :

Lh

/ (x, h(x)m),

(4.1)

/ M by ϕ

/ hϕ,

(4.2)

where (hϕ)(x) = h(x)ϕ(x), and on the gauge fields according to the formulae A

/ h A := Adh (A) + (h −1 )∗ ,

F

/ h F = Adh (F).

(4.3)

The infinitesimal gauge transformations are then generated by taking h = e−t for / g and expanding to the 1st order in t. Let us start by establishing the trans: formation rule of the curvature 3-form H A of gerbe G A over × M under maps (4.1). Lemma 4.1. The 3-form H A defined in (3.19) transforms covariantly under the general / : gauge transformations h : L ∗h H A = Hh −1 A .

(4.4)

Global Gauge Anomalies in 2-D Bosonic Sigma Models

525

Proof. By virtue of the formula (3.27), Lemma 3.9, the identity ι Ad

γ −1 X

∗γ = ∗γ ι X¯

(4.5)

that holds on M, and relations (3.24), we have: ∗ L h H A (x, m) = H A (x, h(x)m) = exp[−ι(h ∗ )(x) ] ∗h(x) H A (x, ·) (m) = exp[−ι(h ∗ )(x) ] ∗h(x) exp[−ι A(x) ¯ ] (H + v(F(x))) (m) = exp[−ι(h ∗ )(x) ] exp[−ι(Ad −1 (A))(x)) ] H +v((Adh −1 (F))(x)) (m) h = exp[−ι(h ∗ +Ad −1 (A))(x) ] H + v((Adh −1 (F))(x)) (m) h

= Hh −1 A (x, m), where the last equality follows from relations (4.3).

(4.6)

We shall need below a few simple facts from the theory of gerbes. First, the pullback and the tensor product of gerbes commute. Second, the pullback of the gerbe I B associated to a 2-form B is a similar gerbe associated to the pullback 2-form. Third, the tensor product of gerbes I B1 ⊗ I B2 for 2-forms Bi on the same space may be identified with the gerbe I B1 +B2 . Fourth, the tensor product G ⊗ G ∗ of a gerbe with its dual is canonically isomorphic to the trivial gerbe I0 which provides the unity of the tensor product. Fifth, if two gerbes are 1-isomorphic then so are their tensor products by a third gerbe and their pullbacks by the same map. To find out the transformation rules of the Feynman amplitudes under general gauge transformations, we have to compare the amplitudes AWZ (hϕ, h A) and AWZ (ϕ, A). Since AWZ (hϕ, h A) = Hol Gh A (L h ◦ φ) = Hol L ∗h Gh A (φ) and AWZ (ϕ, A) = Hol G A (φ) (4.7) for φ = (I d, ϕ), it will be enough to compare the gerbes L ∗h Gh A and G A whose curvatures, equal to L ∗h Hh A and H A , respectively, coincide by Lemma 4.1. From the latter property, it follows that those two gerbes are related up to 1-isomorphism by tensoring with a flat gerbe which we shall identify now. Consider the gerbe F = G12 ⊗ G2∗ ⊗ I−ρ

(4.8)

over × M. It follows from Lemma 3.11 that F is flat. Proposition 4.2. The gerbes L ∗h Gh A and G A ⊗(h× Id)∗ F over × M are 1-isomorphic. A proof of Proposition 4.2 by a chain of relations, based on the properties of gerbes listed above, may be found in Appendix 3. Taking into account relations (4.7) and the identities Hol (h×Id)∗ F (φ) = Hol F ((h × Id) ◦ φ) = Hol F ((h, ϕ)), Proposition 4.2 implies immediately the following transformation property of the Wess-Zumino amplitudes:

526

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

Theorem 4.3. Under the gauge transformation induced by a map h : AWZ (hϕ, h A) = AWZ (ϕ, A)Hol F ((h, ϕ)).

/ , (4.9)

One can be more specific. Note that from Eq. (4.8) it follows that −1 ∗ Hol F ((h, ϕ)) = Hol G (hϕ) Hol G (ϕ) exp −˙ι (h, ϕ) ρ .

(4.10)

In particular, taking h = 1, we infer that Hol F ((1, ϕ)) = 1. Indeed, the 2-form (1, ϕ)∗ ρ on vanishes because the 2-form ρ is composed of terms of degree ≤ 1 in the direction of M. More generally, since the flat-gerbe holonomies of homotopic fields coincide by virtue of the holonomy property (2.4), Hol F ((h, ϕ)) = 1 if h is homotopic to 1. Corollary 4.4. The Feynman amplitudes (3.10) are invariant under gauge transformations homotopic to 1. The gauge transformations homotopic to 1 are often called small. The remaining issue is the invariance of the amplitudes (3.10) under large gauge transformations that are not homotopic to 1. The holonomy of the flat gerbe F on × M defines a cohomology class [F] ∈ H 2 ( × M, U (1)) which is trivial if and only if the flat gerbe F is 1-isomorphic to the trivial gerbe I0 . By virtue of definition (4.8), the latter holds if and only if the gerbes G12 and Iρ ⊗ G2 over × M are 1-isomorphic. Consequently, Corollary 4.5. The amplitudes (3.10) are invariant under all gauge transformations if and only if the gerbes G12 and Iρ ⊗ G2 over × M are 1-isomorphic. The class [F], that will be more carefully studied in Sect. 6, is the obstruction to the invariance of the Feynman amplitudes (3.10) under large gauge transformations. In other words, a non-triviality of the class [F] leads to a global gauge anomaly in the two-dimensional sigma model with the Wess Zumino term corresponding to the gerbe G and coupled to topologically trivial gauge fields. In the above analysis, we kept fixed the -equivariant extension Hˆ + v(X ) of the curvature H of the gerbe G. A natural question arises whether one may use the freedom in the choice of v(X ) to annihilate the global gauge anomaly. Clearly, the answer is that this may be done if and only if there exists a 1-form w(X ) that is -equivariantly closed for which [F] = [σ ], where [σ ] denotes the cohomology class in H 2 ( × M, U (1)) induced by the closed 2-form σ of Eq. (3.30). In many contexts, however, e.g., in applications to WZW and coset models of conformal field theory, that we shall discuss below, v(X ) is a part of the structure tied to the symmetries of the theories and should not be changed. Similarly, one may ask whether it is possible to annihilate the global gauge anomaly by an appropriate choice of gerbe G, keeping the curvature form fixed. Since this involves tensoring G with flat gerbes whose 1-isomorphism classes belong to H 2 (M, U (1)), the answer is that this is possible if and only if [F] = [b]12 − [b]2 for some class [b] ∈ H 2 (M, U (1)). A change of G to another non 1-isomorphic gerbe, however, implies a non-trivial change of the Feynman amplitudes of the ungauged sigma model, i.e. of the model itself.

Global Gauge Anomalies in 2-D Bosonic Sigma Models

527

4.2. Global gauge anomalies in WZW amplitudes. As an example, let us consider the case when M = G, where G is a connected compact semi-simple Lie group, not neces˜ , where G˜ = ×l G˜ l is the covering group sarily simply connected. One has: G = G/Z of G that decomposes into the product of simple factors, and Z is a subgroup of the ˜ The factors Z˜ l are cyclic except for those equal to Z2 correcenter Z˜ = ×l Z˜ l of G. 2 sponding to G˜ l = Spin(4r ). The Lie algebra g of G˜ decomposes as ⊕l gl into the direct sum of simple factors. Let h be a Lie subalgebra of g corresponding to a connected but not necessarily simply connected closed subgroup H˜ ⊂ G˜ that maps onto a closed ˜ Z˜ . Clearly, h is also the Lie algebra of and = H˜ /Z connected subgroup of G/ with Z = H˜ ∩ Z˜ . We shall consider G with the adjoint action of . Definition 4.6. Below, we shall call a -space M = G as above the one of the cosetmodel context. ˜ Z˜ . In what follows, the reader may think about In the simplest case, h = g and = G/ this example. ˜ , we shall consider gerbes Gk with the curvature 3-forms Over the group G = G/Z Hk =

1 12π

ktr 3 ,

(4.11)

where = g −1 dg is the g-valued Maurer-Cartan 1-form on G and k tr X Y :=

kl trl X l Y l stands for the ad-invariant negative-definite bilinear form on g given by ˜ the sum of such forms on gl . We assume that the latter are normalized so that, if G = G, then the form Hk has periods in 2π Z if and only if the level k = (kl ) is composed of integers. For non-simply connected groups G, k has to satisfy more stringent selection 1 rules to assure the integrality of periods of 2π H [15,24,41]. The holonomy of gerbes Gk provides the Wess-Zumino part of amplitudes for the WZW sigma models of conformal field theory [59], see the next section. Definition 4.7. We shall call Gk a WZW gerbe. There may be several non-1-isomorphic WZW gerbes Gk over G (their 1-isomorphism classes are counted by elements of H 2 (Z , U (1)) in the discrete group Z cohomol˜ Z˜ leaves the 3-forms Hk invariant. For X ∈ g, the ogy [4]). The adjoint action of group G/ d ¯ |t=0 Ade−t X (g) vector field X on G induced by the infinitesimal adjoint action: X¯ (g) = dt satisfies the relation ι X¯ (g) = X − Adg−1 (X ). Hence, ι X¯ Hk =

1 8π

1 k tr X (1 − Adg )([(g), (g)]) = − 4π d k tr X (1 + Adg )((g))

(4.12) so that, upon setting 1 vk (X ) = − 4π k tr X (1 + Adg )((g)),

(4.13)

the left one of conditions (3.6) is satisfied. The 1-forms vk (X ) satisfy also the other conditions of (3.6). Indeed, 1 ι X¯ vk (Y ) = − 4π k trY −Adg−1 (X ) + Adg (X ) 1 = 4π tr X −Adg−1 (Y ) + Adg (Y ) = −ιY¯ vk (X ), (4.14)

528

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

d L X¯ vk (Y ) = − dt t=0 d = − dt t=0 d = − dt t=0

1 4π

k trY Ade∗−t X (1 + Adg )((g))

1 4π

k trY Ade−t X (1 + Adg )((g))

1 4π

k tr Adet X (Y ) (1 + Adg )((g)) = vk ([X, Y ]).

(4.15)

Of course, we may restrict X, Y above to take values in the subalgebra h ⊂ g. The 2-form ρk,A on × defined by Eq. (3.9) and the 2-form ρk on × G defined by Eq. (3.26) are given now by the formulae ρk,A =

1 4π

k tr (1 + Adg )((g)) + Adg−1 (A) A,

(4.16)

ρk =

1 4π

k tr (1 + Adg )((g)) + Adg−1 ((γ )) (γ ),

(4.17)

where (γ ) = γ −1 dγ is the Maurer-Cartan form on . The 2-form ρk,A enters the coupling, described in Definition 3.3, of the Wess-Zumino action to the h-valued 1-form A on . Let us compute the holonomy of the flat gerbe Fk = (Gk )12 ⊗(Gk )∗2 ⊗I−ρk over ×G, see Eq. (4.8). Recall that the non-triviality of such holonomy obstructs the invariance of the Wess-Zumino amplitudes of Definition 3.3 under large gauge transformations. By / and ϕ : / G, Eq. (4.10), for h : Hol Fk ((h, ϕ)) = Hol Gk (Adh (ϕ)) Hol Gk (ϕ)−1 exp −˙ι (h, ϕ)∗ ρk =: ch,ϕ .

(4.18) Since Fk is flat, the above holonomy depends only on the homotopy classes [h] and [ϕ] of the maps h and ϕ. Besides it does not depend on whether we treat h as a map with ˜ Z˜ . In the latter case, the homotopy classes of the maps h are in values in or in G/ one-to-one relation with the elements of Z 2ω , where ω is the genus of . The element (˜z 1 , z˜ 2 , . . . , z˜ 2ω−1 , z˜ 2ω ) corresponding to [h] is given by the windings of h described by the holonomies z˜ 2 j−1 = P exp

∗

h , z˜ 2 j = P exp aj

∗

h ,

(4.19)

bj

of the non-Abelian flat gauge field h ∗ () on . Above, P stands for the path-ordering (from left to right) along paths a j , b j , j = 1, . . . , ω, that generate a fixed marking of the surface , the latter assumed here to be connected, see Fig. 1. Similarly for elements (z 1 , . . . , z 2ω ) describing the windings of ϕ belonging to Z 2ω . By pinching off the handles of the surface the same way as in Sec. III of [25], one notes, using the commutativity ˜ Z˜ and of G, that of the fundamental groups of G/ ch,ϕ ≡ c(˜z 1 ,...,˜z 2ω ),(z 1 ,...,z 2ω ) =

ω j=1

c(˜z 2 j−1 ,˜z 2 j ),(z 2 j−1 ,z 2 j ) .

(4.20)

Global Gauge Anomalies in 2-D Bosonic Sigma Models

a1

529

a2

b1

b2

a3 b3

Fig. 1. Genus 3 surface with a marking; crossed broken lines (red online) indicate the contours of its version with pinched handles

Hence, the calculation of ch,ϕ reduces to the genus 1 case with = S 1 × S 1 . Let us choose the Cartan subalgebras th ⊂ h and tg ⊂ g so that th ⊂ tg . On = S 1 × S 1 , one may take h = h p˜1∨ , p˜2∨ and ϕ = ϕ p1∨ . p2∨ with ∨

∨

∨

∨

h p˜1∨ , p˜2∨ (e ι˙σ1 , e ι˙σ2 ) = e ι˙(σ1 p˜1 +σ2 p˜2 ) , ϕ p1∨ , p2∨ (e ι˙σ1 , e ι˙σ2 ) = e ι˙(σ1 p1 +σ2 p2 ) , (4.21) 2π ι˙ p˜ i∨

∈ Z and where p˜ i∨ ∈ ι˙th and pi∨ ∈ ι˙tg are such that the windings z˜ i = e ∨ z i = e 2π ι˙ pi ∈ Z . Note that p˜ i∨ and pi∨ have to belong to the coweight lattice Pg∨ com∨ posed of elements p ∨ ∈ ι˙tg such that e 2π ι˙ p ∈ Z˜ . Since Adh p˜ ∨ , p˜ ∨ (ϕ p1∨ , p2∨ ) = ϕ p1∨ , p2∨ , 1 2 the formula (4.18) gives ∗ c(˜z 1 ,˜z 2 ),(z 1 ,z 2 ) = exp −i (h p˜1∨ , p˜2∨ , ϕ p1∨ , p2∨ ) ρk S 1 ×S 1 2π 2π i ∨ ∨ ∨ ∨ = exp k tr (dσ1 p1 + dσ2 p2 )(dσ1 p˜ 1 + dσ2 p˜ 2 ) 2π 0 0 (4.22) = exp 2πi k tr ( p1∨ p˜ 2∨ − p˜ 1∨ p2∨ ) . That the right hand side depends only on the windings is assured by the integrality of the level k. The holonomy of the flat gerbe Fk is trivial if and only if the above expression is always equal to 1 for the windings restricted as above (compare to a similar discussion in [25]). From Corollary 4.3, we obtain Proposition 4.8. For the -space M = G in the coset-model context, see Definition 4.6, the WZ Feynman amplitudes (3.10) are invariant under all gauge transformations if and only if the phases (4.22) are trivial. ˜ one may take p ∨ = 0 so that the phases (4.22) are trivial. We obtain this When G = G, i way Corollary 4.9. For the simply connected -space M = G˜ in the coset-model context, the WZ Feynman amplitudes (3.10) are invariant under all gauge transformations. ˜ , examples where the phases (4.22) are For non-simply connected groups G = G/Z ˜ Z˜ for G˜ = SU (r + 1) with r even non-trivial are numerous. They include G = = G/ and k = 1 or with r ≥ 3 odd and k = 2. Another example is G = = Spin(2r )/Z22 with r divisible by 4 and k = 1. In all those cases (and many others), the amplitudes (3.10) of Definition 3.3 exhibit a global gauge anomaly.

530

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

The best known case with a non-simple group G˜ is G˜ = SU (2)× SU (2). The restrictions on the level k = (k1 , k2 ) imposed by the existence of the gerbe Gk with curvature ˜ depend on Z ⊂ Z˜ = Z2 × Z2 . Hk on G = G/Z k1 k1 k1 k1 k1

∈ Z, ∈ 2Z, ∈ Z, ∈ Z, ∈ 2Z,

k2 k2 k2 k2 k2

∈Z ∈Z ∈ 2Z ∈ Z, ∈ 2Z

if if if if if

k1 + k2 ∈ 2Z

Z Z Z Z Z

= 0, = Z2 ⊕ 0, = 0 ⊕ Z2 , = diag Z2 , = Z2 ⊕ Z2 .

˜ Z˜ = S O(3) × S O(3), with the adjoint action on G and p˜ i , p˜ , pi , p ∈ Z, For = G/ i i c

p˜ p˜ p p (((−1) p˜ 1 ,(−1) 1 ),((−1) p˜ 2 ,(−1) 2 )),(((−1) p1 ,(−1) 1 ),((−1) p2 ,(−1) 2 )) k1 ( p1 p˜ 2 − p˜ 1 p2 )+k2 ( p1 p˜ 2 − p˜ 1 p2 )

= (−1)

.

(4.23)

We infer from this expression that the only case with a global anomaly of the gauged WZ amplitudes (3.10) of Definition 3.3 is the one with G = (SU (2) × SU (2))/diag Z2 with odd k1 , k2 . If one restricts, however, the group to the diagonal S O(3) subgroup of S O(3)× S O(3) then the global gauge anomaly disappears. Another anomalous example with a non-simple group is G = (SU (3) × SU (3))/(Z3 × Z3 ) at level k = (1, 1) with the adjoint action of = diag(SU (3)/Z3 ). The non-anomalous gauging of the adjoint action of the diagonal S O(3) subgroup in the WZW model with groups (SU (2) × SU (2))/Z is used in the coset model construction [30] of the unitary minimal models of conformal field theory [18,22,33]. Other coset theories involve other versions of gauged WZW amplitudes and may suffer from global anomalies, as will be discussed below. 4.3. Anomalies and WZW partition functions. The results of the calculation of the global-gauge-anomaly phases in the last section are consistent with the exact solution for the toroidal partition functions of the WZW models of conformal field theory in an external gauge field. Let us start by considering the level k WZW sigma model on a closed Riemann sur˜ face with the Lie group G = G/Z as the target manifold. The Feynman amplitude / G in the background of the external gauge field described by a of a field ϕ : g-valued 1-form A on is given by the formula i AW Z W (ϕ, A) = exp 4π k tr (ϕ −1 ∂ A ϕ)(ϕ −1 ∂¯ A ϕ) AW Z (ϕ, A), (4.24)

where ∂ A = ∂ + ad A10 and ∂¯ A = ∂¯ + ad A01 are the minimally coupled Dolbeault differentials relative to the complex structure of , for A = A10 + A01 . The WZ amplitude AW Z (ϕ, A) is related to the holonomy of the WZW gerbe Gk on G, with the adjoint ˜ Z˜ gauged as described previously. action of the group = G/ Let = Tτ := C/(2π Z + 2π τ Z) be the complex torus with the modular parameter τ = τ1 +iτ2 , where the imaginary part τ2 > 0. The toroidal partition function is formally / G defined by the functional integral over the space of maps ϕ : Tτ AW Z W (ϕ, A)Dϕ. (4.25) Z G (τ, A) = Map(Tτ ,G)

Global Gauge Anomalies in 2-D Bosonic Sigma Models

531

Its exact form may be found from (formal) symmetry properties of the functional integral. The result has a specially simple form for the gauge fields Au =

u¯ dw−u d w¯ 2τ2

(4.26)

with u in the complexified Cartan algebra tgC of g and w the coordinate on the complex plane (for other gauge fields, it is then determined by chiral gauge transformations [19]). ˜ one has When the group G is simply connected, i.e. G = G, ˜ π k tr(u−u) ¯ 2 gˆ , (4.27) Z G (τ, Au ) = |χk, (τ, u)|2 exp 2τ 2

∈Pk+ (g) gˆ

where χk, are the affine characters,

gˆ gˆ χk, (τ, u) = tr V gˆ exp 2πi τ L 0 − k,

gˆ

ck 24

+u

,

(4.28)

gˆ

of the unitary highest-weight modules Vk, of level k and highest weight of the affine gˆ

algebra gˆ associated to the Lie algebra g [28,39]. L 0 stands for the corresponding Sugawgˆ

ara-Virasoro generator and ck for the Virasoro central charge. The admissible highest weights form a finite set Pk+ (g). We consider weights as elements of itg , identifying the latter space with its dual by means of the bilinear form tr. ˜ , the toroidal partition functions take a For non-simply connected groups G = G/Z more complicated form [15]. The space of (regular) maps from Tτ to G has different connected components that may be labeled by the windings: Map(Tτ , G) =

(z 1 ,z 2 )∈Z 2

˜ Mapz1 ,z2 (Tτ , G),

(4.29)

∨

where for z i = e 2π ι˙ pi , Mapz1 ,z2 contains the maps homotopic to ϕ p1∨ , p2∨ of Eq. (4.21) (viewed as a map on Tτ via the parametrization of the complex plane by w = σ1 + τ σ2 ). Let G AW Z W (ϕ, A)Dϕ (4.30) Zz ,z (τ, A) = 1 2

Mapz 1 ,z 2 (Tτ ,G)

so that Z G (τ, A) =

(z 1 ,z 2 )∈Z 2

ZzG,z (τ, A).

(4.31)

1 2

By writing ϕ = ϕ p1∨ , p2∨ ϕ, ˜ where ϕ˜ has trivial windings and may be lifted to a map from ˜ ˜ Tτ to G, one may relate the functional integral for Z G (τ, A) to the one for Z G (τ, A) z 1 ,z 2

using the chiral Ward identities [19]. One obtains this way the formula 1 ZzG,z (τ, Au ) = |Z | H olGk (ϕ p1∨ , p2∨ ) exp −ik tr p1∨ ( p2∨ − τ p1∨ ) − 2π ι˙ktr up1∨ 1 2

∈Pk+ (g)

gˆ

gˆ

· χk, (τ, u + p2∨ − τ p1∨ ) χk, (τ, u) exp

π k tr(u−u) ¯ 2 2τ2

,

(4.32)

532

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

where |Z | stands for the cardinality of Z and the values of H olGk (ϕ p1∨ , p2∨ ) may be found / in Sec. IV of [25]. There exists a spectral flow z on Pk+ (g) (and on the set of the corresponding highest-weight modules of gˆ ) induced by the elements z of the center Z˜ of G˜ [15]. The highest weight z is uniquely fixed by the property that e 2π ι˙ k

−1 z

= z Adwz (e 2π ι˙ k

−1

)

(4.33)

˜ The characters of the for some wz in the normalizer of the Cartan subgroup of G. gˆ -modules with the highest weights connected by the spectral flow satisfy the relation gˆ exp −π ι˙ k tr( p2∨ − τ p1∨ ) p1∨ − 2π ι˙k tr up1∨ χk, (τ, u + p2∨ − τ p1∨ ) gˆ = exp 2π ι˙ tr p2∨ − π ι˙ k tr p1∨ p2∨ χ −1 (τ, u) k,z 1

(4.34)

∨

for any p1∨ and p2∨ in the coweight lattice Pg . As a result, Eq. (4.32) may be rewritten in the form ZzG,z (τ, Au ) 1 2 1 gˆ = |Z | z1 ,z2 () χ

k,z 1−1

∈Pk+ (g)

gˆ

(τ, u) χk, (τ, u) exp

π k tr(u−u) ¯ 2 2τ2

,

(4.35)

where z1 ,z2 () = H olGk (ϕ p1∨ , p2∨ ) exp 2π ι˙ tr p2∨ − π ι˙ k tr p1∨ p2∨

(4.36)

defines a character on Z through its dependence on z 2 . Let, for z ∈ Z , C z := ∈ Pk+ (g) | z,z 2 () = 1 for all z 2 ∈ Z .

(4.37)

Summing both sides of Eq. (4.35) over z 1 and z 2 , one obtains the following formula for the complete partition function of the group G WZW model at level k: Z G (τ, Au ) =

gˆ

z∈Z ∈Pk+ (g)∩C z

gˆ

χk,z −1 (τ, u) χk, (τ, u) exp

π k tr(u−u) ¯ 2 2τ2

.

(4.38)

Note that, for non-trivial Z , the affine characters and their complex conjugates are combined non-diagonally in the latter expression, in contrast with the formula (4.27). The space of states of the model that can be read off from Eq. (4.38) has the form [15,41] HG = ⊕

z∈Z

⊕

∈Pk+ (g)∩C z

gˆ

gˆ

Vk,z −1 ⊗ Vk, .

(4.39)

The transformation properties of the WZW partition function (4.35) under large gauge transformations h p˜1∨ , p˜2∨ of Eq. (4.21) are determined by the equality h p˜1∨ , p˜2∨ Au = Au− p˜2∨ +τ p˜1∨ ,

(4.40)

Global Gauge Anomalies in 2-D Bosonic Sigma Models

533

and by identity (4.34) for the affine characters. With the help of these relations, one obtains ZzG,z (τ, h p˜1∨ , p˜2∨ Au ) = ZzG,z (τ, Au− p˜2∨ +τ p˜1∨ ) 1 2 1 2 1 z1 ,z2 () exp −2πi tr p˜ 2∨ (z 1−1 − ) = |Z | ∈Pk+ (g)

·χ

gˆ (τ, u) k,˜z 1 z 1−1

gˆ

χk,˜z 1 (τ, u) exp

π k tr(u−u) ¯ 2 2τ2

,

(4.41)

∨

where, as before, z˜ i = e 2π ι˙ p˜i . It is easy to see, using Eq. (4.33), that exp −2πi tr p˜ 2∨ (˜z 1−1 − ) = exp 2πi k tr p˜ 1∨ p˜ 2∨

(4.42)

for any ∈ Pk+ (g). Replacing by z˜ 1−1 on the right-hand side of Eq. (4.41) and using the relation z1 ,z2 (˜z 1−1 ) = exp −2π ι˙ k tr p˜ 1∨ p2∨ z1 ,z2 () (4.43) that follows from Eq. (4.42), one obtains Proposition 4.10. The transformation law of the toroidal partition function (4.35) under large gauge transformations is described by the identity ZzG,z (τ, h p˜1∨ , p˜2∨ Au ) = c(˜z 1 ,˜z 2 ),(z 1 ,z 2 ) ZzG,z (τ, Au ), 1 2

1 2

(4.44)

where the phases c(˜z 1 ,˜z 2 ),(z 1 ,z 2 ) are given by Eq. (4.22). If we assume the gauge invariance Dϕ = D(hϕ) of the formal functional integral measure, then the above anomalous transformation property follows from the functional integral expression (4.30) and the relation AW Z W (h p˜1∨ , p˜2∨ ϕ, h p˜1∨ , p˜2∨ A) = c(˜z 1 ,˜z 2 ),(z 1 ,z 2 ) AW Z W (ϕ, A)

(4.45)

for ϕ ∈ Mapz1 ,z2 (Tτ , G) which is a consequence of Eq. (4.9) (the minimally coupled term of the WZW action (4.24) is invariant under all gauge transformations). As an example, let us consider the simplest gauged WZW model that exhibits a global gauge anomaly, namely the one with the target group G = SU (3)/Z3 at level k = 1 and the gauged adjoint action of = G. For the simple coweights of su(3) (identified with the simple weights), we may take ∨ λ∨ 1 = diag[ 3 , − 3 , − 3 ] = λ1 , λ2 = diag[ 3 , 3 , − 3 ] = λ2 . 2

∨

1

1

1

4π ι˙

2π ι˙

1

2π ι˙

2

(4.46)

The element z = e 2π ι˙λ1 = diag[e 3 , e− 3 , e− 3 ] generates the center Z3 of SU (3). The set P1+ (su(3)) contains three weights = r1 λ1 +r2 λ2 with (r1 , r2 ) = (0, 0), (1, 0), (0, 1). We shall denote the corresponding level 1 affine characters by χˆ (r1 ,r2 ) . The toroi u) ¯ 2 dal partition functions Z˜ G (τ, u) := Z G (τ, u) exp − π k tr(u− with fixed windings are, 2τ 2

according to Eq. (4.35), 1 G = 3 |χˆ (0,0) |2 + |χˆ (1,0) |2 + |χˆ (0,1) |2 , Z˜1,1

534

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

G Z˜1,z =

1 3

G Z˜1,z 2 =

1 3

G Z˜ z,1 =

1 3

G Z˜ z,z =

1 3

G Z˜z,z 2 =

1 3

Z˜zG2 ,1 =

1 3

Z˜ zG2 ,z =

1 3

Z˜ zG2 ,z 2 =

1 3

|χˆ (0,1) |2 , 2π ι˙ 4π ι˙ |χˆ (0,0) |2 + e 3 |χˆ (1,0) |2 + e 3 |χˆ (0,1) |2 , χˆ (0,1) χˆ (0,0) + χˆ (0,0) χˆ (1,0) + χˆ (1,0) χˆ (0,1) , 4π ι˙ 2π ι˙ e 3 χˆ (0,1) χˆ (0,0) + e 3 χˆ (0,0) χˆ (1,0) + χˆ (1,0) χˆ (0,1) , 2π ι˙ 4π ι˙ e 3 χˆ (0,1) χˆ (0,0) + e 3 χˆ (0,0) χˆ (1,0) + χˆ (1,0) χˆ (0,1) , χˆ (1,0) χˆ (0,0) + χˆ (0,1) χˆ (1,0) + χˆ (0,0) χˆ (0,1) , 2π ι˙ 4π ι˙ e 3 χˆ (1,0) χˆ (0,0) + χˆ (0,1) χˆ (1,0) + e 3 χˆ (0,0) χˆ (0,1) , 4π ι˙ 2π ι˙ e 3 χˆ (1,0) χˆ (0,0) + χˆ (0,1) χˆ (1,0) + e 3 χˆ (0,0) χˆ (0,1) . |χˆ (0,0) |2 + e

4π ι˙ 3

|χˆ (1,0) |2 + e

2π ι˙ 3

Since c(z p˜1 ,z p˜2 ),(z p1 ,z p2 ) = exp

4πi 3

( p1 p˜ 2 − p˜ 1 p2 ) ,

(4.47)

the transformation rule (4.44) implies that all the sectors with non-trivial windings suffer from global gauge anomalies. Summing over the windings, one obtains the total partition function of the level 1 WZW theory for the target group G = SU (3)/Z3 : Z˜ G = |χˆ (0,0) |2 + χˆ (1,0) χˆ (0,1) + χˆ (0,1) χˆ (1,0) .

(4.48)

It should be contrasted with the anomaly-free level 1 partition function for the covering group G˜ = SU (3): ˜ Z˜ G = |χˆ (0,0) |2 + |χˆ (1,0) |2 + |χˆ (0,1) |2 .

(4.49)

4.4. Implications for coset models. Consider now the group = H˜ /Z , where H˜ is a ˜ Z˜ , connected closed subgroup of G˜ with Lie algebra h ⊂ g and Z = H˜ ∩ Z˜ . = / where, ˜ is the covering group of (and of H˜ ) and Z˜ is the subgroup of its center composed of elements that project to Z ⊂ H˜ . Of course, one has to distinguish between Z˜ and Z only if the subgroup H˜ is not simply connected. The so-called G/ coset model of the conformal field theory is obtained by gauging the adjoint action of on ˜ G = G/Z in the group G level k WZW model and by integrating out the gauge fields in the functional integral [2,22,33,40]. In particular, the contribution of the topologically trivial gauge fields to the toroidal partition function of the G/ coset model is formally given by Z G/ (τ ) =

Z G (τ, A)D A,

(4.50)

Global Gauge Anomalies in 2-D Bosonic Sigma Models

535

where A are 1-forms on Tτ with values in the Lie algebra h. Clearly, due to the decomposition (4.31), Z G/ (τ ) = ZzG/,z (τ ) with ZzG/,z (τ ) = ZzG,z (τ, A)D A. (4.51) (z 1 ,z 2 )∈Z 2

1 2

1 2

1 2

The functional integral (4.50) may be computed by an appropriate parametrization of the gauge fields A [22]. In particular, when h is semi-simple, the result is [22,33] ZzG/,z (τ ) = 1 2

1 | Z˜ ||Z |

gˆ ,hˆ (τ ) k,z 1−1 ,λ

z1 ,z2 () b

∈Pk+ (g) λ∈P˜+ (h)

gˆ ,hˆ

bk,,λ (τ ),

(4.52)

k

gˆ ,hˆ

where bk,,λ (τ ) are the branching functions that are the characters of the coset Virasoro gˆ ,hˆ

modules Vk,,λ . The latter appear in the decomposition [29] gˆ

Vk, =

⊕

ˆ λ∈P˜+ (h) k

gˆ ,hˆ

ˆ

h Vk,,λ ⊗ Vk,λ ˜

(4.53)

of the level k unitary highest-weight modules of the affine algebra gˆ into similar modules of the affine subalgebra hˆ ⊂ gˆ at the level k˜ induced by restricting the bilinear form k tr on g to h. By definition, ˆ gˆ gˆ ,hˆ

bk,,λ (τ ) = tr

gˆ ,hˆ Vk,,λ

exp 2πi τ

gˆ

ˆ

L 0 − L h0 −

ck −ch˜ k

24

.

(4.54)

The decomposition (4.53) implies the one for the characters: gˆ gˆ ,hˆ hˆ χk, (τ, u) = bk,,λ (τ ) χk,λ ˜ (τ, u)

(4.55)

λ∈P˜+ (h) k

for u in the complexified Cartan algebra thC of h. From the gauge transformation rule (4.44), we should expect that the sectors with fixed windings (z 1 , z 2 ) of the group G WZW theory which transform in the anomalous / give vanishing contribuway under the large gauge transformations h p˜1∨ , p˜2∨ : Tτ tions to the partition function of the coset theory. This is, indeed, the case. Proposition 4.11. If c(˜z 1 ,˜z 2 ),(z 1 ,z 2 ) = 1 for some (˜z 1 , z˜ 2 ) ∈ Z 2 then the partition function ZzG/,z given by Eq. (4.52) vanishes. 1 2

∨

∨

Proof. Denote by P the subset of the set Ph ⊂ ι˙th ⊂ ι˙tg of coweights of h composed ∨ of such p˜ ∨ that z˜ = e 2π ι˙ p˜ ∈ Z˜ when viewed as elements of ˜ (or that z˜ ∈ Z when ˜ Clearly P ∨ ⊂ Pg∨ . The vanishing result is a consequence viewed as elements of H˜ ⊂ G). of the following well known properties of the branching functions [16]: gˆ ,hˆ

bk,,λ = 0 if

∨

exp[2π ι˙ tr p˜ ∨ ] = exp[2π ι˙ tr p˜ ∨ λ] for some p˜ ∨ ∈ P ,

(4.56)

536

K. Gaw¸edzki, R. R. Suszek, K. Waldorf gˆ ,hˆ

gˆ ,hˆ

bk,˜z ,˜z λ = bk,,λ for z˜ = e 2π ι˙ p˜

∨

∨

and p˜ ∨ ∈ P .

(4.57) ∨

The first of these relations follows from the fact that the central elements e2π ι˙ p˜ act ˆ gˆ by multiplication by the same scalars in the modules Vk, and V ˜h appearing in the k,λ decomposition (4.53). The second one is a consequence of the isomorphism between gˆ ,hˆ

the coset Virasoro modules Vk,,λ with the weights related by the spectral flows under ∨ elements e 2π ι˙ p˜ . Note that both relations are consistent with the fact that the identity ˆ (4.34) is satisfied by the characters of both affine algebras gˆ and h. ∨ ∨ If c(˜z 1 ,˜z 2 ),(z 1 ,z 2 ) = 1, then either exp[2π ι˙k tr p1 p˜ 2 ] = 1 or exp[−2π ι˙k tr p˜ 1∨ p2∨ ] = 1 ∨ for some p˜ i∨ ∈ P , see Eq. (4.22). Relation (4.56) implies that if exp[2π ι˙ k tr p1∨ p˜ 2∨ ] = ∨ exp[−2π ι˙ tr p˜ 2∨ (z 1−1 − )] = 1 for some p˜ 2∨ ∈ P , then, for each pair (, λ), either gˆ ,hˆ k,z 1−1 ,λ

b

gˆ ,hˆ

= 0 or bk,,λ = 0 so that ZzG/,z vanishes. Similarly, using relation (4.57) and 1 2

∨

Eq. (4.43), we infer that if exp[−2π ι˙ k tr p˜ 1∨ p2∨ ] = 1 for some p˜ 1∨ ∈ P , then ZzG/,z 1 2 vanishes too. As we see, global gauge anomalies in the WZW model lead to selection rules for the contributions to the partition functions of the G/ coset model. Let Z ⊂ Z be the non-anomalous subgroup that is composed of the elements z = ∨ ∨ ˜ be e 2π ι˙ p ∈ Z such that exp[2π ι˙ k tr p ∨ p˜ ∨ ] = 1 for all p˜ ∨ ∈ P , and let G = G/Z ˜ the corresponding quotient of G. Proposition 4.11 and Eqs. (4.52) imply that Z G/ (τ ) =

|Z | G / Z (τ ). |Z |

(4.58)

Upon summation over windings in (Z )2 , the partition function on the right-hand side may be rewritten in the form

Z G / (τ ) =

1 | Z˜ |

z∈Z ∈P + (g)∩C z λ∈P˜+ (h) k k

gˆ ,hˆ

gˆ ,hˆ

bk,z −1 ,λ (τ ) bk,,λ (τ ),

(4.59)

where C z is defined as in (4.37) but with the subgroup Z replaced by Z . Due to relation (4.56), we may restrict the sum on the right-hand side to pairs (, λ) such that the ∨ ∨ gˆ elements of e 2π ι˙ p˜ for p˜ ∨ ∈ P act by multiplication by the same scalar in Vk, and ˆ k,λ

∨

in V ˜h . Then, also the pairs (z −1 , λ) for z ∈ Z and (˜z , z˜ λ) for z˜ = e 2π ι˙ p˜ will

have this property due to Eq. (4.42). Besides, it follows from Eq. (4.43) that if ∈ C z then z˜ ∈ C z for all z˜ ∈ Z (unlike for C z if Z is strictly smaller than Z ). As a result of this observation and of relation (4.57), one may rewrite the sum over weights on the right-hand side of Eq. (4.59) as a sum over orbits [, λ] of the diagonal spectral flow of Z˜ . Denoting by Pz the space of such orbits with ∈ C z , we infer that

Z G / (τ ) =

z∈Z [,λ]∈Pz

1 gˆ ,hˆ b (τ ) |S[,λ] | k,z −1 ,λ

gˆ ,hˆ

bk,,λ (τ ),

(4.60)

where S[,λ] ⊂ Z˜ denotes the stabilizer subgroup of the elements of the orbit [, λ]. If S[,λ] is trivial for all orbits [, λ] then the last expression for the partition function

Global Gauge Anomalies in 2-D Bosonic Sigma Models

537

Z G / (τ ) is consistent with the following form of the space of states: gˆ ,hˆ gˆ ,hˆ ⊕ Vk,z −1 ,λ ⊗ Vk,,λ HG / = ⊕ z∈Z

[,λ]∈Pz

(4.61)

Identity (4.58) now implies that, on the contrary, barring further identifications of the coset Virasoro representations [9], the partition function Z G/ (τ ) lacks a Hilbert-space interpretation if the group Z is strictly smaller than Z , i.e. if the group G WZW model suffers from global gauge anomalies relative to the adjoint action of . This points to the inconsistency of the G/ coset model in that case. On the level of the partition function, this inconsistency is of a mild nature since one may turn the inconsistent partition function Z G/ into the consistent one Z G / by changing the normalization. In the case when G = SU (3)/Z3 = , the G/ coset theory is topological and its partition function is τ -independent. The branching functions vanish if = λ and are equal to 1 otherwise. At level 1, all coset partition functions with non-trivial windings vanish and G/

Z G/ = Z1,1 = 3 . 1

(4.62)

In a consistent two-dimensional topological field theory, the partition function is equal to the dimension of the space of states and cannot take a fractional value, confirming the inconsistency of the level 1 G/ coset model for G = SU (3)/Z3 = . On the other hand, the non-anomalous subgroup Z ⊂ Z = Z3 is trivial so that G = G˜ = SU (3) in ˜ coset theory, that case, and for the anomaly-free level 1 G/ ˜

Z G/ = 1,

(4.63)

corresponding to a 1-dimensional space of states. It was pointed out in [47] (for the diagonal coset models corresponding to simply connected groups G = G˜ = G ) that, in the presence of fixed points (0 , λ0 ) of the diagonal spectral flow of Z˜ , there is a further problem with the Hilbert space interpretation of the partition function (4.60) because of the appearance of the fraction |S[1,λ ] | . It was 0 0 shown in [16] within an algebraic approach how to resolve such fixed points to repair this defect. Somewhat earlier, in [33], it was argued that the problem may be resolved on the Lagrangian level by adding to the partition function (4.60) contributions from the sectors with gauge fields in the topologically non-trivial principal -bundles P over Tτ (it was also shown that such contributions vanish if there are no fixed points (0 , λ0 ) of the diagonal spectral flow of Z˜ ). For the sectors with topologically non-trivial gauge fields, the WZW sigma model fields are sections of the associated bundle P × G with respect to the adjoint action of , and the globally gauge invariant WZW amplitudes in the gauge field background may be defined with the help of a -equivariant structure on the WZW gerbe Gk over G, as will be explained in the following section. 5. Coupling to General Gauge Fields 5.1. Equivariant gerbes. We showed in Sect. 3 that the invariance of the Feynman amplitudes (3.10) under all gauge transformations requires the existence of a 1-isomorphism between the gerbes G12 ≡ ∗ G and Iρ ⊗ G2 over × M. Here, we shall strengthen this property by introducing the notion of -equivariant gerbes in the way that will subsequently assure the gauge invariance of the Feynman amplitudes coupled to topologically non-trivial gauge fields.

538

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

Definition 5.1. A gerbe G with the curvature H possessing a -equivariantly closed extension Hˆ (X ) = H + v(X ) will be called -equivariant relative to the 2-form ρ given by Eq. (3.26) if it is equipped with a pair (α, β), called a -equivariant structure, such that / Iρ ⊗ G2 is a 1-isomorphism of gerbes over × M; (i) α : G12 +3 α12,3 is a 2-isomorphism of 1-isomorphisms of gerbes (ii) β : (Id ⊗α2,3 )◦α1,23 over 2 × M; (iii) the following diagram of 2-isomorphisms between 1-isomorphisms of gerbes over 3 × M is commutative: (Id ⊗ α3,4 ) ◦ (Id ⊗Aα2,34 ) ◦ α1,234 AAAA | AAAA ||||| | AAAA | ||| AAAAId◦β1,2,34 (Id⊗β2,3,4 )◦Id ||||| AAAA || | AAA | | | | z | $ (Id ⊗ α23,4 ) B◦ α1,234 (Id ⊗ α3,4 ) ◦ α12,34 BBBB } BBBB }}}}} } BBBB } } }} BBB }}}}}β12,3,4 β1,23,4 BBBBB } } } BBB }} % z }}} α123,4

(5.1)

-equivariant gerbes over M form a 2-category Gr b∇(M)G . A 1-isomorphism between two -equivariant gerbes, (χ , η) : (G a , α a , β a )

/ (G b , α b , β b ),

/ G b and a 2-isomorphism η : (Id ⊗χ2 )◦ α a is a 1-isomorphism χ : between 1-isomorphisms of gerbes over × M, such that the diagram Ga

(5.2) +3 α b ◦χ12

a ) ◦ αa (Id ⊗ χ3 ) ◦ (Id ⊗ α2,3 1,23 N p NNN pp p N p N NNNId◦β a (Id⊗η2,3 )◦Id ppp NNN pp p p NN p p p NNN p p t| "* a b ◦ χ )) ◦ α a (Id ⊗ χ3 ) ◦ α12,3 (Id ⊗ (α2,3 23 1,23 ; ;;; ;;;; ;;;; ;;; η12,3 Id◦η1,23 ;;;; ;;; ! } b b b +3 α ◦ χ123 ◦ χ123 (Id ⊗ α ) ◦ α 2,3

1,23

β b ◦Id

(5.3)

12,3

of 2-isomorphisms between 1-isomorphisms of gerbes over 2 × M is commutative. 1-isomorphic -equivariant gerbes necessarily correspond to the same curvature H and to the same 2-form ρ and, consequently, to the same -equivariantly closed extension Hˆ . The identity 1-isomorphism of -equivariant gerbes is given by the pair (χ , η) = (Id, Id) for which the diagram (5.3) reduces to a trivially commutative one. Finally, a -equivariant 2-isomorphism +3 (χ , η ) : (χ , η) (5.4)

Global Gauge Anomalies in 2-D Bosonic Sigma Models

is a 2-isomorphism : χ

539

+3 χ such that the diagram (Id ⊗ χ2 ) ◦ α a

η

(Id⊗2 )◦Id

(Id ⊗ χ2 ) ◦ α a

+3 α b ◦ χ12

(5.5)

Id◦12

η

+3 α b ◦ χ 12

is commutative, which is trivially the case for the identity 2-isomorphism Id : χ when also η = η.

+3 χ

Remark 5.2. 1. We shall say that two -equivariant structures (α a , β a ) and (α b , β b ) on the gerbe G are isomorphic if the -equivariant gerbes (G, α a , β a ) and (G, α b , β b ) are 1-isomorphic. 2. If (G a , α a , β a ) is a -equivariant gerbe, then for each 1-isomorphism of gerbes / G b there exists a -equivariant structure (α b , β b ) on G b such that the δ : Ga -equivariant gerbes (G a , α a , β a ) and (G b , α b , β b ) are 1-isomorphic. 3. -equivariant gerbes (G, α, β) over a -space M may be pulled back to -equivariant gerbes ( f ∗ G, f 2∗ α, f 3∗ β) over another -space N along -equivariant maps / M. Similarly, their 1- and 2-isomorphisms may be pulled back. f :N 4. For any subgroup ⊂ , the restriction induces a -equivariant gerbe from a -equivariant gerbe (G, α, β). 5. The concept of equivariant (bundle) gerbes (with connection) introduced here is different, although not unrelated, to the one discussed in [42]. For discrete groups , the above definitions of -equivariant gerbes and their 1-isomorphisms and 2-isomorphisms are equivalent to those introduced in [26] (where the actions of that change the sign of the curvature 3-form H were also considered). There is a sub-2-category Gr b∇(M)0G composed of those -equivariant gerbes G whose curvature H is -equivariantly closed and the 2-form ρ = 0. Below, we shall need the following result, a particular consequence of the general descent theory for gerbes: Theorem 5.3. Suppose that acts on M in such a way that M = M/ is a smooth / M . Then, there manifold and M forms a smooth (left) principal -bundle ω : M exists a canonical equivalence Gr b∇(M)0 ∼ = Gr b∇(M ).

(5.6)

In particular, a gerbe G over M that is -equivariant relative to the zero 2-form descends to a gerbe G over M whose pullback by ω is 1-isomorphic to G. The equivalence of Theorem 5.3 commutes with the pullback functors: f ∗ of Gr b∇(M)0 induced / M and f ∗ of Gr b∇(M ) induced by the projected by a -equivariant map f : N / map f : N M. We give a proof of Theorem 5.3 in Appendix 4, employing results of [50].

5.2. WZ amplitudes with topologically non-trivial gauge fields. In Sect. 3, we discussed only topologically trivial two-dimensional gauge fields, i.e. connections in the trivial

540

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

principal -bundle over the worldsheet . Here, we shall consider connections in a gen/ . Such connections correspond to g-valued 1-forms eral principal -bundle π : P A on P with the following defining property: (r ∗ A)( p, γ −1 ) = Adγ (A( p) − (γ )) ,

(5.7)

/ P is the right action of on P. For a -equivariantly closed where r : P × ˆ 3-form H (X ) = H + v(X ), consider the 2-form ρ˜A on M˜ := P × M given by the formula 1

ρ˜A := −v(A) + 2 ιA¯ v(A),

(5.8)

/ M˜ will denote the left compare to the first of Eqs. (3.9). Below, the map ˜ : × M˜ ˜ action of on M: (5.9)

˜ (γ , ( p, m)) := r ( p, γ −1 ), (γ , m) = ( pγ −1 , γ m). ˜ we shall use the notation from the For maps and forms on the product spaces n × M, beginning of Sect. 3.3, marking the subscript indices with a tilde. The subscript indices without a tilde will be reserved for the factors in the expanded expression n × P × M for the same spaces. One has the following counterpart of Eq. (3.31): Lemma 5.4. As forms on × M˜ = × P × M, (ρ˜A )1˜ 2˜ = (ρ˜A )2,3 − ρ1,3 = (ρ˜A )2˜ − ρ1,3 .

(5.10)

A proof of Lemma 5.4 is given in Appendix 5. Let G be a gerbe over M with the curvature H which extends to the -equivariantly closed form Hˆ = H + v(X ). Define a gerbe G˜A over M˜ = P × M by setting G˜A := Iρ˜A ⊗ G2 .

(5.11)

Note that the curvature of G˜A is given by the closed 3-form H˜ A := d ρ˜A + H2 .

(5.12)

˜ we obtain from Lemmas 5.4 and For the pullback of H˜ A under the action ˜ of on M, 3.11: ( H˜ A )1˜ 2˜ = d(ρ˜A )1˜ 2˜ + ( ∗ H )1,3 = d(ρ˜A )2˜ − dρ1,3 + dρ1,3 + H3 = ( H˜ A )2˜ . (5.13) It follows that H˜ A (without any further extension) is a -equivariantly closed form ˜ on M. Proposition 5.5. Let (G, α, β) be a -equivariant gerbe over M in the sense of Definition 5.1 and let P be a principal -bundle over the surface with connection A. Then the gerbe G˜A over M˜ = P × M may be canonically equipped with the structure of a -equivariant gerbe relative to the zero 2-form.

Global Gauge Anomalies in 2-D Bosonic Sigma Models

541

˜ Proof. First, we have to construct a 1-isomorphism α˜ A of gerbes over × M: / (G˜A ) ˜ . 2

α˜ A : (G˜A )1˜ 2˜

(5.14)

It is obtained as the composition (G˜A )1˜ 2˜ = I(ρ˜A )1˜ 2˜ ⊗ G13

I d⊗α1,3

/ I(ρ˜ ) ⊗ Iρ ⊗ G3 = I(ρ˜ ) ⊗ G3 = (G˜A ) ˜ , 1,3 A 1˜ 2˜ A 2˜ 2 (5.15)

where we used Lemma 5.4. Hence, α˜ A is the tensor product of the identity 1-isomorphism of the gerbe I(ρ˜A )1˜ 2˜ with the 1-isomorphism α1,3 . Next, we have to construct a 2-isomorphism β˜A between 1-isomorphisms of gerbes ˜ (G˜A ) ˜ ˜ ˜ and (G˜A ) ˜ over 2 × M, 12 3

3

˜ A )1, β˜A : (α˜ A )2, ˜ 3˜ ◦ (α ˜ 2˜ 3˜

+3 (α˜ A ) ˜ ˜ ˜ . 12,3

(5.16)

Note that (α˜ A )1, ˜ 2˜ 3˜ is the 1-isomorphism Id⊗α1,24

(G˜A )1˜ 2˜ 3˜ = I(ρ˜A )1˜ 2˜ 3˜ ⊗ G124

/ I(ρ˜ ) ⊗ Iρ ⊗ G24 1,24 A 1˜ 2˜ 3˜

= I(ρ˜A )2˜ 3˜ ⊗ G24 = (G˜A )2˜ 3˜ ,

(5.17)

since Lemma 5.4 implies that (ρ˜A )1˜ 2˜ 3˜ + ρ1,24 = (ρ˜A )2˜ 3˜ . Similarly, (α˜ A )2, ˜ 3˜ is the 1-isomorphism Id⊗α2,4

(G˜A )2˜ 3˜ = I(ρ˜A )2˜ 3˜ ⊗ G24

/ I(ρ˜ ) ⊗ Iρ ⊗ G4 2,4 A 2˜ 3˜

= C I(ρ˜A )3˜ ⊗ G4 = (G˜A )3˜ ,

(5.18)

where we used the relation (ρ˜A )2˜ 3˜ + ρ2,4 = (ρ˜A )3˜ , again following from Lemma 5.4. Hence, (α˜ A )2, ˜ A )1, ˜ 3˜ ◦ (α ˜ 2˜ 3˜ is the 1-isomorphism I(ρ˜A )1˜ 2˜ 3˜ ⊗ G124 Id⊗α2,4

Id⊗α1,24

/ I(ρ˜ ) ⊗ Iρ ⊗ G24 1,24 A 1˜ 2˜ 3˜

/ I(ρ˜ ) ⊗ Iρ ⊗ Iρ ⊗ G4 = I(ρ˜ ) ⊗ G4 , 1,24 2,4 A 1˜ 2˜ 3˜ A 3

(5.19)

that is the tensor product of the identity 1-isomorphism of the gerbe I(ρ˜A )1˜ 2˜ 3˜ with the 1-isomorphism (Id ⊗ α2,4 ) ◦ α1,24 . On the other hand, (α˜ A )1˜ 2, ˜ 3˜ is the 1-isomorphism given by (G˜A )1˜ 2˜ 3˜ = I(ρ˜A )1˜ 2˜ 3˜ ⊗ G124

Id⊗α12,4

/ I(ρ˜ ) ⊗ Iρ ⊗ G4 12,4 A 1˜ 2˜ 3˜

= I(ρA )3˜ ⊗ G4 = (G˜A )3˜

(5.20)

because (ρ˜A )1˜ 2˜ 3˜ + ρ12,4 = (ρ˜A )3˜ , once again by virtue of Lemma 5.4. Comparison between (5.19) and (5.20), and Definition 5.1 (ii) show that we may take for β˜A the 2-isomorphism obtained by tensoring the identity 2-isomorphism between the identity 1-isomorphisms of the gerbe I(ρ˜A )1˜ 2˜ 3˜ with the 2-isomorphism β1,2,4 : β˜A := Id ⊗ β1,2,4 .

(5.21)

542

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

We have to check that the 1-isomorphism α˜ A and 2-isomorphism β˜A make the diagram ˜ A )2, ˜ A )1, (α˜ A )3, ˜ 4˜ ◦ (α ˜ 3˜ 4˜ ◦ (α ˜ 2˜ 3˜ 4˜ FFFFF ww w F w F w FFFF wwww FFFF wwwww FFFFId◦(β˜A )1,˜ 2,˜ 3˜ 4˜ (β˜A )2, ˜ 3, ˜ 4˜ ◦Id ww w FFFF w wwww FFF w w w ww & (α˜ A )2˜ 3, (α˜ A )3, ˜ A )1, ˜ A )1˜ 2, ˜ 4˜ ◦ (α ˜ 2˜ 3˜ 4˜ ˜ 4˜ ◦ (α ˜ 3˜ 4˜ GGGGG xx x GGGG x xxx GGGG xxxxx GGGG x x x GGGG xxxx ˜ (β˜A )1, GGGG ˜ 2˜ 3, ˜ 4˜ xxxxx (βA )1˜ 2,˜ 3,˜ 4˜ G ' x x x (α˜ A )1˜ 2˜ 3, ˜ 4˜

(5.22)

commutative. It is easy to see that the above diagram may be identified with the tensor product of the identity 2-isomorphism between the identity 1-isomorphisms of the gerbe I(ρ˜A )1˜ 2˜ 3˜ 4˜ by the pullback of diagram (5.1) along the projection from 3 × P × M to 3 × M. This assures its commutativity, completing the proof of Proposition 5.5. ˜ = P × M =: PM The action (5.9) of on M˜ is free and the quotient space M/ is the associated bundle over with the typical fiber M. The space M˜ may be viewed / PM . Theorem 5.3 and Proposition 5.5 have as as a (left) principal -bundle ω˜ : M˜ the immediate consequence Corollary 5.6. The gerbe G˜A on M˜ descends to a gerbe GA on PM whose pullback along ω˜ is 1-isomorphic to G˜A . In particular, the curvature of GA is equal to the closed 3-form HA on PM whose pullback to M˜ coincides with H˜ A . In order to couple the sigma model with target M to a gauge field A in the principal / , one has to modify also the sigma-model fields. In the gauged -bundle π : P / PM of the associated bundle rather than model, they become global sections : maps from to M. Definition 5.7. Let (G, α, β) be a -equivariant gerbe over M and P a principal -bundle with connection A over a closed oriented surface . The Wess-Zumino contribution / PM to the gauged Feynman amplitude is defined by of a field : AWZ (, A) := Hol GA ().

(5.23)

Remark 5.8. The above constructions are functorial with respect to isomorphisms of principal bundles P. If P is trivial, i.e. P = × , then the gauge fields A may be related to g-valued 1-forms A on M by the formula A(x, γ −1 ) = Adγ (A(x) − (γ )). In this case, the associated bundle PM may be naturally identified with × M, and the gerbe GA with the gerbe G A defined by relation (3.9). One recovers this way the coupling to the topologically trivial gauge fields discussed previously, see Definition 3.3. 5.3. General gauge invariance. For the general case of gauge fields A corresponding / , the general gauge transformations to connections in a principal -bundle π : P h are defined as sections of the associated bundle Ad(P) = P × Ad . The latter is

Global Gauge Anomalies in 2-D Bosonic Sigma Models

543

composed of the orbits {( pγ −1 , Adγ (γ )) | γ ∈ } := [( p, γ )] of the action of on P × . Orbits [( p, γ1 )] and [( p, γ2 )] may be multiplied to [( p, γ1 γ2 )] so that Ad(P) is a bundle of groups. Consequently, sections of Ad(P) may be multiplied point-wise, forming the group of gauge transformations. An orbit [( p, γ )] acts (from the left) on the fiber π −1 (π( p)) ⊂ P by the mapping pγ

/ pγ γ =: [( p, γ )] · pγ .

(5.24)

This action induces a left action of gauge transformations h on P by principal -bundle automorphisms λh given by P p

λh

/ h(x) · p.

(5.25)

Gauge transformations of the gauge field A are defined as A

/ hA := λ ∗−1 A. h

(5.26)

Note that the maps L˜ h := λh × Id

(5.27)

from M˜ = P × M into itself are -equivariant, i.e. they commute with the action (5.9) ˜ Consequently, they descend to automorphisms L h of the associated bundle of on M. PM = P × M. Gauge transformations of sections of PM are defined by the formula

/ L h ◦ =: h.

(5.28)

In the case of the trivial bundle P, the associated bundle Ad(P) is also trivial and the sections h of Ad(P) reduce to maps from to . Their action on gauge fields A agrees with the action (4.3) on the 1-forms A related to A as in Remark 5.8. Similarly, their action on sections of the trivial associated bundle agrees with the one considered in Eq. (4.2). The invariance of the amplitudes AWZ (, A) from Definition 5.7 in the case of the trivial bundle P is assured by the assumption of the -equivariance of the gerbe G. Indeed, as follows from Corollary 4.5, only property (i) of Definition 5.1 is needed in that case to guarantee the gauge invariance under general gauge transformations. Here, we shall prove for a general principal -bundles P, Theorem 5.9. The amplitudes AWZ (, A) of Definition 5.7 are invariant under all gauge transformations, i.e. AWZ (h, hA) = AWZ (, A)

(5.29)

for all sections h of the bundle Ad(P). Proof. We have to show that Hol GhA (L h ◦ ) = Hol L ∗h GhA () = Hol GA ()

(5.30)

for all h, and A. This follows if there exists a 1-isomorphism between gerbes L ∗h Gh A and GA . Recall that gerbe GA over PM descended from the -equivariant ˜ see Proposition 5.5 and Corollary 5.6. Since maps L˜ h gerbe (G˜A , α˜ A , β˜A ) over M,

544

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

of M˜ are -equivariant, gerbe L ∗h G A descends, in turn, from the -equivariant gerbe ( L˜ ∗h G˜A , ( L˜ h )∗˜ α˜ A , ( L˜ h )∗˜ β˜A ), see Theorem 5.3. We claim that the two gerbes 2

3

( L˜ ∗h G˜A , ( L˜ h )∗2˜ α˜ A , ( L˜ h )∗3˜ β˜A ) and (G˜h −1 A , α˜ h −1 A , β˜h −1 A ) coincide. The claim implies, by virtue of Theorem 5.3, that the descended gerbes L ∗h GA and Gh −1 A over PM coincide as well and, hence, so do L ∗h Gh A and GA . It remains to prove the above claim. From definitions (5.8) of the form ρ˜ A , (5.26) of ˜ using, in particular, the fact that L˜ h acts trivially hA and (5.27) of the action L˜ h on M, on the factor M in P × M, it follows immediately that L˜ ∗h ρ˜A = ρ˜h −1 A .

(5.31)

This, in conjunction with definition (5.11), implies, in turn, the equality of gerbes L˜ ∗h G˜A = G˜h −1 A .

(5.32)

Recall from the proof of Proposition 5.5 that α˜ A is the tensor product of the identity 1-isomorphism of the gerbe I(ρ˜A )1˜ 2˜ with the 1-isomorphism α1,3 . Now, the map ( L˜ h )2˜ of × M˜ = × P × M acts only on the factor P. Besides, ( L˜ h )∗2˜ (ρ˜A )1˜ 2˜ = ( L˜ ∗h ρ˜A )1˜ 2˜ = (ρ˜h −1 A )1˜ 2˜ .

(5.33)

We infer this way that the 1-isomorphism ( L˜ h )∗˜ α˜ A is the tensor product of the identity 2 1-isomorphism of the gerbe Iρ˜h −1 A with the 1-isomorphism α1,3 so that ( L˜ h )∗2˜ α˜ A = α˜ h −1 A .

(5.34)

Additionally, equalities (5.32) and (5.34) allow to relate the 2-isomorphisms ( L˜ h )∗˜ β˜A 3 and β˜h −1 A . Indeed, both are tensor products of the identity 2-isomorphism between the identity 1-isomorphisms of the gerbe ( L˜ h )∗˜ I(ρ˜A )1˜ 2˜ 3˜ = I(ρ˜h −1 A )1˜ 2˜ 3˜ with the 2-isomor3 phism β1,2,4 , see Eq. (5.21). Hence, ( L˜ h )∗3˜ β˜A = β˜h −1 A , and the claim is established.

(5.35)

6. Obstructions and Classification of Equivariant Structures In this section, we shall treat the obstructions to the existence and the classification of equivariant structures on gerbe G over a -space, see Definition 5.1. We shall start by discussing subsequently the obstructions to the three parts of the structure: 1-isomorphism α, 2-isomorphism β, and the commutative diagram (5.1).

Global Gauge Anomalies in 2-D Bosonic Sigma Models

545

6.1. Obstructions to 1-isomorphisms α. The first obstruction concerns the existence / Iρ ⊗ G2 or, equivalently, the triviality of 1-isomorof 1-isomorphism α : G12 2 phism class [F] ∈ H ( × M, U (1)) of the flat gerbe F = G12 ⊗ G2∗ ⊗ I−ρ over × M. It coincides with the obstruction to the general gauge invariance of the WZ amplitudes (3.10) coupled to topologically trivial gauge fields, see Corollary 4.5. By the Universal Coefficient Theorem, H 2 ( × M, U (1)) = H om(H2 ( × M), U (1)). In the latter presentation, class [F] is given by the holonomy of the flat gerbe F along maps / × M defining singular 2-cycles, and its triviality is equivalent to the (h, ϕ) : triviality of the holonomy. By the Künneth Theorem, H2 ( × M) = H2 () ⊗ H0 (M) ⊕ H1 () ⊗ H1 (M) ⊕ H0 () ⊗ H2 (M).

(6.1)

Subgroup H2 () ⊗ H0 (M) ∼ = H2 ()π0 (M) is generated by the singular 2-cycles corresponding to maps (h, ϕ) with ϕ taking a constant value in one of the connected components of M (π0 (M) is the set of such components). Similarly for H0 () ⊗ H2 (M) ∼ = H2 (M)π0 () . Subgroup H1 () ⊗ H1 (M) is generated by the maps / h(e iσ1 ), ϕ(e iσ2 ) ∈ × M S 1 × S 1 (e iσ1 , e iσ2 ) (6.2) with h and ϕ giving rise to singular 1-cycles in and M, respectively. Thus, H 2 ( × M, U (1)) = H om(H2 ()π0 (M) , U (1)) ⊕ H om(H1 () ⊗ H1 (M), U (1)) ⊕H om(H2 (M)π0 () , U (1)) = H 2 (, U (1))π0 (M) ⊕ H om(H1 () ⊗ H1 (M), U (1)) ⊕ H 2 (M, U (1))π0 () . (6.3) Accordingly, we obtain Proposition 6.1. Class [F] ∈ H 2 ( × M, U (1)) that obstructs the existence of 1-isomorphism α of Definition 5.1 decomposes as [F] = [F]20 + [F]11 + [F]02 ,

(6.4)

with the summands [F]20 ∈ H 2 (, U (1))π0 (M) , [F]11 ∈ H om(H1 ()⊗ H1 (M), U (1)) and [F]02 ∈ H 2 (M, U (1))π0 () . Components of [F]20 are the 1-isomorphism classes of flat gerbes rm∗ G ⊗I−ρm over for fixed points m in different connected components of M with rm (γ ) = γ m = γ (m) and ρm = 21 (ιa v b )(m)a b . Components of [F]02 are the 1-isomorphism classes of flat gerbes ∗γ G ⊗ G ∗ for fixed points γ in different connected components of . Finally, the bihomomorphism [F]11 ∈ H om(H1 () ⊗ H1 (M), U (1)) is given by the gerbe F holonomy of the maps (6.2). Corollary 6.2. If the connected components of M and are 2-connected, then there is no obstruction to the existence of 1-isomorphism α of Definition 5.1. This applies to the case, studied in [26,27], of -equivariant structures on the WZW gerbe Gk over G˜ for = Z ⊂ Z˜ acting on G˜ by multiplication.

546

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

For the -space M = G in the coset-model context, see Definition 4.6, and with a WZW gerbe Gk over G, the flat gerbe F was denoted Fk , see Sect. 4.2. In decomposition (6.4) of cohomology class [Fk ] ∈ H 2 ( × G, U (1)), terms [Fk ]20 and [Fk ]02 are trivial as determined by the Fk -holonomy of the maps (h p˜1∨ , p˜2∨ , ϕ p1∨ , p2∨ ) of Eqs. (4.21) with p1∨ = p2∨ = 0 or p˜ 1∨ = p˜ 2∨ = 0, respectively, whereas the bihomomorphism [Fk ]11 ∈ H om( Z˜ ⊗ Z , U (1)) is determined by the holonomy with p˜ 2∨ = p1∨ = 0, i.e. by bz˜ 1 ,z 2 = exp −2πi k tr p˜ 1∨ p2∨ , (6.5) see Eq. (4.22), and may be non-trivial. 6.2. Local description of gerbes. In order to discuss further obstructions to the existence of a -equivariant structure on gerbe G over -space M, it will be convenient to use local data for gerbes and their 1- and 2-isomorphisms. We shall follow the discussion in the first part of Sec. VII of [25]. The local data live in the Deligne complex D(2) 0

/ A0 (O)

D0

/

A1 (O)

D1

/

A2 (O)

D2

/

A3 (O)

(6.6)

associated to an open covering O of M. With U standing for the sheaf of smooth U (1)valued functions and n for the sheaf of n-forms, the groups of the Deligne complex are A0 (O) = C 0 (O, U) , A1 (O) = C 0 (O, 1 ) ⊕ C 1 (O, U) , A2 (O) = C 0 (O, 2 ) ⊕ C 1 (O, 1 ) ⊕ C 2 (O, U) , A3 (O) = C 1 (O, 2 ) ⊕ C 2 (O, 1 ) ⊕ C 3 (O, U) ,

(6.7) (6.8)

ˇ where C (O, S) denotes the th Cech cochain group of the open cover O, with values in a sheaf S of Abelian groups. The differentials are D0 ( f i ) = (−i f i−1 d f i , f j−1 f i ),

D1 (i , χi j )

−1 −1 = (di , −iχi−1 j dχi j + j − i , χ jk χik χi j ),

(6.9)

D2 (Bi , Ai j , gi jk ) −1 −1 = (d Ai j − B j + Bi , −igi−1 jk dgi jk + A jk − Aik + Ai j , g jkl gikl gi jl gi jk ).

(6.10)

/ O of covering O induces a restriction map on complexes (6.6). A refinement r : O Local data for gerbe G over M form a cocycle c ∈ A2 (O), D2 c = 0, for a sufficiently / G2 of gerbes with the fine covering O of M. Local data for 1-isomorphism α : G1 respective local data ci ∈ A2 (Oi ) are given by a cochain b ∈ A1 (O) for O a common refinement of O1 and O2 such that, upon restricting the ci to it, c2 = c1 + D1 b (we use the additive notation for the Abelian group law in all An (O)). Finally, local +3 α2 are given by a cochain a ∈ A0 (O) for a suffidata for 2-isomorphism β : α1 ciently fine covering O such that, given local data bi for 1-isomorphisms αi restricted to O, b2 = b1 + D0 a. For sufficiently fine O, the cohomology of the complex (6.6) is H2 (O, D(2)) =

ker D2 ker D1 ∼ 1 , H1 (O, D(2)) = = H (M, U (1)), Im D1 Im D0 H0 (O, D(2)) = ker D0 ∼ = H 0 (M, U (1)).

(6.11) (6.12)

Global Gauge Anomalies in 2-D Bosonic Sigma Models

547

These groups may be identified, respectively, with the group of 1-isomorphism classes of gerbes, the group of isomorphism classes of flat line bundles, and the group of locally constant U (1)-valued functions on M. In the following, we want to consider local data for gerbes and their 1- and 2-isomorphisms over the spaces p × M that form a simplicial manifold with face p / p−1 × M for all p ≥ 1 and 0 ≤ q ≤ p given by maps q : p × M ⎧ ⎨ (γ2 , . . . , γ p , m) p q (γ1 , . . . , γ p , m) := (γ1 , . . . , γq γq+1 , . . . , γ p , m) ⎩ (γ1 , . . . , γ p−1 , γ p m)

for q = 0, for 1 ≤ q < p, (6.13) for q = p.

The face maps satisfy the simplicial relations p−1

r

p

p−1

p

◦ q = q−1 ◦ r

(6.14)

p for all r < q. We shall use simplicial sequences {O p } of open coverings O p = Oi i∈I p p / I p−1 of the index sets of the spaces p × M such that there are face maps q : I p satisfying (6.14), and such that p

p

p−1

q (Oi ) ⊂ O p (i)

(6.15)

q

for all p ≥ 1, all 0 ≤ q ≤ p and all i ∈ I p . A construction of Ref. [52], reviewed in the Appendix of [25], permits to build a simplicial sequence {O p } whose coverings O p refine the coverings of any given sequence of coverings of p × M. Given a simplicial sequence {O p } of coverings of p × M, one has induced cochain maps / C (O p , S) defined by

(q )∗ : C (O p−1 , S) p

p ∗ p (q ) f i = (q )∗ ( f qp (i) ), (6.16)

satisfying the co-simplicial relations (q )∗ ◦ (r

p−1 ∗

p

) = (r )∗ ◦ (q−1 )∗ p

p−1

(6.17)

for r < q. On the groups An (O p ), besides the Deligne differentials Dn, p : An (O p )

/ An+1 (O p ),

(6.18)

one has the simplicial operators n, p : An (O p )

/ An (O p+1 ) with n, p :=

p+1 p+1 (−1)q (q )∗

(6.19)

q=0

whose definition uses the lift (6.16) of the face maps to groups An (O p ). Due to the co-simplicial relations (6.17), we have n, p+1 ◦ n, p = 0. The differentials Dn, p commute with pullbacks, and thus also with operators n, p . This endows the family K = (An (O p )) of Abelian groups with the structure of a double complex.

548

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

6.3. Obstructions to 2-isomorphism β. If cocycle c ∈ A2 (O0 ) describes local data for gerbe G over M then −(2,0 c + ρ) ∈ A2 (O1 ), where ρ is identified with the cochain (ρ| O 1 , 0, 1) for i ∈ I 1 , represents local data for the flat gerbe F = G12 ⊗G2∗ ⊗I−ρ . The i triviality of 1-isomorphism class [F], discussed in Sect. 6.1, means that, for a sufficiently fine simplicial sequence of coverings {O p }, 2,0 c + ρ = D1,1 b

(6.20)

for some b ∈ A1 (O1 ). The cochain b provides local data for a 1-isomorphism α : / Iρ ⊗ G2 , see Definition 5.1. It is defined modulo the addition b / b + b , G12 where D1,1 b = 0. This freedom corresponds to the freedom of choice of α and of local data for it. The cochains (20 )∗ b, (21 )∗ b and (22 )∗ b provide, in turn, local data for 1-isomorphisms α2,3 , α12,3 and α1,23 , respectively. The existence of 2-isomorphism +3 α12,3 is equivalent to the requirement that, for sufficiently β : (Id ⊗ α2,3 ) ◦ α1,23 fine {O p }, 1,1 b = −D0,2 a

(6.21)

with a ∈ A0 (O2 ) representing local data for β. Let us first note that D1,2 1,1 b = 2,1 D1,1 b = 2,1 2,0 c + 2,1 ρ = 0,

(6.22)

where the last equality is a consequence of relations 2,1 ◦ 2,0 = 0 and 2,1 ρ = (ρ2,3 − ρ12,3 + ρ1,23 )| O 2 , 0, 0 , and of Eq. (3.31) of Lemma 3.13. It follows that 1,1 b i defines a cohomology class [1,1 b] ∈

ker D1,2 ∼ 1 2 = H ( × M, U (1)) Im D0,2

(6.23)

that obstructs the solution of Eq. (6.21). However, since b was defined up to D1,1 -cocycles b ∈ A1 (O1 ), the class [1,1 b] is defined modulo the image H1,2 of the map / H 1 ( 2 ×M, U (1)) that sends class [b ] to class [1,1 b ]. [1,1 ] : H 1 (×M, U (1)) We obtain this way / Iρ ⊗ G2 be a 1-isomorphism with local data b ∈ Proposition 6.3. Let α : G12 A1 (O1 ) for a sufficiently fine family of coverings {O p }. Then there exists 2-isomorphism β for a, possibly modified, choice of 1-isomorphism α if and only if the obstruction class [1,1 b] + H1,2 ∈ H 1 ( 2 × M, U (1)) H1,2

(6.24)

vanishes. In the particular case with simply connected components of and M, groups H 1 ( p × M, U (1)) are trivial and we obtain Corollary 6.4. If the connected components of and M are simply connected then the class (6.24) obstructing the existence of 2-isomorphism β is trivial.

Global Gauge Anomalies in 2-D Bosonic Sigma Models

549

This applies to the case of Z -equivariant structures on gerbes Gk over groups G˜ discussed in [26,27]. In the general situation, a more precise description of spaces H 1 ( 2 × M, U (1)) ⊃ 1,2 H may be provided with the help of the Universal Coefficient and Künneth Theorems. One has H 1 ( × M, U (1)) ∼ = H 1 (, U (1))π0 (M) ⊕ H 1 (M, U (1))π0 () . The element

[b ]

∈

H 1 (

(6.25)

× M, U (1)) is represented by the sequences with elements

[b ]1 ([m]) := (ι1m )∗ [b ] ∈ H 1 (, U (1)), [b ]2 ([γ ]) := (ι2γ )∗ [b ] ∈ H 1 (M, U (1)), (6.26) where m, resp. γ , are chosen points in the connected components [m] ∈ π0 (M), resp. / × M, resp. ι2 : M / × M, are the injections with [γ ] ∈ π0 (), and ι1m : γ ι1m (γ ) = ι2γ (m) = (γ , m). Similarly, H 1 ( 2 × M, U (1)) 2 ∼ = H 1 (, U (1))π0 ()×π0 (M) ⊕ H 1 (, U (1))π0 ()×π0 (M) ⊕ H 1 (M, U (1))π0 () . (6.27) An element [d] ∈ H 1 ( 2 × M, U (1)) is represented by the sequences with elements [d]1 ([γ2 ], [m]) := (ι1γ2 ,m )∗ [d] ∈ H 1 (, U (1)),

(6.28)

[d]2 ([γ1 ], [m]) := (ι2γ1 ,m )∗ [d] ∈ H 1 (, U (1)), [d]3 ([γ1 ], [γ2 ]) := (ι3γ1 ,γ2 )∗ [d] ∈ H 1 (M, U (1)),

(6.29) (6.30)

/ 2 × M and ι3 / 2 × M are the injections with where ι1γ2 ,m , ι2γ1 ,m : γ1 ,γ2 : M 1 2 3 ιγ2 ,m (γ1 ) = ιγ1 ,m (γ2 ) = ιγ1 ,γ2 (m) = (γ1 , γ2 , m). Compositions of the above injections with simplicial maps q2 are 20 ◦ ι1γ2 ,m (γ1 ) = 20 ◦ ι2γ1 ,m (γ2 ) = 20 ◦ ι3γ1 ,γ2 (m) = (γ2 , m),

(6.31)

21 22

= (γ1 γ2 , m),

(6.32)

= (γ1 , γ2 m).

(6.33)

◦ ι1γ2 ,m (γ1 ) ◦ ι1γ2 ,m (γ1 )

= =

21 22

◦ ι2γ1 ,m (γ2 ) ◦ ι2γ1 ,m (γ2 )

= =

21 22

◦ ι3γ1 ,γ2 (m) ◦ ι3γ1 ,γ2 (m)

Since 1,1 = (20 )∗ − (21 )∗ + (22 )∗ , it follows that [1,1 b]1 ([γ2 ], [m]) = [−Rγ∗2 (ι1m )∗ b + (ι1γ2 m )∗ b],

(6.34)

[1,1 b]2 ([γ1 ], [m]) = [(ι1m )∗ b − L ∗γ1 (ι1m )∗ b + (ι2γ1 ◦ rm )∗ b], [1,1 b]3 ([γ1 ], [γ2 ]) = [(ι2γ2 )∗ b − (ι2γ1 γ2 )∗ b + ∗γ2 (ι2γ1 )∗ b],

(6.35) (6.36)

/ denote, respectively, the left and the right multiplication by where L γ , Rγ : γ , rm (γ ) = γ m (as before), and we used the fact that the class in H 1 (, U (1)) of the pullback of A1 (O1 ) along a constant map is trivial. When the group is connected, we may choose its identity element as its special point and the above equations reduce to [1,1 b]1 ([1], [m]) = 0, [1,1 b]2 ([1], [m]) = [(ι21 ◦ rm )∗ b], [1,1 b]3 ([1], [1]) = [(ι21 )∗ b].

(6.37)

550

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

In the case of a -space M = G in the coset-model context of Definition 4.6, we may take m = 1 ∈ G in the last formulae which reduce then further to the relations [1,1 b]1 ([1], [1]) = 0, [1,1 b]2 ([1], [1]) = 0, [1,1 b]3 ([1], [1]) = [(ι21 )∗ b], (6.38) because ι21 ◦ r1 is a constant map. In particular, for b = b with D1,1 b = 0, [1,1 b ]1 ([1], [1]) = 0, [1,1 b ]2 ([1], [1]) = 0, [1,1 b ]3 ([1], [1]) = [b ]2 ([1]). (6.39) Since [b ]2 ([1]) runs through arbitrary elements of H 1 (M, U (1)), it follows that the obstruction class (6.24) vanishes and, for an appropriate choice of b with D1,1 b = 0, one has [1,1 (b + b )] = 0 so that 1,1 (b + b ) = −D0,2 a for some a ∈ A0 (O2 ). We obtain this way Corollary 6.5. For the -space M = G in the coset-model context, an appropriate choice of 1-isomorphism α of Definition 5.1 assures the existence of 2-isomorphism β. 6.4. Obstructions to the commutativity of diagram (5.1). By Proposition 6.3, the vanishing of obstruction (6.24) guarantees in the general case that 2-isomorphism β exists for a suitable choice of 1-isomorphism α. In terms of local data, the condition [1,1 b] ∈ H1,2 assures that after a modification of local data b by an appropriate D1,1 -cocycle b , deter / mined up to the change b b − D0,1 a , there exists a ∈ A0 (O2 ) such that 1,1 (b + b ) = −D0,2 a.

(6.40)

In view of the freedom of choice of b , the cochain a is determined up to the replacement / a a + 0,1 a + a for a ∈ A0 (O1 ) and a ∈ ker D0,2 ∼ = H 0 ( 2 × M, U (1)). Cocycle a describes the possible choices of 2-isomorphism β. The commutativity of the diagram (5.1) of 2-isomorphisms of gerbes over 3 × M is now equivalent to the condition that, after the restriction to a sufficiently fine simplicial sequence of coverings, 0,2 a = 0.

(6.41)

D0,3 0,2 a = 1,2 D0,2 a = −1,2 1,1 (b + b ) = 0

(6.42)

Note that, in any case, so that 0,2 a ∈ ker D0,3 ∼ = H 0 ( 3 × M, U (1)). Let us denote by H0,3 the image of 0 2 / H 0 ( 3 × M, U (1)) that sends a to 0,2 a . the map [0,2 ] : H ( × M, U (1)) Using the freedom in the choice of the cochain a and the relation 0,2 (a + 0,1 a + a ) = 0,2 a + 0,2 a ,

(6.43)

we infer Proposition 6.6. 2-isomorphism β may be chosen so that the diagram (5.1) of Definition 5.1 is commutative if and only if the obstruction class (6.44) 0,2 a + H0,3 ∈ H 0 ( 3 × M, U (1)) H0,3 vanishes.

Global Gauge Anomalies in 2-D Bosonic Sigma Models

551

Elements f p ∈ H 0 ( p × M, U (1)) are locally constant U (1)-valued functions on × M. One may identify them with p-chains v p on the group π0 () with values in the π0 ()-module U (1)π0 (M) ∼ = H 0 (M, U (1)) of U (1)-valued functions on π0 (M), where the action of π0 () on U (1)π0 (M) is induced from the action of on M. If the identification is done by the formula p

p ([m]), [γ p−1 ],...,[γ1−1 ]

f p (γ1 , . . . , γ p , m) = v

(6.45)

/ H 0 ( p+1 × M, U (1)) become then the induced maps [0, p ] : H 0 ( p × M, U (1)) the coboundary operators δ p of the group π0 () cohomology: (0, p f p )(γ1 , . . . , γ p , γ p+1 , m) = (−1) p+1 (δ p v p )[γ −1 ],[γ p−1 ],...,[γ −1 ] ([m]). p+1

1

(6.46)

Corollary 6.7. Under identification (6.45), the cochain 0,2 a generates a 3-cocycle v 3 of the group π0 () taking values in U (1)π0 (M) and the obstruction coset (6.44) is the cohomology class [v 3 ] ∈ H 3 (π0 (), U (1)π0 (M) ). In particular, when is discrete and M is connected, then [v 3 ] ∈ H 3 (, U (1)). That is the situation for the Z -equivariant structures on gerbes Gk over groups G˜ discussed in [26,27] and mentioned already above. The obstruction cohomology classes [v 3 ] ∈ H 3 (Z , U (1)) were computed for these cases and simple G˜ in [24]. Since the cohomology groups H p (π0 (), U (1)π0 (M) ) for p > 1 are trivial if π0 () is a trivial group, we obtain Corollary 6.8. If the symmetry group is connected and 2-isomorphism β of Definition 5.1 exists, then it may always be chosen so that the diagram (5.1) commutes. Putting together Proposition 4.8 and Corollaries 6.5 and 6.8, we summarize the results for the situation discussed in Sect. 4.2: Theorem 6.9. For a -space M = G in the coset-model context of Definition 4.6, a -equivariant structure on the WZW gerbe Gk over G exists if and only if the global˜ anomaly phases (4.22) are trivial, as, e.g., for G = G. 6.5. Classification of equivariant structures. Suppose now that we are given two equivariant structures (αi , βi ), i = 1, 2, on gerbe G with local data c ∈ A2 (O0 ), D2,0 c = 0, for a sufficiently fine simplicial sequence of coverings {O p }. Their local data are (bi , ai ), with bi ∈ A1 (O1 ) and ai ∈ A0 (O2 ), that satisfy 2,0 c + ρ = D1,1 bi , 1,1 bi = −D0,2 ai , 0,2 ai = 0.

(6.47)

The difference (b , a ) = (b2 − b1 , a2 −a1 ) gives local data for a -equivariant structure on the trivial gerbe I0 (relative to ρ = 0). It satisfies the homogeneous equations D1,1 b = 0, 1,1 b = −D0,2 a , 0,2 a = 0.

(6.48)

There is an isomorphism (χ , η) between the equivariant structures (αi , βi ) if there exist: / G) a cocycle e ∈ A1 (O0 ), D1,0 e = 0 (providing local data for 1-isomorphism χ : G and a cochain f ∈ A0 (O1 ) (giving local data for 2-isomorphism η) such that b = 1,0 e + D0,1 f, a = −0,1 f.

(6.49)

552

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

These identities represent the definition of η and the commutativity of diagram (5.3), respectively. They imply Eqs. (6.48). Classes of solutions to Eqs. (6.48) modulo solutions to Eqs. (6.49) form the 2nd hypercohomology group H2 (J ) of the double complex J 0

/ A0 (O0 )

D0,0

0

D0,1

/ ker D1,1

D0,2

/ ker D1,2

D0,3

/ ker D1,3

1,1

0,1

0

/ A0 (O2 )

1,2

0,2

0

/ A0 (O3 )

(6.50)

1,0

0,0

/ A0 (O1 )

/ ker D1,0

obtained from the double complex K = (An (O p )). H2 (J ) is the group of isomorphism classes of -equivariant structures on the trivial gerbe I0 . It acts freely and transitively on the set of isomorphism classes of -equivariant structures on gerbe G. In other words, Proposition 6.10. The set of isomorphisms classes of -equivariant structures on gerbe G is a torsor for the Abelian group H2 (J ). / H 1 ( × M, U (1)) Denote by H1,1 the image of the map [1,0 ] : H 1 (M, U (1)) that sends class [e] to class [1,0 e]. In terms of the decomposition (6.25) and (6.26), [1,0 e]1 ([m]) = −[rm∗ e], [1,0 e]2 ([γ ]) = [e] − [ ∗γ e]. Since b is a D1,1 -cocycle, one may consider the map / [b ] + H1,1 ∈ H 1 ( × M, U (1)) H1,1 . (b , a )

(6.51)

(6.52)

Since [b ] ∈ H1,1 for (b , a ) of the form (6.49), the map (6.52) induces a homomorphism / H 1 ( × M, U (1)) H1,1 (6.53) κ : H2 (J ) of Abelian groups. To describe the image and the kernel of κ, we shall do some tracing of diagrams. If [b ] + H1,1 is in the image of κ, then b = b + 1,0 e + D0,1 f for some (b , a ) as above, some e ∈ A1 (O0 ) with D1,0 e = 0, and some f ∈ A0 (O1 ). Consequently, 1,1 b = −D0,2 a + 1,1 D0,1 f = −D0,2 (a − 0,1 f ) so that [1,1 ][b ] = 0. For any [b ] that satisfies the latter equation, i.e. such that 1,1 b = −D0,2 a for some a ∈ A0 (O2 ), we have D0,3 0,2 a = 1,2 D0,2 a = 0, hence 0,2 a ∈ H 0 ( 3 × M, U (1)) and it generates, via Eq. (6.45), a 3-cocycle v 3 on group π0 () with values in U (1)π0 (M) . If [b ]+H1,1 is in the image of κ, then, for a := a −0,1 f −a , we have D0,2 a = 0 so that a ∈ H 0 ( 2 × M, U (1)) generates, again via Eq. (6.45), a 2-cochain u 2 on π0 () with values in U (1)π0 (M) . The relation 0,2 a = −0,2 a implies then that δ 2 u 2 = v 3 so that the cohomology class of [v 3 ] ∈ H 3 (π0 (), U (1)π0 (M) ) vanishes. Conversely, if this is the case, then 0,2 a = −0,2 a for some a ∈ H 0 ( 2 × M, U (1)) so that, for a = a + a , one has 1,1 b = −D0,2 a and 0,2 a = 0. We have proven this way

Global Gauge Anomalies in 2-D Bosonic Sigma Models

553

Lemma 6.11. [b ] + H1,1 is in the image of κ if and only if 1. [1,1 ][b ] = 0 so that 1,1 b = −D0,2 a , 2. the cohomology class [v 3 ] ∈ H 3 (π0 (), U (1)π0 (M) ) of the 3-cocycle v 3 corresponding, via Eq. (6.45), to 0,2 a ∈ H 0 ( 3 × M, U (1)) vanishes. Now, let us study the kernel of κ. If [b ] ∈ H1,1 , i.e. b = 1,0 e + D0,1 f for e ∈ with D1,0 e = 0 and f ∈ A0 (O1 ), then 1,1 b = 1,1 D0,1 f = D0,2 0,1 f so that A1 (O0 )

a + 0,1 f ∈ ker D0,2 ∼ = H 0 ( 2 × M, U (1)).

(6.54)

Since 0,2 (a + 0,1 f ) = 0, the cochain a + 0,1 f may be identified, by means of Eq. (6.45), with a 2-cocycle v 2 on group π0 () with values in U (1)π0 (M) . All 2-cocycles v 2 may be obtained this way by changing a to a + a with D0,2 a = 0 = 0,2 a . Since f is defined modulo f ∈ ker D0,1 ∼ = H 0 ( × M, U (1)), 2-cocycle v 2 is defined 1 modulo coboundaries of the 1-cochains u corresponding to f so that the cohomology class [v 2 ] ∈ H 2 (π0 (), U (1)π0 (M) ) is well defined by the pair (b , a ) with [b ] ∈ H1,1 . The class [v 2 ] vanishes if and only if (b , a ) is of the form (6.49). This shows Lemma 6.12. The kernel of the map κ of (6.53) may be identified with the cohomology group H 2 (π0 (), U (1)π0 (M) ). Let us look at some special cases. First, if H 1 (, U (1)) = {0} = H 1 (M, U (1)), then the homomorphism κ vanishes and we obtain from Lemma 6.12: Corollary 6.13. In the case when the connected components of and M are simply connected, H2 (J ) ∼ = H 2 (π0 (), U (1)π0 (M) ). This is the result that gives, e.g., the classification of Z -equivariant structures on gerbe Gk over G˜ for Z ⊂ Z˜ acting by multiplication, see [24,26,27]. ˜ Z˜ , where ˜ is a simply connected Suppose now that is connected so that = / ˜ ˜ One has H1 () ∼ Lie group and Z is a subgroup of the center of . = π1 () ∼ = Z˜ . Lemma 6.12 implies in that case that κ is injective onto its image which, by Lemma 6.11 and Eq. (6.37), is composed of the cosets [b ] + H1,1 such that [b ]2 ([1]) = 0 in the decomposition (6.25) and (6.26). From the explicit form (6.51) of H1,1 , we then infer Corollary 6.14. If the group and manifold M are connected, then H2 (J ) ∼ = H 1 (, U (1)) [rm∗ ](H 1 (M, U (1))) ∼ = Z ∗M ,

(6.55)

where Z ∗M is the group of characters of the kernel Z M of the homomorphism from H1 () rm / to H1 (M) induced by the map γ γ m. In particular, we have Corollary 6.15. For the -space M = G in the coset-model context, see Definition 4.6, Z M = Z˜ so that ∗ H2 (J ) ∼ = H 1 (, U (1)) ∼ = Z˜

(6.56)

and the -equivariant structures on the WZW gerbes Gk over G are classified by the group of characters of H1 () ∼ = π1 () ∼ = Z˜ .

554

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

Let us analyze closer the case when the -space M is a (left) principal -bundle / M . By the descent Theorem 5.3, each -equivariant structure on the trivial ω:M gerbe I0 relative to the vanishing 2-form descends to a flat gerbe on M whose pullback to M is 1-isomorphic to I0 . Passing to isomorphism classes, one obtains the canonical injective homomorphism ν : H2 (J )

/ H 2 (M , U (1))

(6.57)

/ H 2 (M, U (1)). that maps into the kernel of the pullback map [ω∗ ] : H 2 (M , U (1)) Now, suppose that we are given a flat gerbe on M whose class is in the kernel of [ω∗ ]. It is easy to see, using Theorem 5.3 and Remark 5.2(2), that such a gerbe is 1-isomorphic to a gerbe that descends from the trivial gerbe I0 equipped with a -equivariant structure (relative to ρ = 0). This shows that ν maps onto the kernel of [ω∗ ]. We obtain this way Corollary 6.16. In the case when M is a principal -bundle, there is an exact sequence of Abelian groups 0

/ H2 (J )

ν

/

H 2 (M , U (1))

[ω∗ ]

/

H 2 (M, U (1))

(6.58)

that induces an isomorphism between H2 (J ) and the kernel of [ω∗ ] in H 2 (M , U (1)). If and M are connected, then the exact sequence (6.58) is induced, by virtue of / M Corollary 6.14, by the cohomology exact sequence for the -bundle ω : M [10,49] H 1 (M, U (1))

[rm∗ ]

/

H 1 (, U (1))

τ

/ H 2 (M , U (1))

[ω∗ ]

/ H 2 (M, U (1)). (6.59)

The middle arrow τ may be easily described in terms of the classifying space B of / H 1 (, U (1)) is an isomorphism for group . The transgression map H 2 (B, U (1)) connected . Its composition with τ is given by the pullback map from H 2 (B, U (1)) / B for the principal bundle to H 2 (M , U (1)) along the classifying map f ω : M / M . In Appendix 6, we describe an equivalent construction of homomorω : M phism τ . That construction, carried out in terms of line bundles and gerbes, will be used below. 6.6. Ambiguity of gauged amplitudes. Let us recall from Sect. 5.2 how the WZ amplitudes coupled to a topologically non-trivial gauge field A in the principal -bundle P over the worldsheet were defined. They were given by the holonomy of gerbe GA over the associated bundle PM = P × M, see Definition 5.7. That gerbe was obtained via Theorem 5.3 from gerbe G˜A = Iρ˜A ⊗ G2 over M˜ = P × M equipped with a -equivariant structure (relative to ρ = 0) induced from that of G. Let us use the subscript M or M˜ to distinguish between the two cases of -spaces. If we change the isomorphism class of a -equivariant structure on G by a class K ∈ H2 (J M ), then a quick inspection of the proof of Proposition 5.5 shows that the isomorphism class of the induced -equivariant structure on G˜A changes by the class K 2 ∈ H2 (J M˜ ) obtained by the pullbacks along the projection pr2 of P × M on the second factor. The isomorphism class of the descended gerbe GA changes then, according to the discussion from Sect. 6.5, by ν M˜ (K 2 ) ∈ H 2 (PM , U (1)) ∼ = H om(H2 (PM ), U (1)).

(6.60)

Global Gauge Anomalies in 2-D Bosonic Sigma Models

555

Viewed as a character of H2 (PM ), class ν M˜ (K 2 ) describes the change of the holonomy of the gerbe GA . We obtain this way Corollary 6.17. Under the change of the isomorphism class of a -equivariant structure on gerbe G over M by a class K ∈ H2 (J M ), the WZ amplitude (5.23) of a section / PM of the associated bundle is multiplied by the U (1) phase [], ν ˜ (K 2 ) , : M where [] denotes the homology class of . Remark 6.18. The dependence of the gauged WZ amplitudes on the choice of an equivariant structure is another manifestation of the phenomenon of “discrete torsion” [53]. In the particular situation where manifolds , M and are connected, Corollary 6.14 implies that H2 (J M ) ∼ = Z ∗M , H2 (J M˜ ) ∼ = Z ∗M˜ .

(6.61)

We shall denote by χ K the character of Z M corresponding to K ∈ H2 (J M ) and by χ K˜ the one of Z M˜ corresponding to K˜ ∈ H2 (J M˜ ). The relation pr2 ◦ r( p,m) = rm for / ( p, m) ∈ M˜ implies the inclusion Z M˜ ⊂ Z M . The map Z ∗M χ K χ K 2 ∈ Z ∗˜ is M now given by the restriction of the characters, whereas the homomorphism ν M˜ is induced / H 2 (PM , U (1)) of the exact sequence (6.59). The by the map τ M˜ : H 1 (, U (1)) problem of ambiguities of the gauged WZ amplitudes may be completely settled in this case employing a construction of homomorphism τ M˜ along the lines of Appendix 6 and an explicit description of principal -bundles over [33]. ˙ × , Up to isomorphism, such bundles may be obtained by gluing D × and (\ D) ˙ where D is a closed unit disc embedded into and D its interior, via the identification ˙ × D × (e ι˙σ , γ (e ι˙σ )γ ) = (e ι˙σ , γ ) ∈ (\ D)

(6.62)

for a transition loop S 1 e ι˙σ / γ (e ι˙σ ) ∈ that we assume based at the unit element: γ (1) = 1. The -bundle P depends, up to isomorphism, only on the element z P ∈ Z˜ ∼ = π1 () corresponding to the homotopy class of the transition loop. The ˙ × M via the associated bundle PM is then obtained by gluing D × M and (\ D) identification ˙ × M. D × M (e ι˙σ , γ (e ι˙σ )m) = (e ι˙σ , m) ∈ (\ D) A global section : D x

(6.63)

/ PM is given by two maps / φ1 (x) ∈ M and (\ D) ˙ x

/ φ2 (x) ∈ M

(6.64)

such that φ1 (e ι˙σ ) = γ (e ι˙σ )φ2 (e ι˙σ ).

(6.65)

∗ According to Appendix 6, the homomorphism τ M˜ , mapping H 1 (, U (1)) ∼ = Z˜ to ˜∗ H 2 (PM , U (1)), associates to a character χ ∈ Z the 1-isomorphism class of a flat gerbe Gχ on PM . Consequently, the phase [], ν M˜ (K 2 ) is equal to the holonomy Hol Gχ () for a character χ of Z˜ extending χ K . The flat gerbe Gχ may be trivialized over D × M

556

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

˙ × M. It is then given by a transition line bundle [32] over S 1 × M obtained and (\ D) by pulling back the flat line bundle L χ over , described in Appendix 6, along the map S 1 × M (e ι˙σ , m)

/ γ (e ι˙σ ) ∈ .

(6.66)

Using such a presentation of gerbe Gχ , it is easy to see from the geometric definition of the holonomy of gerbes, see, e.g., [23], that the phase Hol Gχ () is given by the holonomy of the loop e ι˙σ / γ (e ι˙σ ) in the line bundle L χ over . The latter is equal to the value of the character χ on the element z P ∈ Z˜ . The above phase should be independent of the extension χ of the character χ K from the subgroup Z M to Z˜ . This does not seem evident. Here is the resolution of the puzzle. Let φi be the maps representing section of PM . As a boundary value of a map from the disc to M, the 1-cycle S 1 e ι˙σ

/ φ2 (e ι˙σ ) ∈ M

(6.67)

is homologous to a constant 1-cycle. Hence the 1-cycle S 1 e ι˙σ

/ φ1 (e ι˙σ ) = γ (e ι˙σ )φ2 (e ι˙σ ),

(6.68)

which is a boundary value of a map from \ D˙ to M and, as such, has a trivial class in H1 (M), is homologous to S 1 e ι˙σ

/ γ (e ι˙σ )m

(6.69)

for any point m ∈ M. But the triviality of the class in H1 (M) of the latter 1-cycle is just the condition that z P ∈ Z M . Note that in the coset context, there always exists a section ≡ 1 of the associated bundle given by φi ≡ 1. In that case z P always belongs to Z M = Z˜ , see Corollary 6.15. We may summarize the above discussion in Theorem 6.19. Let , M, and be connected and P be the principal -bundle over associated to z P ∈ π1 (). Then 1. if z P ∈ Z M then there are no global sections of the associated bundle PM ; / PM , 2. for any global section :

[], ν M˜ (K 2 ) = χ K (z P ) ∈ U (1).

(6.70)

Corollary 6.20. Under the same assumptions, a change of the isomorphism class of a -equivariant structure on gerbe G over M by a class K ∈ H2 (J M ) identified with a / PM by character χ K ∈ Z ∗M multiplies the WZ amplitude (5.23) of a section : χ K (z P ). In particular, if P is trivializable then z P = 1 and the gauged WZ amplitudes are independent of the choice of a -equivariant structure (and may be defined in more general circumstances discussed in the first part of the paper).

Global Gauge Anomalies in 2-D Bosonic Sigma Models

557

6.7. Fixed-point resolved coset partition functions. In [33], K. Hori studied an example of the coset theory based on the WZW model with simply connected group G˜ = SU (2) × SU (2) at level (k, 2) and with gauged adjoint action of = diag(SU (2))/ diag(Z2 ). Let b( j1 , j2 ), j (τ ) denote the corresponding branching functions for spins j1 = 0, 21 , . . . , k2 , j2 = 0, 21 , 1, and j = 0, 21 , . . . , k2 +1. As a consequence of Eqs. (4.56) and (4.57), functions b( j1 , j2 ), j (τ ) vanish if j1 + j2 − j is not an integer and are unchanged by the joint spectral flow ( j1 , j2 , j )

/ ( k − j1 , 1 − j2 , k+2 − j ). 2 2

(6.71)

It follows from Eq. (4.60) for G = G˜ that the contribution to the coset partition function of the sector with topologically trivial gauge fields is equal to 1 Ztriv = (6.72) |b( j1 , j2 ), j (τ )|2 , |S ]| [( j , j ), j 1 2 [( j1 , j2 ), j ]

where [( j1 , j2 ), j ] runs through the orbits of the spectral flow (6.71) and S[( j1 , j2 ), j ] denotes the corresponding stabilizer subgroups of Z˜ . |S[( j1 , j2 ), j ] | = 1 for the twopoint orbits and |S[( j1 , j2 ), j ] | = 2 for the one-point ones composed of fixed points of the spectral flow. For k odd, one always has |S[( j1 , j2 ), j ] | = 1. In that case, the contribution Zntriv (τ ) of the sector with topologically non-trivial gauge fields to the partition function vanishes [33]. For k even, however, there is one fixed point orbit [( k4 , 21 ), k+2 4 ] with |S[( j1 , j2 ), j ] | = 2. Using the supersymmetry present in the above coset model, Hori showed the decomposition b( k , 1 ), k+2 (τ ) = b+ (τ ) + b− (τ ) with b+ (τ ) − b− (τ ) = 1, 4 2

4

(6.73)

where b± (τ ) collects the contribution to b( k , 1 ), k+2 (τ ) of states with (−1) F = ±1 for 4 2 4 F the fermion number. In the terminology of [16], decomposition (6.73) gives the resolution of the fixed-point branching function. Further analysis in [33] established that Zntriv (τ ) is τ -independent and postulated the equality Z ntriv = 21 . In that case, the sum of the fixed point contribution to Ztriv (τ ) and of Zntriv gives 1 + |b (τ ) + b− (τ )|2 2

+

1 2

= |b+ (τ )|2 + |b− (τ )|2 ,

(6.74)

which is the diagonal sum of the resolved fixed-point branching functions, as proposed in [16]. On the other hand, Hori argued that a different θ -vacuum of the coset theory should lead to Zntriv = − 21 . In the latter case, one obtains 1 + |b (τ ) + b− (τ )|2 2

−

1 2

= b+ (τ )b− (τ ) + b− (τ )b+ (τ ),

(6.75)

i.e. a non-diagonal combination of the resolved fixed-point branching functions. The latter choice was not discussed in [16]. Since Z˜ = Z2 = Z M in the present case, it follows from Corollary 6.20 that the sign ambiguity of Z ntriv is due to the freedom of choice of an S O(3)-equivariant structure on gerbe G(k,2) over SU (2) × SU (2). Based on the analysis of [16], one may generalize the above discussion and conjecG/ ture explicit expressions for the contributions Z P (τ ) to the coset partition functions ˜ where Z ⊂ Z˜ is a of gauge fields in principal -bundles P for groups G = G/Z

558

K. Gaw¸edzki, R. R. Suszek, K. Waldorf ˆ z gˆ ,h,˜

non-anomalous subgroup, see Sect. 4.4. Let bk,,λ (τ ) be the so-called twining branching functions introduced in [16] for z˜ ∈ Z˜ with spectral flow fixing (, λ) and set to zero otherwise. The formula G/

ZP

(τ ) =

z∈Z

[,λ]∈Pz

1 ˆ P ˆ gˆ ,h,z gˆ ,h,z (τ ) bk,,λP (τ ) b |S[,λ] | k,z −1 ,λ

(6.76)

should hold for a special choice of the -equivariant structure on the gerbe Gk over G ˜ see Remark 7.2 below). Since for z P = 1 the twining branching functions (for G = G, coincide with the standard ones, the above expression gives correctly the contribution of the sector with topologically trivial gauge fields, see Eq. (4.60). Summing over the isomorphism classes of -principal bundles, i.e. over z P ∈ Z˜ , one obtains, with the use of the Plancherel formula for the isotropy groups S[,λ] ⊂ Z˜ , the expression for the total partition function G/

Z tot (τ ) =

ˆ gˆ ,h,χ

∗ z∈Z [,λ]∈Pz χ ∈S[,λ]

ˆ gˆ ,h,χ

bk,z −1 ,λ (τ ) bk,,λ (τ )

(6.77)

in terms of the resolved fixed-point branching functions [16] ˆ gˆ ,h,χ

bk,,λ (τ ) =

1 |S[,λ] |

ˆ z gˆ ,h,˜

χ (˜z )−1 bk,,λ (τ ).

(6.78)

z˜ ∈S[,λ]

satisfying the sum rule

ˆ gˆ ,h,χ

∗ χ ∈S[,λ]

gˆ ,hˆ

bk,,λ (τ ) = bk,,λ (τ ).

(6.79)

On the other hand, the twist of the -equivariant structure by a character χ K ∈ Z˜ ∗ introduces the factor χ K (z P ) on the right hand side of Eq. (6.76) giving rise to the modified total partition function G/

Z tot (τ ) =

z∈Z

∗ [,λ]∈Pz χ ∈S[,λ]

ˆ gˆ ,h,χ

ˆ χ gˆ ,h,χ

bk,z −1 ,λ (τ ) bk,,λK (τ ).

(6.80)

˜ Eq. (6.77) gives the coset partition function in terms of the sum of squares For G = G, of the fixed-point resolved branching functions, as postulated in [16]. 7. Ad-Equivariant WZW Gerbes Over Simply Connected Groups In order to illustrate the concept of -equivariant gerbes, we shall return to the situation discussed in Sect. 4.2 involving the WZW gerbes Gk over connected compact simple ˜ ˜ Z˜ acting by the adjoint action. Recall groups G = G/Z viewed as -spaces for = G/ that Theorem 6.9 states that gerbes Gk possess -equivariant structures whenever the ˜ Such structures are composed of 1-isophases (4.22) are trivial, so always for G = G. morphism α and 2-isomorphism β, see Definition 5.1. They are classified by the dual

Global Gauge Anomalies in 2-D Bosonic Sigma Models

559

group of Z˜ , see Corollary 6.15. What follows is devoted to an explicit construction of ˜ -equivariant structures on gerbes Gk over simply connected groups G. Instead of the local data formalism used in Sect. 6, we shall employ below a geometric presentation of gerbes and their 1- and 2-isomorphisms, see, e.g., [27]. In such a presentation, a gerbe G over M with curvature H is a quadruple (Y, B, L , μ), where / M is a surjective submersion, B is a 2-form on Y , called curving, such that π :Y d B = π ∗ H, L is a line bundle over the fiber-product Y [2] = Y × M Y with curvature / L 13 is an isomorphism of line bundles over F = B2 − B1 , and μ : L 12 ⊗ L 23 [3] Y = Y × M Y × M Y that defines a groupoid structure on L ⇒ Y (the subscripts / Y [q] ). An explicit geometric denote here the pullbacks along projections from Y [ p] ˜ construction of gerbes Gk over M = G with k ∈ Z was given in [42] and is somewhat involved. We shall use here its description from [24], see also Sec. 4.1 of [20].

7.1. WZW gerbes over compact simply connected simple Lie groups. As before, coroots, coweights, roots and coroots will be considered as elements of the imaginary Cartan subalgebra ι˙t ⊂ ι˙g identified with its dual with the help of the bilinear form tr. The normalization of tr makes the length squared of long roots equal to 2. αi , αi∨ , λi and λi∨ , where i = 1, . . . , r , will denote the simple roots, coroots,

weights and coweights, respectively, with r the rank of g. The highest root φ = i ki αi , where the positive integers ki are the Kac labels. Denote by AW ⊂ it the positive Weyl alcove. AW is a simplex with vertices τi = k1i λi∨ , i = 1, . . . , r , and τ0 = 0. For i ∈ R := {0, 1, . . . , r }, let Ai = {τ ∈ AW | τ = s j τ j with si > 0}, j

Oi = {g = Adh g (e

2π ι˙τ

˜ τ ∈ Ai }, ) | h g ∈ G,

(7.1)

and, for I ⊂ R, let A I = ∩i∈I Ai and O I = ∩i∈I Oi . Subsets O I of G˜ are open and Ad-invariant. They are composed of elements g = Adh g (e 2π ι˙τ ) with h g ∈ G and τ ∈ A I . The expressions Bi =

k 4π

tr (h g ) Ade 2π ι˙τ ((h g )) + 2π ι˙(τ − τi )[(h g ), (h g )] ,

(7.2)

where (h g ) = h −1 g dh g , define smooth 2-forms on Oi such that d Bi = Hk | Oi . For groups SU (n), it is enough to take Y = i Oi , see [7,23]. In order to have a unique construction of gerbes Gk for all compact simply connected simple Lie groups, one makes a more involved choice [42]. Consider the stabilizer subgroups, G I = {γ ∈ G˜ | γ e 2π ι˙τ γ −1 = e 2π ι˙τ for (any) τ ∈ A I \ ∪ Ai }. i ∈ I

(7.3)

In particular, G i is composed of the elements of G˜ that commute with e 2π ι˙τi . The Cartan subgroup T ⊂ G˜ is contained in all G I . The maps O I g = Adh g (e 2π ι˙τ )

ηI

/

h g G I ∈ G/G I

(7.4)

560

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

are well-defined because the adjoint-action stabilizers of e 2π ι˙τ for τ ∈ A I are contained in G I . They are smooth, see Sec. 5.1 of [42]. One introduces principal G I -bundles / OI , π I : PI PI = {(g, h) ∈ O I × G˜ | η I (g) = hG I }.

(7.5)

For the gerbes Gk = (Y, B, L , μ), one sets Y = Pi

(7.6)

i∈R

with π : Y

/ G˜ restricting to πi on Pi and the 2-form B restricting to π ∗ Bi . Let i Yˆi1 ...in = PI × G i1 × · · · × G in and

Yi1 ...in = Yˆi1 ...in /G I

(7.7)

for I = {i 1 , . . . , i n }, and for G I acting on Yˆi1 ...in diagonally by the right multiplication. The fiber power Y [n] of Y may be identified with the disjoint union of Yi1 ...in by assigning to the G I -orbit of ((g, h), γi1 , . . . , γin ) the n-tuple (y1 , . . . , yn ) ∈ Y [n] with ), ym = (g, hγi−1 m Y [n] ∼ =

(i 1 ,...,i n )

Yi1 ...in .

(7.8)

The construction of the line bundle L over Y [2] uses more detailed properties of the stabilizer groups G I . For I ⊂ J ⊂ R , G J is contained in G I . The smallest of those ˜ Groups G I are connected but groups, G R , coincides with the Cartan subgroup T of G. not necessarily simply connected. Let g I ⊃ t denote the Lie algebra of G I , and let e I be the exponential map from g I to the universal cover G˜ I . One has ∨

2π ι˙ Q , G I = G˜ I /Z I for Z I = e I

(7.9)

where Q ∨ ⊂ t is the coroot lattice of g. The exponential map e I maps t to the Abelian subgroup T˜I ⊂ G˜ I . For I ⊂ J , the group G˜ J maps naturally into G˜ I and Z J into Z I . One shows that the formula χi (ei2π ι˙τ ) = e 2π ι˙ tr τi τ

(7.10)

/ U (1). By restriction, χi determines a character for τ ∈ t defines a character χi : T˜i / U (1) by the of Z i . One may also define a 1-dimensional representation χi j : G˜ i j formula 1 χi j (γ˜i j ) = exp ι˙ ai j , (7.11) γ˜i j

where ai j = ι˙ tr(τ j − τi ) (γi j ) is a closed 1-form on G i j (γ˜i j is identified with a homotopy class of a path in G i j starting at 1). For τ ∈ ι˙t one has: ι˙τ χi j (ei2π ) = χi (ei2π ι˙τ )−1 χ j (e j2π ι˙τ ). j

(7.12)

Besides χi j (γ˜i j ) = χ ji (γ˜i j )−1 , and for γ˜i jk ∈ G˜ i jk , χi j (γ˜i jk ) χ jk (γ˜i jk ) = χik (γ˜i jk ).

(7.13)

Global Gauge Anomalies in 2-D Bosonic Sigma Models

561

Over space Yˆi j , there is a line bundle Lˆ i j whose fiber over ((g, h), γi , γ j ) is composed of the equivalence classes [γ˜i , γ˜ j , u i j ]i j with respect to the relation (γ˜i , γ˜ j , u i j ) ∼ (γ˜i ζi , γ˜ j ζ j , χi (ζi )k χ j (ζ j )−k u i j ) ij

(7.14)

for γ˜i ∈ G˜ i , γ˜ j ∈ G˜ j projecting to γi ∈ G i and γ j ∈ G j , respectively, and u i j ∈ C, ζi ∈ Z i , ζ j ∈ Z j . One twists the natural flat structure of Lˆ i j by the connection form Aˆ i j = ι˙k tr(τ j − τi ) (h).

(7.15)

The right action of G i j on Yˆi j lifts to the action on Lˆ i j defined by ((g, h), [γ˜i , γ˜ j , u i j ]i j )

/ ((g, hγ ), [γ˜i γ˜i j , γ˜ j γ˜i j , χi j (γ˜i j )−k u i j ]i j )

(7.16)

for γi j ∈ G i j and γ˜i j its lift to G˜ i j . The hermitian structure and the connection of Lˆ i j descend to the quotient bundle Lˆ i j /G i j = L i j over Yi j and the line bundle L over Y [2] for the gerbe Gk is taken as equal to L i j when restricted to Yi j . The curvature 2-form Fi j of L i j lifts to Yˆi j to the 2-form d Aˆ i j that coincides with the lift to Yˆi j of the 2-form B j − Bi . This gives the required relation F = B2 − B1 between the curvature F of the line bundle L over Y [2] and the curving B on Y . The groupoid multiplication μ of G is defined as follows. Let ((g, h), γi , γ j , γk ) ∈ Yˆi jk represent (y, y , y ) ∈ Y [3] with y = (g, hγi−1 ), y = (g, hγ j−1 ) and y = (g, hγk−1 ) and let

i j ∈ L (y,y ) , jk ∈ L (y ,y ) , ik ∈ L (y,y )

(7.17)

be the elements in the appropriate fibers of L given by the G i jk -orbits of

ˆi j = ((g, h), [γ˜i , γ˜ j , u i j ]i j ), ˆ jk = ((g, h), [γ˜ j , γ˜k , u jk ] jk ),

ˆik = ((g, h), [γ˜i , γ˜k , u ik ]i j )

(7.18)

with u ik = u i j u jk . Then μ( i j ⊗ jk ) = ik .

(7.19)

˜ This ends the description of gerbes Gk = (Y, B, L , μ) over simply connected groups G. 7.2. Construction of 1-isomorphism α. We need to compare the pullbacks of gerbe Gk ˜ Consider first the pullback (Gk )12 along the adjoint action to the product space × G. / ˜ ˜ Z˜ on G. ˜ One has: ˜ G of = G/

:×G (Gk )12 = (Y12 , B12 , L 12 , μ12 ).

(7.20)

The adjoint action of G˜ on itself may be lifted to Y by the map G˜ × Y (γ˜ , y)

/ Adγ˜ (y) ∈ Y,

(7.21)

562

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

where for y = (g, h) ∈ Pi ⊂ Y, Adγ˜ (y) := (Adγ˜ (g), γ˜ h) ∈ Pi . The map (7.21) is constant on orbits of the action / (γ˜ , y) (z γ˜ , yz −1 ) (7.22) of Z˜ on G˜ × Y , where yz −1 := (g, hz −1 ) for y = (g, h) ∈ Pi . It allows the canonical identification Y12 ≡ (G˜ × Y )/ Z˜ .

(7.23)

/ × G˜ is generated by In this identification, the surjective submersion π12 : (Y )12 / ˜ ˜ (γ , π(y)), where γ ∈ = G/ Z is the canonical projection of γ˜ . the map (γ˜ , y) Similarly, [n] ∼ ˜ Y12 = (G × Y [n] )/ Z˜ .

(7.24)

The action of Z˜ on G˜ × Y [2] induced by (7.22) may be lifted to the one on G˜ × L given by

/ (z γ˜ , i j ! z −1 ), (7.25) where for i j given by the G i j -orbit (7.16) of ˆi j = (g, h), [γ˜i , γ˜ j , u i j ]i j , the element

i j ! z −1 is defined as the G i j -orbit of (7.26)

ˆi j ! z −1 := (g, h), [γ˜i z˜ , γ˜ j z˜ , χi j (˜z )−k u i j ]i j , (γ˜ , i j )

with z˜ standing for any lift of z ∈ Z˜ to G˜ i j . We introduce a special symbol for this action to distinguish it from another one that will be defined below. As line bundles, L 12 ∼ = (G˜ × L)/ Z˜ .

(7.27)

In order to obtain the correct connection on L 12 , the one on G˜ × L has to be modified by twisting the flat structure on G˜ × Lˆ i j by the connection 1-form γ˜ ∗ Aˆ i j = ι˙k tr(τ j − τi )(γ˜ h)

(7.28)

rather than by Aˆ i j of Eq. (7.15). 1-isomorphism α will compare gerbe (Gk )12 to Iρk ⊗ (Gk )2 = (Y2 , B2 + π2∗ ρk , L 2 , μ2 ), where (Gk )2 is the pullback of Gk to × G˜ along the projection to the second factor. It will be convenient to identify Y2 = × Y ∼ = (G˜ × Y )/ Z˜ ,

(7.29)

/ × G˜ is induced upon this ˜ The projection π2 : Y2 where now Z˜ acts only on G. identification by the map (γ˜ , y) / (γ , π(y)). Similarly, Y2[n] = × Y [n] ∼ = (G˜ × Y [n] )/ Z˜ ,

L2 = × L ∼ = (G˜ × L)/ Z˜ ,

(7.30)

with Z˜ always acting trivially on the 2nd factor. The first part of data for 1-isomorphism α is a line bundle E over W12 := Y12 ×(×G) ˜ Y2 , see [26]. E has to be equipped with a connection whose curvature form F E is equal

Global Gauge Anomalies in 2-D Bosonic Sigma Models

563

to (B2 + π2∗ ρk )2 − (B12 )1 , where the outside subscript 1 (resp. 2) refers to the pullback along the projection from W12 to Y12 (resp. to Y2 ). In view of identifications (7.23) and (7.29), we obtain for the fiber-product space W12 , W12 ∼ = (G˜ × Y [2] )/ Z˜

(7.31)

for the action (γ˜ , (y, y )) / (z γ˜ , (yz −1 , y )) of Z˜ . The projection to Y12 is induced by the map (γ˜ , (y, y )) / (γ˜ , y), the one to Y2 by (γ˜ , (y, y )) / (γ˜ , y ). The composed / × G˜ is (γ˜ , (y, y )) / (γ , π(y) = π(y )). Line bundle E projection " : W12 over W12 will be defined by E := (G˜ × L)/ Z˜ ,

(7.32)

for the action of Z˜ , (γ˜ , i j )

/ (z γ˜ , i j · z −1 ),

where the element i j · z −1 defined as the G i j -orbit of

ˆi j · z −1 := (g, h), [γ˜i z˜ , γ˜ j , χi (˜z )k χ (z) u i j ]i j ,

(7.33)

(7.34)

/ U (1) a fixed character. Note the difference between elements ˆi j · z −1 with χ : Z˜ and ˆi j ! z −1 , with the latter one defined by Eq. (7.26). The connection in line bundle E requires a careful definition in order to assure that it has the desired curvature. Note that the 2-form (B2 + π2∗ ρk )2 − (B12 )1 on (G˜ × Yi j )/ Z˜ ⊂ ˜ W12 is equal to the pullback by " of the 2-form (B j )2 +ρk −(Bi )12 on × Oi j ⊂ × G. 2π ι ˙ τ ) ∈ Oi j , A short calculation shows that for γ ∈ and g = Adh g (e (B j )2 + ρk − (Bi )12 (γ , g) 1 = d ι˙k tr Adh g (τ − τi ) (γ ) + 2 ι˙k tr(τi − τ j )[(h g ), (h g )]. (7.35) It was shown in [42] that the map Oi g = Adh g (e 2π ι˙τ )

/ Adh (τ − τi ) ∈ ι˙g, g

(7.36)

denoted #i there, is well defined and smooth so that the 1-form Ai = ι˙k tr Adh g (τ − τi ) (γ ) is well defined and smooth on × Oi . On the other hand, the 2-form B j − Bi = 1 2 ι˙k tr(τi −τ j )[(h g ), (h g )] is a well defined closed 2-form on Oi j which, when pulled back to Yi j ⊂ Y [2] , coincides with the curvature form of L|Yi j = L i j . In order to assure the correct curvature of E, we shall additionally twist the connection of G˜ × L|Yi j in (7.32) by the pullbacks to G˜ × Yi j of the forms Aˆ i = ι˙k tr Adh g (τ − τi ) (γ˜ )

(7.37)

on G˜ × Oi j . A straightforward check shows that the resulting connection in G˜ × L descends to the quotient by the action (7.33) of Z˜ . Note that the resulting bundles E differ for different characters χ of Z˜ by tensor factors that are pullbacks to W12 of flat bundles over .

564

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

1-isomorphism α : (Gk )12 [2] line bundles over W12

/ Iρ ⊗ (Gk )2 of Definition 5.1 is an isomorphism of k

α : L 12 ⊗ E 2

/ E1 ⊗ L 2 ,

(7.38)

where natural pullbacks of bundles L 12 and L 2 are understood. Recalling realization (7.31) of W12 , we have [2] ∼ ˜ W12 = (G × Y [4] )/ Z˜

(7.39)

with the action (γ˜ , y1 , y1 , y2 , y2 )

/ (z γ˜ , (y1 z −1 , y , y2 z −1 , y )) 1 2

(7.40)

of Z˜ . Suppose that (y1 , y1 , y2 , y2 ) ∈ Yi1 j1 i2 j2 and that

i1 i2 ∈ L (y1 ,y2 ) , i2 j2 ∈ L (y2 ,y2 ) , i1 j1 ∈ L (y1 ,y1 ) , j1 j2 ∈ L (y1 ,y2 )

(7.41)

are given by G i1 j1 i2 j2 -orbits of

ˆi1 i2 = (g, h), [γ˜i1 , γ˜i2 , u i1 i2 ]i1 i2 , ˆi2 j2 = (g, h), [γ˜i2 , γ˜ j2 , u i2 j2 ]i2 j2 , (7.42) ˆ ˆ

i1 j1 = (g, h), [γ˜i1 , γ˜ j1 , u i1 j1 ]i1 j1 , j1 j2 = (g, h), [γ˜ j1 , γ˜ j2 , u j1 j2 ] j1 j2 . (7.43) with μ( i1 i2 ⊗ i2 j2 ) = μ( i1 j1 ⊗ j1 j2 ), i.e. u i1 i2 u i2 j2 = u i1 j1 u j1 j2 .

(7.44)

The bundle isomorphism α of (7.38) will be generated by a map α˜ such that α˜ γ˜ , i1 i2 ⊗ i2 j2 = γ˜ , i1 j1 ⊗ j1 j2 .

(7.45)

Consistency requires that α˜ commutes with the action of Z˜ , i.e. that α˜ z γ˜ , i1 i2 ! z −1 ⊗ i2 j2 · z −1 = γ˜ , i1 j1 · z −1 ⊗ j1 j2 .

(7.46)

In view of Eqs. (7.26), (7.34) and (7.44), this is guaranteed by the relation χi1 i2 (˜z )−k χi2 (˜z )k χ (z) = χi1 (˜z )k χ (z),

(7.47)

which follows from identity (7.12). That the bundle isomorphism α preserves the connections follows from the equality of the (modified) connection forms γ˜ ∗ Aˆ i1 i2 + Aˆ i2 j2 + Aˆ i2 = Aˆ i1 j1 + Aˆ i1 + Aˆ j1 j2 ,

(7.48)

which is easy to check. For the bundle isomorphism α to define a gerbe 1-isomorphism from (Gk )12 to Iρk ⊗ (Gk )2 , one has to require a proper behavior with respect to the groupoid multiplication

Global Gauge Anomalies in 2-D Bosonic Sigma Models

565

[26]. More precisely, what is needed is the coincidence of two composed isomorphisms [3] . The first one is of line bundles over W12 (L 12 )1,2 ⊗ (L 12 )2,3 ⊗ E 3 α1,2 ⊗Id

/

Id⊗α2,3

/ (L 12 )1,2 ⊗ E 2 ⊗ (L 2 )2,3

E 1 ⊗ (L 2 )1,2 ⊗ (L 2 )2,3

Id⊗(μ2 )1,2,3

/

E 1 ⊗ (L 2 )1,3 ,

(7.49)

[3] . The second one is with the exterior subscripts referring to the components of W12

(L 12 )1,2 ⊗ (L 12 )2,3 × E 3

(μ12 )1,2,3 ⊗Id

/

(L 12 )1,3 ⊗ E 3

α1,3

/

E 1 ⊗ (L 2 )1,3 . (7.50)

Straightforward verification that they coincide is carried out in Appendix 7. 7.3. Construction of 2-isomorphism β. 2-isomorphism β of Definition 5.1 compares ˜ First, consider gerbe (Gk )123 = (Y123 , B123 , 1-isomorphisms of gerbes over 2 × G. L 123 , μ123 ). The same way as before for Y12 , we shall use the map G˜ 2 × Y (γ˜1 , γ˜2 , y) constant on orbits of the Z˜ 2 -action (γ˜ , γ˜ , y) 1

2

/ Adγ˜ γ˜ y ∈ Y 1 2

(7.51)

/ (z 1 γ˜1 , z 2 γ˜2 , y(z 1 z 2 )−1 )

(7.52)

in order to identify [n] ∼ ˜ 2 Y123 = (G × Y [n] )/ Z˜ 2 .

(7.53)

L 123 ∼ = (G˜ 2 × L)/ Z˜ 2

(7.54)

As line bundles,

for the action (γ˜1 , γ˜2 , i j )

/ (z 1 γ˜1 , z 2 γ˜2 , i j ! (z 1 z 2 )−1 ).

(7.55)

The connection in L has to be modified by twisting the flat structure of G˜ 2 × Lˆ i j by the connection 1-form (γ˜1 γ˜2 )∗ Aˆ i j , see Eq. (7.28). Similarly, for gerbe (Gk )23 = ˜ we have: (Y23 , B23 , L 23 , μ23 ) over 2 × G, [n] ∼ ˜ 2 Y23 = (G × Y [n] )/ Z˜ 2 ,

(7.56)

where now the action of Z˜ 2 is induced from the one on G˜ 2 × Y˜ given by (γ˜1 , γ˜2 , y)

/ (z 1 γ˜1 , z 2 γ˜2 , yz −1 ). 2

(7.57)

As line bundles, L 23 ∼ = (G˜ 2 × L)/ Z˜ 2

(7.58)

for the action (γ˜1 , γ˜2 , i j )

/ (z 1 γ˜1 , z 2 γ˜2 , i j ! z −1 ), 2

(7.59)

566

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

with the connection of G˜ 2 × L modified now using 1-forms γ˜2∗ Aˆ i j . Finally, for gerbe (Gk )3 = (Y3 , B3 , L 3 , μ3 ), Y3[n] ∼ = (G˜ 2 × Y [n] )/ Z˜ 2 and L 3 ∼ = (G˜ 2 × L)/ Z˜ 2 ,

(7.60)

with Z˜ 2 acting only on the factors G˜ 2 . For the fiber-product space W123 = Y123 ×( 2 ×G) ˜ Y23 ×( 2 ×G) ˜ Y3 , we have W123 = (G˜ 2 × Y [3] )/ Z˜ 2

(7.61)

for the action (γ˜1 , γ˜2 , (y, y , y ))

/ (z 1 γ˜1 , z 2 γ˜2 , (y(z 1 z 2 )−1 , y z −1 , y )) 2

(7.62)

of Z˜ 2 . We may pull back the line bundle E over W12 in three different ways to W123 , obtaining the respective line bundles E 1,23 , E 2,3 and E 12,3 . One has E 1,23 ∼ = (G˜ 2 × L 1,2 )/ Z˜ 2 , E 2,3 ∼ = (G˜ 2 × L 2,3 )/ Z˜ 2 , ∼ (G˜ 2 × L 1,3 )/ Z˜ 2 . E 12,3 =

(7.63)

The actions of Z˜ 2 above are defined as follows. If (y, y , y ) ∈ Yi jk ⊂ Y [3] and

i j , jk and ik are as in (7.17), i.e. i j ∈ L (y,y ) ⊂ L 1,2 , jk ∈ L (y ,y ) ⊂ L 2,3 and

ik ∈ L (y,y ) ⊂ L 1,3 , then under (z 1 , z 2 ) ∈ Z˜ 2 , (γ˜1 , γ˜2 , i j )

(γ˜1 , γ˜2 , jk ) (γ˜ , γ˜ , ) 1

2

ik

/ (z 1 γ˜1 , z 2 γ˜2 , ( i j ! z −1 ) · z −1 ), 2 1 / (z 1 γ˜1 , z 2 γ˜2 , jk · z −1 ), 2 / (z 1 γ˜1 , z 2 γ˜2 , ik · (z 1 z 2 )−1 ).

(7.64) (7.65) (7.66)

The connection of L in the three pullbacks in (7.63) has to be modified by twisting the flat structure of G˜ 2 × Lˆ i j by the 1-form = ι˙k tr(τ j − τi ) (γ˜2 h g ) + ι˙k tr Adγ˜2 h g (τ − τi ) (γ˜1 ), Aˆ i1,23 j

(7.67)

that of G˜ 2 × Lˆ jk by Aˆ 2,3 jk = ι˙k tr(τk − τ j ) (h g ) + ι˙k tr Adh g (τ − τ j ) (γ˜2 ),

(7.68)

and that of G˜ 2 × Lˆ ik by 12,3 = ι˙k tr(τk − τi ) (h g ) + ι˙k tr Adh g (τ − τi ) (γ˜1 γ˜2 ). Aˆ ik

There is a natural isomorphism β : E 1,23 ⊗ E 2,3 multiplication μ in L, i.e. induced by the map

(7.69)

/ E 12,3 given by the groupoid

β˜

/ (γ˜1 , γ˜2 , μ( i j ⊗ jk )). (7.70) Indeed, β˜ commutes with the action of Z˜ 2 because μ ( i j ! z 2−1 ) · z 1−1 ⊗ jk · z 2−1 =

ik · (z 1 z 2 )−1 if μ i j ⊗ jk = ik as (γ˜1 , γ˜2 , i j ⊗ jk )

χi (˜z 1 )k χ (z 1 ) χi j (˜z 2 )−k χ j (˜z 2 )k χ (z 2 ) = χi (˜z 1 z˜ 2 )k χ (z 1 z 2 ).

(7.71)

Global Gauge Anomalies in 2-D Bosonic Sigma Models

567

Besides, β˜ intertwines the modified connections since ˆ 2,3 ˆ 12,3 Aˆ i1,23 j + A jk = Aik ,

(7.72)

as a short calculation shows. For the line bundle isomorphism β to provide a gerbe 2-isomorphism required by Definition 5.1, one needs (see [26]) that over [2] ∼ ˜ 2 W123 = (G × Y [6] )/ Z˜ 2 ,

(7.73)

with the action of Z˜ 2 induced from that in (7.62), the diagram of line bundle isomorphisms L 123 ⊗ (E 1,23 )2 ⊗ (E 2,3 )2 Id⊗β2

α1,23 ⊗Id

/ (E 1,23 )1 ⊗ L 23 ⊗ (E 2,3 )2

I d⊗α2,3

L 123 ⊗ (E 12,3 )2

α12,3

/ (E 1,23 )1 ⊗ (E 2,3 )1 ⊗ L 3

/

β1 ⊗Id

(E 12,3 )1 ⊗ L 3

(7.74) [2] with the exterior subscripts referring to the pullbacks to W123 and with the obvious pullbacks omitted, be commutative. This is checked in Appendix 8.

7.4. Commutativity of diagram (5.1). This is the identity β1,23,4 • ((I d ⊗ β2,3,4 ) ◦ Id) = β12,3,4 • (I d ◦ β1,2,34) )

(7.75)

for composed 2-isomorphisms between 1-isomorphisms of gerbes over 2 × G˜ (see [54] for the abstract definition of the vertical • and horizontal ◦ compositions of 2-morphisms). The left- and the right-hand side are the following compositions of the isomorphisms of line bundles: E 1,234 ⊗ E 2,34 ⊗ E 3,4

Id⊗β2,3,4

/

E 1,234 ⊗ E 23,4

β1,23,4

/

E 123,4 ,

(7.76)

E 123,4 ,

(7.77)

and E 1,234 ⊗ E 2,34 ⊗ E 3,4

β1,2,34 ⊗Id

/

E 12,34 ⊗ E 3,4

β12,3,4

/

respectively, over the fiber-product space W1234 = (Y )1234 ×( 3 ×G) ˜ (Y )234 ×( 3 ×G) ˜ (Y )34 ×( 3 ×G) ˜ (Y )4 . It is checked in Appendix 9 that they coincide. This proves identity (7.75) establishing the commutativity of diagram (5.1) of Definition 5.1 and completing the construction of -equivariant structures on gerbe Gk over G˜ for the adjoint action of ˜ Z˜ on G. ˜ = G/ Theorem 7.1. The -equivariant structures on the WZW gerbe Gk over G˜ constructed / U (1) and each above are non-isomorphic for different characters χ : Z˜ -equivariant structure on Gk is isomorphic to one of them.

568

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

Proof. The general discussion of classification of -equivariant structures in Sect. 6.5 showed that different isomorphism classes of -equivariant structures correspond in this case to cohomology classes [b ] ∈ H 1 ( × M, U (1)) ∼ = H 1 (, U (1)) in the image of homomorphism κ, see Corollary 6.15. The classes [b ] are the isomorphism classes of flat line bundles over by which differ the line bundles E over W12 involved in the above construction of 1-isomorphisms α of Definition 5.1. Different choices of char/ U (1) correspond to tensoring E with such flat line bundles, as was acters χ : Z˜ remarked in Sect. 7.2. The claim of the theorem now follows from the isomorphism of ∗ H 1 (, U (1)) with the character group Z˜ . Remark 7.2. It is natural to conjecture that the special -equivariant structure for which Eq. (6.76) gives the contributions of the topologically non-trivial gauge fields to the ˜ coset theory corresponds to χ = 1. partition function of the G/ 8. Conclusions We revisited the problem of the gauging of rigid symmetries in two-dimensional sigma models with the Wess-Zumino action related to a closed 3-forms H on the target manifold. For topologically trivial gauge fields given by global Lie-algebra valued 1-forms on the worldsheet, the gauging prescription of Refs. [37] and [36], recalled in Sect. 3.1, assures infinitesimal gauge invariance. We showed, however, that it may lead to global gauge anomalies. In Corollary 4.5 and Sect. 6.1, we classified such anomalies using geometric tools based on the theory of bundle gerbes. As was shown in Sect. 4.2, global gauge anomalies occur, for example, in numerous WZW sigma models with non-simply connected target groups when one gauges their adjoint symmetries. They lead to the inconsistency, discussed in Sect. 4.4, of the corresponding coset models obtained by integrating out the external gauge fields in the respective gauged WZW models. In Sect. 5.1, we introduced geometric structures called equivariant gerbes, living on the target space, that permit an anomaly-free coupling of WZ amplitudes to arbitrary (also topologically non-trivial) gauge fields. A detailed analysis of obstructions to the existence of such structures was performed and their classification was obtained in Sect. 6. In particular, we proved Theorem 6.9 asserting that the gerbes relevant to the WZW theories with compact semi-simple target groups can be equipped with equivariant structures with respect to adjoint symmetries if and only if there is no global gauge anomaly in the coupling of the WZW model to topologically trivial gauge fields. In Sect. 7, we explicitly constructed all nonequivalent equivariant structures in the case of simply connected target groups. Different equivariant structures result in the coupling to topologically non-trivial gauge fields that differs by phases. We showed in Sect. 6.6 that such ambiguities, anticipated in [33], are given by characters of a subgroup of the fundamental group of the symmetry group, if the latter is connected, see Corollary 6.20. In Sect. 6.7, we discussed how such ambiguities are reflected in the (fixed-poit resolved) partition functions of the non-anomalous coset theories. We do not know if, in general, the existence of equivariant gerbes is also a necessary condition for the existence of non-anomalous coupling of WZ amplitudes to gauge fields in topologically non-trivial sectors, but this is a plausible conjecture. The analysis of the present paper was limited to the case of oriented closed worldsheets. Local gauge anomalies on worldsheets with boundary were studied in [12]. A generalization of the present work to the case of such worldsheets, or worldsheets with conformal defects, will be discussed in a separate publication. An extension of WZ amplitudes to unoriented surfaces requires an additional structure on

Global Gauge Anomalies in 2-D Bosonic Sigma Models

569

gerbes that was introduced under the name of Jandl structure in [48], see also [26,27]. We plan to discuss the interrelation between equivariant structures, Jandl structures, and multiplicative structures on gerbes of [5,25,55], in a future study, with applications to orientifolds of coset models. Other possible extensions of our work should cover the cases of WZW and coset theories with gauging of twisted-adjoint symmetries or with non-compact targets, of supersymmetric sigma models, and applications to global aspects of T -duality [34]. It should also be possible to study global gauge anomalies for higher dimensional WZ actions on spacetimes with arbitrary topology using the theory of bundle n-gerbes [6]. Acknowledgements. The authors acknowledge the support of the contract ANR-05-BLAN-0029-03 in the early stage of this collaboration. The work of K.W. was supported by a Feodor-Lynen scholarship granted by the Alexander von Humboldt Foundation. That of R.R.S. was partially funded by the Collaborative Research Centre 676 “Particles, Strings and the Early Universe - the Structure of Matter and Space-Time”. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Appendices 1. Proof of Proposition 3.1. We have to find conditions under which the coupled amplitudes A(ϕ, A) given by Eq. (3.5) are invariant under infinitesimal gauge transformations. ¯ the vector field on × M in the direcSetting e−t φ = (I d, e−t ϕ) and denoting by d ¯ tion of M given by (x, m) = dt |t=0 (x, e−t m), we observe that d 1 1 −t ∗ 2 ∗ 2 −v(A) + −v(A) + (e φ) u(A ) = φ L u(A ) ¯ 2 2 dt t=0 1 φ ∗ ι¯ d −v(A) + 2 u(A2 ) (A.1.1) =

since the other term dι¯ in the Lie derivative gives a term that vanishes by integration d e−t A = d − [, A], see Eq. (3.4), one obtains by parts. Similarly, as dt t=0 d 1 ∗ −t −t 2 −v(e φ A) + u((e A) ) 2 dt t=0 = φ ∗ (−v(d − [, A]) + u((d − [, A])A)) .

(A.1.2)

On the other hand, AWZ (e−t ϕ) = Hol G2 (e−t φ), where the subscript 2 on G refers to the pullback along the projection from × M to M (the latter relation follows from the behavior of gerbe holonomy under gerbe pullbacks). Proceeding as in the proof of Proposition 5.5 one then shows that d −t ∗ AWZ (e ϕ) = ι˙ φ ι¯ H AWZ (ϕ), dt t=0

(A.1.3)

570

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

so that ι¯ H (more pedantically defined as ι¯ H2 ) is a form on × M. Gathering the above relations, we infer that d 1 −t −t ∗ 2 ι H + d(−v(A) + A (e ϕ, e A) = ι ˙ φ u(A )) ¯ WZ 2 dt t=0 (A.1.4) − v(d − [, A]) + u ((d − [, A])A)] AWZ (ϕ, A). Consequently, the invariance of the amplitudes AWZ (ϕ, A) under infinitesimal gauge transformations requires that for all ϕ and A, 1 φ ∗ ι¯ H +d(−v(A) + 2 u(A2 )) −v(d − [, A])+u((d − [, A])A) = 0.

(A.1.5) In order to proceed, it will be easier to employ a basis (t a ) in g, writing A = t a Aa , = t a a and using the notations of Remark 3.2.2, Eq. (A.1.5) may then be rewritten as 1 φ ∗ a ιa H + d(−v b Ab + 2 u bc Ab Ac ) + f abc v c Ab − u cd Ab Ad +(da )(v a + u ab Ab ) 1 = φ ∗ a ιa H − ιa (dv b )Ab + ιa v b d Ab + 2 ιa (du bc )Ab Ac + f abc v c Ab (A.1.6) − f abc u cd Ab Ad − dv a − (du ab )Ab − u ab d Ab = 0, where the terms in the last line were obtained by integration by parts. Since a are arbitrary functions on , we infer that the 2-form ϕ ∗ ιa H − dv a + ϕ ∗ −ιa (dv b ) + f abc v c − du ab Ab + ϕ ∗ ιa v b − u ab d Ab 1 + 2 ϕ ∗ ιa du bc − f abd u dc + f acd u db Ab Ac (A.1.7) / M and all 1-forms Aa on . It is easy to see on has to vanish for all maps ϕ : that this imposes the separate constraints ιa H − dv a = 0, −ιa (dv b ) + f abc v c − du ab = 0, ιa v b − u ab = 0, ιa du bc − f abd u dc + f acd u db = 0.

(A.1.8) (A.1.9)

The 1st of these equalities gives the left of Eqs. (3.6). The 3rd one gives Eq. (3.7), implying also the right of Eqs. (3.6) and, via the 2nd equality, the middle of Eqs. (3.6). The 4th equality may be rewritten as ι X¯ dιY¯ v(Z ) − ι[ X¯ ,Y¯ ] v(Z ) + ι[ X¯ , Z¯ ] v(Y ) = 0

(A.1.10)

and now holds automatically since ι X¯ dιY¯ v(Z ) = L X¯ ιY¯ v(Z ), ι[ X¯ , Z¯ ] v(Y ) = −ιY¯ v([X, Z ]) = −ιY¯ L X¯ v(Z ) (A.1.11) and [L X¯ , ιY¯ ] = ι[ X¯ ,Y¯ ] . This ends the proof of Proposition 3.1.

Global Gauge Anomalies in 2-D Bosonic Sigma Models

571

2. Proof of Lemma 3.13. In order to prove that the 2-form ρ of Eq. (3.26) is -invariant, recall that × M is considered as a -space with the action ˜γ (γ , m) = (Adγ (γ ), γ m) of γ ∈ . Using relation (4.5), we obtain: ∗ ∗ 1

˜γ ρ = ˜γ −v() + 2 (ι¯ v)() = −( ∗γ v)(Adγ∗ ) + 2 (ι Ad 1

∗ γ −1 (Adγ )

= −(v(Adγ −1 (Adγ∗ )) + 2 (ι Ad 1

∗γ v)(Adγ∗ )

∗ γ −1 (Adγ )

v)(Adγ −1 (Adγ∗ )),

(A.2.1)

where the 2nd equality follows from the 2nd of relations (3.24). The identity Adγ∗ = Adγ () implies that the right-hand side is γ -independent so that the -invariance of ρ follows. Let us pass to the proof of relation (3.31). Using the equality (γ1 γ2 ) = Adγ −1 (γ1 )+ 2

(γ2 ), we obtain on 2 × M, ρ12 (γ1 , γ2 , m) = ρ(γ1 γ2 , m)

1 = − v(Adγ −1 (γ1 )) (m) − v((γ2 )) (m)+ 2 ι Ad −1 (γ1 ) v(Adγ −1 (γ1 )) (m) 2 2 γ2 1 1 + 2 ι(γ ¯ 2 ) v(Adγ −1 (γ1 )) (m) + 2 ι Ad −1 (γ1 ) v((γ2 )) (m) 2 γ2 1 + 2 ι(γ (A.2.2) ¯ 2 ) v((γ2 )) (m).

Using, again, the 2nd of relations (3.24) as well as the last of equalities (3.6), identity (4.5) and, finally, equality (3.27), we may rewrite the last identity as ρ12 (γ1 , γ2 , m) 1 = − ∗γ2 v((γ1 )) (m) − v((γ2 )) (m) + 2 ∗γ2 (ι(γ ¯ 1 ) v((γ1 ))) (m) 1 ∗ + ι(γ ι

v((γ )) (m) + v((γ )) (m) ¯ 2 ) γ2 ¯ 2) 1 2 2 (γ 1 ∗ (m) − v((γ2 )) (m) = exp[−ι(γ ¯ 2 ) ] γ2 −v((γ1 )) + 2 ι(γ ¯ 1 ) v((γ1 )) 1 + 2 ι(γ ¯ 2 ) v((γ2 )) (m) 1 = −v((γ1 )) + 2 ι(γ ¯ 1 ) v((γ1 )) (γ2 m) + ρ(γ2 , m) = ρ(γ1 , γ2 m) + ρ(γ2 , m) = ρ1,23 + ρ2,3 (γ1 , γ2 , m). (A.2.3) 3. Proof of Proposition 4.2. Note, first, that the action L h of the gauge transformation h on × M defined in (4.1) may be factored through × × M as

Kh

/ (x, h(x), m)

/ (x, h(x)m) .

(A.3.1)

L ∗h G A = K h∗ (Id × )∗ G A = K h∗ (Id × )∗ (Iρ A ⊗ G2 ),

(A.3.2)

(x, m)

Id×

It follows that

572

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

see the 2nd of Eqs. (3.9). Now, (Id × )∗ (Iρ A ⊗ G2 ) = I(ρ A )1,23 ⊗ G23 ,

(A.3.3)

with the indices referring to the factors of × × M so that (ρ A )1,23 = (Id × )∗ ρ A . From the definition (4.8) of the gerbe F, it follows that

∗ G = G12 ∼ = Iρ ⊗ G2 ⊗ F,

(A.3.4)

where ∼ = stands for “is 1-isomorphic to”. Equation (A.3.3) then implies (Id × )∗ (Iρ A ⊗ G2 ) ∼ = I(ρ A )1,23 ⊗ Iρ2,3 ⊗ G3 ⊗ F2,3 = I(ρ A )1,23 +ρ2,3 ⊗ G3 ⊗ F2,3 . (A.3.5) The substitution of this identity into the right hand side of relation (A.3.2) gives L ∗h G A ∼ = K h∗ (I(ρ A )1,23 +ρ2,3 ⊗ G3 ⊗ F2,3 ) = Iω ⊗ G2 ⊗ (h × Id)∗ F,

(A.3.6)

ω := K h∗ ((ρ A )1,23 + ρ2,3 ) = L ∗h ρ A + (h × Id)∗ ρ

(A.3.7)

where

is a 2-form on the product space × M that is identified in Lemma A.3.1. ω = ρh −1 A . Proof. On the one hand,

L ∗h ρ A (x, m) = ρ A (x, h(x)m) = exp[−ι(h ∗ )(x) ] ∗h(x) (ρ A )(x, ·) (m) 1 (m) = exp[−ι(h ∗ )(x) ] ∗h(x) −v(A(x)) + 2 ι A(x) ¯ v(A(x)) = − v((Adh −1 (A))(x)) + ι(h ∗ )(x) v((Adh −1 (A))(x)) 1 (A.3.8) + 2 ι(Ad −1 (A))(x) v((Adh −1 (A))(x)) (m). h

On the other hand, 1 (h × Id)∗ ρ (x, m) = −v(h ∗ ) + 2 ιh ∗ v(h ∗ ) (x, m).

(A.3.9)

Adding both expressions and using the 3rd of relations (3.6), we infer that L ∗h ρ A + (h × Id)∗ ρ = −v(h ∗ + Adh −1 (A)) + 2 ιh ∗ +Ad 1

= −v(h −1 A) + 2 ιh −1 A v(h −1 A) 1

which is the identity claimed by Lemma A.3.1.

h −1 (A)

v(h ∗ + Adh −1 (A)) (A.3.10)

Replacing A by h A and recalling definition (3.9) of the gerbe G A , we infer from Eq. (A.3.6) and Lemma A.3.1 the existence of the 1-isomorphism required by Proposition 4.2.

Global Gauge Anomalies in 2-D Bosonic Sigma Models

573

4. Proof of Theorem 5.3. To prove Theorem 5.3, we shall show the existence of a canonical equivalence Gr b∇(M)0 ∼ = Gr b∇(M )

(A.4.1)

of 2-categories. Here, M is assumed to be a left principal -bundle over M . On the left-hand side of (A.4.1) is the 2-category of -equivariant gerbes over M whose 2-form ρ vanishes. On the right-hand side is the 2-category of gerbes over the quotient M = M/ . We shall show that the equivalence (A.4.1) is a consequence of the fact that gerbes form a sheaf of 2-categories over smooth manifolds. We shall first recall some details about this fact. / M , we consider the descent Associated to any surjective submersion ω : M 2-category Des(ω) defined as follows, with πi1 ...iq standing for the projection from a p-fold fiber-product M [ p] = M × M M × M · · · × M M to the q-fold fiber product M [q] forgetting all but the i 1 , . . . , i q components. An object in Des(ω) is a triple (G, C, λ) / π ∗ G over M [2] and a consisting of a gerbe G over M, a 1-isomorphism C : π1∗ G 2 2-isomorphism ∗ ∗ λ : π23 C ◦ π12 C

+3 π ∗ C 13

(A.4.2)

over M [3] such that the diagram ∗ C ◦ π∗ C ◦ π∗ C π34 23 G 12 GGGG wwww GGGGId◦π ∗ λ w w w GGGG 123 w w www GGG w w w ww ' ∗ ∗ ∗ C ◦ π∗ C π24 C ◦ π12 C π34 13 GGGGG ww w GGGG w w GGGG wwww ∗ λ G wwww ∗ GG ' π124 w www π134 λ ∗ C π14

(A.4.3)

∗ λ◦Id π234

of 2-isomorphisms over M [4] is commutative. A 1-morphism (D, κ) : (G a , C a , λa ) in Des(ω) is a 1-isomorphism D : G a

/ (G b , C b , λb )

(A.4.4)

/ G b of gerbes over M and a 2-isomorphism

κ : π2∗ D ◦ C a

+3 C b ◦ π ∗ D 1

(A.4.5)

such that the diagram ∗ Ca ⊗ π ∗ Ca π3∗ D ⊗ π23 QQ12 k k QQQ kk a k k QId⊗λ k QQQQ k k k k k QQQ qy kk ,$ ∗ Ca ∗ ∗ ∗ b a π3∗ D ⊗ π13 π23 C ⊗ π2 D ⊗ π12 C ????? { ???? {{{{ {{{{{∗ ???? { { ∗ { Id⊗π12 κ ? # y {{{ π13 κ ∗ ∗ ∗ ∗ b b b 3 + π C ⊗π C ⊗π D π C ⊗ π ∗D ∗ κ⊗Id π23

23

12

1

λb ⊗Id

13

1

(A.4.6)

574

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

of 2-isomorphisms over M [3] is commutative. Finally, a 2-isomorphism ε : (D, κ) +3 D such that the diagram (D , κ ) in Des(ω) is a 2-isomorphism ε : D π2∗ D ◦ C a π2∗ ε◦Id

κ

1

(A.4.7)

Id◦π1∗ ε

π2∗ D ◦ C a

+3 C b ◦ π ∗ D

+3

κ

+3 C b ◦ π ∗ D 1

of 2-isomorphisms over M [2] is commutative. Composition and identities in Des(ω) are defined in the natural way. There is an obvious functor ω∗ : Gr b∇(M )

/ Des(ω)

(A.4.8)

which sends a gerbe G over M to the triple (ω∗ G, Id, Id), and is defined analogously for 1-morphisms and 2-morphisms. An important part of the statement that gerbes form a sheaf of 2-categories over smooth manifolds is the gluing axiom for this sheaf. Using the definitions introduced above, it can be formulated in the following way. Theorem A.4.2. For any surjective submersion ω : M an equivalence of 2-categories.

/ M , the functor (A.4.8) is

This was proven in [50], Prop. 6.7, in a setup with (bundle) gerbes without connections, but the proof actually works also for gerbes with connection. The equivalence (A.4.1) that we have to prove is now a consequence of Theorem A.4.2 and the following relation between equivariant gerbes and the descent 2-categories introduced above. Here, we remark that the projection of any principal G-bundle is a surjective submersion. Lemma A.4.3. Let M be a (left) principal -bundle over M with projection ω : M / M . Then, there is a canonical equivalence of 2-categories Gr b∇(M)0 ∼ = Des(ω).

(A.4.9)

Proof. Since M is a principal -bundle over M , there are diffeomorphisms f p : p−1 × / M [ p] , M (γ1 , . . . , γ p−1 , m)

fp

/ (γ1 . . . γ p−1 m, γ2 . . . γ p−1 m, . . . , γ p−1 m, m). (A.4.10)

/ q × M that we The diffeomorphisms f p exchange various maps ... : p−1 × M [ p] [q] / M in the following way: introduced in Sect. 3.3 with projections πi1 ,...,iq : M f 2 ◦ 2,3

(A.4.11)

12 = π1 ◦ f 2 and 2 = π2 ◦ f 2 , = π23 ◦ f 3 , f 2 ◦ 1,23 = π12 ◦ f 3 and f 2 ◦ 12,3 = π13 ◦ f 3 , (A.4.12) f 3 ◦ 1,2,34 = π123 ◦ f 4 , f 3 ◦ 12,3,4 = π134 ◦ f 4 , f 3 ◦ 2,3,4 = π234 ◦ f 4 and f 3 ◦ 1,23,4 = π124 ◦ f 4 . (A.4.13)

Global Gauge Anomalies in 2-D Bosonic Sigma Models

575

Consider a descent object (G, C, λ). Note that the curvature H of gerbe G is (without any extension) -equivariantly closed so that we may take ρ = 0 for the -equivariant struc/ π ∗G ture on G, see Definition 5.1. Using rules (A.4.11), the pullback of C : π1∗ G 2 [2] / along f 2 : G × M M is a 1-isomorphism α := f 2∗ C : ∗12 G

/ ∗ G, 2

(A.4.14)

and thus precisely the datum (i) we need for a -equivariant structure. Using rules ∗ C ◦ π∗ C +3 π ∗ C along f 3 is a (A.4.12), the pullback of the 2-isomorphism λ : π23 12 13 2-isomorphism β := f 3∗ λ : ∗2,3 α ◦ ∗1,23 α

+3 ∗ α, 12,3

(A.4.15)

and thus precisely the datum (ii) we need for the -equivariant structure. It is then easy to observe that the pullback of the commutative diagram (A.4.3) along f 4 is, using rules (A.4.13), precisely the diagram (5.1) in Definition 5.1. Thus, (G, α, β) is a -equivariant gerbe relative to the zero 2-form. In the same way one verifies, using (A.4.11)–(A.4.13), that 1-isomorphisms and 2-isomorphisms in Des(ω) pull back to 1-isomorphisms and 2-isomorphisms between -equivariant gerbes, respectively. This defines a functor f ∗ : Des(ω)

/ Gr b∇(M) . 0

(A.4.16)

This functor is an equivalence, because the maps f p are diffeomorphisms. Indeed, if (G, α, β) is a -equivariant gerbe then, using (A.4.11)–(A.4.13) again, one observes that C := ( f 2−1 )∗ α and λ := ( f 3−1 )∗ β make up a descent object (G, C, λ), and analogously for 1-isomorphisms and 2-isomorphisms. 5. Proof of Lemma 5.4. For ρ˜A and ˜ defined by Eqs. (5.8) and (5.9), one obtains by virtue of relations (5.7) and (3.27): ∗ (ρ˜A )1˜ 2˜ (γ , ( p, m)) = ˜ ρ˜A (γ , ( p, m)) = ∗1,3 ρ˜ Adγ (A−(γ )) (γ , p, m) ∗ = exp[−ι(γ ¯ ) ]( γ )3 ρ˜ Adγ (A−(γ )) ( p, m) ∗ = ( γ )∗3 ρ˜ Adγ (A−(γ )) ( p, m) − ι(γ ¯ ) ( γ )3 ρ˜ Adγ (A−(γ )) ( p, m). (A.5.1) The 2nd of relations (3.24) implies further that ( γ )∗3 ρ˜ Adγ (A−(γ )) ( p, m) ! 1 = ( γ )∗3 −v(Adγ (A −(γ )))+ 2 ι Adγ (A−(γ )) v(Adγ (A − (γ ))) ( p, m) 1 = −v(A − (γ )) + 2 ιA−(γ ) v(A − (γ )) ( p, m). (A.5.2) Hence, ( γ )∗3 ρ˜ Adγ (A−(γ )) ( p, m) 1 = −v(A) + 2 ιA¯ v(A) + v((γ )) +

1 ι ¯ ) v((γ )) 2 (γ

576

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

1 1 − 2 ι(γ ¯ ) v(A) − 2 ιA¯ v((γ )) ( p, m) 1 1 = −v(A) + 2 ιA¯ v(A) + v((γ )) + 2 ι(γ ¯ ) v((γ )) − ι(γ ¯ ) v(A) ( p, m), (A.5.3) where the last equality follows from the right one of relations (3.6). Consequently, ∗ ι(γ ¯ ) ( γ )3 ρ˜ Adγ (A−(γ )) ( p, m) = −ι(γ ¯ ) v(A) + ι(γ ¯ ) v((γ )) ( p, m). (A.5.4) Subtracting the last expression from the previous one, we infer from Eq. (A.5.1) the relation 1 1 (ρ˜A )1˜ 2˜ (γ , ( p, m)) = −v(A) + 2 ιA¯ v(A) + v((γ )) − 2 ι(γ ¯ ) v((γ )) ( p, m) = ρ˜A ( p, m) − ρ(γ , m).

(A.5.5)

This is the identity claimed by Lemma 5.4 6. Construction of flat gerbes from characters. Let be a connected Lie group and / M a left principal -bundle. We shall assume that M is also connected. ω : M ˜ Z˜ where ˜ is the covering group of and Z˜ is a subgroup of One has = / the center of ˜ and is naturally identified with the fundamental group of . Note that ∗ ∗ H 1 (, U (1)) ∼ = Z˜ . To each character χ ∈ Z˜ , there corresponds a flat line bundle L χ composed of classes [γ˜ , u]χ of the equivalence relation on ˜ × C (γ˜ , u) ∼ (γ˜ z −1 , χ (z)u) χ

(A.6.1)

for z ∈ Z˜ . We can associate to this line bundle L χ a flat gerbe Gχ = (Y, B, L , μ) over M using the geometric description of gerbes mentioned in the beginning of Sect. 7. We shall take Y = M with the canonical projection on M and a vanishing curving B = 0. The fiber products Y [ p] = M [ p] may be naturally identified with p−1 × M by the map f p given by Eq. (A.4.10). For the line bundle L we shall take the pullback of L χ along / γ ∈ . The groupoid multiplication μ is then induced by the map Y [2] (γ m, m) the map / [γ˜1 γ˜2 , u 1 u 2 ]χ . (A.6.2) [γ˜1 , u 1 ]χ ⊗ [γ˜2 , u 2 ]χ It is easy to show that the pullback gerbe ω∗ Gχ is 1-isomorphic to the trivial gerbe I0 on M and that Gχ is 1-isomorphic to the trivial gerbe on M if and only the flat line bundle / M to M. L χ extends from a fiber of the bundle ω : M The 1-isomorphism class of the flat gerbe Gχ gives the element of H 2 (M , U (1)) associated by the middle homomorphism τ in the exact sequence (6.59) to the element of H 1 (, U (1)) identified with the character χ of Z˜ . 7. Behavior of isomorphism α under groupoid multiplication. We verify here that, for the line bundle isomorphism α constructed in Sect. 7.2, the two composed isomorphisms (7.49) and (7.50) coincide so that α defines a 1-isomorphism between the gerbes (Gk )12 ˜ Similarly as for W [2] , see Eq. (7.39), we and Iρ ⊗ (Gk )2 over the product group × G. 12 have:

Global Gauge Anomalies in 2-D Bosonic Sigma Models

577

[3] W12 = (G˜ × Y [3] )/ Z˜ .

(A.7.1)

[3] Over (G˜ × Yi1 j1 i2 j2 i3 j3 )/ Z˜ ⊂ W12 , consider elements i1 i2 . . . j1 j3 in the respective fibers of L. The composition (7.49) of line bundle isomorphisms is induced by the map

Id⊗α˜ 2,3 / γ˜ , i i ⊗ i j ⊗ j j γ˜ , i1 i2 ⊗ i2 i3 ⊗ i3 j3 1 2 2 2 2 3 α˜ 1,2 ⊗Id

/

γ˜ , i1 j1 ⊗ j1 j2 ⊗ j2 j3

Id×(Id⊗(μ 2 )1,2,3 ) / γ˜ , i1 j1 ⊗ j1 j3

(A.7.2)

with μ( i2 i3 ⊗ i3 j3 ) = μ( i2 j2 ⊗ j2 j3 ), μ( i1 i2 ⊗ i2 j2 ) = μ( i1 j1 ⊗ j1 j2 ) and μ( j1 j2 ⊗

j2 j3 ) = j1 j3 . The associativity of the groupoid multiplication μ then implies that μ( i1 i2 ⊗μ( i2 i3 ⊗ i3 j3 )) = μ( i1 i2 ⊗μ( i2 j2 ⊗ j2 j3 )) = μ(μ( i1 i2 ⊗ i2 j2 ) ⊗ j2 j3 ) = μ(μ( i1 j1 ⊗ j1 j2 ) ⊗ j2 j3 ) = μ( i1 j1 ⊗μ( j1 j2 ⊗ j2 j3 )) = μ( i1 j1 ⊗ j1 j3 ). (A.7.3) Similarly, the composition (7.50) descends from the map ⊗Id) Id×((μ 12 )1,2,3 / γ˜ , i1 i2 ⊗ i2 i3 ⊗ i3 j3 γ˜ , i1 i3 ⊗ i3 j3

α˜ 1,3

/

γ˜ , i1 j1 ⊗ j1 j3 (A.7.4)

with μ( i1 i2 ⊗ i2 i3 ) = i1 i3 and μ( i1 i3 ⊗ i3 j3 ) = μ( i1 j1 ⊗ j1 j3 ). Now μ(μ( i1 i2 ⊗ i2 i3 ) ⊗ i3 j3 ) = μ( i1 i3 ⊗ i3 j3 ) = μ( i1 j1 ⊗ j1 j3 ).

(A.7.5)

Comparison between the relations (A.7.3) and (A.7.5) and the use of the associativity of μ show that the target elements of (A.7.2) and (A.7.4) coincide if the initial elements are the same. That demonstrates the identity of two composed line bundle isomorphisms (7.49) and (7.50).

8. Commutativity of diagram (7.74). We shall prove that diagram (7.74) of isomor[2] phisms of line bundles over W123 is commutative. Over subspace (G˜ 2 ×Yi1 j1 k1 i2 j2 k2 )/ Z˜ 2 ⊂ [2] W123 , with notations similar to those in the previous Appendix, the top line of the diagram is induced by the composite map α˜ 1,23 ⊗Id / γ˜1 , γ˜2 , i j ⊗ j j ⊗ j k γ˜1 , γ˜2 , i1 i2 ⊗ i2 j2 ⊗ j2 k2 1 1 1 2 2 2 Id⊗ α˜ 2,3/ γ˜ , γ˜ ,

1 2 i 1 j1 ⊗ j1 k1 ⊗ k1 k2

(A.8.1)

with μ( i1 i2 ⊗ i2 j2 ) = μ( i1 j1 ⊗ j1 j2 ) and μ( j1 j2 ⊗ j2 k2 ) = μ( j1 k1 ⊗ μk1 k2 ) which imply that μ(μ( i1 i2 ⊗ i2 j2 ) ⊗ j2 k2 ) = μ(μ( i1 j1 ⊗ j1 j2 ) ⊗ j2 k2 ) = μ( i1 j1 ⊗ μ( j1 j2 ⊗ j2 k2 )) = μ( i1 j1 ⊗ μ( j1 k1 ⊗ k1 k2 )).

(A.8.2)

578

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

The bottom line of the diagram (7.74) descends from the map γ˜1 , γ˜2 , i1 i2 ⊗ i2 k2

α˜ 12,3

/

γ˜1 , γ˜2 , i1 k1 ⊗ k1 k2

(A.8.3)

with Assuming that

μ( i1 i2 ⊗ i2 k2 ) = μ( i1 k1 ⊗ k1 k2 ).

(A.8.4)

γ˜1 , γ˜2 , i2 k2 = β˜ γ˜1 , γ˜2 , i2 j2 ⊗ j2 k2 and γ˜1 , γ˜2 , i1 k1 = β˜ γ˜1 , γ˜2 , i1 j1 ⊗ j1 k1 ,

(A.8.5)

i.e. that i2 k2 = μ( i2 j2 ⊗ j2 k2 ) and i1 k1 = μ( i1 j1 ⊗ j1 k1 ), we infer from comparison between Eqs. (A.8.4) and (A.8.2) that the target elements of (A.8.1) and (A.8.3) coincide, establishing the commutativity of diagram (7.74). 9. Proof of the equality of isomorphisms (7.76) and (7.77). Similarly as before, one may identify W1234 = (G˜ 3 × Y [4] )/ Z˜ 3

(A.9.1)

with the action of Z˜ 3 given by γ˜1 , γ˜2 , γ˜3 , (y, y , y , y ) / z 1 γ˜1 , z 2 γ˜2 , z 3 γ˜3 , (y(z 1 z 2 z 3 )−1 , y (z 2 z 3 )−1 , y z −1 , y ) . (A.9.2) 3 The different pullbacks of the bundle E over W12 to W1234 may be identified as E 1,234 ∼ = (G˜ 3 × L 1,2 )/ Z˜ 3 ,

E 2,34 ∼ = (G˜ 3 × L 2,3 )/ Z˜ 3 ,

E 23,4 ∼ = (G˜ 3 × L 2,4 )/ Z˜ 3 ,

E 12,34 ∼ = (G˜ 3 × L 1,3 )/ Z˜ 3 ,

E 3,4 ∼ = (G˜ 3 × L 3,4 )/ Z˜ 3 , E 123,4 ∼ = (G˜ 3 × L 1,4 )/ Z˜ 3 , (A.9.3)

with appropriate actions of Z˜ 3 and appropriate modifications of the connection of the pullbacks of L. If (y, y , y , y ) ∈ Yi jkl ⊂ Y [4] and i j ∈ L (y,y ) ⊂ L 1,2 , . . . . . . , il ∈ L (y,y ) ⊂ L 1,4 , then the composition (7.76) of the line bundle isomorphisms is induced by the map γ˜1 , γ˜2 , γ˜3 , i j ⊗ jk ⊗ kl

Id×Id⊗ β˜2,3,4

/

γ˜1 , γ˜2 , γ˜3 , i j ⊗ jl

˜ Id× β1,23,4

/ (γ˜1 , γ˜2 , γ˜3 , il )

(A.9.4)

with jl = μ( jk ⊗ kl ) and il = μ( i j ⊗ jl ) = μ( i j ⊗ μ( jk ⊗ kl )). On the other hand, the composition (7.77) is given by Id× β˜1,2,34 ⊗Id / (γ˜1 , γ˜2 , γ˜3 , ik ⊗ kl ) γ˜1 , γ˜2 , γ˜3 , i j ⊗ jk ⊗ kl ˜ Id× β12,3,4

/ (γ˜1 , γ˜2 , γ˜3 , il )

(A.9.5)

with ik = μ( i j ⊗ jk ) and il = μ( ik ⊗ kl ) = μ(μ( i j ⊗ jk ) ⊗ kl ). Using the associativity of μ, we infer that the two compositions give the same line-bundle isomorphism.

Global Gauge Anomalies in 2-D Bosonic Sigma Models

579

References 1. Alvarez, O.: Topological quantization and cohomology. Commun. Math. Phys. 100, 279–309 (1985) 2. Bardakci, K., Rabinovici, E., Säring, B.: String models with c < 1 components. Nucl. Phys. B 299, 151–182 (1988) 3. Bertlmann, R.A.: Anomalies in Quantum Field Theory. Oxford-New York: Oxford University Press, 2000 4. Brown, K.S.: Cohomology of Groups. Berlin-Heidelberg-New-York: Springer, 1982 5. Carey, A.L., Johnson, S., Murray, M.K., Stevenson, D., Wang, B.L.: Bundle gerbes for Chern-Simons and Wess-Zumino-Witten theories. Commun. Math. Phys. 259, 577–613 (2005) 6. Carey, A.L., Murray, M.K., Wang, B.L.: Higher bundle gerbes and cohomology classes in gauge theories. J. Geom. Phys. 21, 183–197 (1997) 7. Chatterjee, D.S.: On gerbes. Ph.D. thesis, Trinity College, Cambridge, 1998 8. Di Vecchia, P., Durhuus, B., Petersen, J.L.: The Wess-Zumino action in two dimensions and non-abelian bosonization. Phys. Lett. B 144, 245–249 (1984) 9. Dunbar, D.C., Joshi, K.G.: Maverick examples of coset conformal field theories. Mod. Phys. Lett. A 8, 2803–2814 (1993) 10. Dubrovin, B.A., Fomenko, A.T., Novikov, S.P.: Modern Geometry - Methods and Applications. Part III, Introduction to Homology Theory. Berlin-Heidelberg-New-York: Springer, 1990 11. Fabbrichesi, M.: Cancellation of global anomalies in spontaneously broken gauge theories. Pramana 62, 725–727 (2004) 12. Figueroa-O’Farrill, J.M., Mohammedi, N.: Gauging the Wess-Zumino term of a sigma model with boundary. JHEP 08, 086 (2005) 13. Figueroa-O’Farrill, J.M., Stanciu, S.: Equivariant cohomology and gauged bosonic σ -models, http:// arXiv.org/abs/hep-th/9407149v3, 1994 14. Figueroa-O’Farrill, J.M., Stanciu, S.: Gauged Wess-Zumino terms and equivariant cohomology. Phys. Lett. B 341, 153–159 (1994) 15. Felder, G., Gaw¸edzki, K., Kupiainen, A.: Spectra of Wess-Zumino-Witten models with arbitrary simple groups. Commun. Math. Phys. 117, 127–158 (1988) 16. Fuchs, J., Schellekens, B., Schweigert, C.: The resolution of field identification fixed points in diagonal coset theories. Nucl. Phys. B 461, 371–406 (1996) 17. Gaw¸edzki, K.: Topological actions in two-dimensional quantum field theories. In: Hooft, G.’t, Jaffe, A., Mack, G., Mitter, P.K., Stora, R. (eds.) Non-perturbative Quantum Field Theory. New York: Plenum Press, 1988, pp. 101–142 18. Gaw¸edzki, K.: Conformal field theory. In: Séminaire Bourbaki, Exposé 704, Astérisque 177/178, 95–126 (1989) 19. Gaw¸edzki, K.: Geometry of Wess-Zumino-Witten models of conformal field theory. In: Recent Advances in Field Theory. Binétruy, P., Girardi, G., Sorba, P. (eds.) Nucl. Phys. (Proc. Suppl.) B 18, 78–91 (1990) 20. Gaw¸edzki, K.: Abelian and non-Abelian branes in WZW models and gerbes. Commun. Math. Phys. 258, 23–73 (2005) 21. Gaw¸edzki, K., Kupiainen, A.: G/H conformal field theory from gauged WZW model. Phys. Lett. B 215, 119–123 (1988) 22. Gaw¸edzki, K., Kupiainen, A.: Coset construction from functional integral. Nucl. Phys. B 320, 625–668 (1989) 23. Gaw¸edzki, K., Reis, N.: WZW branes and gerbes. Rev. Math. Phys. 14, 1281–1334 (2002) 24. Gaw¸edzki, K., Reis, N.: Basic gerbe over non simply connected compact groups. J. Geom. Phys. 50, 28–55 (2004) 25. Gaw¸edzki, K., Waldorf, K.: Polyakov-Wiegmann formula and multiplicative gerbes. JHEP 09, 073 (2009) 26. Gaw¸edzki, K., Suszek, R.R., Waldorf, K.: WZW orientifolds and finite group cohomology. Commun. Math. Phys. 284, 1–49 (2008) 27. Gaw¸edzki, K., Suszek, R.R., Waldorf, K.: Bundle gerbes for orientifold sigma models, http://arXiv.org/ abs/0809.5125v2 [math-ph], 2008 28. Gepner, D., Witten, E.: String theory on group manifolds. Nucl. Phys. B 278, 493–549 (1986) 29. Goddard, P.: Infinite dimensional Lie algebras: representations and applications. In: WSGP5, Proceedings of the Winter School “Geometry and Physics” Frolík, Z., Souˇcek, V., Vinárek, J. (eds.), Palermo: Circolo Matematico di Palermo, 1985, pp. 73–107 30. Goddard, P., Kent, A., Olive, D.: Virasoro Algebras and Coset Space Models. Phys. Lett. B 152, 88–92 (1985) 31. Gomi, K.: Equivariant smooth Deligne cohomology. Osaka J. Math. 42, 309–337 (2005) 32. Hitchin, N.J.: Lectures on special Lagrangian submanifolds. In: Winter School on Mirror Symmetry, Vector Bundles and Lagrangian Submanifolds. Vafa, C., Yau, S.-T. (eds.) AMS/IP Stud. Adv. Math. Vol. 23, Providence, RI: Amer. Math. Soc., 2001, pp. 151–182 33. Hori, K.: Global aspects of gauged Wess-Zumino-Witten models. Commun. Math. Phys. 182, 1–32 (1996)

580

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

34. Hull, C.M.: Global aspects of T-duality, gauged sigma models and T-folds. JHEP 10, 057 (2007) 35. Hull, C.M.: Doubled geometry and T-folds. JHEP 07, 080 (2007) 36. Hull, C.M., Spence, B.: The gauged nonlinear sigma model with Wess-Zumino term. Phys. Lett. B 232, 204–210 (1989) 37. Jack, I., Jones, D.R.T., Mohammedi, N., Osborn, H.: Gauging the general σ -model with a Wess-Zumino term. Nucl. Phys. B 332, 359–379 (1990) 38. Kalkman, J.: BRST model for equivariant cohomology and representatives for the equivariant Thom class. Commun. Math. Phys. 153, 447–463 (1993) 39. Kac, V.G.: Infinite dimensional Lie algebras, 2nd edition, Cambridge: Cambridge University Press, 1985 40. Karabali, D., Park, Q., Schnitzer, H.J., Yang, Z.: A GKO construction based on a path integral formulation of gauged Wess-Zumino-Witten actions. Phys. Lett. B 216, 307–312 (1989) 41. Kreuzer, M., Schellekens, A.N.: Simple currents versus orbifolds with discrete torsion - a complete classification. Nucl. Phys. B 411, 97–121 (1994) 42. Meinrenken, E.: The basic gerbe over a compact simple Lie group. Enseign. Math. 49, 307–333 (2003) 43. Murray, M.K.: Bundle gerbes. J. London Math. Soc. 54(2), 403–416 (1996) 44. Murray, M.K., Stevenson, D.: Bundle gerbes: stable isomorphisms and local theory. J. London Math. Soc. 62(2), 925–937 (2000) 45. Nikolaus, T.: Äquivariante Gerben und Abstieg. Diploma thesis, University of Hamburg, 2009 46. Petersen, J.L.: Non-abelian chiral anomalies and Wess-Zumino effective actions. Acta Phys. Polon. B 16, 271–300 (1985) 47. Schellekens, A.N., Yankielowicz, S.: Field identification fixed points in the coset construction. Nucl. Phys. B 334, 67–102 (1990) 48. Schreiber, U., Schweigert, C., Waldorf, K.: Unoriented WZW models and holonomy of bundle gerbes. Commun. Math. Phys. 274, 31–64 (2007) 49. Serre, J.-P.: Homologie singulière des espaces fibrés. Ann. of Math. 54, 425–505 (1951) 50. Stevenson, D.: The geometry of bundle gerbes. Ph.D. thesis, University of Adelaide, 2000, http://arXiv. org/abs/0004117v1 [math.DG], 2000 51. Hooft, G.’t.: Naturalness, chiral symmetry, and spontaneous chiral symmetry breaking. In: Recent Developments in Gauge Theories. Hooft, G.’t, Itzykson, C., Jaffe, A., Lehmann, H., Mitter, P.K., Singer, I.M., Stora, R. (eds.), New York: Plenum Press, 1980 52. Tu, J.-L.: Groupoid cohomology and extensions. Trans. Amer. Math. Soc. 358, 4721–4747 (2006) 53. Vafa, C.: Modular invariance and discrete torsion on orbifolds. Nucl. Phys. B 273, 592–606 (1986) 54. Waldorf, K.: More morphisms between bundle gerbes. Theory Appl. Categ. 18, 240–273 (2007) 55. Waldorf, K.: Multiplicative bundle gerbes with connection. Diff. Geom. Appl 28(3), 313–340 (2010) 56. Weinberg, S.: The Quantum Theory of Fields, Vol. 2: Modern Applications. Cambridge: Cambridge University Press, 1996 57. Wess, J., Zumino, B.: Consequences of anomalous Ward identies. Phys. Lett. B 37, 95–97 (1971) 58. Witten, E.: An SU (2) anomaly. Phys. Lett. B 117, 324–328 (1982) 59. Witten, E.: Non-abelian bosonization in two dimensions. Commun. Math. Phys. 92, 455–472 (1984) 60. Witten, E.: On holomorphic factorization of WZW and coset models. Commun. Math. Phys. 144, 189–212 (1992) 61. Wu, S.: Cohomological obstructions to the equivariant extension of closed invariant forms. J. Geom. Phys. 10, 381–392 (1993) Communicated by A. Kapustin

Commun. Math. Phys. 302, 581–630 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1178-5

Communications in

Mathematical Physics

Sharp Convergence Rate of the Glimm Scheme for General Nonlinear Hyperbolic Systems Fabio Ancona, Andrea Marson Dipartimento di Matematica Pura ed Applicata, Universita di Padova, Via Trieste 63, 35121 Padova, Italy. E-mail: [email protected]; [email protected] Received: 19 March 2009 / Accepted: 18 August 2010 Published online: 28 January 2011 – © Springer-Verlag 2011

Abstract: Consider a general strictly hyperbolic, quasilinear system, in one space dimension u t + A(u)u x = 0,

(1)

where u → A(u), u ∈ ⊂ R N , is a smooth matrix-valued map. Given an initial datum u(0, ·) with small total variation, let u(t, ·) be the corresponding (unique) vanishing viscosity solution of (1) obtained as a limit of solutions to the viscous parabolic approximation u t + A(u)u x = μu x x , as μ → 0. For every T ≥ 0, we prove the a-priori bound ε √ u (T, ·) − u(T, ·) 1 = o(1) · ε | log ε| (2) L for an approximate solution u ε of (1) constructed by the Glimm scheme, with mesh size x = t = ε, and with a suitable choice of the sampling sequence. This result provides for general hyperbolic systems the same type of error estimates valid for Glimm approximate solutions of hyperbolic systems of conservation laws u t + F(u)x = 0 satisfying the classical Lax or Liu assumptions on the eigenvalues λk (u) and on the eigenvectors rk (u) of the Jacobian matrix A(u) = D F(u). The estimate (2) is obtained introducing a new wave interaction functional with a cubic term that controls the nonlinear coupling of waves of the same family and at the same time decreases at interactions by a quantity that is of the same order of the product of the wave strength times the change in the wave speeds. This is precisely the type of errors arising in a wave tracing analysis of the Glimm scheme, which is crucial to control in order to achieve an accurate estimate of the convergence rate as (2). Contents 1. 2. 3.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588 New Wave Interaction Potential . . . . . . . . . . . . . . . . . . . . . . . . 593

582

4.

5. 6.

F. Ancona, A. Marson

Bounds on the Oscillations of the Interaction Potential . . . . . . . . . . . . 4.1 A wave-partition algorithm . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Oscillation of the interaction potential for subwaves . . . . . . . . . . . 4.3 A functional measuring the oscillation of the interaction potential . . . . 4.4 Uniform bound on the oscillations of the interaction potential of quadratic order in the total variation . . . . . . . . . . . . . . . . . . . . . . . . . Wave Tracing for General Quasilinear Systems . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

601 601 604 607 611 623 628

1. Introduction Consider a general strictly hyperbolic, N × N quasilinear system in one space dimension u t + A(u) u x = 0,

(1.1)

where u → A(u) is a C 2 matrix valued map defined from a domain ⊆ R N into M N ×N (R), and A(u) has N real distinct eigenvalues λ1 (u) < · · · < λ N (u)

∀u ∈ .

(1.2)

Denote with r1 (u), . . . , r N (u) a corresponding basis of right eigenvectors. The fundamental paper of Bianchini and Bressan [9] shows that (1.1) generates a unique (up to the domain) Lipschitz continuous semigroup {St : t ≥ 0} of vanishing viscosity solutions with small total variation obtained as the (unique) limits of solutions to the (artificial) viscous parabolic approximation u t + A(u) u x = μ u x x ,

(1.3)μ

when the viscosity coefficient μ → 0. In particular, in the conservative case where A(u) is the Jacobian matrix of a flux function F(u), every vanishing viscosity solution of (1.1) provides a weak solution (in a distributional sense) of u t + F(u)x = 0,

(1.4)

satisfying an admissibility criterion proposed by T.P. Liu in [24,25], which generalizes the classical stability conditions introduced by Lax [22]. Definition 1. A shock discontinuity of the k th family (u L , u R ), traveling with speed σk [u L , u R ], is Liu admissible if, for any state u lying on the Hugoniot curve Sk [u L ] between u L and u R , the shock speed σk [u L , u] of the discontinuity (u L , u) satisfies σk [u L , u] ≥ σk [u L , u R ].

(1.5)

Such a criterion needs to be imposed to rule out non-physical discontinuities, since weak solutions to Cauchy problems for (1.4) are not unique. 1 (R; R N ) with small total variation Given an initial datum u ∈ Lloc u(0, x) = u(x),

(1.6)

the existence of global weak admissible solutions to (1.4)–(1.6) was first established in the celebrated paper of Glimm [18] under the additional assumption that each characteristic field rk be either linearly degenerate (LD), so that ∇λk (u) · rk (u) = 0

∀ u,

(1.7)

A Sharp Convergence Rate of the Glimm Scheme

583

or else genuinely nonlinear (GNL), i.e. ∇λk (u) · rk (u) = 0

∀ u.

(1.8)

A primary example of a system (1.4) satisfying such assumptions is provided by the Euler equations of non-viscous gases, see [15]. A random choice method, the Glimm scheme, was introduced in [18] to construct approximate solutions of the general Cauchy problem (1.4)–(1.6) by piecing together solutions of several Riemann problems, i.e. Cauchy problems whose initial data are piecewise constant with a single jump at the origin L u if x < 0, u(0, x) = (1.9) uR if x > 0. Using a nonlinear functional introduced by Glimm, that measures the nonlinear coupling of waves in the solution, one can establish a-priori bounds on the total variation of a family of approximate solutions. These uniform estimates then yield the convergence of a sequence of approximate solutions to the weak admissible solution of (1.4)–(1.6). Unfortunately, assumptions (1.7) and (1.8) on the characteristic fields are too restrictive in several physical contexts, such as elastodynamic (e.g. see [16]), rigid heat conductors at low temperature [29,30], superfluids [28] or traffic flow models [14]. Hence, in the past years existence theorems for the Cauchy problem for systems not fulfilling assumptions (1.7)–(1.8) have been proved. In particular, the Glimm scheme for systems 1.4 in conservation form was extended by Liu [26], Liu and Yang [27], and by Iguchy and LeFloch [21] to the case of systems with non genuinely nonlinear (NGNL) characteristic families that exhibit finitely many points of lack of genuine nonlinearity along each elementary curve, and by Bianchini [7] to general hyperbolic systems (1.1). The aim of the present paper is to provide a sharp convergence rate for approximate solutions obtained by the Glimm scheme valid for general hyperbolic quasilinear systems (1.1), without any additional assumption on A(u) besides the strict hyperbolicity (1.2). We recall that in the Glimm scheme, one works with a fixed grid in the t-x plane, with mesh sizes t, x. An approximate solution u ε of (1.4)–(1.6) is then constructed as follows. By possibly performing a linear change of coordinates in the t-x plane, we may assume that the characteristic speeds λk (u), 1 ≤ k ≤ N , take values in the interval [0, 1], 1 (R; R N )∩ BV (R; R N ), we may for all u ∈ . Moreover, since the initial datum u ∈ Lloc . assume to work with its right continuous representative. Then, choose t = x = ε, and let {θ }∈N ⊂ [0, 1] be an equidistributed sequence of numbers, which thus satisfies the condition n−1 1 ∀ λ ∈ [0, 1], (1.10) lim λ − χ[0,λ] (θ ) = 0 n→∞ n =0

where χ[0,λ] denotes the characteristic function of the interval [0, λ]. On the initial strip 0 ≤ t < ε, u ε is defined as the exact solution of (1.4) provided by the Riemann solver in [6], with starting condition u ε (0, x) = u (( j + θ0 )ε)

∀ x ∈ ] jε, ( j + 1)ε [.

The elementary waves of the corresponding Riemann problem do not interact within the strip because the characteristic speeds λk (u) take values in [0, 1]. Next, assuming that

584

F. Ancona, A. Marson

u ε has been constructed for t ∈ [0, iε[, on the strip iε ≤ t < (i + 1)ε, u ε is defined as the exact solution of (1.4), with starting condition u ε (iε, x) = u ε (iε−, ( j + θi )ε)

∀ x ∈ ] jε, ( j + 1)ε [.

Relying on uniform a-priori bounds on the total variation, we thus define inductively the approximate solution u ε (t, ·) for all t ≥ 0. One can repeat this construction with the same values θi for each time interval [iε, (i + 1)ε[ , and letting the mesh size ε tend to zero. Hence, we obtain a parametrized family of solutions u ε which converge, by compactness, to some limit function u that results in a vanishing viscosity solution of (1.1), (1.6) (cfr. [9]). In order to derive an accurate estimate of the convergence rate of the approximate solutions, it was introduced in [12] an equidistributed sequence {θ }∈N ⊂ [0, 1] enjoying the following property. For any given 0 ≤ m < n, define the discrepancy of the set {θm , . . . , θn−1 } as 1 . (1.11) Dm,n = sup λ − χ[0,λ] (θ ) . n−m λ∈[0,1] m≤
Then, there holds Dm,n ≤ c ·

1 + log(n − m) n−m

∀ n > m ≥ 1,

(1.12)

for some constant c > 0. For systems (1.4) with GNL or LD characteristic fields, the L1 convergence rate of Glimm approximate solutions constructed √ with a sampling sequence enjoying the property (1.12) was shown in [12] to be o(1) · ε | ln ε| (o(1) indicating a quantity that approaches zero as ε → 0). This error estimate was recently extended in [5,19] to quasilinear systems (1.1) satisfying the assumption (H) For each k ∈ {1, . . . , N }th characteristic family, the linearly degenerate manifold . Mk = {u ∈ : ∇λk (u) · rk (u) = 0}

(1.13)

is either empty (GNL characteristic field), or it is the whole space (LD characteristic field), or it consists of a finite number of smooth, connected, hypersurfaces, and there holds ∇(∇λk · rk )(u) · rk (u) = 0

∀u ∈ Mk .

(1.14)

Notice that the Liu admissible solution of a Riemann problem for a system of conservation laws satisfying the assumption (H) consists of centered rarefaction waves, compressive shocks or composed waves made of a finite number of Liu admissible contact-discontinuities adjacent to rarefaction waves. On the contrary, the solution of a Riemann problem for a general hyperbolic system (1.4) may well be a composed wave containing a countable number of rarefaction waves and Liu admissible contactdiscontinuities. In the present paper we show that the same convergence rate valid for systems satisfying the assumption (H) continues to hold even for Glimm approximate solutions of general quasilinear systems (1.1). Namely, our result is the following.

A Sharp Convergence Rate of the Glimm Scheme

585

Theorem 1. Let A be a C 2 matrix valued map defined from a domain ⊂ R N into M N ×N (R), and assume that the matrices A(u) are strictly hyperbolic. Then, for every compact set K ⊂ , there exists a constant δ0 > 0 such that the following holds. Given 1 (R; R N ) with Tot.Var.{u} < δ , lim an initial datum u ∈ Lloc 0 x→−∞ u(x) ∈ K , consider the vanishing viscosity solution u(t, ·) of the Cauchy problem (1.1, (1.6) (obtained as the unique limit of solutions to the Cauchy problem (1.3)μ , (1.6) when μ → 0). Let u ε be a Glimm approximate solution of (1.1), (1.6), with mesh sizes x = t = ε, generated by a sampling sequence {θk }k∈N ⊂ [0, 1] satisfying (1.12). Then, for every T ≥ 0 there holds u ε (T, ·) − u(T, ·)L1 = 0, √ ε→0 ε| log ε| lim

(1.15)

and the limit is uniform w.r.t. u as long as Tot.Var.{u} < δ0 , lim x→−∞ u(x) ∈ K . The proof of the error bound (1.15) follows the same strategy adopted in [5,12,19], relying on the careful analysis of the structure of solutions to NGNL systems developed by T.P. Liu and T. Yang in [26,27]. Indeed, to estimate the distance between Lipschitz continuous (in time) approximate solutions ψ of (1.1) and the corresponding exact solution one would like to use the error bound [11] ψ(T ) − St ψ(0)L1 ≤ L

T

lim inf 0

h→0+

ψ(t + h) − Sh ψ(t)L1 dt, h

(1.16)

where L denotes a Lipschitz constant of the semigroup S generated by (1.4). However, for approximate solutions constructed by the Glimm scheme, a direct application of this formula is of little help because of the additional errors introduced by the restarting . procedures at times ti = iε. For this reason, following the wave tracing analysis in [27], it is useful to partition the elementary waves present in the approximate solution, say in a time interval [τ1 , τ2 ], into virtual subwaves that can be either traced back from τ2 to τ1 (primary waves), or are canceled or generated by interactions occurring in [τ1 , τ2 ] (secondary waves). Thanks to the simplified wave pattern associated to this partition, one can construct a front tracking approximation having the same initial and terminal values as the Glimm approximation, and thus establish (1.15) relying on (1.16). The key step of this procedure is to show that the variation of a Glimm functional provides a bound for the change in strength and for the product of strength times the variation in speeds of the primary waves. Here we shall implement a wave tracing algorithm for a general quasilinear system (1.1) in which such bounds are obtained relying on a new interaction potential functional whose decrease at interactions is precisely of the same order of this type of errors. To motivate the definition of this functional, consider an interaction between two shock waves of a k th NGNL family, say of size s , s , with speeds λ , λ , respectively, and assume that s , s have the same sign. Then, letting λ denote the shock speed of the outgoing wave s of the k th family, by the interaction estimates in [7, Theorem 3.7] there holds s s λ − λ . [sλ] = |s | λ − λ + |s | λ − λ = O(1) · . (1.17) |s | + |s | Here, and throughout the paper, O(1) denotes a uniformly bounded quantity depending only on the system (1.1). Notice that, using the wave-speed maps σ (·), σ (·) associated

586

F. Ancona, A. Marson

to the waves s , s (cfr. Theorem 2), one can rewrite the term on the right-hand side of (1.17) as s s σ (ξ ) − σ (ξ ) (1.18) dξ dξ . 0 0 |s | + |s | Thus, a natural suggestion of the above estimate would be to define the cubic part of a Glimm functional related to the potential interaction of waves of the same family as the sum of terms as (1.18) corresponding to all pair of waves s , s of each characteristic family. However, if one computes the resulting value of such a functional for a collection of waves sα , α = 1, . . . , m of the same GNL family, we would obtain sα sβ σα (ξ ) − σβ (ξ ) dξ dξ 0 |sα | + |sβ | 0 1≤α,β≤m

sα sβ · |wα − wβ | + |sα | + |sβ | |sα | + |sβ | 1≤α,β≤m ≈ min{|sα |, |sβ |} · |wα − wβ | + |sα | + |sβ | , ≈

(1.19)

1≤α,β≤m

where wα denotes the right state of sα and wβ denotes the left states of sβ (when sα is located on the left of sβ ). If we assume for simplicity that each wave sα is adjacent to the next one sα+1 so that one has |wα+1 − wα | = O(1)|sα |, and that all waves have the same strength |s| = ( α |sα |)/m, the right hand-side of (1.19) results equal to m|s| · [O(1) α|s| + 2|s|] = O(1) · m|s| [(m − 1)m + 2m] |s| 1≤α<m

= O(1) ·

2 |sα |

(m + 1),

(1.20)

α

which becomes arbitrary large when the number of waves m → ∞ while keeping the total amount of strength α |sα | fixed. Therefore, a functional constructed summing up terms as in (1.18) would be of no use to provide uniform bounds on the total variation of a solution to a general system (1.1). Instead, we shall take in to consideration a functional defined as the sum of terms of the form s s σ (s − ξ ) − σ (ξ ) . (s , s ) = dξ dξ , (1.21) 0 0 |w (s − ξ ) − w (ξ )| + (s − ξ )s + ξ s where σ (·), σ (·) are the wave-speed maps associated to s , s and, assuming s to be located on the left of s , w (s − ξ ), w (ξ ) denote, respectively, the right state and the left state of the shock components (s − ξ )s , ξ s of s and s , related to the parameters s −ξ , ξ (cfr. Definition 4). If we evaluate such a functional for a collection of (possibly composed) waves sα , α = 1, . . . , m of the same k th family, we find s α s β (sα , sβ ) = O(1) · 1≤α,β≤m

1≤α,β≤m

⎡ = O(1) · ⎣

1≤α≤m

⎤2 |sα |⎦ ,

(1.22)

A Sharp Convergence Rate of the Glimm Scheme

587

which remains bounded by the total strength of waves (<1), independently on their number. On the other hand, the occurrence of an interaction between two waves s , s of the k th family as above, besides determining the desired decrease of such a functional by a quantity of the order [sλ], produces a variation of all the terms involving the interacting waves s , s and other (non interacting) waves sγ of the same family (s, sγ ) − (s , sγ ) + (s , sγ ) , sγ

which is in general not comparable with the quantity [sλ]. However, one can establish a uniform (in time) a-priori bound on the variation of all such terms in any fixed time interval, which is quadratic with respect to the total strength of waves in the solution. For these reasons, in the present paper: 1. we consider an interaction potential defined by . sα sβ + Q(t) = c · (sα , sβ ) kα xβ (t)

(1.23)

kα =kβ

(with (sα , sβ ) as in (1.21)) where, as usual, xα (t) denotes the position of the wave sα of the kα th characteristic family in the approximate solution u ε (t), while the second summation extends to all pairs of (possibly composed) waves sα , sβ of the kα th family (including sα = sβ ), and c > 0 is a suitable constant; 2. we introduce a functional G(t) measuring the total amount of oscillation of the terms (sα , sβ ) at any later times τ ≥ t, which depends explicitly on the global wave pattern of the solution (see definition (4.36)–(4.40), (4.103). Our main results here show that, letting V (t) denote the total strength of waves in u ε (t), the functional . ϒ(t) = V (t) + C · (Q(t) + G(t)) t ≥0 (1.24) is actually non-increasing in time (for a suitable choice of the constant C > 0), G(t) is bounded by O(1) · (V (t))2 , and the total amount of products [sλ] of strength times the variation in speeds of the primary waves relative to any interval [t1 , t2 ] is bounded by O(1) · |t1 ,t2 ϒτ |. Notice that, in the genuinely nonlinear case, the following bounds hold. 1 (1.25) · sα sβ ≤ (sα , sβ ) ≤ O(1) · sα sβ , O(1) and thus one recovers from (1.23) the standard quadratic interaction potential of the original Glimm functional [18], with the only difference from [18] that in (1.23) all waves of the same family are considered as approaching (even pairs of rarefaction fans). We conclude recalling that for NGNL systems several Glimm type functionals are available in the literature [7,21,26,27], which work perfectly to establish uniform a-priori bounds on the total variation of the solution, but are not truly effective to control the type of errors [sλ] arising in a wave tracing analysis of the Glimm scheme. On the other hand, in the case of systems satisfying the assumption (H), were recently introduced in [5,19] two type of potential interaction functionals whose decrease actually bounds the products of strength times the variation in speeds [sλ], and which inspired the new definition in (1.23). The Glimm functional defined in [5] is the sum of a quadratic

588

F. Ancona, A. Marson

term Q q and of the cubic interaction potential defined in [7] concerning waves of the s s same family, that takes the form Qc = kα =kβ 0 α 0 β σα (ξ ) − σβ (ξ ) dξ dξ . Here, in the presence of interactions between waves of the same families and strength smaller than some threshold parameter δ , Q q behaves as the interaction functional introduced in [3] for systems with a single connected hypersurface 1.13, while the decrease of Qc controls the possible increase of Q q at interactions involving waves of the same family and strength larger than δ . The cubic part of the functional proposed in [19] corresponding to waves of the same family instead depends globally on the wave patterns of the solution. It is defined as kα =kβ |sα , sβ |[(sα , sβ )]− /Vkα (sα , sβ ), where (sα , sβ ) represents the effective angle between sα and sβ , computed taking into account all the kα -waves lying between sα and sβ , [ · ]− denotes the negative part, while Vkα (sα , sβ ) is the total strength of all kα -waves between sα and sβ (including sα and sβ ). Employing these interaction potentials it is shown in [5,19] that, for systems (1.1) satisfying the assumption (H), one can produce a simplified wave partition pattern whose errors are controlled by the total decrease of the corresponding Glimm functional in the time interval taken into consideration, and thus yield the error estimate (1.15). Unfortunately, the decreasing properties of both functionals strongly rely on the assumption that the linearly degenerate manifold 1.13 be a finite union of hypersurfaces transversal to the characteristic vector fields, and thus are of no use to establish an accurate convergence rate for general systems (1.1). Instead, the interaction potential in (1.23) can be applied to a general quasilinear system (1.1), without any assumption on the matrix A(u) apart from the strict hyperbolicity. 2. Preliminaries Let A be a smooth matrix-valued map defined on a domain ⊂ R N , with values in the set of N × N matrices. Assume that each A(u) is strictly hyperbolic and denote by {λ1 (u), . . . , λ N (u)} ⊂ [0, 1] its eigenvalues. Since we will consider only solutions with small total variation that take values in a neighborhood of a compact set K ⊂ , it is not restrictive to assume that is bounded and that there exist constants λ0 < · · · < λN such that λk , λk−1 < λk (u) <

∀ u, k = 1, . . . , N .

(2.1)

One can choose bases of right and left eigenvectors rk (u), lk (u), (k = 1, . . . , N ), associated to λk (u), normalized so that 1 if k = h, |rk (u)| ≡ 1, lh (u), rk (u) = ∀ u ∈ . (2.2) 0 if k = h, By the strict hyperbolicity of the system, in the conservative case (1.4) (where A(u) = D F(u)), for every fixed u 0 ∈ and for each k ∈ {1, . . . , N }th characteristic family one can construct in a neighborhood of u 0 a one-parameter smooth curve Sk [u 0 ] passing through u 0 (called the k th Hugoniot curve issuing from u 0 ), whose points u ∈ Sk [u 0 ] satisfy the Rankine Hugoniot equation F(u) − F(u 0 ) = σ (u − u 0 ) for some scalar σ = σk [u 0 , u]. The curve Sk [u 0 ] is tangent at u 0 to the right eigenvector rk (u 0 ) of A(u 0 ) associated to λk (u 0 ), and we say that (u L , u R ) is a shock discontinuity of the k th family with speed σk [u L , u R ] if u R ∈ Sk [u L ]. We describe here the general method introduced in [6,9] to construct the self-similar solution of a Riemann problem for a strictly hyperbolic quasilinear system (1.1). As

A Sharp Convergence Rate of the Glimm Scheme

589

customary, the basic step consists in constructing the elementary curve of the k th family (k = 1, . . . , N ) for every given left state u L , which is a one parameter curve of right states s → Tk [u L ](s) with the property that the Riemann problem having initial data . (u L , u R ), u R = Tk [u L ](s), admits a vanishing viscosity solution consisting only of elementary waves of the k th characteristic family. Such a curve is constructed by looking at the fixed point of a suitable contractive transformation associated to a smooth manifold of viscous traveling profiles for the parabolic system with unit viscosity (1.3)1 . Given a fixed state u 0 ∈ , and an index k ∈ {1, . . . , N }, in connection with the N +2dimensional smooth manifold of bounded traveling profiles of (1.3)1 with speed close to λk (u 0 ), one can define on a neighborhood of (u 0 , 0, λk (u 0 )) ∈ R N × R × R suitable smooth vector functions (u, vk , σ ) → rk (u, vk , σ ) that satisfy rk (u 0 , 0, σ ) = rk (u 0 ), for all σ , and are normalized so that lk (u 0 ), rk (u, vk , σ ) = 1 ∀ u, vk , σ. (2.3) The vector valued map rk (u, vk , σ ) is called the k th generalized eigenvector of the matrix . A(u), associated to the generalized eigenvalue λk (u, vk , σ ) = lk (u 0 ), A(u) rk (u, vk , σ ) , that satisfies the identity λk (u 0 , vk , σ ) = λk (u 0 ), for all vk , σ , and, moreover ∂ λk (u, vk , σ ) = O(1) · |u − u 0 |, ∂vk

∂ λk (u, vk , σ ) = O(1) · |vk ||u − u 0 |. ∂σ (2.4)

Next, given a left state u L in a neighborhood of u 0 , and 0 < s << 1, consider the set k (u L , s) of Lipschitz continuous curves τ → γ (τ ) = (u(τ ), vk (τ ), σk (τ )),

τ ∈ [0, s],

(2.5)

with values in Rn+2 that satisfy u(0) = u L ,

|u(τ ) − u L | ≤ δ,

|vk (τ )| ≤ δ,

|σk (τ ) − λk (u L )| ≤ δ,

for all τ ∈ [0, s], and for some δ > 0 sufficiently small. In connection with any curve γ ∈ k (u L , s), define the scalar function τ . λk (u(ξ ), vk (ξ ), σ (ξ )) dξ, f k (γ ; τ ) = (2.6) 0

and consider the mapping γ → Tk [u L , s](γ ) = ( u (·), vk (·), σk (·)), γ ∈ k (u L , s), defined by ⎧ τ . u (τ ) = u L + 0 rk (u(ξ ), vk (ξ ), σk (ξ )) dξ, ⎪ ⎨ . f k (γ ; τ ) − conv[0,s] 0 ≤ τ ≤ s, f k (γ ; τ ), vk (τ ) = (2.7) ⎪ . d ⎩ σk (τ ) = conv[0,s] f k (γ ; τ ), dτ

f k (γ ; ·) on [0, s], i.e. where conv[0,s] f k (γ ; τ ) denotes the lower convex envelope of . conv[0,s] f k (γ ; y) + (1 − θ ) f k (γ ; z) : f k (γ ; τ ) = inf θ θ ∈ [0, 1], y, z ∈ [0, s], τ = θ y + (1 − θ )z}. (2.8)

590

F. Ancona, A. Marson

It is shown in [6] that, for s sufficiently small, the transformation Tk [u L , s] in (2.7) is a contraction on the set k (u L , s) with respect to the weighted distance . D(γ , γ ) = δ u − u L∞ + vk − vk L1 + vk σk − vk σk L1 . (2.9) Hence, for every u L in a neighborhood of u 0 , s in a right neighborhood of zero, the transformation Tk [u L , s] admits a unique fixed point τ → u(τ ; u L , s), vk (τ ; u L , s), σk (τ ; u L , s) τ ∈ [0, s] (2.10) which thus provides a Lipschitz continuous solution to the integral system ⎧ τ rk (u(ξ ), vk (ξ ), σk (ξ )) dξ, ⎨ u(τ ) = u L + 0 vk (τ ) = f k (γ ; τ ) − conv[0,s] 0 ≤ τ ≤ s, f k (γ ; τ ), ⎩ d σk (τ ) = dτ conv[0,s] f k (γ ; τ ).

(2.11)

The elementary curve of right states of the k th family issuing from u L is then defined by setting . Tk [u L ](s) = u(s; u L , s). (2.12) Sometimes, the value (2.12) of the elementary curve issuing from u L will be equivalently written Tk (s)[u L ]. In the following it will be convenient to adopt the notations . vk [u L ](s, τ ) = vk (τ ; u L , s), . σk [u L ](s, τ ) = σk (τ ; u L , s) ∀ τ ∈ [0, s], (2.13) . L L L L k [u ](s, τ ) = f k (u( · ; u , s), vk ( · ; u , s), σ ( · ; u , s)); τ F for the v, σ components of the solution to (2.11), and for the reduced flux function fk evaluated in connection with such a solution. Notice that by construction the maps (u L , s) → k [u L ](s, ·), and the derivative (u L , s) → Dτ F k [u L ](s, · ) are σk [u L ](s, ·), (u L , s) → F Lipschitz continuous for u L in a neighborhood of u 0 , and s in a right neighborhood of zero. For negative values s < 0, |s| << 1, one replaces in (2.11) the lower convex envelope of f k on the interval [0, s] with its upper concave envelope on [s, 0] (defined in analogous way as (2.8), and then constructs the curve Tk [u L ] and the map σk [u L ] exactly in the same way as above looking at the solution of the integral system (2.11) on the interval [s, 0]. The elementary curve Tk [u L ] and the wave-speed map σk [u L ] constructed in this way enjoy the properties stated in the following theorem, where we let C I ([a, b]) (C D ([b, a])) denote the set of continuous and increasing (decreasing) scalar functions . defined on an interval [a, b], and we set C I ([a, b]) = C D ([b, a]) in the case a > b. Theorem 2 ([6,9]). Let A be a smooth, matrix valued map defined from a domain ⊂ R N into M N ×N (R), and assume that the matrices A(u) are strictly hyperbolic. Then, for every u ∈ , there exist N Lipschitz continuous curves s → Tk [u](s) ∈ satisfying d lims→0 ds Tk [u](s) = rk (u), together with N continuous functions s → σk [u](s, ·) ∈ C I ([0, s]) (k = 1, . . . , N ), defined on a neighborhood of zero, so that the following holds. Whenever u L ∈ , u R = Tk [u L ](s), for some s > 0, function ⎧ L if x/t < σk [u L ](s, 0), ⎨u . L u(t, x) = Tk [u ](τ ) if τ = sup{ξ ∈ [0, s ] : x/t = σk [u L ](s, ξ )}, (2.14) ⎩ R u if x/t > σk [u L ](s, s),

A Sharp Convergence Rate of the Glimm Scheme

591

provides the unique vanishing viscosity solution (determined by the parabolic approximation (1.3)) of the Riemann problem (1.1), (1.9). In the case s < 0, one replaces in (2.14) the superior extremum with the inferior one, letting ξ vary over the interval [s, 0]. Remark 1. If the system (1.1) is in conservation form, i.e. in the case where A(u) = D F(u) for some smooth flux function F, the general solution of the Riemann problem provided by (2.14) is a composed wave of the k th family made of a possibly countable number of contact-discontinuities or compressive shocks (which satisfy the Liu admissibility condition of Definition 1) adjacent to rarefaction waves. Namely, the regions where the vk -component of the solution to (2.11) vanishes correspond to rarefaction waves if the σk -component is strictly increasing and to contact discontinuities if the σk -component is constant, while the regions where the vk -component of the solution to (2.11) is different from zero correspond to contact discontinuities or to compressive shocks. In particular, whenever the solution of a Riemann problem with initial data u L , u R = Tk [u L ](s) contains a Liu admissible shock joining, say, two states Tk [u L ](s ), Tk [u L ](s ), s , s ∈ [0, s], one has σk [u L ](s, s ) = σk [u L ](s, τ ) for all τ ∈ [s , s ], and σk [u L ](s, s ) provides the shock speed of the discontinuity Tk [u L ](s ), Tk [u L ](s ) . Clearly, in a non-conservative setting, “admissibility” for a jump means precisely that the jump corresponds to a traveling profile for the parabolic approximation (1.3)1 . Once we have constructed the elementary curves Tk for each k th characteristic family, the vanishing viscosity solution of a general Riemann problem for (1.4) is then obtained by a standard procedure observing that the composite mapping . (s1 , . . . , s N ) → TN (s N ) ◦ · · · ◦ T1 (s1 )[u L ] = u R , (2.15) is one-to-one from a neighborhood of the origin in R N onto a neighborhood of u L . This is a consequence of the fact that the curves Tk [u] are tangent to rk (u) at zero (cfr. Theorem 2), and then follows by applying a version of the implicit function theorem valid for Lipschitz continuous maps. Therefore, we can uniquely determine intermediate states . . u L = ω0 , ω1 , . . . , ω N = u R , and wave sizes s1 , . . . , s N , such that there holds ωk = Tk [ωk−1 ](sk )

k = 1, . . . , N ,

(2.16)

provided that the left and right states u L , u R are sufficiently close to each other. Each Riemann problem with initial datum ωk−1 if x < 0, u k (x) = (2.17) ωk if x > 0, admits a vanishing viscosity solution of total size sk , containing a sequence of rarefactions and Liu admissible discontinuities of the k th family. Then, because of the uniform strict hyperbolicity assumption (2.1), the general solution of the Riemann Problem with initial data u L , u R is obtained by piecing together the vanishing viscosity solutions of the elementary Riemann problems (1.4) (2.17). Throughout the paper, with a slight abuse of notation, we shall often call s a wave of (total) size s, and, if u R = Tk [u L ](s), we will say that (u L , u R ) is a wave of size s of the k th characteristic family. A fundamental ingredient to establish an accurate convergence rate for the Glimm scheme is the wave tracing procedure, which was first introduced by T.P. Liu in his celebrated paper [23] for systems with genuinely nonlinear or linearly degenerate fields, and lately extended to systems fulfilling assumption (H) [26,27]. In this spirit, we have introduced in [5] the following notion of partition of a k-wave (u L , u R ), defined in terms of the elementary curves Tk at (2.12).

592

F. Ancona, A. Marson

L R Definition 2. Given a pair u R = Tk [u L ](s) for some s > 0, we of states u , u , with 1 l th say that a set y , . . . , y is a partition of the k wave (u L , u R ) if the followings holds. . . 1. There exist scalars s h > 0, h = 1, . . . , l, such that, setting τ h = hp=1 s p , w h = Tk [u L ](τ h ), there holds τ l = s, and

y h = w h − w h−1

∀ h.

The quantity s h is called the size of the elementary wave y h . . 2. Letting σ = σk [u L ](s, ·) be the map in (2.13), there holds σ (τ h ) − σ (τ h−1 ) ≤ ε

∀ h.

Moreover, we require that whenever we produce such a partition for the solution of a Riemann problem at a node (iε, jε), in a Glimm scheme generated by a sampling sequence {θ }∈N , there holds θi+1 ∈ / ]σ (τ h−1 ), σ (τ h )[, for all h (so to avoid further h partitions of y at next time t = (i + 1)ε). The definition is entirely u R = Tk [u L ](s), with s < 0. In connec 1 similar in theL case l R tion with a partition y , . . . , y of (u , u ), we define the corresponding speed of the elementary wave y h as λkh

. 1 = h s

τh

τ h−1

σ (τ ) dτ

∀ h.

(2.18)

We conclude the section providing the following definition of quantity of interaction introduced in [7, Definition 3.5] for a general strictly hyperbolic system (1.1), which is useful to measure the decrease of the functional Q in (1.23) when waves of the same family interact together. Definition 3. Consider two waves of sizes s , s , belonging to the same k th charac. . = = Fk [u ](s , · ) and F teristic family, with left states u , u , respectively. Let F Fk [u ](s , · ) be the reduced flux with starting point u , u , evaluated along the solution of (2.11) on the interval [0, s ], and [0, s ], respectively (cfr. def. (2.13)). Then, assuming that s ≥ 0, we say that the amount of interaction J (s , s ) between s and s is the quantity defined as follows. 1. If s ≥ 0 set: . J (s , s ) =

0

s

+

conv[0, s ] F (ξ ) dξ + (ξ ) − conv[0, s +s ] F ∪ F

s +s

s

F (ξ ) dξ, (s ) + conv[0, s ] F (ξ − s ) − conv[0, s +s ] F ∪ F (2.19)

∪ F is the function defined on [0, s + s ] as where F (s) i f s ∈ [0, s ], . F F ∪ F (s) = (s − s ) i f s ∈ [s , s + s ]. F (s ) + F

(2.20)

A Sharp Convergence Rate of the Glimm Scheme

593

2. If −s ≤ s < 0 set: . J (s , s ) =

s +s 0

+

conv[0, s ] F (ξ ) − conv[0, s +s ] F (ξ ) dξ +

s s +s

conv[0, s ] F (ξ ) − conc[s +s , s ] F (ξ ) dξ.

(2.21)

3. If s < −s set: . J (s , s ) =

0

s +s s

+ 0

conc[s , 0] F (ξ − s ) − conc[s , −s ] F (ξ − s ) dξ + conc[s , 0] F (ξ − s ) − conv[−s , 0] F (ξ − s ) dξ.

(2.22)

Here, conv[a,b] f, conc[a,b] f denote the lower convex envelope and the upper concave envelope of f on [a, b], defined as in (2.8). In the case where s < 0, one replaces in (2.19)–(2.22) the lower convex envelope with the upper concave one, and vice-versa. k [u L ](s, · ), (u, s) → Remark 2. By the Lipschitz continuity of the maps (u, s) → F k [u L ](s, · ) it follows that Dτ F J (s , s ) = O(1) · |s s |.

(2.23)

Moreover, by Remark 1 one can easily verify that, in the conservative case, if s , s are both shocks of the k th family that have the same sign, then the amount of interaction in (2.19) takes the form σk [u L , u M ] − σk [u M , u R ] J (s , s ) = s s , 2 i.e. it is precisely half of the product of the strength of the waves times the difference of their Rankine Hugoniot speeds. 3. New Wave Interaction Potential In this section we first collect the basic estimates on the change in size and speed of the elementary waves of an approximate solution constructed by the Glimm scheme, next introduce a potential interaction Q of the form (1.23), and finally establish the a-priori bounds on the variation of such a functional Q. To this end, for every given wave s of the k th family, set s . . σ (τ )dτ, (3.1) (s) = (σ, s) = 0

. where σ (·) = σk [w](s, ·) is the wave-speed map in (2.13), with w being the left state of s. Then, relying on the analysis in [7, Sect. 3] of the effect of wave interactions on the solution of Riemann problems for general quasilinear systems (1.1), we derive the following

594

F. Ancona, A. Marson

Lemma 1. For every compact set K ⊂ , there exists a constant χ0 > 0 such that the following holds. Let s1 , . . . , s N and s1 , . . . , s N be, respectively, the sizes of the waves in the solution of two adjacent Riemann problems (u L , u M ) and (u M , u R ), si and si belonging to the i th characteristic family, with u L , u M , u R ∈ K , and |si |, |si | ≤ χ0 for all i = 1, . . . , N . Call s1 , . . . , s N the sizes of the waves in the solution of the Riemann problem (u L , u R ), si belonging to the i th characteristic family. Then, there holds ⎡ N k=1

⎢ ⎥ sk − s − s = O(1) · ⎢ |si s j | + J (si , si )⎥ k k ⎣ ⎦, ⎡

N k=1

⎤

1≤i, j≤N i> j

(3.2)

1≤i≤N

⎤

⎢ ⎥ (sk ) − (s ) − (s ) = O(1) · ⎢ |s s | + J (si , si )⎥ k k i j ⎣ ⎦. 1≤i, j≤N i> j

(3.3)

1≤i≤N

Proof. A proof of the estimate (3.2) can be found in [7], thus we will focus our attention only on (3.3). Notice that, by the analysis in [7, Sect. 3] it immediately follows that the changes of the quantity in (3.1) due to interactions between waves of different families is controlled by the product of the strengths of the approaching waves. Hence, it will be sufficient to establish (3.3) in the case where the two adjacent Riemann problems are both solved by a single wave of the same k th family, sk and sk (sk on the left of sk ). Thus, u L , u M are the states on the left of sk and sk respectively, and u R is the state on the right of sk . Call u L k the left state of the outgoing wave of the k th family, sk . To fix the ideas, assume that sk ≥ 0. Let . γ (ξ ) = . γ (ξ ) = . γ (ξ ) =

(u(ξ ), vk (ξ ), σk (ξ )), u (ξ ), vk (ξ ), σk (ξ ) , u (ξ ), vk (ξ ), σk (ξ ) ,

ξ ∈ [0, sk + sk ], ξ ∈ [0, sk ],

ξ ∈ [0, sk ],

be the fixed points of the transformations Tk [u L , sk +sk ], Tk [u L , sk ], Tk [u M , sk ], defined by (2.7). Notice that (3.2) yields L u − u L k = O(1) · J (sk , sk ), s j = O(1) · J (s , s ). sk − s − s + k k k k

(3.4) (3.5)

j=k

Moreover, since the Lipschitz continuity of the wave-speed map u → σk [u](s, ·) at (2.13) implies σk [u L k ](sk , τ ) − σk [u L ](sk , τ ) = O(1) · u L k − u L

∀ τ ∈ [0, sk ],

and because (s j ) = (s j ) = 0

∀ j = k,

(3.6)

A Sharp Convergence Rate of the Glimm Scheme

595

it follows from (3.4)–(3.5) that, in order to establish (3.3), it suffices to prove s s k k . sk +sk || = σk (ξ ) − σk (ξ ) − σk (ξ ) = O(1) · J (sk , sk ). (3.7) 0 0 0 We will consider two cases, depending on the sign of sk · sk . Case 1. sk · sk > 0. Observe that, letting ξ → γ ∪ γ (ξ ), ξ ∈ [0, sk + sk ], be the curve defined by if ξ ∈ [0, sk ], . γ (ξ ) γ ∪ γ (ξ ) = (3.8) γ (sk ) + γ (ξ − sk ) if ξ ∈ [sk , sk + sk ], . u (·), vk (·), σk (·)) defined if we consider the curve γ = Tk [u L , sk + sk ](γ ∪ γ ) = ( σk ) = (σk ) + (σk ). by the transformation Tk [u L , sk + sk ] in (2.7), we have ( Hence, in view of the proof of [7, Prop. 3.2], we find || ≤ σk − σk L1 ([0,s +s ]) = O(1) · D(γ , γ ), k

(3.9)

k

where D(γ , γ ) denotes the weighted distance in (2.9) between the curves γ , γ . On the other hand, by the proof of [7, Lemma 3.9], and relying on the contraction property of Tk [u L , sk + sk ], we deduce γ ) = O(1) · J (sk , sk ), D(γ , γ ) = O(1) · D(γ ∪ γ ,

(3.10)

which, together with (3.9), yields (3.7). Case 2. sk · sk < 0. To fix the ideas, assume that sk ≥ −sk ≥ 0, the case sk + sk < 0 being entirely similar. By the same arguments and with the same notations of Case 1, . considering the curve γ = Tk [u L , sk + sk ](γ ∪ −γ ) = ( u (·), vk (·), σk (·)) we have ( σk ) = (σk ) − (σk ). Hence, relying on the proofs of [7, Prop. 3.2] and [7, Lemma 3.10], we derive || ≤ σk − σk L1 ([0,s ) = O(1) · D( γ , γ ) k

= O(1) · D( γ , γ ∪ −γ ) = O(1) · J (sk , sk ), which yields (3.7), thus completing the proof of the lemma.

(3.11)

We now provide a precise definition of the terms (1.21) which appear in the functional Q introduced in (1.23). Definition 4. Consider two waves of sizes s , s belonging to the same k th characteristic family, with s located on the left of s . Let u , u be the left states of s , s , respectively, . . and let σ = σk [u ](s , ·), σ = σk [u ](s , ·) denote the corresponding wave-speed maps defined in (2.13). Assuming s > 0, for every ξ ∈ [0, s ], we let ξ s and w (ξ ) denote, respectively, the size of the shock component of s related to ξ , and the right state of such a component, defined as follows. If there is an open interval ]τ1 , τ2 [⊂ [0, s ], containing ξ , that enjoys one of the following two properties: i) the map vk [u ](s , ·) in (2.13) vanishes only on an at most countable (possibly empty) subset of ]τ1 , τ2 [; ii) the map vk [u ](s , ·) in (2.13) vanishes on [τ1 , τ2 ] and σ is constant on [τ1 , τ2 ];

596

F. Ancona, A. Marson

. ξ . ξ ξ ξ then, letting ]τ1 , τ2 [ be the largest such interval, we set ξ s = τ2 − τ1 , w (ξ ) = ξ Tk [u ](τ2 ), and say that the shock component of s related to ξ is the shock wave with ξ left state Tk [u ](τ1 ) and right state w (ξ ). Otherwise, i.e. if vk [u ](s , ·) vanishes on a left or right neighborhood of ξ , and σ is strictly increasing on such a neighborhood, . . we set ξ s = 0, w (ξ ) = Tk [u ](ξ ), and say that s has no shock component related to ξ . Similarly, assuming s > 0, for every ξ ∈ [0, s ], we let ξ s and w (ξ ) denote, respectively, the size of the shock component of s related to ξ , and the left state of such a component, defined as above. With this notations and the analogous ones in the case s < 0, or s < 0, we then set s s σ (s − ξ ) − σ (ξ ) . dξ dξ , (3.12) (s , s ) = |w (s − ξ ) − w (ξ )| + (s − ξ )s + ξ s 0 0 if s , s > 0, and define (s , s ) as in (3.12) taking the integrals over the intervals [s , 0] or [s , 0], if s < 0 or s < 0. In the case s is located on the right of s , . we set (s , s ) = (s , s ), with (s , s ) defined as above. By extension, we define also (s, s) as in (3.12), for every wave s of the k th characteristic family, adopting the above conventions and viewing s as a wave located on the left (or on the right) of itself. Moreover, for every (portion of) k-wave s, we define the quantity s s . 1 σ (ξ ) − σ (ξ ) dξ dξ , (s, s) = (3.13) · |s| 0 0 with σ denoting the speed map associated to s. Remark 3. By the Lipschitz continuity of the map (u, s) → σk [u](s, · ) it follows that, for all pairs of waves s , s in the approximate solution, there holds a uniform bound independent on s , s , σ (s − ξ ) − σ (ξ ) ≤ O(1) ∀ ξ , ξ . (3.14) |w (s − ξ ) − w (ξ )| + (s − ξ )s + ξ s This, in turn, implies

for all pairs of waves

(s , s ) = O(1) · s s ,

(3.15)

s , s .

Remark 4. By Remark 1 one can easily verify that, in the conservative case, if s , s are two adjacent shocks of the k th family, with Rankine Hugoniot speeds λ , λ , respectively, then the potential interaction term in (3.12) takes the form s s λ − λ . (s , s ) = |s | + |s | Moreover, whenever s is a single discontinuity, there holds (s, s) = 0. Relying on the estimate (3.2), we shall obtain an a-priori bound on the change in values of the total strength of waves . |sα |, (3.16) V (t) = α

A Sharp Convergence Rate of the Glimm Scheme

597

and of the interaction potential Q in (1.23), (3.12), when evaluated for two adjacent Riemann problems (u L , u M ), (u M , u R ), and for the joined Riemann problem (u L , u R ). To this end, we introduce some further definitions that specify: the quantity of effective interaction associated to waves of the same family and with the same sign; the composite portion of wave that replaces a portion of an incoming shock (due to an interaction involving waves of different families or of the same family with opposite sign); the variation of the self-interacting terms in Q. Moreover, for every pair of waves of the same family s , s , we define the amount of cancellation as min |s |, |s | if s · s < 0, . C(s , s ) = (3.17) 0 otherwise. Definition 5. In the same setting of Lemma 1, let s1 , . . . , s N , be the waves generated by the interaction of two pair of waves s1 , . . . , s N , and s1 , . . . , s N , solving two adjacent Riemann problems. If sk , sk have the same sign, say positive, we define as follows the interacting portions of the k-waves sk , sk , denoted sk [i] , sk [i] . Let u k , u k , be the left . . = = Fk [u k ](sk , · ), F states of sk and sk , respectively, consider the reduced fluxes F k [u ](s , · ), evaluated along the solution of (2.11) on the intervals [0, s ], [0, s ] (cfr. F k k k k Def. (2.13)), and set ! " . (ξ ) ∀ ξ ∈ [0, τ ] , (ξ ) = conv[0, s +s ] F ∪ F τ = sup τ ∈ [0, sk ] : conv[0, sk ] F k k ! " . (ξ ) ∀ ξ ∈ [τ, sk ] , ∪ F τ = inf τ ∈ [0, sk ] : conv[0, sk ] F (ξ ) = conv[0, sk +sk ] F (3.18) (adopting the same notations of Definition 3). Let u k be the left state of sk , and, relying on (3.2), denote by sk[s] a (possibly zero) shock component of sk , with left state . [s] [s] u [s] k = Tk [u k ](τk ), for some τk , that satisfies [s] [s] [s] (3.19) τk − τk + (τk + sk ) − τk = O(1) · J (sk , sk ), for some 0 < τk ≤ τk ≤ sk + sk , with

⎡ ⎤ τ − τ + s + τ − τ = O(1) · ⎣ |s j | + |s j |⎦ . k k k j
(3.20)

j>k

Then, if τk < sk < τk , we denote by sk [i] , sk [i] , the portions of wave of sk and sk (according with Definition 2), with left states Tk [u k ](τk ), u k , and sizes sk − τk , τk − sk , respectively. Instead, whenever τk ≥ sk , or τk ≤ sk , we say that sk [i] , sk [i] , are trivial waves of zero sizes. Next, in the case τk < sk < τk , we define the quantity of effective interaction e (sk , sk ) between sk and sk , as s −τ τ −s k k k k 1 . σ (s − ξ ) − σ (ξ ) dξ dξ , · e (sk , sk ) = k [i] [i] 0 0 sk + sk (3.21) denote the speed maps associated to s , s . Instead, if τk ≥ sk , or τk ≤ sk , e (sk , sk ) = 0. Entirely similar definitions are given when s , s have both

where σ , σ

we set negative sign.

598

F. Ancona, A. Marson

Remark 5. In the same setting of Definition 5, because of (3.20) we have s −τ τ k σ (s − ξ ) − σ (ξ ) dξ dξ k 0 0

+ [sk −τk ] τk −sk σ (s − ξ ) − σ (ξ ) dξ dξ + = O(1) · 0

+|sk | ·

j
k

0

|s j | + |sk | ·

⎤ |s j |⎦ ,

(3.22)

j>k

. where [a]+ = max{0, a} denotes the positive part of an element a. Hence, by the analysis in [7, Sect. 4] and [8], and relying on [7, Remark 3.6], we derive s −τ τ k σ (τ + ξ ) − σ (ξ ) dξ dξ J (s , s ) = O(1) · 0 ⎡0 ⎤ = O(1) · ⎣ e (sk , sk ) + |sk | · |s j | + |sk | · |s j |⎦ . (3.23) j
j>k

Definition 6. In the same setting of Lemma 1, letting s1 , . . . , s N , be the waves generated by the interaction of two pair of waves s1 , . . . , s N , and s1 , . . . , s N (solving two adjacent [sp] Riemann problems), for every k = 1, . . . , N , we denote by sk r the possible composite portions of wave present in sk in place of (portions of) single k-shocks of sk or of sk , determined by the interaction of sk , sk with waves s j , s j of other families j = k, or by the fact that the interacting waves sk , sk have opposite sign. Namely, recalling Definition 4, for all (portions of) shock components sk [s]r split by the interaction that belong [sp] to sk , if we assume sk > 0, we define sk r as follows. Let u k , u k be the left states of sk and sk , respectively, denote by u kr the left state of sk [s]r , so that u kr = Tk [u k ](τ1 r ) for . some τ1 r , and set τ2 r = τ1 r + sk [s]r . By Lemma 1 there exist intervals [τ1r , τ2r ] ⊆ [0, sk ], with Tk [u k ](τ2r ) = Tk [Tk [u k ](τ1r )](τ2r ), so that the states Tk [u k ](τ1r ), Tk [u k ](τ2r ), are joined by a composed k-wave, and there holds ⎡ ⎤ ⎢ ⎥ ⎥ (τ r − τ r ) − (τ r − τ r ) = O(1) · ⎢ |s s | + J (s , s ) 2 1 2 1 i j i i ⎣ ⎦ . (3.24) r

1≤i, j≤N i> j

1≤i≤N

[sp] [sp] . Then, we denote by sk r the k-wave with left state wk r = Tk [u k ](τ1r ) and size τ2r −τ1r . [sp] Entirely similar definitions are given for the possible composite portions of wave sk r present in sk due to the splitting of (portions of) shock components of sk , and when [sp] sk < 0. In the case where no splitting occurs at the interaction we say that sk r are trivial waves of zero size.

Remark 6. In the same setting of Definition 6, suppose that some (portions of) shock components of one of the two incoming k-waves, say sk , are split after the interaction. According to Definition 6, in connection with every such shock component sk [s]r of sk ,

A Sharp Convergence Rate of the Glimm Scheme

599

[sp]

let sk r denote the corresponding composed waves present in the outgoing wave sk . . Observe that, by Remark 1, the wave speed map ξ → σk (ξ ) = σk [u ](sk , ξ ) defined in (2.13) (u being the left state of sk ) is constant for values of ξ corresponding to sk [s]r . Hence, relying on (3.2) and on the Lipschitz continuity of the map (u, s) → σk [u](s, ·), by definition (3.13) we deduce # # $ [sp] [sp] (sk r , sk r ) = O(1)· sk · s j + J (sk , sk ) + C(sk , sk ) + r

j

+

|si s j | +

1≤i, j≤N i> j

J (si , si )

$

1≤i≤N

# = O(1)· sk ·C(sk , sk ) +

|si s j | +

1≤i, j≤N i> j

$ J (si , si ) .

1≤i≤N

(3.25) sk[sp]r

Similarly, for the composite portions of wave of sk determined by the splitting of (portions of) shock components sk [s]r of sk , one has # $ [sp] [sp] |si s j | + J (si , si ) . (sk r , sk r ) = O(1)· sk ·C(sk , sk ) + r

1≤i, j≤N i> j

1≤i≤N

(3.26) Definition 7. In the same setting of Lemma 1, letting s1 , . . . , s N , be the waves generated by the interaction of two pair of waves s1 , . . . , s N , and s1 , . . . , s N , and recalling definitions (3.13), (3.21), and Definition 6, we define the variation of the self-interacting terms of the k th family in Q as ⎧ (sk , sk ) −(sk , sk )−(sk , sk )+ ⎪ ⎪ ⎪ if sk · sk > 0, ⎪ ⎪ e ⎪ , s ) + 2 (s , s ) − 2(s ⎪ k k k k ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (sk , sk ) −(sk +sk , sk +sk )+ ⎪ ⎨ si . [sp]r [sp]r if sk · sk < 0, k = − (3.27) , s ) (s k k ⎪ ⎪ ⎪ r ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (sk , sk ) −(sk , sk )−(sk , sk )+ ⎪ ⎪ ⎪ [sp] [sp] if sk · sk = 0 ⎪− ⎪ (s r, s r ) ⎩ k

k

r

(viewing sk + sk as a wave with the same left state as sk ), and set . si si = k . N

(3.28)

k=1

The quantity si defined in (3.27)–(3.28) measures the oscillations of the terms in Q (related to the interacting waves s j , s j , j = 1, . . . , N ) that are not controlled by the decrease of Q taking place at the interaction.

600

F. Ancona, A. Marson

. . Proposition 1. In the same setting of Lemma 1, set V = V + − V − , Q = Q + − Q − , − − + + where V , Q and V , Q denote the values of V, Q related, respectively, to the incoming waves s1 , . . . , s N , s1 , . . . , s N , and to the outgoing waves s1 , . . . , s N . Then, there exist constants χ1 , c1 > 0, and c > 0 in (1.23) such that, assuming V − ≤ χ1 , and recalling definitions (3.13), (3.21), and Definition 7, there hold ⎡ ⎤ V ≤ −c1 ·

N i=1

⎢ ⎥ ⎢ ⎥ C(si , si ) + O(1) · ⎢ |si s j | + e (si , si )⎥ , ⎣ ⎦ 1≤i, j≤N i> j

⎡

(3.29)

1≤i≤N si ·si >0

⎤

N ⎢ ⎥ ⎢ ⎥ Q ≤ si − c1 · ⎢ |si s j | + C(si , si ). e (si , si )⎥+ O(1) · V− · ⎣ ⎦ 1≤i, j≤N i> j

1≤i≤N si ·si >0

i=1

(3.30) Proof. Observing that by (2.23) one has

J (si , si ) = O(1) ·

1≤i≤N si ·si <0

|si si | = O(1) · V − ·

1≤i≤N si ·si <0

N

C(si , si ),

(3.31)

i=1

we deduce that (3.29) is an immediate consequence of (3.2) and (3.23), (3.31), provided that V − is sufficiently small. Thus, we will focus our attention on the estimate (3.30). For sake of simplicity, we shall consider only the case in which the two adjacent Riemann problems (u L , u M ), (u M , u R ) are solved by elementary waves of a single family, say s and s , s on the left of s , so that we have V − = |s | + |s |. We distinguish three cases, depending on the characteristic families of s and s and on their sign sizes. [sp]

[sp]

1. s and s are waves of the k and k < k characteristic families. Let sk r , sk r , be the composed waves that can be possibly produced by the interaction, in place of k -shocks present in s or of (portions of) k -shocks present in s , respectively (cfr. Definition 6). Relying on (3.2), (3.14), and adopting the notations in (3.27), we find that the variation of Q is given by # $ si si [sp] [sp] [sp] [sp] Q = k + k + (sk r , sk r ) + (sk r , sk r ) + r

r

−c |s s | + O(1)·|s s | · V − .

(3.32)

Then, observing that by (3.25)–(3.26) we have r

[sp] [sp] (sk r , sk r ) +

[sp] [sp] (sk r , sk r ) = O(1) · s s ,

(3.33)

r

from (3.32)–(3.33) we recover (3.30), provided that V − is sufficiently small and c > 0 in (1.23) is chosen sufficiently large.

A Sharp Convergence Rate of the Glimm Scheme

601

2. s and s are both k-waves and s · s < 0. To fix the ideas, assume that s , sk > 0, the case s + s , sk < 0 being entirely similar. As in 1, relying on (3.2), (2.23), (3.14), we find that si

Q ≤ k +

[sp] [sp] (sk r , sk r ) + O(1)·|s s | · V − .

(3.34)

r

On the other hand, in this case, assuming V − ≤ 1, by (3.25)–(3.26), (2.23), and since |s s | ≤ C(s , s ), we derive [sp] [sp] (3.35) (sk r , sk r ) = O(1) · C(s , s ) · V − . r

Hence, (3.34)–(3.35) together yield (3.30). 3. s and s are both k-waves and s · s > 0. Relying on (3.2), (3.14), by definition (3.27) we find Q ≤ si − 2 e (s , s ) + O(1)·J (s , s ) · V − ,

(3.36)

which, together with (3.23), yields (3.30) assuming V − sufficiently small. This completes the proof of the proposition. 4. Bounds on the Oscillations of the Interaction Potential In view of Proposition 1, we introduce in this section a functional G(t) that measures the total amount of oscillations of the terms (sα , sβ ) of Q taking place in the time interval [sp] [sp] [t, ∞), plus the variation of the terms e (sα , sβ ), (sα r , sα r ) in (3.27), between the nodes (iε, ε), i > t, where shock waves emerge (as the product of interactions of waves of the same family and of the same sign), and the nodes (i ε, ε), i > i, where composed waves are generated out of incoming shock waves (by interactions involving waves of different families or waves of the same family with opposite sign). The definition of G(t) is based on a partitioning scheme {skh (i, )}h for the k-waves issuing from every node (iε, ε), and contains three types of terms associated to three types of pairs of subwaves present in the solution u ε (τ ), τ ≥ t:

i) subwaves skh , skh of two interacting k-waves sk , sk with the same sign, that correspond to subwaves of a same shock wave generated by the interaction (cfr. definition (4.14)); ii) subwaves skh , skh of a composite portion of k-wave, that correspond to subwaves of a shock component of an incoming k-wave split by the interaction (cfr. definition (4.15)); iii) subwaves skh , skh of non-interacting waves (solving non adjacent Riemann problems), or subwaves which do not fulfill conditions i)-ii) (cfr. definition (4.13)).

4.1. A wave-partition algorithm. Towards a definition of the functional G(t) in (1.24), in the same spirit of [23,26] we first introduce a bookkeeping procedure to subdivide the waves of an approximate solution u ε (t) constructed by the Glimm scheme in a given interval [0, nε]. Such a procedure consists in partitioning the elementary

602

F. Ancona, A. Marson

waves sk (i, ), k = 1, . . . , N , issuing from every mesh point (iε, ε), m < i ≤ n, in two classes of waves: f ir st-generation waves : waves that can be traced back from the time iε all the way down to the time mε , second-generation waves : waves that can be traced back from the time iε to a time i ε > mε where they are generated . Here, the first class collects all the waves that are present in the solution at time mε and, either reach the terminal time nε, or are canceled before time nε, while the second class consists of all the waves that are generated by interactions occurring in the time interval ]mε, nε]. The total strength of waves in the second class is bounded by the total amount of interactions taking place within the interval ]mε, nε], which in turn, by the analysis in [7] can be estimated as O(1) · (V (mε))2 , where V (t) denotes the total strength of waves in u (t), defined as in (3.16). Then, recalling Definition 2 of a wave partition given m,q in Sect. 2, with the same analysis in [26,27] one can associate to each portion sk (i, ) of first-generation k-wave issuing from a node (iε, ε), a portion skm,h (m, j ) of k-wave exiting from a node (mε, j ε), and thus define two index maps (m, j ,k,h,i) , q(m, j ,k,h,i) , so that (m, j ,k,h,i) = , q(m, j ,k,h,i) = q. Similarly, every second-generation k-subwave m,q m,h s k (i, ) can be associated to a subwave s k (i , j ), emerging from a node (i ε, j ε), with (i , j ) ∈ Gm,k , so that = (i , j ,k,h,i) , q = q(i , j ,k,h,i) , for the same index maps , q of above. Here, {(i ε, j ε); (i , j ) ∈ Gm,k } is the set of all points in the strip ]mε, nε] × R, where new k-waves are generated by the interaction of waves of j = k th families. Every h-portion of a (first or second generation) k-wave issuing from a node (i ε, j ε) ∈ [mε, nε[×R travels along the nodes (iε, (i , j ,k,h,i) ε), i > i , and eventually reaches the node ι(i , j ,k,h) ε, (i , j ,k,h,ι(i , j ,k,h)) ε , ι denoting a further index map that identifies the maximum grid index until which the corresponding subwave survives. If ι(i , j ,k,h) < n the subwave is cancelled at time ι(i , j ,k,h) ε, otherwise one has ι(i , j ,k,h) = n and the subwave propagates along the whole interval [i ε, nε]. The next proposition provides a wave partition of this type for a Glimm approximate solution defined on an interval [0, nε]. Proposition 2. Given a Glimm approximate solution and any fixed n ∈ N, there exists a partition of elementary k-waves {z kh (i, )}0
(i , j , k, h, i) → (i , j ,k,h,i) ∈ Z,

(i , j , k, h, i) → q

(i , j ,k,h,i)

(4.1)

∈ N,

with the following properties: 1. For every i, , k, z kh (i, ) 0
so that the following hold.

h∈H

A Sharp Convergence Rate of the Glimm Scheme

603

(a) For every!fixed m ≤ i ≤ n, k = 1, . . . , N , there " is!a one-to-one correspondence " m,h : between z k (m, j); ι(m, j,k,h) ≥ i, h ∈ H and z km,h (i, ), h ∈ H z km,h (m, j)

m,q(m, j,k,h,i)

←→

zk

(i, (m, j,k,h,i) ),

(4.2)

such that the sizes skm,h of the corresponding waves satisfy & % m,q m,h sk (m, j) − max sk (m, j,k,h,i) (i, (m, j,k,h,i) ) j,k,h

m
= O(1) · (V (mε))2 .

(4.3)

map from (b) There is a set Gm,k" ⊂ {m ! + 1, . . . , n} × Z, and a one-to-one " ! m,h m,h z k (i, ), h ∈ H into z k (i , j ); (i , j ) ∈ Gm,k , h ∈ H : m,q m,h z k h (i, ) −→ z k (i , j ) for some (i , j ) ∈ Gm,k s.t. = (i , j ,k,h,i) ,

(4.4)

m,h such that the sizes s k of the corresponding waves satisfy ( ' m,q(i , j ,k,h,i) m,h (i, (i , j ,k,h,i) ) max s k (i , j ) − s k (i , j )∈Gm,k k,h

i
= O(1) · (V (mε))2 , m,h s k (i , j ) = O(1) · (V (mε))2 . (i , j )∈G

(4.5) (4.6)

m,k

k,h

Remark 7. The maps (4.2), (4.4) are order-preserving with respect to the ordering defined by

skh (i, ) ≺ skh (i, )

if

< or = and h < h ,

(4.7)

on the set of all k-subwaves present at time t = iε. Namely, the correspondence skh (i , j )

−→

q

sk (i , j ,k,h,i) (i, (i , j ,k,h,i) )

(4.8)

induced by (4.2),(4.4) preserves the ordering (4.7). Moreover, the partition provided by Proposition 2 can be constructed so to enjoy the following additional property. Let s , s be the k-waves issuing from two consecutive nodes ((i −1)ε, (−1)ε), ((i −1)ε, ε), that take part in the solution of the Riemann problem at the node (iε, ε), assume that s , s have the same sign, and let sk be the k-wave exiting from (iε, ε). Then, we may choose partitions {s h }h , {s h }h , {skh }h , of the k-waves s , s , sk according with Proposition 2, so that there is a one-to-one correspondence between the k-subwaves of the interacting portions s [i] , s [i] (cfr. Definition 5) of s , s , and the k-subwaves of the resulting shock component of the outgoing wave sk . Moreover, in the case s [i] = s [i] = 0, any pair of subwaves s h , s h of s , s will be in one-to-one correspondence with a pair of subwaves of sk that belong to different components of sk . Similarly, consider a k-wave s

604

F. Ancona, A. Marson

issuing from a node of the layer t = (i −1)ε, that has a shock component which is split as a composed portion of a k-wave sk issuing from the next layer t = iε (cfr. Definition 6). Then, we will choose partitions {s h }h , {skh }h , of s , sk , according to Proposition 2, so that there is a one-to-one correspondence between the subwaves of the portions sk [s]r of the shock components of s which are not eventually cancelled by the interaction and the subwaves of the corresponding composed components sk[sp]r of s. 4.2. Oscillation of the interaction potential for subwaves. We can naturally extend the definition of the terms (s , s ) of the interaction potential (1.23) given for pairs of k-waves s , s (cfr. Definition 4 in Sect. 3), to pairs of k-subwaves skh (i, ), skh (i, ) as follows. . . Definition 8. Consider two subwaves of sizes s h = skh (i, ), s h = skh (i, ), that are portions of two waves s , s > 0 of the k th characteristic family, with < . Adopting . . p the same notations of Definition 4 and Proposition 2, set τ h −1= p≥h sk (i, ), τ h = . p p p h −1= h . p≥h +1 sk (i, ), τ p≤h −1 sk (i, ), τ = p≤h sk (i, ), and define:

. (s , s ) = h

τ

h

τ

h

h −1

τ

τ

h

h −1

σ (s − ξ ) − σ (ξ ) dξ dξ . (4.9) |w (s − ξ ) − w (ξ )| + (s − ξ )s + ξ s

Similar definitions are given in the case s h < 0 or s h < 0, and whenever s h is located on the right of s h . By extension, we define also (s h , s h ) as in (4.9) for every pair of subwaves s h , s h , of the same k-wave s, viewing s as a wave located on the left (or on the right) of itself. Moreover, according to (3.13), for subwaves s h , s h of two waves s , s > 0, we define the quantity . (s , s ) = h

h

1 · |s | + |s |

τ τ

h

h −1

τ

τ

h

h −1

σ (s − ξ ) − σ (ξ ) dξ dξ ,

(4.10)

and extend such a definition to the case where one of the two waves s , s have neg ative size. Instead, for pairs of subwaves s h , s h of the interacting portions s [i] , s [i] of two k-waves s , s (see Definition 5), we define the quantity (s h , s h ) as in (4.10) with 1/(|s [i] | + |s [i] |) in place of 1/(|s | + |s |). Furthermore, for subwaves s h , s h [sp] r of the same k-wave s (or of the same composite portion of the split wave s as in Definition 6), according with Definition 4 we define (s h , s h ) as in (4.10) with 1/|s| (or 1/|s [sp]r |) in place of 1/(|s | + |s |). Remark 8. By Definitions 4-5 and Definition 8 it follows ! " that, ! for"every given pairs of h waves s , s of the same characteristic family, letting s , s h be the correspondh h ing partitions provided by Proposition 2, there holds h h , (4.11) s , s (s , s ) = h ,h

A Sharp Convergence Rate of the Glimm Scheme

e (s , s ) =

605

h h , s , s

(4.12)

h ,h

where the summation in (4.11) runs over all subwaves s h , s h of s and s , respec tively, while the summation in (4.12) runs over all subwaves s h , s h of s [i] and s [i] , respectively. The equality (4.11) remains valid in the case s = s = s (s being either a whole elementary wave or a composite portion s [sp]r split by an interaction). Notice that, in this case, the subwaves s h , s h that appear in (4.11) are all portions of the same wave s, and whenever s h , s h are subwaves of a same shock component of s, by Remark 4 h h one has (s , s ) = 0.

We next define the oscillation of the terms (skh (i, ), skh (i, )) for every pair of k-subwaves skh (i, ), skh (i, ). In view of Proposition 2 we define such oscillations q

q

for pairs of k-subwaves sk (i, (i , j ,k,h ,i) ), sk (i, (i , j ,k,h ,i) ), associated to pairs of points (i ε, j ε), (i ε, j ε) that they have previously crossed (or where they have been originated), setting Osc k,i ((i , j , h ), (i , j , h )) q q . = sk (i , j ,k,h ,i) (i, (i , j ,k,h ,i) ), sk (i , j ,k,h ,i) (i, (i , j ,k,h ,i) ) + − skh (i −1, (i , j ,k,h ,i -1) ), skh (i −1, (i , j ,k,h ,i -1) ) ,

(4.13)

(l, q being the maps in (4.1)). We also define the oscillation for a pair of k-subwaves skh (i − 1, (i , j ,k,h ,i -1) ), skh (i − 1, (i , j ,k,h ,i -1) ), issuing from two consecutive nodes of the layer t = (i −1)ε, that join together as subwaves of a shock component of a k-wave issuing from a node (iε, ), with = (i , j ,k,h ,i) = (i , j ,k,h ,i) , by setting Osc k,i ((i , j , h ), (i , j , h )) . = skh (i −1, (i , j ,k,h ,i -1) ), skh (i −1, (i , j ,k,h ,i -1) ) + − skh (i −1, (i , j ,k,h ,i -1) ), skh (i −1, (i , j ,k,h ,i -1) ) , (4.14) where skh , skh is defined as in (4.10) viewing skh , skh as subwaves of the interacting portions of waves sk[i] (i −1, (i , j ,k,h ,i -1) ), sk[i] (i −1, (i , j ,k,h ,i -1) ) (cfr. Definition 5). q

q

Instead, for the k-subwaves sk (i, (i , j ,k,h ,i) ), sk (i, (i , j ,k,h ,i) ), of a composed por[sp] tion of k-wave sk r issuing from a node of the layer t = iε, generated by the splitting of a (portion of a) shock component of a k-wave issuing from a node of the previous layer t = (i − 1)ε (cfr. Definition 6), we set Osc k,i ((i , j , h ), (i , j , h )) q q . = sk (i , j ,k,h ,i) (i, (i , j ,k,h ,i) ), sk (i , j ,k,h ,i) (i, (i , j ,k,h ,i) ) + q q − sk (i , j ,k,h ,i) (i, (i , j ,k,h ,i) ), sk (i , j ,k,h ,i) (i, (i , j ,k,h ,i) ) , (4.15) q q [sp] where sk () , sk () is defined as in (4.10) viewing skh , skh as subwaves of sk r . To

compare these definitions with the one given in (3.27), consider the k-waves sk , sk , k =

606

F. Ancona, A. Marson

1, . . . , N , issuing from two consecutive nodes ((i − 1)ε, ( j − 1)ε), ((i − 1)ε, jε), that take part of the solution s1 , . . . , s N of the Riemann problem at the node (iε, jε). Let {sk h }h , {sk h }h and {skh }h , be the corresponding partitions of sk , sk and sk . Denote with . I = {h ; ι(i -1, j -1,k,h ) ≥ i}, . I = {h ; ι(i -1, j,k,h ) ≥ i},

(4.16)

the index sets of k-subwaves of sk , sk that are not canceled by the interaction and, in the case sk · sk > 0, denote with ! " [i] . h [i] I = h ; sk , is a subwave of sk ! " (4.17) [i] . h [i] I = h ; sk is a subwave of sk , the index sets of k-subwaves that are effectively interacting (cfr. Remark 7), while set . I [i] = I [i] = ∅ whenever sk · sk ≤ 0, or sk [i] = sk[i] = 0. Moreover, setting . . qh = q(i -1, j -1,k,h ,i) , qh = q(i -1, j,k,h ,i) , let ! " q [sp] . [sp] I r = h ; sk h is a subwave of sk r , ! " (4.18) q [sp] . [sp] I r = h ; sk h is a subwave of sk r , denote the index set of k-subwaves of those portions sk [s]r of shock components of one [sp] of the two k-waves sk , sk , that are split in composed portions sk r of sk after the inter. . action at (iε, jε) (cfr. Definition 6 and Remark 7). Set I [sp] = ∅ (I [sp] = ∅) if no such splitting takes place after the interaction for a shock component of sk (sk ), and let ) . ) [sp]r [sp] [sp] [sp] I [sp] = I I r × I r . (4.19) × I r ∪ r

r

Then, defining * (h , h ) ∈ (I ∪ I )×(I ∪ I ); (h , h ) ∈ / I [i] ×I [i] . H= (h , h ) ∈ (I ∪ I )×(I ∪ I ) (h , h ) ∈ / I [sp] . = H (h , h ) ∈ (I ∪ I )×(I ∪ I ); (h , h ) ∈ /H ,

if sk ·sk > 0, if sk ·sk ≤ 0, (4.20) (4.21)

comparing definitions (3.21), (3.27), and Definition 6, with definitions (4.13)–(4.15), and relying on Remarks 7-8, there holds

si

k =

Osc k,i ((i − 1, j − 1, h ), (i − 1, j, h )) +

(h , h )∈H

+

Osc k,i ((i − 1, j − 1, h ), (i − 1, j, h )),

(4.22)

(h , h )∈H

where the terms of the second summand are defined as in (4.14) if (h , h ) ∈ I [i] ×I [i] , and as in (4.15) if (h , h ) ∈ I [sp] . Here the summations may run over a countable set

A Sharp Convergence Rate of the Glimm Scheme

607

of indices, in which case their sums is well defined being the corresponding series abso lutely convergent since, by Lemma 1 and because of (3.14), one has h ,h Osc k,i + 2 h ,h Osc k,i = O(1) · (V (iε)) . 4.3. A functional measuring the oscillation of the interaction potential. In order to provide a definition of the total amount of oscillations of the terms (skh (i, ), skh (i, )) over a time interval [mε, nε], we now introduce two sets of pairs of indices F(m,k,i, , ) , S(m,k,i, , ) . They identify the pairs of points and of portions of wave from which are originated pairs of first and second generation k-subwaves, respectively, that: 1. eventually reach the nodes (iε, ε), (iε, ε); . 2. in the case = = , (a) don’t join together as subwaves of a k-shock issuing from (iε, ε), (b) don’t belong to a composed wave generated by a splitting at (iε, ε). (m,k,i,) , S(m,k,i,) , the sets of pairs of indices as in Instead, we shall denote by F F(m,k,i,,) , S(m,k,i,,) , corresponding to pairs of k-subwaves that, when reaching the node (iε, ε), either join together as portions of a shock, or are portions of a composed wave issuing from (iε, ε). To this end, for every m ≤ i < i ≤ n, 1 ≤ k ≤ N , ∈ Z, denote with . (4.23) I(i ,k,i,) = {( j, h); (i , j,k,h,i) = }, the index sets of k-subwaves that were present at time t = i ε, and eventually reach the node (iε, ε). Moreover, in the case where the k-waves sk (i − 1, − 1), sk (i − 1, ), have the same sign, let ! " q . [i] I (i ,k,i,) = ( j, h) ∈ I(i ,k,i,) ; sk (i , j,k,h,i−1) (i −1, −1) is a subwave of sk[i] (i − 1, − 1) , ! " q . [i] I (i ,k,i,) = ( j, h) ∈ I(i ,k,i,) ; sk (i , j,k,h,i−1) (i −1, ) is a subwave of sk[i] (i −1, ) ,

(4.24) denote the index sets of k-subwaves that are effectively interacting (cfr. Definition 5 and Remark 7) and eventually reach the node (iε, ε). Instead, whenever sk (i − 1, − 1) · sk (i − 1, ) < 0, or sk[i] (i − 1, − 1) = sk[i] (i − 1, ) = 0, as well as in the case where one of the two Riemann solutions at the nodes (i − 1, − 1), (i − 1, ), contains no k-wave taking part of the solution of the Riemann problem at (iε, ε), set . I (i[i] ,k,i,) = I (i[i] ,k,i,) = ∅. Furthermore, in the case where the k-wave sk (i, ) contains [sp]

composite waves sk r produced by the splitting of (portions of) shock components of k-waves issuing from the previous layer t = (i − 1)ε (cfr. Definition 6), let ! " q . [sp]r I(i ,k,i,) = ( j, h) ∈ I(i ,k,i,) ; sk (i , j,k,h,i) (i, ) is a subwave of sk[sp]r (i, ) , (4.25) denote the index sets of k-subwaves that when reaching the node (iε, ε) are portions [sp] of sk r , and set . ) [sp]r [sp] [sp]r I(i ,k,i,) × I(i ,k,i,) , (4.26) I(i ,i ,k,i,) = r

608

F. Ancona, A. Marson

. [sp] while set I(i ,k,i,) = ∅, if no splitting takes place at (i, ) of a shock component of a k-wave present at t = (i − 1)ε. Then, recalling that {(i ε, j ε); (i , j ) ∈ Gm,k } is the set of all generation points of new waves within the strip ]mε, nε] × R, we define, for m < i ≤ n, 1 ≤ k ≤ N , , ∈ Z, the sets . . F(m,k,i, , ) = (( j , h ), ( j , h )) ∈ I(m,k,i, ) × I(m,k,i, ) ; if = = , [i]

[i]

(( j , h ), ( j , h )) ∈ / I (m,k,i,) ×I (m,k,i,) [sp] (( j , h ), ( j , h )) ∈ / I(m,m,k,i,) , there holds

(m,k,i,) F S(m,k,i, , )

(4.27) . = (( j , h ), ( j , h )) ∈ I(m,k,i,) × I(m,k,i,) ; (4.28) / F(m,k,i,,) , (( j , h ), ( j , h )) ∈ . = ((i , j , h ), (i , j , h )); (( j , h ), ( j , h )) ∈ I(i ,k,i, ) ×I(i ,k,i, ) , (i , j ), (i , j ) ∈ Gm,k ∪ {m} × Z, . max{i , i } > m, and, if = = , [i]

[i]

(( j , h ), ( j , h )) ∈ / I (i ,k,i,) ×I (i ,k,i,) [sp] (( j , h ), ( j , h )) ∈ / I(i ,i ,k,i,) ,

(4.29)

(i , j ), (i , j ) ∈ Gm,k ∪ {m} × Z, max{i , i } > m, ((i , j , h ), (i , j , h )) ∈ / S(m,k,i,,) .

(4.30)

there holds

. S(m,k,i,) = ((i , j , h ), (i , j , h )); (( j , h ), ( j , h )) ∈ I(i ,k,i,) ×I(i ,k,i,) ,

By construction, the maps

q q (( j , h ), ( j , h )) → sk (m, j ,k,h ,i) (i, ), sk (m, j ,k,h ,i) (i, ) q q ((i , j , h ), (i , j , h )) → sk (i , j ,k,h ,i) (i, ), sk (i , j ,k,h ,i) (i, )

(4.31)

(m,k,i, , ) , and S(m,k,i, , ) , associate to every pair of indices in F(m,k,i, , ) , F S(m,k,i, , ) , a pair of k-subwaves issuing from the nodes (iε, ε), (iε, ε). The sets (m,k,i, , ) identify the pairs of first generation subwaves, while F(m,k,i, , ) , F S(m,k,i, , ) , S(m,k,i, , ) identify the pairs of subwaves containing at least one second generation wave. Remark 9. By definition (4.24), and recalling Definition 5 and Remark 7, it follows that, q q whenever we consider two subwaves sk (i , j ,k,h ,i−1) (i − 1, − 1), sk (i , j ,k,h ,i−1) (i − 1, ), so that [i]

[i]

(( j , h ), ( j , h )) ∈ I (i ,k,i,) ×I (i ,k,i,) ,

(4.32) q

for some m ≤ i , i < i ≤ n, the corresponding k-subwaves sk (i , j ,k,h ,i) (i, ), q sk (i , j ,k,h ,i) (i, ) on the layer t = iε result in being portions of a same shock component of the k-wave sk (i, ) issuing from (iε, ε). Therefore, by Remark 8, we have q q sk (i , j ,k,h ,i) (i, ), sk (i , j ,k,h ,i) (i, ) = 0. (4.33)

A Sharp Convergence Rate of the Glimm Scheme

609

Similarly, by definitions (4.25)–(4.26), and because of Definition 6 and Remark 7, it q q follows that, when we consider two subwaves sk (i , j ,k,h ,i) (i, ), sk (i , j ,k,h ,i) (i, ), so that [sp]

(( j , h ), ( j , h )) ∈ I(i ,i ,k,i,) , q

q

(4.34)

the corresponding k-subwaves sk (i , j ,k,h ,i−1) (i, ), sk (i , j ,k,h ,i−1) (i, ) on the layer t = (i − 1)ε will be portions of a same shock component of a k-wave issuing from one of the two nodes ((i − 1)ε, ( − 1)ε), ((i − 1)ε, ε). Hence, by Remark 8, there holds q q sk (i , j ,k,h ,i−1) (i, ), sk (i , j ,k,h ,i−1) (i, ) = 0. (4.35) Given a Glimm approximate solution, we now fix n > 0, and consider the corresponding wave partition provided by Proposition 2. With the above notations, and using the definitions of oscillation given in (4.13)–(4.15), we define a functional G n (t), t ∈ [0, nε[ , by setting for every 0 ≤ m < n: N n . = GF n,m

k=1 i=m+1 , ∈Z

Osc k,i ((m, j , h ), (m, j , h )),

(( j ,h ), ( j ,h ))∈F

(m,k,i, , )

(4.36) . GF n,m =

N

n

Osc k,i ((m, j , h ), (m, j , h )),

(4.37)

(m,k,i,) k=1 i=m+1 ∈Z (( j ,h ), ( j ,h ))∈F N n . GS = n,m

Osc k,i ((i , j , h ), (i , j , h )),

k=1 i=m+2 , ∈Z ((i , j ,h ), (i , j ,h ))∈S(m,k,i, , )

(4.38) . GS n,m =

N

n

Osc k,i ((i , j , h ), (i , j , h )),

(4.39)

(m,k,i,) k=1 i=m+2 ∈Z ((i , j ,h ), (i , j ,h ))∈S

S [i] [i] (the terms of G F n,m , G n,m , being defined as in (4.14) if (( j , h ), ( j , h )) ∈ I (·) ×I (·) , [sp]

and as in (4.15) if (( j , h ), ( j , h )) ∈ I(·) ), and letting . . F S S G n (t) = G n,m = G F n,m +G n,m +G n,m +G n,m

∀ t ∈ [mε, (m + 1)ε[,

(4.40)

for all 0 ≤ m < n. As observed for (4.22), the summations in (4.36)–(4.36) when running over countable sets of pairs of indices are well defined since the corresponding sums are absolutely convergent thanks to Lemma 1 and by (3.14). We wish to compare the variation of the functional G n (t) at a time t = mε with the oscillation of the terms (sα , sβ ) in Q(t) at t = mε. To this end, let si (mε, ε) denote a quantity defined as in (3.27)–(3.28), expressing the variation of the self-interacting terms of (1.23) related to the waves involved in the solution of the Riemann problem at the node (mε, ε), and denote with (mε, α ε, β ε) the variation of the interacting terms of (1.23) related to the solutions of the Riemann problems at different nodes (mε, α ε), (mε, β ε), α = β , , s , k = 1, . . . , N , be the k-waves issuing from the nodes defined as follows. Let sα,k α,k ((m − 1)ε, (α − 1)ε), ((m − 1)ε, α ε), that take part of the solution sα,1 , . . . , sα,N , of

610

F. Ancona, A. Marson

the Riemann problem at the node (mε, α ε), and adopt similar notations for the k-waves , s , k = 1, . . . , N , issuing from ((m − 1)ε, ( − 1)ε), ((m − 1)ε, ε), and sβ,k β β β,k taking part of the solution sβ,k , k = 1, . . . N , of the Riemann problem at (mε, β ε). Then, set + , . , sβ,k ) − (sα,k , sβ,k ) + k (mε, α ε, β ε) = (sα,k , sβ,k ) − (sα,k + , − (sα,k , sβ,k ) − (sα,k , sβ,k ) , (4.41) and . (mε, α ε, β ε) = k (mε, α ε, β ε). N

(4.42)

k=1

Comparing definitions (4.13)–(4.15), (4.36)–(4.40) with (3.21), (3.27)–(3.28), Definition 6 and (4.41–(4.42, and relying on Remarks 7-8, we observe that the variation of the functional G n (t) at any time t = mε < nε, is precisely

G n,m−1 − G n,m =

si (mε, ε) +

∈Z

(mε, α ε, β ε).

(4.43)

α ,β ∈Z α =β

Hence, applying proposition 1, we deduce the estimates stated in the following proposition on the variation of the functionals G n (t) and Q(t) + G n (t) across grid-times t = mε < nε. Proposition 3. For every compact set K ⊂ , there exist positive constants χ2 , c2 , C, such that the following holds. Let u ε = u ε (t, x) be a Glimm approximate solution of (1.1), (1.6), and assume that lim x→−∞ u ε (mε−, x) ∈ K , V (mε−) < χ2 , for some . . . m > 0. Then, letting V − = V (mε−), Q − = Q(mε−), G − n = G n (mε−), and . . . + + + V = V (mε+), Q = Q(mε+), G n = G n (mε+), denote the values of V, Q, G n . (n > m), related to u ε (mε−, ·) and u ε (mε+, ·), respectively, and setting V (mε) = . . V + − V − , Q(mε) = Q + − Q − , G n (mε) = G +n − G − n , there hold ⎡

⎤

⎢ ⎢ G n (mε) = − ⎢ si (mε, ε) + ⎣ ∈Z

⎡

α ,β ∈Z α =β

⎥ ⎥ (mε, α ε, β ε)⎥, ⎦

(4.44) ⎤

⎢ ⎥ ⎢ ⎥ |sα sβ | + C(sα , sβ ) + e (sα , sβ )⎥, [V + C · (Q + G n )] (mε) ≤ −c2 · ⎢ ⎣ ⎦ kα xβ

kα =kβ

kα =kβ sα ·sβ >0

(4.45) where sα denotes a wave in u ε (mε−, ·) of the kα th family located in xα , C(sα , sβ ) denotes the amount of cancellation defined in (3.17), and e (sα , sβ ) is the interaction quantity defined in (3.21).

A Sharp Convergence Rate of the Glimm Scheme

611

4.4. Uniform bound on the oscillations of the interaction potential of quadratic order in the total variation. We establish here an a-priori bound on the functional G n (t), t ∈ [0, nε[ , defined in (4.36)–(4.40), uniform with respect to n ∈ N. To this end, we fix n > 0 and, in connection with the wave partition provided by Proposition 2, we introduce a [sp] set of indices Gm,k , 0 ≤ m < n, that identify the points (i, j) in the strip ]mε, nε[×R, where a k-shock, incoming from the previous layer t = (i − 1)ε, is split in a composed wave because of the interaction occurring at (iε, jε) involving waves of the same k th family with opposite sign, or waves of other families. Namely, with the notations of Definition 6 and Proposition 2, we set ! " [sp] . [sp] Gm,k = (i, j) ∈ ]mε, nε] × R; I = ∅ for some m ≤ i , i < i . (4.46) (i ,i ,k,i, j) [sp]

Definition 9. Given any point (i, j) ∈ Gm,k , 0 ≤ m < n, we define the maximal (back[sp]

ward) tree of the kth family with vertex at (i, j) ∈ Gm,k , denoted by Tk,m (i, j), as the [sp]

collection of all k-subwaves originated at some point (i 0 , j0 ) ∈ Gm,k ∪ Gm,k , i 0 < i, or at time t = mε, that eventually reach a composed wave issuing from (i, j), generated by the splitting of a (portion of a) shock wave incoming from the previous layer t = (i −1)ε. More precisely, let . p r,+ Tm,k (i, j) = sk (i, j) l r −1 (i, j)< p≤l r (i, j) , k

(4.47)

k

be the collection of k-subwaves issuing from (i, j) that are portions of a same split [sp] component sk r (i, j) of sk (i, j) (cfr. Definition 6), and set . ) r,+ + (i, j) = Tm,k (i, j), Tm,k

(4.48)

r

where the (possibly countable) union is taken over all split components of sk (i, j). By Proposition 2, consider for each r the family of k-subwaves ! " h . r,− Tm,k (i, j) = sk p (i p , j p ) r −1 lk

(i, j)< p≤lkr (i, j)

(i p ≥ m)

(4.49)

r,+ that are in one-to-one correspondence with the subwaves of Tk,m (i, j) and enjoy, for r −1 r every lk (i, j) m ⇒ i, (i p , j p ,k,h p ,i) ∈ /

q(i p , j p ,k,h p ,i ) = p,

[sp] (i p , j p ) ∈ Gm,k ∪ Gm,k , [sp] ∀ ip < i Gm,k ∪ Gm,k

(4.50) (4.51)

< i,

(4.52)

612

F. Ancona, A. Marson

(, q being the maps in (4.1)). Then, set ! q " (i , j ,k,h ,i) . Tm,k, p (i, j) = sk p p p (i, (i p , j p ,k,h p ,i) ); i p ≤ i ≤ i , r (i, Tm,k

lkr (i, j)

)

. j) =

Tm,k, p (i, j),

p=lkr −1 (i, j)+1

− Tm,k (i,

(4.53)

. ) r,− j) = Tm,k (i, j), r

. ) r Tm,k (i, j) = Tm,k (i, j). r p

p

r,+ Given any pair of subwaves sk (i, j), sk (i, j) ∈ Tm,k (i, j), we shall denote by [i] τ ( i, j, k, p , p ) the minimum index i ∈ max {i p , i p }, . . . , i with the property that, for every i < i ≤ i, ∈ Z, there holds

( j p , h p ), ( j p , h p ) ∈ / I (i[i] ,k,i,) ×I (i[i] ,k,i,) , p

(4.54)

p

[i] [i] (letting I (m,k,i,) , I (m,k,i,) be the sets defined in (4.23)–(4.24), and adopting the notations of Proposition 2). [sp]

Remark 10. A maximal backward tree Tm,k (i, j) with vertex at a point (i, j) ∈ Gm,k , is associated to a collection of polygonal lines passing through the grid points where the subwaves of Tm,k (i, j) have emanated. Hence, with a slight abuse of notations, we will equivalently speak of Tm,k (i, j) as a collection of waves or as a collection of lines (or of points connected by the lines). Such a tree has two key properties. (i) Every (backward) branch Tm,k, p (i, j), lkr −1 (i, j) < p ≤ lkr (i, j), starts at the point [sp] (i, j) ∈ Gm,k , and terminates at the point (i p , j p ), which is either a point of Gm,k where the subwaves of Tm,k, p (i, j) are generated, or a point of the line t = mε, or [sp] a point of Gm,k , where a k-wave is split in subwaves of Tm,k (i, j) and possibly in other subwaves travelling through points outside of Tm,k (i, j). (ii) Every two (backward) branches Tm,k, p (i, j), Tm,k, p (i, j), lkr −1 (i, j) < p , p ≤ lkr (i, j), coincide on a polygonal line starting at (i, j), and after splitting at the point (τ p p , p , p ), where . τ p , p = τ [i] ( i, j, k, p , p ), . p , p = (i p , j p ,k,h p ,τ p , p ) = (i p , j p ,k,h p ,τ p , p ) ,

(4.55)

they can again possibly join together only at their terminal points if (i p , j p ) = (i p , j p ). Moreover, τ p , p i p , one has

[i] ( j p , h p ), ( j p , h p ) ∈ I (i ,k,τ p

p , p , p , p )

[i]

×I (i

p ,k,τ p , p , p , p )

.

(4.56)

A Sharp Convergence Rate of the Glimm Scheme

613

The properties (i)-(ii) will be useful to analyze the variation of the interaction terms q q q q r (i, j), and to estimate the total (sk , sk ) for pairs of waves sk , sk belonging to Tm,k r (i, j). amount of such a variation in terms of the quantity of wave interaction within Tm,k p

p

r,+ (i, j), with (i, j) ∈ Indeed, consider a pair of k-subwaves sk (i, j), sk (i, j) ∈ Tm,k [sp]

Gm,k , lkr −1 (i, j) < p , p ≤ lkr (i, j), and, to fix the ideas, assume that i p ≥ i p . Then, by properties (i)-(ii) and recalling definitions (4.23)–(4.24), for every ∈ Z there holds [i] [i] / I (i ,k,i,) ×I (i ,k,i,) ∀ i p ≤ i ≤ i, i = τ p , p , ( j p , h p ), ( j p , h p ) ∈ p p (4.57) [sp] ( j p , h p ), ( j p , h p ) ∈ / I(i ,i ,k,i,) ∀ i p < i < i. p

p

For i p < i ≤ i, letting , q be the maps in (4.1), set . i = (i p , j p ,k,h p ,i) , . qi = q(i p , j p ,k,h p ,i) ,

. i = (i p , j p ,k,h p ,i) , . qi = q(i p , j p ,k,h p ,i) .

(4.58) (4.59)

Because of (4.57), and by definitions (4.27)–(4.30), we deduce that the overall variation q

q

of the interaction term (sk i (i, i ), sk i (i, i )) for i p ≤ i ≤ i, is equal to the sum of all the terms Osc k,i , Osc k,i , i p < i ≤ i, of the functionals (4.36)–(4.39) that corre . . spond to pairs of waves in T p = Tm,k, p (i, j), T p = Tm,k, p (i, j), plus the variation q q of (sk i (i, i ), sk i (i, i )) between i = τ p , p − 1 and i = i. In fact, in accordance with (4.27)–(4.30), consider the sets of pairs of indices IiF

* ) . (m,k,i,) ; = (( j , h ), ( j , h )) ∈ F(m,k,i,) ∪ F ∈Z q(m, j ,k,h ,i) (i, (m, j ,k,h ,i) ) ∈ T p, sk

sk (m, j

∈Z q(i , j ,k,h ,i) sk (i, (i , j ,k,h ,i) ) ∈ T p,

sk (i

q

,k,h ,i)

p

(i, (i , j ,k,h ,i) ) ∈ T

p

* ) . S(m,k,i,) ∪ S(m,k,i,) ; IiS = ((i , j , h ), (i , j , h )) ∈ q

, j ,k,h ,i)

" ,

(i, (m, j ,k,h ,i) ) ∈ T

(4.60) " ,

F that are associated to the terms Osc k,i , Osc k,i of the functionals G F n,m , G n,m and

S GS n,m , G n,m , related to pairs of subwaves in T Observe that, for every i p m,

∅ {((i p , j p , h p ), (i p , j p , h p ))}

if i p = m, if i p > m.

(4.61) (4.62)

614

F. Ancona, A. Marson

. Next, set τ − = τ p , p − 1, and

. k (i, j; p , p ) =

⎧ % & q − q − ⎪ p p τ τ − − ⎪ ⎪ ⎨ sk (i, j), sk (i, j) − sk (τ , τ − ), sk (τ , τ − ) ⎪ ⎪ p p ⎪ ⎩ sk (i, j), sk (i, j)

(4.63)

if τ p , p > i p , if τ p , p = i p ,

& % q − q − p p , sk , sk , are defined with the same conventions adopted for where sk τ , sk τ the corresponding terms in (4.14)–(4.15). Then, assuming to fix the ideas that i p = m, and recalling definitions (4.13)–(4.15), property (4.50), and Remark 9, we deduce that, with the notation (4.63), there holds h h p p sk (i, j), sk (i, j) − sk p (m, j p ), sk p (m, j p ) = k (i, j; p , p ) + + Osc k,i ((m, j p , h p ), (m, j p , h p )) + + Osc k,i ((m, j p , h p ), (m, j p , h p )) + i p
Osc k,τ p , p ((m, j p , h p ), (m, j p , h p )), +H (τ p , p −i p ) ·

(4.64)

where . H (τ p , p −i p ) =

1 if τ p , p > i p , 0 otherwise.

(4.65)

[sp] (i, j; p , p ) , (i, j) ∈ Gm,k , in (4.63) is bounded The total amount of variations k by (V (mε))2 , as shown in the next Lemma 2. For every compact set K ⊂ , there exist constants χ3 , c3 > 0 such that the following holds. Let u ε = u ε (t, x) be a Glimm approximate solution of (1.1), (1.6), assume that V (t) < χ3 , lim x→−∞ u ε (t, x) ∈ K , and consider a wave parti[sp] tion as in Proposition 2 for some fixed n ∈ N. Then, letting Gm,k be the set in (4.46), r,+ (i, j) in (4.47), and lkr (i, j) the index associated to the collection of subwaves Tm,k [sp] k (i, j; p , p ), (i, j) ∈ Gm,k , be the quantity defined in (4.63) (with max{i p , i p } in place of i p ), for every 0 < m < n there holds N

k=1 (i, j)∈G [sp] m,k

r

lkr −1 (i, j)< p , p ≤lkr (i, j)

k (i, j; p , p ) ≤ c3 · (V (mε))2 , (4.66)

[sp]r

where the third summation runs over all split components sk sk (i, j) issuing from (i, j).

(i, j) of the k-wave

A Sharp Convergence Rate of the Glimm Scheme

615

Proof. The proof is given in three steps. Step 1. Consider the potential interaction functional introduced in [7]: 1 sα sβ . sα sβ + σα (ξ ) − σβ (ξ ) dξ dξ Q(t) = 4 0 0 kα xβ (t)

(4.67)

kα =kβ

(where xα (t) denotes the position of the wave sα of the kα th characteristic family in the approximate solution u ε (t), and σα (·) is the wave-speed map associated to sα ). By the analysis in [7, Sect. 4] it follows that, if V (t) is sufficiently small, Q(t) is decreasing at every interaction, and that the total amount of wave interaction taking place within a time interval ]mε, nε] is bounded by O(1) · [Q(nε+) − Q(mε+)]. Namely, if we let s1 , . . . , s N , s1 , . . . , s N denote the incoming waves taking part of the solution of the Riemann problem at a node (i, j) (in the same setting of Lemma 1), and define the amount of interaction at (i, j) as . Int(i, j) = |sα sβ | + J (sα , sα ) (4.68) 1≤α,β≤N α>β

1≤α≤N

(J (sα , sβ ) being the amount of interaction between sα , sβ defined as in (2.19)–(2.22), then there holds Int(i, j) = O(1) · Q(mε+) ∀ n > 0, (4.69) m
with O(1) denoting a quantity uniformly bounded with respect to n. On the other hand, if we define the product of cancellation and total variation at (i, j) as . V C(i, j) = |sα | + |sα | · C(sα , sα ), (4.70) 1≤α≤N

and we let Cm,n denote the total amount of cancellation taking place in the interval ]mε, nε], by the uniform a-priori bound on V (t) established in [7] we deduce V C(i, j) = O(1) · V (mε+) · Cm,n ∀ n > 0. (4.71) m
k,T

r

incoming k-waves sk , sk at (i, j) that contain the k-subwaves of T . Then, set ⎤⎡ ⎤ ⎡ . ⎦ ⎣ ⎦ Intk,T r (i, j) = ⎣s r · sβ + sk,T r · s β + k,T β
β>k

, + + s r + s r ·J (sk , sk ), k,T k,T

(4.72)

616

F. Ancona, A. Marson

. Intk,T (i, j) = Intk,T r (i, j), r , + . V Ck,T r (i, j) = s r + s r · C(sk , sk ), k,T k,T . V Ck,T (i, j) = V Ck,T r (i, j),

(4.73) (4.74) (4.75)

r

and observe that, for every pair of backward maximal trees Tm,k (i , j ), Tm,k (i , j ), [sp] with vertices at different points (i , j ), (i , j ) ∈ Gm,k , the collections of k-subwaves

∧

− − of Tm,k (i , j )\Tm,k (i , j ) and Tm,k (i , j )\Tm,k (i , j ) are disjoint. Hence, letting T . − denote the set of nodes in Tm,k (i, j)\Tm,k (i, j) associated to a tree T = Tm,k (i, j), for every fixed (i, j) ∈]mε, nε] × R, one has Intk,T (i, j) + V Ck,T (i, j) = Int(i, j) + V C(i, j), (4.76) 1≤k≤N {T ; T ∧ (i, j)}

. [sp] where the second sum runs over all trees T = Tm,k (i, j), (i, j) ∈ Gm,k , that have − (i, j). Therefore, since by (i, j) as a node not belonging to their terminal points Tm,k definition (4.67) we have Q(t) = O(1) · kα ,kβ |sα sβ | = O(1) · (V (t))2 , and because Cm,n = O(1) · V (mε+), it follows from (4.69), (4.71), (4.76), that in order to establish (4.66) it will be sufficient to prove that, for every given k-family, and for every fixed [sp] (i, j) ∈ Gm,k , there holds k (i, j; p , p ) r

lkr −1 (i, j)< p , p ≤lkr (i, j)

⎡

⎢ = O(1) · ⎣Int(i, j) + V C(i, j) +

⎤ ⎥ Intk,T (i, j) + V Ck,T (i, j) ⎦ . (4.77) (i, j)∈T

∧

. [sp] Step 2. Fix (i, j) ∈ Gm,k , and consider the backward tree T = Tm,k (i, j) with vertex at (i, j). Adopting the notations of Proposition 2 and Definition 9, for every pair of . . indices lkr −1 (i, j) < p , p ≤ lkr (i, j), set τ p , p = τ [i] ( i, j, k, p , p ), and let p , p = (i p , j p ,k,h p ,τ p , p ) = (i p , j p ,k,h p ,τ p , p ) . Moreover, let i , i , qi , qi , max{i p , i p } < i ≤ i, be the indices defined as in (4.58)–(4.59). Next, define the sets ! " . . ) Pr , Pr = ( p , p ); lkr −1 (i, j) < p , p ≤ lkr (i, j) , P= r

. P = ( p , p ) ∈ P; τ p , p = max{i p , i p } , . Pr (i, j) = ( p , p ) ∈ Pr ; max{i p , i p } < τ p , p = i, p , p = j , . ) P (i, j) = Pr (i, j),

(4.78)

r

and observe that P = P ∪

) (i, j)∈T

P (i, j). ∧

(4.79)

A Sharp Convergence Rate of the Glimm Scheme

617

By Remark 8, Remark 6, and Definition (4.63), relying on (3.25)–(3.26) we may estimate the terms on the left-hand side of (4.77) related to pairs ( p , p ) ∈ P as p p k (i, j; p , p ) = sk (i, j), sk (i, j) ( p , p )∈P

( p , p )∈P

= O(1) ·

[sp] [sp] sk r (i, j), sk r (i, j)

r

= O(1) · Int(i, j) + V C(i, j) .

(4.80)

Instead, for the terms on the left-hand side of (4.77) related to pairs ( p , p ) ∈ Pr (i, j), r∧ . r (i, j)\T r,− (i, j), with the notations (4.58)–(4.59), and recalling = Tm,k (i, j) ∈ T m,k Remark 7, Definition 8 and the conventions adopted for the definition of (4.63), we have k (i, j; p , p ) r

=

=

( p , p )∈Pr (i, j) r∧ (i, j)∈T

r

( p , p )∈Pr (i, j) r∧ (i, j)∈T

r

% & qi−1 qi−1 p p s (i, j), s (i, j) − s (i −1, ), s (i −1, ) i−1 i−1 k k k k

( p , p )∈Pr (i, j) r∧

% & r s p , s p , s qi−1 , s qi−1 , k k k k

(4.81)

(i, j)∈T

where % & q p p q r sk , sk , sk i−1 , sk i−1 ⎡ s p s p k k p 1 . ⎣ p σ = (ξ ) − σ (ξ ) · dξ dξ + k k [sp] 0 |sk r (i, j)| 0 −

q

1 )|+|s [i] (i −1, ))| |sk[i] ((i −1, i−1 k i−1

q

⎤ σ qi−1 (ξ ) − σ qi−1 (ξ ) dξ dξ ⎦, k k

s i−1 sk i−1 k

· 0

0

(4.82) p

with sk

qi−1

. p p . p = sk (i, j), sk = sk (i, j), sk p

q

p

q

qi−1

qi−1

. ), s = sk (i − 1, i−1 k

qi−1

. = sk (i −

), and letting σ , σ , σ i−1 , σ i−1 , denote the wave-speed map defined as in 1, i−1 k k k k [sp]r

(2.13), associated to the wave sk q

p

p

(i, j) (for σk , σk ), and to the waves sk[i] ((i − 1, q

), s [i] ((i −1, ) (for σ i−1 and σ i−1 , respectively). i−1 k k k i−1

Step 3. In order to estimate the right-hand side of (4.81), consider a pair ( p , p ) ∈ r∧ Pr (i, j), (i, j) ∈ T , and for every node (ι, ι ) in T r ∧ that lies between (i, j) and q q (i, j) define the total strength of interacting waves with sk ι , sk ι , and cancellation taking

618

F. Ancona, A. Marson

place at (ι, ι ), as follows. Let s1 , . . . , s N , s1 , . . . , s N , denote the incoming waves taking part of the solution of the Riemann problem at (ι, ι ), and, in the case ι > i, assume q q to fix the ideas that s j , j < k, are the waves of other families interacting with sk ι , sk ι . q

q

Then, recalling that by Definition 9 and because of (4.56), sk i−1 , sk i−1 are subwaves of q

q

interacting waves at (i, j) issuing from different nodes, while for ι ≥ i, sk ι , sk ι are subwaves of the same k-wave issuing from a node (ι, ι ), and interacting at (ι + 1, ι+1 ), set . p , p I Wk (ι, ι ) = if ι > i, s j + J (sk , sk ) + C(sk , sk ) j
. I Wk (ι, ι ) = s j + J (sk , sk ) + C(sk , sk ) p

if ι = i,

(4.83)

j
. p I Wk (ι, ι ) = s j + J (sk , sk ) + C(sk , sk )

if ι = i.

j>k

Next, relying on (3.2) and on the Lipschitz continuity of the map (u, s) → σk [u](s, ·), we derive ⎡ ⎤ ⎢ q ⎢ p p σk − σk i−1 L∞ = O(1) · ⎢ I Wk (i, j) + ⎣ ⎡ ⎢ q ⎢ p p σk − σk i−1 L∞ = O(1) · ⎢ I Wk (i, j) + ⎣

(ι,)∈T r ∧ (i, j)≺(ι,)!(i, j)

(ι,)∈T r ∧ (i, j)≺(ι,)!(i, j)

p , p

I Wk

⎥ ⎥ (ι, )⎥ , ⎦

(4.84)

⎤ p , p

I Wk

⎥ ⎥ (ι, )⎥ , ⎦

(4.85)

where the sums run over all nodes (ι, ) ∈ T r ∧ lying between (i, j) and (i, j). Moreq

q

p

p

over, observe that since sk i−1 , sk i−1 are subwaves of interacting waves while sk , sk are subwaves of a composed wave, by the monotonicity of the wave-speed it follows that q q p p max σk − σk L∞ , σk i−1 − σk i−1 L∞ q q p p (4.86) = O(1) · max σk − σk i−1 L∞ , σk − σk i−1 L∞ . On the other hand, by definitions (4.72), (4.74), and applying (3.2), we find qi−1 qi−1 |skp skp | s | |s k k − [i] [sp]r [i] |s (i −1, )|+|s (i −1, )| |s (i, j)| k k i−1 i−1 k ( p , p )∈Pr (i, j) r∧ (i, j)∈T

= O(1) ·

(i, j)∈T

r∧

Intk,T r (i, j) + V Ck,T r (i, j) .

(4.87)

A Sharp Convergence Rate of the Glimm Scheme

619

Hence, relying on (4.84)–(4.87), we derive

r

( p , p )∈Pr (i, j) r∧ (i, j)∈T

= O(1) ·

r

= O(1) ·

% & r s p , s p , s qi−1 , s qi−1 k k k k

(i, j)∈T

(i, j)∈T

∧

r∧

Intk,T r (i, j) + V Ck,T r (i, j)

Intk,T (i, j) + V Ck,T (i, j) ,

(4.88)

which, together with (4.80)–(4.81), and because of (4.78)–(4.79), yields (4.77), thus completing the proof of the lemma. Proposition 4. In the same setting of Lemma 2, for every compact set K ⊂ , there exist constants χ4 , c4 > 0 such that the following holds. Let u ε = u ε (t, x) be a Glimm approximate solution of (1.1), (1.6), satisfying V (t) < χ4 , lim x→−∞ u ε (t, x) ∈ K , and consider the total amount of oscillations G n,m defined by (4.36)–(4.40), for any 0 ≤ m < n, Then, there holds G n,m ≤ c4 · (V (mε))2

∀ 0 ≤ m < n.

(4.89)

Proof. In order to provide a uniform bound on the functional G n,m defined in (4.40) in connection with a wave partition {skh (i, j)}h of u ε provided by Proposition 2, we shall F S S F F consider separately the terms G F n,m , G n,m and G n,m , G n,m . Regarding G n,m + G n,m , . skm,h (m, j ), let ι( j ,h , j ,h ,k) = for any pair of first-generation subwaves skm,h (m, j ), min{ι(m, j ,k,h ), ι(m, j ,k,h ) } (ι being the map in 4.1) denote the maximum grid index ≤ n until which both k-subwaves survive. This means that either one of the two waves is canceled at time t = ι( j ,h , j ,h ,k) ε < nε, or ι( j ,h , j ,h ,k) ε = nε and thus both waves propagate along the whole interval [mε, nε]. Observe that, by Remark 10, and relying in particular on (4.64), letting , q be the maps in (4.1), and setting . ( j ,h ,k) = (m, j ,k,h ,ι( j ,h , j ,h ,k) ) , . q( j ,h ,k) = q(m, j ,k,h ,ι( j ,h , j ,h ,k) ) , . m,q( j ,h ,k) m,q ι( j ,h , j ,h ,k) , ( j ,h ,k) , sk sk (ι, ) = . ( j ,h ,k) = (m, j ,k,h ,ι( j ,h , j ,h ,k) ) , . q(j ,h ,k) = q(m, j ,k,h ,ι( j ,h , j ,h ,k) ) , . m,q( j ,h ,k) m,q ι( j ,h , j ,h ,k) , ( j ,h ,k) , sk sk (ι, ) =

(4.90)

. m,q m,q sk (ι, ), sk (ι, ) − skm,h (m, j ), skm,h (m, j ) , D(Fj ,h , j ,h ,k) = (4.91)

620

F. Ancona, A. Marson

we have N F ≤ G n,m + G F n,m

k=1 (i, j)∈G [sp] m,k

+

N

r

lkr −1 (i, j)< p , p ≤lkr (i, j)

k=1 j , j ∈Z h ,h

k (i, j; p , p ) +

D(Fj ,h , j ,h ,k) ,

(4.92)

where the third summation of the second term runs over those indices h , h that correspond to first-generation k-subwaves issuing from (m, j ) and (m, j ), respectively. Hence, applying Lemma 2, and because of (3.14), we find F 2 G n,m + G F n,m ≤ O(1) · (V (mε)) + +O(1) ·

N

, + m,q m,h m,q sk (m, j ) sk (ι, ) sk (ι, )+ skm,h (m, j ) .

k=1 j , j ∈Z h ,h

(4.93) On the other hand, since ⎡ ⎤2 m,h m,h sk (m, j ) sk (m, j)⎦ skm,h (m, j ) = ⎣

k, j , j h ,h

k, j,h

= O(1) · (V (mε))2 ,

(4.94)

and thanks to (4.3), we derive , + m,q m,h m,q sk (m, j ) sk (ι, ) sk (ι, ) + skm,h (m, j ) k, j , j h ,h

= O(1) ·

m,h sk (m, j ) skm,h (m, j ) +

k, j , j h ,h

, + m,q m,q sk (ι, )− sk (ι, )− + O(1) · skm,h (m, j )+ skm,h (m, j ) k, j , j h ,h

= O(1) · (V (mε))2 ,

(4.95)

which, together with (4.93), yields F 2 G n,m + G F n,m = O(1) · (V (mε)) .

(4.96)

S Concerning the term G S n,m + G n,m of G n,m , letting ι, , q be the maps in (4.1), set

. ι(i , j ,h ,i , j ,h ,k) = min{ι(i , j ,k,h ) , ι(i , j ,k,h ) }, . (i , j ,h ,k) = (i , j ,k,h ,ι(i , j ,h ,i , j ,h ,k) ) , . q(i , j ,h ,k) = q(i , j ,k,h ,ι(i , j ,h ,i , j ,h ,k) ) ,

(4.97)

A Sharp Convergence Rate of the Glimm Scheme

621

. m,q(i , j ,h ,k) m,q s k (ι, ) = ι(i , j ,h ,i , j ,h ,k) , (i , j ,h ,k) , sk . (i , j ,h ,k) = (i , j ,k,h ,ι(i , j ,h ,i , j ,h ,k) ) ,

(4.98) . q(i , j ,h ,k) = q(i , j ,k,h ,ι( j ,h , j ,h ,k) ) , . m,q(i , j ,h ,k) m,q s k (ι, ) = ι(i , j ,h ,i , j ,h ,k) , (i , j ,h ,k) , sk . m,q m,q D(iS , j ,h ,i , j ,h ,k) = sk (ι, ), sk (ι, ) − skm,h (i , j ), skm,h (i , j ) , . Gm,k = (i , j ), (i , j ) ∈ Gm,k ∪ {m} × Z; max{i , i } > m .

(4.99) (4.100)

Next, observe that N S S G ≤ + G n,m n,m

k (i, j; p , p ) +

k=1 (i, j)∈G [sp] r l r −1 (i, j)< p , p ≤l r (i, j) k k m,k

+

N

k=1 (i , j ),(i , j )∈Gm,k h ,h

D(iS , j ,h ,i , j ,h ,k) ,

(4.101)

the third summation of the second term running over those indices h , h that correspond to k-subwaves issuing from (i , j ) and (i , j ), respectively, one of which at least is of second-generation. To fix the ideas assume that skm,h (i , j ) is of second-generation, i.e. m,h s k (i , j ), (i , j ) ∈ Gm,k . Then, with similar arguments as those that s m,h (i , j ) = k

F used in the estimate of G F n,m + G n,m , applying (4.3), (4.5)–(4.6) and Lemma 2, we derive

S 2 G n,m + G S n,m ≤ O(1) · (V (mε)) + +O(1) ·

N

m,h s k (i , j ) +

k=1 (i , j )∈Gm,k h

+O(1) ·

N

+ m,q m,h s k (ι, )− s k (i , j ) +

k=1 (i , j ),(i , j )∈Gm,k h ,h

, m,q + sk (ι, )−skm,h (i , j ) = O(1) · (V (mε))2 .

(4.102)

From (4.96), (4.102) we recover the desired estimate (4.89), thus concluding the proof of the proposition. Thanks to Proposition 4, we may now define a functional G(t), t ≥ 0, setting . . G(t) = G m = sup G n,m n>m

∀ t ∈ [mε, (m + 1)ε[, m ≥ 0.

(4.103)

622

F. Ancona, A. Marson

Observe that, by (4.43), one has G m−1 − G m =

si (mε, ε) +

∈Z

(mε, α ε, β ε).

(4.104)

α ,β ∈Z α =β

Hence, recalling (4.40), from (4.43), (4.104), it follows G(t) = G n (t)

∀ t ∈ [mε, (m + 1)ε[, 0 ≤ m < n.

(4.105)

Thus, relying on Propositions 3-4, and on (4.105), we deduce the following Corollary 1. In the same setting of Proposition 3, and letting c2 , c4 , C, be the constants provided by Proposition 3 and Proposition 4, there exist some constants χ5 , c5 > 0 such 1 (R; R N ) with Tot.Var.{u} < that the following holds. Given an initial datum u ∈ Lloc ε ε χ5 , lim x→−∞ u(x) ∈ K , and letting u = u (t, x) be a Glimm approximate solution of (1.1), (1.6), there holds |G(t)| ≤ c4 · (V (t))2

∀ t ≥ 0, 1 · V (t) ≤ V (t) + C · (Q(t) + G(t)) ≤ c5 · V (t), c5 [V + C · (Q + G)] (mε) ⎡

(4.106) ∀ t ≥ 0, ⎤

⎢ ⎥ ⎢ ⎥ |sα sβ | + C(sα , sβ ) + ≤ −c2 · ⎢ e (sα , sβ )⎥ ⎣ ⎦ kα xβ

kα =kβ

(4.107)

∀ m ≥ 0,

kα =kβ sα ·sβ >0

(4.108) (C(sα , sβ ) denoting the cancellation defined in (3.17), and e (sα , sβ ) being the quantity defined in (3.21)). Proof. As in the proof of Proposition 4, we may assume that the Glimm approximate solution u ε (t) is globally defined in time, with an a-priori bound on V (t), provided that V (0) is taken sufficiently small. Then, the uniform bound (4.106) is an immediate consequence of (4.89) and of the definition (4.103) of G(t). The estimate (4.108) follows from (4.45), relying on (4.105). Finally, observe that, because of (3.15), one has Q(t) = O(1) · (V (t))2 , which, together with (4.106), yields (4.107). By Corollary 1, it follows that the functional ϒ(t) defined by (1.24) assume positive values and is non increasing in time provided that the initial strength V (0) is sufficiently small. Moreover, for any given 0 ≤ m < n, the total amount of wave interaction and cancellation taking place in the time interval [mε, nε] is bounded by O(1) · |m,n ϒ|, where . m,n ϒ = ϒ(nε+) − ϒ(mε+) denotes the variation of ϒ on [mε, nε].

(4.109)

A Sharp Convergence Rate of the Glimm Scheme

623

5. Wave Tracing for General Quasilinear Systems We will show now how to implement a wave tracing algorithm for a general quasilinear system 1.1 so that the change in strength and the product of strength times the variation in speeds of the primary waves is bounded by the variation of the Glimm functional in (1.24). Here, differently from the wave-partition provided by Proposition 2, for every given 0 ≤ m < n, we shall partition the outgoing waves issuing from every mesh point (iε, ε), mε ≤ iε ≤ nε, in primary waves and secondary waves. The first class of waves consists of all subwaves that can be traced through the whole interval [mε, nε], while the second class of waves collects all the new subwaves that are generated at time t = iε and all the subwaves that are canceled before time t = nε. The total strength of secondary waves produced in the time interval ]mε, nε] is bounded by the total amount of interaction and cancellation occurring within ]mε, nε]. Namely, recalling Definition 2 of a wave partition, we have the following result analogous to [5, Prop. 2]. Proposition 5. Given a Glimm approximate solutionand any fixed 0 ≤ m < n, there exists a partition of elementary k-wave sizes and speeds ykh (i, ), λkh (i, ) 0
! " ykh (i, ), λkh (i, )

h∈ H

,

h h λk (i, ) y k (i, ),

h∈ H

with the following properties: (a) h y k (i, ) = O(1) · m,n ϒ

,

∀ m ≤ i ≤ n;

(5.1)

,k,h

(b) for everyfixed m ≤ i ≤ n, k = 1, . .. , N , there is a one-to-one correspondence between ykh (m, j), λkh (m, j) and ykh (i, ), λkh (i, ) : ! " ykh (m, j), λkh (m, j)

←→

qh q yk (i, ( j,k,h,i) ), λk h (i, ( j,k,h,i) )

(5.2)

such that the sizes skh and the speeds λkh of the corresponding waves satisfy & % h q sk (m, j) − (5.3) max sk h (i, ( j,k,h,i) ) = O(1) · m,n ϒ , j,k,h

m≤i≤n

% skh (m, j) · max

j,k,h

m≤i≤n

& h qh λk (m, j) − λk (i, ( j,k,h,i) ) = O(1) · m,n ϒ , (5.4)

where m,n ϒ is the variation (4.109) of the functional ϒ.

624

F. Ancona, A. Marson

Proof. In order to produce a partition for an approximate solution u ε that fulfills Properties 1-2, we shall proceed by induction on the time steps iε, m ≤ i ≤ n. Then, assuming that such a partition is given for all times mε ≤ t < iε, our goal is to show how to define a partition of the outgoing waves generated by the interactions that take place at t = iε, preserving the properties 1-2. It will be sufficient to focus our attention on interactions between waves of the same family, since for interactions between waves of different families the change in strength and the product of strength times the variation in speeds is controlled by the variation of a quadratic interaction potential as the first term in (1.23), and hence the definition of a partition verifying 1-2 for the outgoing waves generated by an interaction of this type is standard (cfr. [27, Theorem 5.1]). Thus, consider an interaction between two k-waves, say sk , sk , issuing from two consecutive mesh points ((i − 1)ε, ( − 1)ε) and ((i − 1)ε, ε). We shall distinguish two cases. 1. sk and sk have the same sign. For the sake of simplicity, we assume that sk , sk > 0 and that the outgoing k-wave sk issuing from (iε, ε) is a shock, the other cases being entirely similar. Let ! " " ! h h ykh , λh , y , λ , (5.5) k k k 0
be the partitions of sk

with sizes

and sk !

0
enjoying Properties 1-2 (on the interval [mε, (i − 1)ε]),

skh

" 0
,

! " skh

0
.

(5.6)

For every p th wave s p , p = k, exiting from (iε, ε), we may choose a (finite) partition {y hp }0 sk + sk . The subwaves skh in (5.7) inherit the same classification in primary and secondary waves of the corresponding subwaves skh or skh−l , while all the possible subwaves of sk − (sk + sk ) are labeled as secondary waves. Clearly, the bound (5.1) is again satisfied because of the interaction estimates (3.2), and thanks to Corollary 1, while the one-to-one correspondence at (5.2) and the bound (5.3) are verified by construction and by the inductive assumption. Hence, in order to conclude the proof, it remains to establish only the estimate (5.4) on the wave speeds. To this end, notice that the shock speed λk of the outgoing k-wave sk coincides with the speeds λkh of all subwaves skh defined according with Definition 2, since for a

A Sharp Convergence Rate of the Glimm Scheme

625

shock wave the integrand function σ (·) in (2.18) results in a constant (cfr. Remark 1). Moreover, by the choice of the speeds of a partition at (2.18), one has λh =

1

skh

τkh τkh−1

σ (ξ ) dξ,

λh =

1 skh

τkh

τkh−1

σ (ξ ) dξ,

(5.8)

. h . h . . p p h = where τkh = p=1 sk , τk p=1 sk , and σ (·) = σk [wk ](sk , ·), σ (·) = σk [wk ](sk , ·), denote the map in (2.13) defining the speed of the rarefaction and shock components of sk and sk , respectively (wk , wk being the left states of sk , sk ). Then, applying Lemma 1, one obtains the following estimate on the wave speeds (in the same spirit of the ones provided by [27, Theorem 3.1]): λk · sk + sk =

sk +sk

0

=

sk

σ (ξ ) dξ + O(1) · J (sk , sk )

σ (ξ ) dξ +

0

sk

0

σ (ξ ) dξ + O(1) · J (sk , sk ),

(5.9)

which, relying on (5.8), yields λk ·

sk

+ sk

=

l

skh λh k

h=1

+

l

skh λh k + O(1) · J (sk , sk ).

(5.10)

h=1

Thus, since by the monotonicity property of σ (·) and σ (·), we have h λh k − O(1) · J (sk , sk ) ≤ λk ≤ λk + O(1) · J (sk , sk )

∀ h,

using (5.10) we derive h |λh k − λk | = λk − λk + O(1) · J (sk , sk ) ⎤ ⎡ l l 1 ⎣ p h p p p ⎦ +O(1) · J (sk , sk ), · sk λ k − λk + sk λh = k − λk sk + sk p=1

p=1

which, in turn, yields ⎡ l l 1 p h p h ⎣ λ skh |λh − λ | = · s s − λ k k k k k k + sk + sk h=1 h=1 p=1 ⎤ l l p p ⎦ + + O(1) · J (sk , sk ) · sk . skh sk λh k − λk

l

h=1 p=1

(5.11) Notice that the terms of the first double sum on the right hand side of (5.11) are antisymmetric in (h, p), and hence the first summand vanishes. Moreover, recalling

626

F. Ancona, A. Marson

(5.8), and observing that by the monotonicity property of the wave-speed one has σ (ξ ) ≥ σ (ξ ) for all ξ ∈ [0, sk ], ξ ∈ [0, sk ], we find

p

skh sk

p λh k − λk

h=1 p=1

=

p=1

= sk =

sk

p sk sk 0

0

τkh

h−1 h=1 τk

σ (ξ ) dξ −

σ (ξ ) dξ − sk sk

0

h=1

sk

skh

τkh−1

h−1 p=1 τk

σ (ξ ) dξ

σ (ξ ) dξ

0

σ (s − ξ ) − σ (ξ ) dξ dξ . k

(5.12)

On the other hand, letting w , (sk − ξ )s , w , ξ s , be the quantities associated to the waves sk , sk by Definition 4, we have w (s − ξ ) − w (ξ ) + (s − ξ )s + ξ s = O(1) · |s | + |s | ∀ ξ , ξ . k k (5.13) Thus, from (5.11)–(5.13), recalling (3.23), and applying (4.108), we obtain sk sk σ (s −ξ ) − σ (ξ ) dξ dξ

|skh ||λh k −λk |

0

=

0

h=1

k

|sk | + |sk |

= O(1) · e (sk , sk ) = O(1) · |ϒ(iε)| ,

+ O(1) · J (sk , sk )

(5.14)

. where ϒ(iε) = i−1,i ϒ is the variation at time t = iε of the functional ϒ in (1.24). An entirely similar estimate can be obtained for the components of the partition of sk , so that there holds

|skh ||λh k − λk | = O(1) · |ϒ(iε)| .

(5.15)

h=1

Therefore, relying on the inductive assumption, from (5.14)–(5.15) we recover the desired estimate (5.4) on the time interval [mε, iε]. 2. sk and sk have opposite sign. To fix the ideas, assume that sk > −sk > 0 and that the outgoing k-wave sk issuing from (iε, ε) is a shock, the other cases being entirely similar. Adopting the above notations, we may define a partition {y hp }0
if h = 1, . . . , ,

(5.16)

A Sharp Convergence Rate of the Glimm Scheme

627

h . q where = max{h ≤ : q=1 sk = sk }. Such partitions continue to satisfy the bounds (5.1), (5.3) and the one-to-one correspondence at (5.2), thanks to the estimate (3.2) and to Corollary 1, and because of the inductive assumption. Therefore, even in this case the proof will be completed once we establish the estimate (5.4) on the wave speeds. Towards this goal, observe as above that the shock speed λk of the outgoing shock sk coincides with the speeds λkh of all subwaves λkh defined according to Definition 2. On the other hand, applying Lemma 1, relying on (2.23), and thanks to the Lipschitz continuity of (u, s) → σk [u](s, · ), we deduce that s +s k k λk · sk + sk = σ (ξ ) dξ + O(1) · J (sk , sk ) 0

=

0

=

0

=

sk

sk

σ (ξ ) dξ + 0

sk +sk

σ (ξ ) dξ +

sk +sk

0

σ (ξ ) dξ + O(1) · J (sk , sk ), sk

0

σ (ξ ) − σ (ξ + sk ) dξ + O(1)·|sk sk |

σ (ξ ) dξ + O(1) · |sk sk |,

which, because of (5.8), yields skh λh λk · sk + sk = k + O(1) · |sk sk |.

(5.17)

h=1

Hence, observing as in Case 1 that by the monotonicity property of σ (·) we have λk ≤ λh k + O(1) · J (sk , sk )

∀ h,

using (2.23), (5.17) we derive h |λh k − λk | = λk − λk + O(1) · J (sk , sk ) p |s s | 1 p + O(1) · k k , = · sk λh k − λk sk + sk sk + s k p=1,...,

which, in turn, yields h=1

skh |λh k − λk | =

1 p p + O(1)|sk sk |. · skh sk λh k − λk sk + sk

(5.18)

h=1 p=1

Since the terms in the sum on the right-hand side of (5.18) are antisymmetric in (h, p), the whole sum vanishes. Hence, applying (4.108), from (5.18) we recover

|skh | |λh k − λk | = O(1)|sk sk |

h=1

= O(1) · C(sk , sk ) = O(1) · |ϒ(iε)|,

(5.19)

which proves (5.4) relying on the inductive assumption. This concludes the proof of the proposition.

628

F. Ancona, A. Marson

6. Conclusion Relying on the results established in the previous section, one can now conclude the proof of Theorem 1 following the same strategy adopted in [5,12]. We briefly recall it for completeness. Step 1. We use the partition of waves of an approximate solution u ε into h h secondary waves y k , λk ,

" ! λkh , primary waves ykh ,

provided by Proposition 5 to construct a piecewise constant approximation ψ = ψ(t, x) of u ε (t, x) in a time interval [mε, nε] that enjoys the following properties (cfr. [12, Sect. 4]): 1. The wave fronts in ψ are of two kinds, primary and secondary. 2. There is a one-to-one correspondence between primary fronts and primary waves { ykh }, and the primary front corresponding to ykh (m, j) has constant size skh (m, j). 3. Each primary front originates at t = mε and ends at t = nε. In particular, the primary front corresponding to ykh (m, j) joins the points (mε, jε) and (nε, (n, j,k,h) ε) of the (t, x) plane. 4. The left and right states of the primary front corresponding to ykh (m, j), say h,L h,R u k (m, j), u k (m, j), are always related by + , u kh,R (m, j) = Tk u kh,L (m, j) skh (m, j) . Moreover, there holds ψ(mε) = u ε (mε). 5. Let u βL (t), and u βR (t) be the left and right states of a secondary front xβ (t) of ψ at time t ∈ [mε, nε]. Then, letting CW denote the set of all pairs of crossing primary waves in u ε (i.e. all pair of waves ykh (m, j), ykh (m, j ) for which j < j , k > k and (n, j,k,h) ≥ (n, j ,k ,h ) ), there holds ⎡ ⎤ h s k (m, j) + skh (m, j) skh (m, j )⎦ u βR (t) − u βL (t) = O(1) · ⎣ β

j,k,h

= O(1) · m,n ϒ ,

CW

where the summand on the left-hand side runs over all secondary fronts in ψ(t), while the second summand on the right-hand side runs over all pairs of crossing primary waves in u ε . 6. All secondary fronts travel with speed 2, strictly larger than all characteristic speeds.

A Sharp Convergence Rate of the Glimm Scheme

629

Step 2. Using the same arguments of [12, Sect. 5], relying on (1.12), (1.16), (5.3), (5.4), one can prove that S(n−m)ε ψ(mε) − ψ(nε) 1 L # $ 1 + log(n − m) = O(1) · m,n ϒ + + ε (n − m)ε, n−m ε u (nε) − ψ(nε) 1 = O(1) · m,n ϒ · (n − m)ε, (6.1) L where S(n−m)ε ψ(mε) is the semigroup trajectory of (1.4), with initial datum ψ(mε) = u ε (mε), evaluated at time t = (n − m)ε. Step 3. As in [12, Sect. 6], let T = mε + ε , for some m ∈ N, 0 ≤ ε < ε, and fix a positive constant ρ > 2ε. Then, we inductively define integers 0 = m 0 < m 1 < · · · < m κ = m with the following procedure. Assuming m i given: 1. if ϒ(m i ε) − ϒ ((m i + 1)ε) ≤ ρ, let m i+1 be the largest integer less than or equal to m such that (m i+1 − m i )ε ≤ ρ and ϒ(m i ε) − ϒ(m i+1 ε) ≤ ρ; . 2. if ϒ(m i ε) − ϒ ((m i + 1)ε) > ρ, set m i+1 = m i + 1. On every interval [m i ε, m i+1 ε] where Case 1 holds, we construct a piecewise constant approximation of u ε according to Step 1. Then, using (6.1) we derive ε u (m i+1 ε) − S(m −m )ε u ε (m i ε) 1 i+1 i L # $ 1 + log(m i+1 − m i ) = O(1) · m i ,m i +1 ϒ + + ε (m i+1 − mi) ε. (6.2) m i+1 − m i On the other hand, on each interval [m i ε, m i+1 ε] where Case 2 is verified, by the Lipschitz continuity of u ε and applying (1.16) we find ε u (m i+1 ε) − S(m −m )ε u ε (m i ε) 1 = O(1) · ε. (6.3) i+1 i L Hence, observing that the cardinality of both classes of intervals Cases 1-2 is bounded by O(1) · ρ −1 , from (6.2)–(6.3) we finally deduce % &$ # ε u (T ) − ST u 1 = O(1) · ρ + ε log ρ + ε 1 + 1 , L ρ ε ρ . √ which yields (1.15) choosing ρ = ε · log | log ε|. Acknowledgements. The authors wish to thank Tong Yang and an anonymous referee for having pointed out two inconsistencies in the previous version of the interaction potential functional presented in this paper.

References 1. Ancona, F., Marson, A.: A note on the Riemann Problem for general n × n conservation laws. J. Math. Anal. Appl. 260, 279–293 (2001) 2. Ancona F., Marson A.: Well-posedness for general 2 × 2 systems of conservation laws. Mems. Amer. Math. Soc. 169(801) (2004) 3. Ancona, F., Marson, A.: A wave front tracking algorithm for N × N non genuinely nonlinear conservation laws. J. Diff. Eqs. 177, 454–493 (2001) 4. Ancona, F., Marson, A.: Existence theory by front tracking for general nonlinear hyperbolic systems. Arch. Rat. Mech. Anal. 185(2), 287–340 (2007)

630

F. Ancona, A. Marson

5. Ancona, F., Marson, A.: A locally quadratic Glimm functional and sharp convergence rate of the Glimm scheme for nonlinear hyperbolic systems. Arch. Rat. Mech. Anal. 196(2), 455–487 (2010) 6. Bianchini, S.: On the Riemann problem for non-conservative hyperbolic systems. Arch. Rat. Mech. Anal. 166, 1–26 (2003) 7. Bianchini, S.: Interaction estimates and Glimm functional for general hyperbolic systems. Dis. Cont. Dyn. Syst. 9, 133–166 (2003) 8. Bianchini, S., Bressan, A.: On a Lyapunov functional relating shortening curves and viscous conservation laws. Nonlin. Anal. TMA 51(4), 649–662 (2002) 9. Bianchini, S., Bressan, A.: Vanishing viscosity solutions to nonlinear hyperbolic systems. Ann. Math. 161, 223–342 (2005) 10. Bressan, A.: The unique limit of the Glimm scheme. Arch. Rat. Mech. Anal. 130, 205–230 (1995) 11. Bressan A. Hyperbolic Systems of Conservation Laws - The one-dimensional Cauchy problem. Oxford: Oxford Univ. Press, 2000 12. Bressan, A., Marson, A.: Error bounds for a deterministic version of the Glimm scheme. Arch. Rat. Mech. Anal. 142, 155–176 (1998) 13. Bressan, A., Yang, T.: On the convergence rate of vanishing viscosity approximations. Comm. Pure Appl. Math. 57, 1075–1109 (2004) 14. Colombo, R.M.: On a 2 × 2 hyperbolic traffic flow model. Math. Comput. Modelling 35, 683–688 (2002) 15. Dafermos C.M.: Hyperbolic conservation laws in continuum physics. Berlin: Springer-Verlag (2000) 16. DiPerna, R.: Uniqueness of solutions to hyperbolic conservation laws. Indiana Univ. Math. J. 28, 137–188 (1979) 17. Glass, O., LeFloch, P.G.: Nonlinear hyperbolic systems: nondegenerate flux, inner speed variation, and graph solutions. Arch. Rat. Mech. Anal. 185(3), 409–480 (2007) 18. Glimm, J.: Solutions in the large for nonlinear hyperbolic systems of equations. Comm. Pure Appl. Math. 18, 697–715 (1965) 19. Hua, J., Jiang, Z., Yang, T.: A new Glimm functional and convergence rate of Glimm scheme for general systems of hyperbolic conservation laws. Arch. Rat. Mech. Anal. 196(2), 433–454 (2010) 20. Hua, J., Yang, T.: An improved convergence rate of Glimm scheme for general systems of hyperbolic conservation laws. J. Diff. Eqs. 231, 92–107 (2006) 21. Iguchi, T., LeFloch, P.G.: Existence theory for hyperbolic systems of conservation laws with general flux-functions. Arch. Rat. Mech. Anal. 168, 165–244 (2003) 22. Lax, P.D.: Hyperbolic systems of conservation laws II. Comm. Pure Appl. Math. 10, 537–566 (1957) 23. Liu, T.P.: The determnistic version of the Glimm scheme. Commun. Math. Phys. 57, 135–148 (1975) 24. Liu, T.P.: The Riemann problem for general 2 × 2 conservation laws. Trans. Amer. Math. Soc. 199, 89–112 (1974) 25. Liu, T.P.: The Riemann problem for general systems of conservation laws. J. Diff. Eqs. 18, 218–234 (1975) 26. Liu T.P.: Admissible solutions of hyperbolic conservation laws. Mems. Amer. Math. Soc. 30(240) (1981) 27. Liu, T.P., Yang, T.: Weak solutions of general systems of hyperbolic conservation laws. Commun. Math. Phys 230, 289–327 (2002) 28. Muracchini, A., Ruggeri, T., Seccia, L.: Mixture of Euler’s fluids and second sound propagation in superfluid helium. Z. Angew. Math. Phys. 57, 567–585 (2006) 29. Ruggeri, T., Muracchini, A., Seccia, L.: Continuum approach to phonon gasand shape changes of second sound via shock wave theory. Nuovo Cimento D. 16, 15–44 (1996) 30. Ruggeri, T., Muracchini, A., Seccia, L.: Second sound and characteristic temperature in solids. Phys. Rev. B 54, 332–339 (1996) 31. Yang, T.: Convergence rate of Glimm scheme for general systems of hyperbolic conservation laws. Taiwanese J. Math. 7, 195–205 (2003) Communicated by P. Constantin

Commun. Math. Phys. 302, 631–674 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1176-7

Communications in

Mathematical Physics

A Uniqueness Theorem for Stationary Kaluza-Klein Black Holes Stefan Hollands1 , Stoytcho Yazadjiev2 1 School of Mathematics, Cardiff University, Cardiff CF24 4AG, UK. E-mail: [email protected] 2 Department of Theoretical Physics, Faculty of Physics, Sofia University, 5 J. Bourchier Blvd., Sofia 1164,

Bulgaria. E-mail: [email protected] Received: 5 May 2009 / Accepted: 9 June 2010 Published online: 28 January 2011 – © Springer-Verlag 2011

Abstract: We prove a uniqueness theorem for stationary D-dimensional Kaluza-Klein black holes with D − 2 Killing fields, generating the symmetry group R × U (1) D−3 . It is shown that the topology and metric of such black holes is uniquely determined by the angular momenta and certain other invariants consisting of a number of real moduli, as well as integer vectors subject to certain constraints. 1. Introduction The classic black hole uniqueness theorems state that four dimensional, stationary, asymptotically flat black hole spacetimes are uniquely determined by their mass and angular momentum in the vacuum case, and by their mass, angular momentum, and charge in the Einstein-Maxwell case. The solutions are in fact given by the Kerr metrics in the first case and the Kerr-Newman metrics in the second. This was proven in a series of papers [1,2,23,24,30,38,48]; for a coherent exposition clarifying many important details and providing a set of consistent technical assumptions see [7]. The black hole uniqueness theorem is not true as stated in general spacetime dimensions D ≥ 5. For example, in D = 5 dimensions, there exist asymptotically flat, stationary vacuum black holes with the same mass and angular momenta, but with nonisometric spacetime metrics, and in fact even different topology [4,11–13,15,41,45]. One would nevertheless hope that a similar uniqueness theorem still applies if additional invariants (“parameters”) are specified beyond the mass and angular momenta. Unfortunately, except in the static case [19,49,50], such a classification result is not known, nor is it known what could be the nature of the additional invariants. In this paper, we study the special case of stationary vacuum black hole spacetimes in dimension D ≥ 4 with a compact, non-degenerate, connected horizon, admitting D − 3 commuting additional Killing fields with closed orbits. The spacetimes that we consider asymptote to a flat Kaluza-Klein space with 1, 2, 3 or 4 large spatial dimensions and a corresponding number of toroidal extra dimensions. For simplicity, we do not consider

632

S. Hollands, S. Yazadjiev

spacetimes with net “monopole charge”, see also footnote 3. Examples of such metrics have been given by [34,47]. We will first show how to associate certain invariants to such a spacetime consisting of a collection of “moduli” {li ∈ R>0 } and certain generalized “winding numbers” {a i ∈ Z D−3 }. The moduli may be thought of as the length of various rotation “axis” within the spacetime, whereas the winding numbers characterize the nature of the action of the D − 3 rotational symmetries near a given axis. The collection of these winding numbers uniquely characterizes the topology and symmetry structure of the exterior of the black hole, and we refer to it as the “interval structure” of the manifold. This analysis also implies that the horizon must be topologically the cartesian product a torus of the appropriate dimension and either a 3-sphere, ring (S2 × S1 ), or Lens-space L( p, q). Our notion of interval structure may be viewed as a generalization of what has been called “weighted orbit space” in the mathematics literature on 4-manifolds with torus action [43,44], but the latter notion does not involve the moduli {li }. Also, in the context of stationary black holes, a similar notion called “rod structure” was first considered by [20,21]; see [14] for the static case. The main difference between this and our notion is that our winding numbers are found to obey an integrality condition as well as certain other constraints, whereas there were no such constraints in [20,21]. The latter are a necessary and sufficient condition for the spacetime to have the structure of a smooth manifold with torus action. These topological considerations are described in detail in Sect. 3. We will then prove a uniqueness theorem which states that there can be at most one black hole spacetime with the same angular momenta and interval structure.1 Our uniqueness theorem generalizes a theorem proved in a previous paper [28] on asymptotically flat vacuum black holes in D = 5 dimensions; see also [29] for the Einstein-Maxwell case. The proof of the theorem proceeds in two steps: First, one reduces the full Einstein equations onto the space of symmetry orbits. Because the spacetime is assumed to have a total number of D − 2 independent commuting Killing fields, the space of symmetry orbits is two-dimensional—in fact it is shown to be a manifold with boundaries and corners homeomorphic to a half-plane. The parameters {li } are essentially the lengths of the various boundary segments. The arguments in the first step are topological in nature, and the only role of Einstein’s equations is to provide additional information about the fundamental group of the manifold via the topological censorship theorem [9]. That information is needed to rule out the presence of conical singularities in the orbit space.2 Our results in this part may be thought of as a generalization of [43,44] to a higher dimensional situation. The second step is to cast the reduced Einstein equations on the orbit space into a suitable form. Here, we make use of a formulation due to [35] involving certain potentials. The form of the equations leads to a partial differential equation for a quantity representing the “difference” between any two black hole metrics of the type considered which has been called “Mazur identity” [38]. Using this identity, one can prove the uniqueness theorem. The vectors {a i } and parameters {li } are important to treat the boundary conditions of this differential equation. The arguments in the second step are geometrical/analytical, and involve the use of Einstein’s equations in an essential way. 1 It has been brought to our attention that a conjecture in this direction had been made at the talk [22], see also [21]. 2 Here our analysis also fills a gap in our previous paper [28], where the absence of such conical singularities had to be assumed by hand.

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

633

The simpler case of a 5-dimensional spherical black hole with trivial interval structure was previously treated by a similar method in [37]. While our uniqueness theorem in higher dimensions is in some ways similar to the corresponding theorem in four dimensions, there are some notable differences. The first, more minor, difference is that higher dimensional black holes are not only classified by the mass and angular momenta, but in addition depend on the interval structure. In D = 4 the interval structure of a single black hole spacetime is trivial. A more substantial difference is that in D = 4 dimensions, the additional axial Killing field is in fact guaranteed by the rigidity theorem [6,16,24,39,46]. While a generalized rigidity theorem can be established in D dimensions [26,27,40], this theorem now only guarantees at least one additional axial Killing field. For the arguments of the present paper to work, we need however D − 3 commuting axial Killing fields. It does not seem likely that our theorem covers all asymptotically Kaluza-Klein, stationary black hole spacetimes in D dimensions. A third difference is that we have not been able so far to establish for which given set of angular momenta and interval structure there actually exists a regular black hole solution. The situation in this regard is in fact unclear even in five asymptotically large dimensions with no small extra dimensions. Here, solutions corresponding to various simple interval structures have been constructed. These include solutions with horizon topology S3 , S2 × S1 , L( p, q), which are the possible topologies allowed by our uniqueness theorem. However, by contrast with the cases S3 , S2 × S1 [11,13,41,45], the black holes with lens space horizon topology found so far [4,15] are not regular, and are thus actually not covered by our theorem. The situation is very different in four dimensions. Here the interval structure for single black hole spacetimes only involves the specification of a single parameter (related to the area of the horizon), and a regular black hole solution is known to exist for any choice of this parameter and the angular momentum—the corresponding Kerr solution. The mass, surface gravity, angular velocity of the horizon etc. of the solution can all be expressed in terms of these parameters.

2. Description of the Problem, Assumptions, Notations Let (M, g) be a D-dimensional, stationary black hole spacetime satisfying the vacuum Einstein equations, where D ≥ 4. The asymptotically timelike Killing field is called t, so £t g = 0. We assume that M has s + 1 asymptotically flat large spacetime dimensions and D − s − 1 asymptotically small extra dimensions, where s > 0. More precisely, we assume that a subset of M is diffeomorphic to the cartesian product of Rs with a ball removed—corresponding to the asymptotic region of the large spatial dimensions—and R × T D−s−1 —corresponding to the time-direction and small dimensions.3 We will refer to this region as the asymptotic region and call it M∞ . The metric is required to behave in this region like g = −dτ 2 +

s i=1

dxi2 +

D−s−1

dϕi2 + O(R −s+2 ),

(1)

i=1

3 In particular, we thereby exclude situations such as D = 5, s = 3, where the extra dimension T1 is fibered non-trivially over the sphere S 2 at infinity. Solutions of this kind have been given in [47]. The Euler class

of the fibration corresponds to a net “monopole charge”. It would be interesting to generalize our analysis to include monopole charge.

634

S. Hollands, S. Yazadjiev

where O(R −α ) stands for metric components that drop off faster than R −α in the radial coordinate R =

x12 + · · · + xs2 , with k th derivatives in the coordinates x1 , . . . , xs drop-

ping off at least as fast as R −α−k . These terms are also required to be independent of the coordinate τ , which together with xi forms the standard cartesian coordinates on Rs,1 . The remaining coordinates ϕi are 2π -periodic and parametrize the torus T D−s−1 . The timelike Killing field is assumed to be equal to ∂/∂τ in M∞ . We call spacetimes satisfying these properties “asymptotically Kaluza-Klein” spacetimes.4 The domain of outer communication is defined by M = I + (M∞ ) ∩ I − (M∞ ) ,

(2)

where I ± denote the chronological past/future of a set. The black hole region B is defined as the complement in M of the causal past of the asymptotic region, and its boundary ∂ B = H is called the (future) event horizon. In this paper, we also assume the existence of D −3 further linearly independent Killing fields, ψ1 , . . . , ψ D−3 , so that the total number of Killing fields is equal to the number of spacetime dimensions minus two. These are required to mutually commute, to commute with t, and to have periodic orbits. The Killing fields ψi are referred to as “axial” by analogy to the four-dimensional case, even though their zero-sets are generically higher dimensional surfaces rather than “axis” in D > 4. We also assume that, in the asymptotic region M∞ , the action of the axial symmetries is given by the standard rotations in the cartesian product of flat Minkowski spacetime Rs,1 times the standard flat torus T D−s−1 . In other words, ψi = ∂/∂ϕi for i > [s/2] and5 ψ j = x2 j−1 ∂x2 j − x2 j ∂x2 j−1 for j = 1, . . . , [s/2] in M∞ . The group of isometries is hence G = R × T , where R corresponds to the flow of τ , and where T = T D−3 corresponds to the commuting flows of the axial Killing fields. Looking at the action of G on the asymptotic region, it is evident that an asymptotically Kaluza-Klein spacetime can have at most [s/2] + D − s − 1 commuting axial Killing fields. If this number is more than or equal to D − 3 as we are assuming, then s can be either 1, 2, 3 or 4. A more general class of spacetimes admitting G as their isometry group would be ones that are asymptotically the direct product Rs,1 × Y , where Y is a compact manifold of dimension D − s − 1. By the classification of manifolds with torus action given in [43,44] and Sect. 3 of this paper, one would have the following possibilities: 1. When s = 4, then Y is a (D − 5)-dimensional compact manifold admitting an action of T D−5 , hence Y = T D−5 . 2. When s = 3, then Y is a (D − 4)-dimensional compact manifold admitting an action of T D−4 , hence Y = T D−4 3. When s = 2, then Y is a (D − 3)-dimensional compact manifold admitting an action of T D−4 . The possibilities are summarized in Thm. 2, i.e. Y ∼ = S3 × T D−6 , S2 × D−5 D−3 D−6 ∼ T ,T , or Y = L( p, q) × T , where L( p, q) is a Lens space. 4. When s = 1 then Y is a (D − 2)-dimensional compact manifold admitting an action of T D−3 . The possibilities are again as summarized in Thm. 2, i.e. Y ∼ = S3 × T D−5 , S2 × T D−4 , T D−2 , or Y ∼ = L( p, q) × T D−5 , where L( p, q) is a Lens space. In this paper, we will treat explicitly only the first case, i.e. when the asymptotics of the spacetime is R4,1 × T D−5 (and D ≥ 5), but we will occasionally comment on the other 4 For the axisymmetric spacetimes considered in this paper, we will derive below a stonger asymptotic expansion, see Eq. (83). 5 The notation [x] means the largest integer n such that n ≤ x.

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

635

cases. The second case is rather similar, and the statement and proof of our main result would apply with minor changes. The third and fourth cases are qualitatively somewhat different. We are going to analyze the uniqueness properties of the asymptotically Kaluza-Klein spacetimes just described. Unfortunately, in order to make our arguments in a consistent way, we will have to make certain further technical assumptions about the global nature of (M, g) and the action of the symmetries. Our assumptions are in parallel to those made by Chru´sciel and Costa in their study [7] of 4-dimensional stationary black holes. The requirements are (a) that M contains an acausal, spacelike, connected hypersurface S asymptotic to the τ = 0 surface in the asymptotic region M∞ , whose closure has as its boundary ∂ S = H, a cross section of the horizon. We assume H to be compact and (for simplicity) to be connected. (b) We assume that the orbits of t are complete. (c) We assume that the horizon is non-degenerate. (d) We assume that M is globally hyperbolic. We will also assume (e) that the spacetime, the metric, and the group action are analytic, rather than only smooth. This will serve us to transfer information gathered about the metric in M to all of M, and it is also convenient to exclude certain pathologies of the action of isometries. This condition could be relaxed to smoothness without major difficulties. For the spacetimes described, one of the following two statements is true: (i) t is tangent to the null generators of H . In the asymptotically flat case, the spacetime must be static by the results of [51]. In the asymptotically Kaluza-Klein case, no such general result is known to our knowledge, but it is plausible that this statement might still hold true. (ii) t is not tangent to the null generators of H . In this case, the rigidity theorem [26,40] implies6 that there exists a linear combination K = t + 1 ψ1 + · · · + D−3 ψ D−3 , i ∈ R

(3)

so that the Killing field K is tangent and normal to the null generators of the horizon H , and g(K , ψi ) = 0 on H .

(4) κ2

From K , one may define the surface gravity of the black hole by = −(1/2) lim H (∇a K b )∇ a K b , and it may be shown that κ is constant on H [52]. In the first case (i), one can prove that the spacetime is actually unique [30], and in fact isometric to the Schwarzschild spacetime when D = 4. For higher dimensions, the same has been proven in the asymptotically flat case by [19,49,50]. We also expect a statement of this type to be true in the asymptotically Kaluza-Klein case with the Schwarzschild spacetime replaced by an appropriate generalization, but this is presently still open. In this paper, we will be concerned exclusively with the second case (ii), and we will give a uniqueness theorem for such spacetimes. Of particular importance for us will be the orbit space Mˆ = M/G, so in the next section we will look in detail at this space. 3. Analysis of the Orbit Space 3.1. Manifolds with torus actions. To begin, we consider a somewhat simpler situation, namely an orientable, analytic, compact connected Riemannian manifold of 6 The rigidity theorem was proved in these references only in the asymptotically flat case. However, to prove the relation (3), only a local analysis of the geometry at the horizon is needed, and the asymptotic conditions do not play a role for this.

636

S. Hollands, S. Yazadjiev

dimension s ≥ 3, with a smooth effective7 action of the N -dimensional torus T = T N . Thus, we assume that Diff( ) contains a copy of T . Such actions have been analyzed and classified in the case s = 4 in a classic work by Orlik and Raymond [43,44], and— repeating many of their arguments—in [28]. Some of our arguments for general s are in parallel with this case, others are not. Essentially all of our arguments do not require the analyticity of the manifold or group action, and would also hold e.g. if the quantities were only of class C 1 , or even just C 0 , However, since our application will be to analytic spacetimes, we may as well assume this here. We may equip with a Riemannian metric h, and by averaging h with the action of T if necessary, we may assume that T acts by isometries of h. Later, will be a spatial slice of our physical spacetime (so that s = D − 1) and N will be taken to be D − 3, but for the moment this is not relevant. It will be useful to view the N -torus as the quotient R N / N , where N = (2π Z) N is the standard 2π -periodic N -dimensional lattice. Elements k ∈ T will consequently be identified with equivalence classes of N -dimensional vectors, k = [τ1 , . . . , τ N ] ∈ R N / N . The standard basis of N will be denoted b1 , . . . , b N , i.e., bi = (0, . . . , 0, 2π, 0, . . . , 0) , where the non-zero entry is in the i th position. Various facts about lattices that we will use in this section may be found in the classic monograph [3]. We denote the commuting Killing fields generating the action of T by ψi , i = 1, . . . , N . The flows of these vector fields are denoted Fiτ , and we assume that they are normalized so that the flows are periodic with period 2π , so Fi2π (x) = x for any x ∈ , and any i. The action of a group element k = [τ1 , . . . , τ N ] on a point is abbreviated by k · x = F1τ1 ◦ · · · ◦ FNτ N (x).

(5)

We also abbreviate the action of k on a tensor field T = Ta1 ...aq b1 ...br on by k · T = [F1τ1 ◦ · · · ◦ FNτ N ]∗ T , where the ∗ denotes the push-forward/pull-back of the tensor field. Because the Killing fields commute, we have in particular k · ψi = ψi for any k ∈ T . If ψ1 , . . . , ψ N are Killing fields as above, then so are ψˆ 1 , . . . , ψˆ N , where ⎞ ⎛ A11 . . . A N 1 N ⎜ .. ⎟ ∈ S L(N , Z). ψˆ i = (6) Ai j ψ j , ±A = ± ⎝ ... . ⎠ i=1 A1 N . . . A N N Another way of saying this is that we may conjugate the action of T = T N by the inner automorphism8 α A ([τ ]) = [τ A T ] of T , and the modified Killing fields ψˆ i generate the conjugated action. The freedom of choosing different 2π -periodic Killing fields to generate the action of T = R N / N is closely related to the possibility of choosing different bases in the lattice N , because any such change of basis is implemented by an integer matrix A with det A = ±1. As is standard, we define the orbit and the isotropy subgroup associated with a point by, respectively Ox = {k · x | k ∈ T },

I x = {k ∈ T | k · x = x}.

(7)

7 This means that if k · x = x for all x ∈ , then k is necessarily the identity. Given an action of the above type, one may always pass to an effective action by taking a quotient of T if necessary. 8 The automorphism property is α (kk ) = α (k)α (k ) for all k, k ∈ T . A A A

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

637

I x is a closed (hence compact) subgroup of T , and Ox is a smooth manifold that can be identified with T /I x . Being compact and abelian, I x must be isomorphic to Tn × Z p j . A more precise description of the action I x in an open neighborhood of x ˆ = {Ox | x ∈ } is called the factor space will be given below. The set of all orbits

ˆ = /T . It is not a manifold for general group actions. and is also written as

It will be useful to define the non-negative, symmetric N × N Gram matrix of the Killing fields, f i j = h(ψi , ψ j ).

(8)

It will also be convenient to distinguish points in according to the dimension of their orbit. For this, we define Sr = {x ∈ | dim Ox = r } = {x ∈ | rank[ f (x)] = r } = {x ∈ | dim I x = n = N − r }.

(9)

Evidently, n = N − r is also equal to the number of independent linear combinations of the Killing fields ψ1 , . . . , ψ N that vanish at points of Sr . Clearly, we have

=

N

Sr .

(10)

r =0

Lemma 1. Let ( , h) be a Riemannian manifold of dimension s, with N mutually commuting Killing fields ψi , i = 1, . . . , N . Let f i j be the Gram matrix, and let x be a point such that rank[ f (x)] = r . Then it follows that N − r ≤ [(s − r )/2]. Proof. Let Vx ⊂ Tx be the span of the Killing fields ψi |x , i = 1, . . . , N at x, and let Wx be the orthogonal complement. The assumptions of the lemma mean that the dimension of Vx is r , and that there exist N − r linear combinations of ψi |x , i = 1, . . . , N that vanish. By forming suitable linear combinations of the Killing fields, we may hence assume that span{ψi |x , i = 1, . . . , r } = Vx , and that ψi |x = 0, i = r + 1, . . . , N . Let D be the derivative operator of h, and let ti = Dψi |x , where i = r + 1, . . . , N . Then each ti is a linear map ti : Tx → Tx . The Killing equation implies that ti is skew symmetric with respect to the bilinear form h : Tx × Tx → R, i.e. h(ti X, Y ) = −h(X, ti Y ). Evaluating the D-derivative of the commutator [ψi , ψ j ] = 0 at x for r < i, j ≤ N then implies that the corresponding commutator ti t j − t j ti = 0 vanishes, too. Evaluating the derivative of the commutator [ψi , ψ j ] = 0 at x for r < i ≤ N and 0 < j ≤ r then furthermore shows that ti Vx = Vx , and consequently ti Wx = Wx . Now let us choose an orthogonal basis {e1 , . . . , es−r } of Wx , and use that to identify ti , r < i ≤ N with a linear map Rs−r → Rs−r . These linear maps must hence be skew symmetric, i.e., commuting elements of the Lie-algebra o(s − r, R). They must also be linearly independent. Indeed, assume on the contrary that a non-trival linear combination λ1 tr +1 +· · ·+λ N −r t N vanishes. Then both the Killing field s = λ1 ψr +1 +· · ·+λ N −r ψ N , as well as its derivative Ds vanish at the point x. It is a well-known property of Killing fields (see e.g. [52]) that a Killing field vanishes identically on a connected Riemannian manifold if it vanishes at a point together with its derivative. Hence, the Killing fields ψi , r < i ≤ N would be linearly dependent, a contradiction. Thus, we conclude that the linear maps ti , r < i ≤ N may be viewed as forming a (N − r )-dimensional abelian subalgebra of o(s − r, R). Any maximal abelian subalgebra of o(s − r, R) has dimension [(s − r )/2], so N − r ≤ [(s − r )/2].

638

S. Hollands, S. Yazadjiev

In the situation considered later in this section, we have N = s − 2 Killing fields. The lemma then implies that the sets Sr are non-empty only for r = s − 2, s − 3, s − 4, so we have = Ss−2 ∪ Ss−3 ∪ Ss−4 . Our task will now be to construct, for each orbit Ox , an open neighborhood of it and a coordinate system in which we can explicitly understand the action of the group T . We will then be able to locally take the quotient of this neighborhood by T and thereby get a local description of the orbit space. By patching the local regions together, we will be able to characterize the manifold structure of the orbit space. Let x be an arbitrary but fixed point in Sr . Then the dimension of Ox is r , and the dimension of the isotropy group I x is n = N − r . As we have just seen, n may only take on the values 0, 1, . . . , [(s − r )/2]. We first show that if x ∈ Sr , there exists a matrix ±A ∈ S L(N , Z) such that the vector fields ψˆ i , 0 < i ≤ N defined as in Eq. (6) satisfy ψˆ i |x = 0, r < i ≤ N and such that ψˆ i |x , 0 < i ≤ r span the tangent space Tx Ox . We start our discussion with a general lemma. Lemma 2. Let L ⊂ T = T N be an n-dimensional closed subgroup. Then there are matrices of integers (Ai j )i,N j=1 and (vi j )ri, j=1 , where r = N − n and det A = ±1, with the property that L = α A (L0 × L1 ). Here L0 = {0r } × R N −r / N −r ,

(11)

L1 = (v −1 r )/r × {0 N −r },

(12)

where r has been identified with the lattice generated by b1 , . . . , br with origin denoted 0r , and where N −r has been identified with the lattice generated by br +1 , . . . , b N , with origin denoted 0 N −r . We have also written v −1 r for the lattice of Rr generated by r −1 j=1 (v )i j b j , where i = 1, . . . , r . Hence L0 is connected, L1 is finite, α α L1 ∼ = Z pα1 × · · · × Z pα M , |L1 | = p1 1 . . . p MM = | det(vi j )ri, j=1 |, 1

(13)

M

with p j > 0 prime. Proof. Let us first assume that L is also connected. Then L is a compact, abelian, connected Lie-group and so must be isomorphic to Tn . Let β : Tn → L be the isomorphism. We identify T = T N with R N / N , where N is the standard lattice. Similarly, N we identify Tn with Rn /n , with n = spanZ (bi )i=r +1 . Let a i = β(bi ) ∈ N , where i = r + 1, . . . , N . If λi ∈ R are such that c = λ1 a r +1 + · · · + λn a N = β(λ1 br +1 + · · · + λn b N ) ∈ N ,

(14)

then it follows that λi ∈ Z. We conclude from [3, Cor. 3, I.2.2] that there are vectors a 1 , . . . , a r ∈ N such that a 1 , . . . , a N form a basis of N . We now let A be the N × N matrix of integers such that bi A T = a i for i = 1, . . . , N . Then det A = ±1 because the matrix relates two bases of the lattice N . Since L0 viewed as a subgroup of T N is generated precisely by br +1 , . . . , b N , this proves the lemma when L is connected. In the general case, L is isomorphic to the cartesian product of a torus

and cyclic groups of order given by a prime power, i.e. there is an isomorphism β : Tn × Z pα j → j

L. For j = 1, . . . , M, let c j be the image under β of the generator of the j th cyclic finite group in the decomposition, projected onto the (real) span of a 1 , . . . , a r . The

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

639

vectors c1 , . . . , c M together with a 1 , . . . , a r generate an r -dimensional lattice r . Let γ 1 , . . . , γ r be a basis of the lattice r . It follows from [3, Thm. 1,I.2.2] that there are integers vi j such that vii > 0, vii > v ji for j > i, and a 1 = v11 γ 1 a 2 = v21 γ 1 + v22 γ 2

(15)

.. .

.. . a r = vr 1 γ 1 + vr 2 γ 2 + · · · + vrr γ r .

It is evident that L is given by the image under α A of the cartesian product of the group given by the real multiples of a r +1 , . . . , a N mod N and the group of integer multiples of γ 1 , . . . , γ r mod N . The first group is the image under α A of L0 , while the second is the image of L1 . This proves that L = α A (L0 × L1 ). From the system (15) one sees that the order of L1 is given by |L1 | =

r

vii = det(vi j )ri, j=1 .

i=1 n On the other hand, α −1 A ◦ β is an isomorphism between T ×

Z pα j and L0 × L1 .

j αj

The number of connected components of the first group is given by p j = | Z pα j |, while it is given by |L1 | for the second. This finishes the proof of the lemma.

.

j

We apply this lemma to the isotropy group I x ⊂ T , and we formulate the intermediate result as another lemma for future reference: Lemma 3. Let x ∈ Sr . There are integer matrices (vi j )ri, j=1 and (Ai j )i,N j=1 (depending on x) with det A = ±1 such that I x = α A (L0 × L1 ), with L0 and L1 the groups given above in Eq. (11). Alternatively, we can say that I x is generated by the elements ⎡ ⎞⎤ ⎛ N r 1 ⎝ k(τ1 , . . . , τ N ) := α A ⎣ τi bi + (v −1 )i j τi b j ⎠⎦ , (16) 2π i=r +1

i, j=1

where τi ∈ R for r + 1 ≤ i ≤ N , and where τi ∈ 2π Z for 1 ≤ i ≤ r . If we define ψˆ i = Ai j ψ j , then Lemma 3 implies that ψˆ i |x = 0 for i = r + 1, . . . , N , and ψˆ j |x span Tx Ox for j = 1, . . . , r . We now continue our analysis by inspecting the action of I x on the tangent space Tx . Let k ∈ I x . Then, because k · x = x, this induces a linear map k : Tx → Tx satisfying h(k · X, k · Y ) = h(X, Y ) for all X, Y ∈ Tx . In fact, because k · ψi = ψi for any of our Killing fields, it follows that k leaves each vector in the tangent space Tx Ox invariant. But then it also leaves the orthogonal complement Wx invariant. Let {e1 , . . . , es−r } be an orthogonal basis of Wx . So for every k ∈ I x , we get a representing orthogonal matrix (ki j ), 0 < i, j ≤ s − r acting on the orthognonal basis by k · ei = ki j e j . Because

is assumed to be orientable, we have a distinguished non-vanishing rank s totally antisymmetric tensor field (determined up to sign by a1 ...as b1 ...bs h a1 b1 . . . h as bs = s!). This tensor is invariant under the isometries of , so in particular k · = at point x,

640

S. Hollands, S. Yazadjiev

for any k ∈ I x . Because k · ψi for any of our Killing fields, this implies that the action of k on Wx preserves the orientation, so the matrix (ki j ) representing this action has determinant det (ki j ) = +1, and (ki j ) ∈ S O(s − r ). In particular, (ki j ) must have an even number of −1 eigenvalues. The matrices (ki j ) commute for different choices of k ∈ I x , and so we may put them simultaneously into Jordon normal form. By making a change of basis of the {e1 , . . . , es−r } with an orthogonal element g ∈ O(s − r ), we may achieve that k · (e2 j−1 + ie2 j ) = eiθ j (e2 j−1 + ie2 j ), 0 < j ≤ [(s − r )/2] if s − r even (17) together with k · es−r = es−r when s − r is odd.9 The phases θ j depend on k. For the elements of the isotropy group given by Lemma 3, we have in fact k(0, . . . , 2π, . . . 0) · (e2 j−1 + ie2 j ) r = exp 2πi (v −1 )lm wm j (e2 j−1 + ie2 j ), 0 < j ≤ [(s − r )/2]

(18)

m=1

if s − r is even together with k(0, . . . , 2π, . . . 0) · es−r = es−r when s − r is odd. Here, the 2π is in the l th slot, with l ≤ r . The wi j are integers, which follows from the fact that the group elements k( j vi j b j ) are the identity, by Lemma 3. The above formula becomes somewhat more transparent if we note that the elements γ i = rj=1 (v −1 )i j b j defined for i = 1, . . . , r generate a copy of the isotropy subgroup I x ∼ = (v −1 r )/r ∼ =

∼ α j = γ mod r , see Lemma 2. Thus, we may view the exponential expression Z j p i j

in the above formula as a homomorphism ϑ j : (v −1 r )/r → S1 = {z ∈ C | |z| = 1}, ϑ j (γ k ) = e2πi

r

m=1 (v

−1 )

km wm j

. (19)

We also have k(0, . . . , τl , . . . 0) · (e2 j−1 + ie2 j ) = exp(iτl wl j )(e2 j−1 + ie2 j ) , 0 < j ≤ [(s − r )/2] (20) together with k(0, . . . , τi , . . . 0) · es−r = es−r when s − r is odd. Here, the τl is in the l th slot, and r + 1 ≤ l ≤ N . The wi j are again integers. As yet, the basis {e1 , . . . , es−r } has only been defined in Wx , but we now wish to define it for any W y , with y ∈ Ox . Let x(τ1 , . . . , τr ) = k(τ1 , . . . , τr , 0, . . . , 0) · x, 0 ≤ τi < 2π,

(21)

where k(τ ) is as in Lemma 3. Note that x(τ ) is periodic in τ with period 2π in each component of τ , and that τ ∈ [0, 2π )r → x(τ ) ∈ Ox provide (periodic) coordinates in Ox . We define our basis elements in Wx(τ ) by transporting {e1 , . . . , es−r } to x(τ ) with the group element in Eq. (21). We call this basis {e1 (τ ), . . . , es−r (τ )}. We note that this is still an orthonormal system, because it was obtained by an isometry between 9 Here it has been used that (k ) has determinant +1. Otherwise (k ) could also act as a reflection on an ij ij odd number of basis vectors.

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

641

Wx → Wx(τ ) . Note that this basis is not periodic in τ , by Eq. (17). To obtain an orthonormal basis {e˜1 (τ ), . . . , e˜s−r (τ )} that is periodic in τ , we set ⎛ ⎞ r e˜2 j−1 (τ ) + i e˜2 j (τ ) = exp ⎝−i τl (v −1 )lm wm j ⎠ (e2 j−1 (τ ) + ie2 j (τ )) , (22) m,l=1

for 0 < j ≤ [(s − r )/2], together with e˜s−r (τ ) = es−r (τ ) when s − r is odd. In an open neighborhood of Ox , we now define coordinates as follows. First, on Ox , we use the coordinates (ys−r +1 , . . . , ys ) ∈ [0, 2π )r → x(ys−r +1 , . . . , ys ). In a neighborhood of Ox we use ⎞ ⎛ s−r (y1 , . . . , ys ) → Expx(ys−r +1 ,...,ys ) ⎝ y j e˜ j (ys−r +1 , . . . , ys )⎠ . (23) j=1

Here, “Exp” is the exponential map for our metric h, i.e., (y1 , . . . , ys−r ) are Riemannian normal coordinates transverse to Ox . They cover an open neighborhood of Ox . From the construction of the coordinates, the action of the isometry group T in these coordinates is described by the following lemma: Lemma 4. Let x ∈ Sr , let (vi j ) be the matrix and k(τ1 , . . . , τ N ) ∈ I x be as in Lemma 3. Then, in terms of the coordinates (23) covering a neighborhood of Ox , the action of T is given by k(σ1 , . . . , σr , 0, . . . , 0) · (y1 + i y2 , . . . , ys−r −1 + i ys−r , ys−r +1 , . . . , ys ) ⎛ ⎡ ⎤ ⎞ r [(s−r )/2] = ⎝(exp ⎣i σl (v −1 )lm wm j ⎦ (y2 j−1 + i y2 j )) j=1 , (ys−r +i + σi )ri=1 ⎠ l,m=1

(24) when s − r is even. When s − r is odd, ys−r remains unchanged. Furthermore, k(0, . . . , 0, σr +1 , . . . , σ N ) · (y1 + i y2 , . . . , ys−r −1 + i ys−r , ys−r +1 , . . . , ys ) N [(s−r )/2] = (exp i σl wl j (y2 j−1 + i y2 j )) j=1 , (ys−r +i )ri=1

(25)

l=r +1

when s − r is even. When s − r is odd, ys−r remains unchanged. Let A be the matrix in Lemma 4, and let ψˆ i = j Ai j ψ j . By Lemma 4, the Killing ˆ fields ψi are related to the coordinate vector fields ∂ yi as: ⎛

⎞ ⎛ ψˆ 1 v11 ⎜ . ⎟ ⎜ . ⎜ . ⎟ ⎜ . ⎜ . ⎟ ⎜ . ⎜ ⎟ ⎜ ψˆ r ⎟ ⎜ vr 1 ⎜ ⎟=⎜ ⎜ ψˆ ⎟ ⎜ 0 ⎜ r +1 ⎟ ⎜ ⎜ . ⎟ ⎜ . ⎜ . ⎟ ⎜ ⎝ . ⎠ ⎝ .. 0 ψˆ N

. . . v1r w1 1 . . . . . . . . . vrr wr 1 . . . 0 wr +1 1 . . . . . . ... 0 wN 1

⎞⎛ w1 [(s−r )/2] ∂ ys−r +1 ⎟⎜ . . . . ⎟⎜ . . ⎟⎜ ⎟⎜ . . . wr [(s−r )/2] ⎟ ⎜ ∂ ys ⎟⎜ . . . wr +1 [(s−r )/2] ⎟ ⎜ y1 ∂ y2 − y2 ∂ y1 ⎟⎜ ⎟⎜ . . .. . ⎝ ⎠ . ys−r −1 ∂ ys−r − ys−r ∂ ys−r −1 . . . w N [(s−r )/2] ...

⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠

(26)

642

S. Hollands, S. Yazadjiev

when s − r is even. When s − r is odd, there is an analogous expression. Let us denote the N × (r + [(s − r )/2]) matrix in this expression as C. When N − r = [(s − r )/2], C is a square N × N matrix. Furthermore, each of the commuting, locally defined Killing fields ∂/∂ yi and y2 j−1 ∂/∂ y2 j − y2 j ∂/∂ y2 j−1 on the right side of the above equation is periodic, with period precisely 2π . Hence, when N − r = [(s − r )/2], the matrix C must have determinant ±1. So we get the condition −r det (vi j )ri, j=1 · det (w(r +i) j )i,N j=1 = det C = ±1.

(27)

Because both determinants on the left are integers, we conclude that they must be ±1. −r In view of Lemma 2, this means p1 = · · · = pr = 1 and det (w(r +i) j )i,N j=1 = ±1. We summarize our findings in another lemma: Lemma 5. Let ψ1 , . . . , ψ N be Killing fields as above, x ∈ Sr , n = N −r = [(s −r )/2]. Then p1 = · · · = pr = 1 (see Lemma 2), and det (w(r +i) j )i,n j=1 = ±1. Furthermore, in that case I x is connected. With the help of the above lemmas, we are now ready to analyze the orbit space ˆ in the case when N = s − 2. We first cover by the coordinate systems defined

in Eq. (23). Within each such coordinate system, we can then separately perform the quotient by T . We need to distinguish the cases n = 0, 1, 2, where n = s − 2 − r , and where the coordinate system covers a point x ∈ Sr . Case 0. For n = 0 and hence r = s − 2, the isotropy group I x is discrete and is isomor −1 phic to the group generated by the elements γ i = s−2 j=1 (v )i j b j , see Lemmas 2, 3. It

is also isomorphic to j Z pα j . Furthermore, by combining Lemmas 3 and 4, the action j

of these isotropy groups in a neighborhood of Ox can be written as k(0, . . . , 2π, . . . , 0) · (y1 + i y2 , y3 , . . . , ys ) = ϑ(γ j )(y1 + i y2 ), y3 , . . . , ys ,

(28)

where we are using the notation introduced in Eq. (19) for the homomorphism ϑ :

1 th α j Z p j → S , and where the “2π ” is in the j slot. Consider now the kernel ker ϑ. If j

g is an element in its kernel, then it is evident from the above formula that the corresponding isometry of acts by the identity both in a full neighborhood of Ox . Consequently, g must be the identity element of the group, since we are assuming the action to be effective. In particular, ϑ is injective. Consider next the image ran ϑ. This is a finite subgroup of the circle group S1 . Hence it is given by ran ϑ = {e2πik/q | k = 0, . . . , q − 1} ∼ = Zq for some q. It follows from the fact that ϑ is injective that

αj |ran ϑ| = q = | Z pα j | = pj . (29) j

j

j

Furthermore, it follows that the inverse ϑ −1 is a well-defined map on Zq , which can be viewed as taking values in the isotropy group I x ⊂ T . It follows from the discussion that, within the neighborhood considered, the quotient

a is modeled upon R2 /Zq , where q = p j j = | det v | (see Lemma 2, 3), and where the cyclic group of q elements acts on the coordinates y1 + i y2 by complex phases e2πi/q . Thus, in a neighborhood of Ox , the quotient space is an orbifold R2 /Zq . In particular,

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

643

we see that the orbits having non-trivial discrete isotropy group must be isolated points ˆ These orbits are also called “exceptional orbits”. The other orbits in case (0) have in . no isotropy group and are called “principal orbits”. Case 1. For n = 1, Lemma 5 applies and pi = 1 for all i and w(s−3)1 = ±1. We first factor by the group elements [0, . . . , 0, σs−2 ], see Eq. (24), and afterwards by the group elements [σ1 , . . . , σs−3 , 0], see Eq. (25). Then it is quite clear that the resulting quotient space of our neighborhood of Ox is locally modeled upon R × R>0 . The first factor

corresponds to the variable y3 , while the second factor to the variable y12 + y22 . w(s−3)1 w(s−2)1 = ±1. Case 2. For n = 2, Lemma 5 applies and pi = 1 for all i and w(s−3)2 w(s−2)2 We first factor by the group elements [0, . . . , 0, σs−3 , σs−2 ], see Eq. (24), and afterwards by the group elements [σ1 , . . . , σs−4 , 0, 0], see Eq. (25). Then it is quite clear that the resulting quotient space of our neighborhood of O x is locally modeled upon R>0 × R>0 . The first factor corresponds to the variable factor to the variable y12 + y22 . Thus, we have proven the following theorem:

y32 + y42 , while the second

Theorem 1. Let be a compact orientable connected s-dimensional Riemannian manifold (without boundary) with s − 2 pairwise commuting Killing fields generatˆ = /T ing an action of the group T = Ts−2 by isometries. Then the quotient space

is an orbifold with conical singularities, boundary segments, and corners. Thus, each ˆ has a neighborhood modeled on a neighborhood of the tip of a cone R2 /Zq , point of

on a half-space R × R>0 , or on a corner, R>0 × R>0 . In the first case, the corresponding isotropy group is finite and q is given by the order of this group. Each point of the boundary segments, corners, or orbifold points in is associated with an isotropy group I x as in Lemma 3. It follows from our discussion in Case 1) that, as long as we stay within one boundary segment, the isotropy group does not change. Furthermore, by Lemmas 5 and 3, the isotropy group I x is connected for points x associated with boundaries and corners. For x associated with conical singularities, I x is discrete, again by Lemmas 5 and 3. It also follows from our discussion of Cases 1) and 2) that, for each boundary segment and each corner, the isotropy group is completely characterized by an integer matrix A of determinant ±1. Furthermore, it follows from our discussion in Case 0) that the isotropy group I x is characterized by an integer q and an injective homomophism ϑ −1 : Zq → Ts−2 , whose image is I x . There is one such matrix A for each boundary segment, one for each corner, and one such q, ϑ −1 for each conical singularity. The matrices A are actually not completely characterized by the corresponding isotropy subgroup I x . In fact, by Lemma 2 (with L = I x , x ∈ Sr ) the position of the isotropy subgroup within T is uniquely determined by the class (N = s − 2) [A] ∈

S L(N , Z) , U (N − r, r ; Z)

(30)

where U (N − r, r ; Z) is the group of block-upper triangular matrices with block sizes N − r, r with integer entries and determinant ±1. The quotient by such matrices U takes into account the fact that left-multiplying an A by such a matrix gives the same isotropy subgroup. When N − r = n = 1 (corresponding to Case 1, and a boundary segment),

644

S. Hollands, S. Yazadjiev

the class of A is determined by the last row (a N 1 , . . . , a N N ) of the matrix A, and we have a N i ψi |x = 0 for each point x in M corresponding to the boundary segment under consideration. When N − r = n = 2 (corresponding to Case 2, and a corner), the class of A is determined by the last two rows (a(N −1)1 , . . . , a(N −1)N ), (a1N , . . . , a N N ) up to a S L(2, Z) transformation acting on each column of the N × 2 matrix formed from these. We have a(N −1)i ψi |x = 0 and a N i ψi |x = 0 for each point x in

corresponding to the corner under consideration. ˆ is the collection of boundary segments, and if Ii j = Ii ∩ I j are the If {I j } ⊂ ∂

corresponding corners, then for each Ii , we have a vector a(Ii ) ∈ Z N which is the last row of the matrix A corresponding to that boundary segment. The greatest common divisor (g.c.d.) of the entries of the vector may be assumed to be equal to 1, g.c.d.{ai (I j ) | i = 1, . . . , D − 3} = 1.

(31)

For each corner Ii j , the corresponding vectors a(Ii ) and a(I j ) must be such that the N × 2 matrix formed from these vectors can be supplemented by N − 2 rows of integers to an S L(N , Z)-matrix, and this introduces a constraint on the pair a(Ii ), a(I j ). In the case s = 4 (i.e., N = 2), the constraint at each corner Ii j is simply that det (a(Ii ), a(I j )) = ±1. In general, the constraint on the vectors adjacent to a corner Ii j can be restated as follows applying [3, Lemma 2, I.2.3]: ˆ be the boundary segments. With each boundary segment Proposition 1. Let {I j } ⊂ ∂

there is associated a vector a(I j ) ∈ Zs−2 and ai (I j )ψi = 0 at the corresponding points of . At a corner Ii j = Ii ∩ I j , the vectors are subject to the constraint g.c.d. {Q kl | 1 ≤ k < l ≤ D − 3} = 1. Here, the numbers Q kl ∈ Z are defined by a (I ) ak (I j ) |. Q kl = | det k i al (Ii ) al (I j )

(32)

(33)

ˆ be the conical singularities. With each one, there is associated a natural Let {xˆi } ⊂

number qi > 1, specifying the type R2 /Zqi of the conical singularity, and a homomorphism ϑi−1 : Zqi → T , whose image is the discrete isotropy subgroup at xi = any point ˆ The collections {Ii } and {xˆi } are finite. in in the class xˆi ∈ . Remarks. (1) The data consisting of (i) the vectors {a(I j )}, (ii) the pairs {q j , ϑ −1 j }, ˆ (iv) the topological type of

ˆ (genus) has been called (iii) the orientation of , the “weighted orbit space” by Orlik and Raymond [43,44] for the case s = 4. Our proposition hence may be viewed as a generalization of their analysis to higher dimensions. ˆ is empty, then, as explained in detail in [43, Sect. 1.3], there are (2) If the boundary ∂

additional invariants associated with the T -space . These may be characterized as obstructions to lift certain cross sections on the boundaries of tubular neighborhoods of the orbifold-type orbits xˆi to and may be thought of as a class in the space m ˆ H 2 , Di2 ; Zs−2 ∼ = Zs−2 , i=1

where each Di2 is a disk around xˆi . This class has to be added to the data.

(34)

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

645

Proof. It only remains to be shown that the collections {Ii } and {xˆi } are finite. Suppose ˆ is a compact manifold with e.g. that there was an infinite set of points {xˆi }. Because

boundaries and corners, there would then have to be a convergent subsequence with limit ˆ We claim that yˆ cannot be on ∂ . ˆ Indeed, the boundary ∂

ˆ consists of orbits for yˆ ∈ . which we have already constructed open neighborhoods with an action of T described above in Cases 1) and 2) after Lemma 5. Clearly these open neighborhoods thus do not contain any points with discrete isotropy group, and hence none of the xi . This shows ˆ We next show that I y ⊂ T , the isotropy subgroup of y, must be discrete that yˆ ∈ / ∂ . and non-trivial. Let {xˆik } be the subsequence converging to yˆ , and let Tk := I xik be the corresponding isotropy subgroups. Clearly, each Tk is a discrete, non-trivial subgroup of the compact abelian group T . It easily follows from these general facts that there is a sequence gk ∈ Tk with limit g ∈ T not equal to the identity element, and it also easily follows that g ∈ I y . Hence the isotropy group of the point y is not empty, and because yˆ ∈ / ∂ , must be discrete. However, we may now construct a chart around the orbit of y as in Case 0) below Lemma 5, and this shows that there are no other points x with non-trivial isotropy subgroup in an open neighborhood of the orbit of y besides the orbit itself. In particular, xˆi cannot converge to yˆ , a contradiction. Hence {xˆi } is finite. The argument that the collection {Ii } is finite proceeds in a similar way. By a similar analysis we can also prove the following theorem on cohomogeneity-1 torus actions: Theorem 2. Let (H, γ ) be a connected, orientable, compact Riemannian manifold of dimension s − 1 > 1 with an isometry group containing an (s − 2)-dimensional torus T = Ts−2 . Then the orbit space Hˆ = H/T is diffeomorphic to a closed interval as a manifold with boundary, or to a circle. In the first case, we have the following possibilities concerning the topology of H: ⎧ 2 s−3 ⎪ ⎨S × T 3 s−4 ∼ H= S ×T ⎪ ⎩ L( p, q) × Ts−4

(35)

Here L( p, q) is a 3-dimensional Lens space. In the second case, H ∼ = Ts−1 . Proof. Let ψi , i = 1, . . . , s −2 be the commuting Killing fields of period 2π generating the action of T on H. In the decomposition H = ∪Sr defined as in Eqs. (9), (10), only the sets with r = s − 1 and r = s − 2 may be non-zero, by Lemma 1. We consider these cases separately. Case 0). Let x ∈ Ss−1 , and let Tx H = Tx Ox ⊕ Wx be the orthogonal decomposition into vectors tangent to Ox and those orthogonal to Ox . By assumption, the dimension of Wx is one. If k ∈ I x is in the isotropy group, then it leaves Tx Ox invariant, as k · ψi = ψi for all i. So k acts as ±1 on Wx . But k also preserves the rank (s − 1) anti-symmetric tensor compatible with the metric, which exists since H is orientable. So k acts as +1 on Wx , and hence as the identity on Tx H. The action of k must hence leave invariant any piecewise smooth geodesic on (H, γ ) through x, and therefore k must act as the identity on all of H, since this is a connected manifold. Thus, the isotropy group I x is trivial

646

S. Hollands, S. Yazadjiev

in Case 0). Consequently, near Ox , Hˆ = H/T has the structure of a 1-dimensional manifold, i.e., an open interval. Case 1). Let x ∈ Ss−2 . By exactly the same arguments as given above using Lemmas 4 and 5, the action of T is given near Ox in local coordinates (y1 , . . . , ys−1 ) by k(σ1 , . . . , σs−2 ) · (y1 + i y2 , y3 , . . . , ys−1 ) s−2 = exp i wl σl (y1 + i y2 ), y3 + σ1 , . . . , ys−1 + σs−3 .

(36)

l=1

Here, ±A is some S L(s − 2, Z) matrix, thenumbers wl are integers, and ws−2 = ±1 (see Lemma 5). It is evident from this that y 2 + y 2 furnishes a coordinate for Hˆ in a 1

2

neighborhood of Ox , thus identifying this neighborhood locally with a half-open interval. Because Hˆ can be covered by neighborhoods of the kind described in Cases 0) and 1), i.e., open and half open intervals, and because Hˆ is compact in a natural topology and connected, it follows that Hˆ must be a 1-dimensional connected compact manifold with or without boundaries. In the first case, Hˆ is diffeomorphic to a closed interval, in the second case to a circle. In the first case, the two boundary points of this closed interval correspond to orbits Ox respectively O y in H, where an integer linear combination ai,1 ψi respectively ai,2 ψi vanishes. We can redefine our action of T using Ai j ψ j for some integer matrix A with det A = ±1 instead the Killing fields ψˆ i = in such a way that on Ox we have ψˆ 1 = 0, while on O y we have p ψˆ 1 + q ψˆ 2 = 0. Consider now the subgroup L ⊂ T generated by ψˆ 3 , . . . , ψˆ s−2 . Clearly, L is isomorphic to Ts−4 . It follows from the discussion of the Cases 0) and 1) that there are no points in H which are fixed under a non-trivial element of L, so H ∼ = (H/L) × Ts−4 . Then, H/L is a three-dimensional manifold on which there acts the subgroup of isometries in T generated by ψˆ 1 , ψˆ 2 . It is not difficult to see, and argued carefully in [28], that H/L is isomorphic to S3 if ( p, q) = (0, 1), isomorphic to S2 × T1 if ( p, q) = (1, 0), and a Lens-space L( p, q) otherwise. In the second case, H must be diffeomorphic to the direct product of T and a circle, i.e. to Ts−1 . 3.2. The fundamental group of . In the previous section, we have analyzed oriented s-dimensional manifolds with an effective action of T = Ts−2 . We showed that ˆ = /T was an orientable 2-manifold with a finite number of the quotient space

conical singularities in the interior, and with boundaries and corners. With each of the ˆ there was associated an integer qi ∈ Z and an injective conical singularities xˆi ∈

−1 homomorphism ϑi : Zqi → T . These homomorphisms may be written as 2πi/q j ϑ −1 ) = (e2πi p1, j /q j , . . . , e2πi ps−2, j /q j ), j (e

(37)

where g.c.d.{q j , g.c.d.{ p1, j , . . . , ps−2, j }} = 1. Furthermore, with each of the boundary intervals Ii ⊂ ∂ , there was associated a vector a i = (a1,i , . . . , as−2,i ) ∈ Zs−2 . On a ˆ is corner, the vectors are subject to the constraint (32), (33). If is compact, then

a compact oriented 2-dimensional topological manifold, and hence topologically of the form

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

ˆ ∼ ˆ g\

=

d

647

D 2j ,

(38)

j=1

ˆ g is a closed Riemann surface of where each D 2j is a 2-dimensional disk, and where

genus g. One can show that the manifold with T -action is fixed up to equivariant isomorˆ {Ii }, {xˆi }, {qi , p }, {a i }; we will indicate how to prove phism by the data consisting of , i this in Subsect. 3.3. Therefore, any topological invariant of must be expressible in terms of these data. It is evident that the fundamental group π1 ( ) should provide a strong invariant for the topology of . It is given in the next theorem: Theorem 3. Let be a compact orientable manifold with an effective action of T = ˆ = ∅. Then the fundamental group can be presented as: Ts−2 such that ∂

π1 ( ) = k1 , . . . , ks−2 , d1 , . . . , dc , h 1 , . . . , h d , m 1 , . . . , m g , l1 , . . . , l g [m 1 , l1 ] · · · [m g , l g ] · d1 · · · dc · h 1 · · · h d ; [m i , k j ] ; [li , k j ] ; [di , k j ] ; [h i , k j ] ; [ki , k j ] ; a

a

a

a

s−2,1 s−2,b , . . . , k1 1,b · · · ks−2 ; k1 1,1 · · · ks−2

ps−2,1 ps−2,c q p q p . d1 1 k1 1,1 . . . ks−2 , . . . , dc c k1 1,c . . . ks−2

(39)

Here, we are using the usual notation for a finitely generated group in terms of its relations, and [x, y] = x yx −1 y −1 is the commutator of group elements. Above, g is the ˆ c is the number of conical singularities, b is the number of number of handles of , ˆ homeomorphic to intervals {Ii } , and d is the number of boundary components in ∂

circles, see Eq. (38). Proof. The proof is essentially an application of the Seifert-Van Kampen theorem, which is described e.g. in [36, Chap. 4]. Let x ∈ be any point with trivial isotropy group, and let ki , i = 1, . . . , s − 2 be the closed loops obtained by a applying the i th generator of π1 (T ) (=generator of the i th copy of T1 in Ts−2 ) to x. Let di , i = 1, . . . , c be lifts ˆ and let h i , i = 1, . . . , d be lifts of loops going around the i th conical singularity in ∂ , ˆ (=boundary component in ∂ ). ˆ We cut out a of loops going around the i th hole of

ˆ we cut out a small neighsmall disk Di2 around each of the conical singularities in , ˆ and we consider the corresponding subset of . This borhood of the boundary in , subset will have a homotopy group generated by k1 , . . . , ks−2 , d1 , . . . , dc , h 1 , . . . , h d , ˆ The relations are and generators m 1 , l1 , . . . , m g , l g corresponding to the g handles of . [m 1 , l1 ] · · · [m g , l g ] · d1 · · · dc · h 1 · · · h d ; [m i , k j ] ; [li , k j ] ; [di , k j ] ; [h i , k j ] ; [ki , k j ].

(40)

We now glue back in the neighborhood of the boundary. Since, near the i th boundary a as−2,i segment Ii , the generator k1 1,i · · · ks−2 shrinks to zero size, we receive the relations a

a

a

a

s−2,1 s−2,b k1 1,1 · · · ks−2 ; . . . ; k1 1,b · · · ks−2

(41)

via the Van Kampen theorem. We finally glue in the disks around the conical singularities, each of which corresponds to a tube D 2 ×Ts−2 . We must perform the gluing in such

648

S. Hollands, S. Yazadjiev

a way that the standard action of T on D 2 × Ts−2 matches up with the action of T on

near the exceptional orbits. This action is characterized by the homomorphism (37) for the j th tube; we receive the relations q

p

p

q

p

p

s−2,1 s−2,c ; . . . ; dc c k1 1,c . . . ks−2 d1 1 k1 1,1 . . . ks−2

from this operation, again via the Van Kampen theorem.

(42)

The theorem has an interesting corollary in s = 4 if the action of T has a fixed point, i.e. when the orbit space has a corner. The vectors associated with the intervals Ii , Ii+1 adjacent to the corner, a i , a i+1 , must then satisfy det(a i , a i+1 ) = ±1 [see Eq. (32)]. This imposes the relation k1 = k2 = e in Eq. (39). Then, if π1 ( ) = 0, this will imply that g = d = 0, and q1 , . . . , qc = 0. In other words, if s = 4, if the action has fixed point, and if is simply connected, then there are no conical singularities, i.e., exceptional orbits. This generalizes a result first proved using methods from singular cohomology in [43]. The above theorem has another related corollary which will be relevant below in our ˆ be any disk in the interior of the application to the structure of black holes. Let D 2 ⊂

orbit manifold not intersecting any of the boundaries or conical singularities. Thus, the orbits are all (s − 2)-dimensional tori, with no fixed points. The inverse image of D 2 in

is homeomorphic to D 2 × Ts−2 , with T acting on the second factor. Let us denote the generators of π1 (D 2 × Ts−2 ) by k1 , . . . , ks−2 , which are the s − 2 generators of π1 (Ts−2 ) = Zs−2 . Without loss of generality, we may assume that k j are the image of the paths generated by the action of the j th copy on T = Ts−2 on a point x ∈ D 2 ×Ts−2 . From the inclusion f : D 2 × Ts−2 → , we get a corresponding homomorphism f ∗ : π1 (D 2 × Ts−2 ) → π1 ( ). The way we have set things up, we may assume that f ∗ (k j ) = k j , using the same notation and assumptions as in the above Theorem 3. Lemma 6. If f ∗ : π1 (D 2 × Ts−2 ) → π1 ( ) is surjective, then we have g = d = 0, ˆ is a topologically a disk, and there are no conical q1 = · · · = qc = 1. In other words,

singularities. Proof. Using Eq. (39) and the formula f ∗ (k j ) = k j , we see that f ∗ π1 (D 2 × Ts−2 ) is a normal subgroup of π1 ( ). By assumption, the factor group π1 ( )/ f ∗ π1 (D 2 × Ts−2 ) is trivial. From the quotient, the group π1 ( ) [see Eq. (39)] receives the additional relations k j = e for j = 1, . . . , s − 2. This means that the factor group is isomorphic to π1 ( )/ f ∗ π1 (D 2 × Ts−2 ) ∼ = d1 , . . . , dc , h 1 , . . . , h d , m 1 , . . . , m g , l 1 , . . . , l g

q q [m 1 , l1 ] · · · [m g , l g ] · d1 · · · dc · h 1 · · · h d ; d1 1 ; . . . ; dc c .

(43) Evidently, this group is non-trivial unless g = d = 0, q1 = · · · = qc = 1, from which the lemma follows. 3.3. Model spaces, examples. In the previous sections, we showed how a closed oriented manifold of dimension s with an action of T = Ts−2 gives rise to a number ˆ see Prop. 1, and Thm. 3. In this of invariants and decoration data on an orbit space , section we will outline to what extent the converse of these data determine the original manifold with T -action. In other words, given another such manifold , does there

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

649

exist a diffeomorphism h : → , and an automorphism α A : T → T such that h(k · x) = α A (k) · h(x) for all x ∈ , k ∈ T ? As shown in the case s = 4 in [44, Para. I], ˆ = 0, the decoration the answer to this question is in the affirmative. (In the case that ∂

data must include also the invariant mentioned in remark (2) after Prop. 1.) The proof of this theorem really extends straightforwardly to the case of with arbitrary dimension, so we will not describe it here in detail. ˆ and given decoration data as described in A related question is whether for a given

Prop. 1, we can find a corresponding manifold with T -action described by these data. The question is again in the affirmative, and we now outline how one can construct such ˆ which is an oria manifold. Thus, let us assume that we are given (i) an orbit space

ented two-dimensional manifold with boundaries, corners, and conical singularities, (ii) ˆ satisfying the constraints (32), (iii) a vectors {a(I j )}, one for each component I j ⊂ ∂ , ˆ as described in around (37). collection {qi , pi }, one for each conical singularity xˆi ∈ , We want to construct a corresponding manifold with T -action. ˆ is a half-plane R>0 × R, with finitely many coniFor simplicity, let us assume that

cal singularities in the interior, and with boundary divided into the segments I1 , . . . , Ib . We first consider the conical singularities in the interior. We may assume that they are all in a disk D 2 ⊂ R>0 × R. We cut out this disk, and we consider D 2 × T with standard action of T on the second factor. We cut out from this region c tubes of the form Di2 × T , with each Di2 a small disk containing the i th of the c conical singularities. Near the conical singularities, we would like the T -action to be described by the homomorphisms ϑi−1 : Zqi → T given in Eq. (37). A model space for this action is Di2 ×ϑ −1 T , i

Di2 = {z ∈ C | |z − z i | ≤ 1},

(44)

where g ∈ Zqi ⊂ S1 acts on the disk by multiplication with the complex phase. We glue in these model spaces along the boundaries where we cut out the c tubes Di2 × T with diffeomorphisms h i : ∂(Di2 ×ϑ −1 T ) → ∂(Di2 × T ) in such a way that the T -actions i

match up. We call the manifold with boundary obtained from D 2 × T in this way 0 . We now construct a second T -space 1 that incorporates the data {a(I j )}. These data were constructed above by giving, for each orbit, a neighborhood together with a set of coordinates in which the action of T was explicitly given. It is intuitively clear that we can turn this around and define 1 to be the collection of these coordinate charts with corresponding T -action, and we now briefly explain how this can be done. For simplicity and concreteness, we consider explicitly the case when s = dim = 4. The construction is well-known in topology and is sometimes called “linear plumbing”, see [25]. We present the construction in such a way that the generalization to general s should be fairly obvious, details will be given in [8]. The construction of 1 is as follows. Let b ≥ 2 be the number of boundary segments {I j }. On the boundary S 3 of the four-dimensional solid ball B 4 = {y12 + y22 + y32 + y42 < 1}, we consider the disjoint subsets 3 S+ := {(y1 , y2 , y3 , y4 ) ∈ S | y32 + y42 < 1/4}, (45) S− := {(y1 , y2 , y3 , y4 ) ∈ S3 | y12 + y22 < 1/4}. Both of these subsets are topologically solid tori. We consider the disjoint union of b − 1 copies of the solid ball B 4 , and on the i th copy we define an action of T = T2 generated

650

S. Hollands, S. Yazadjiev

by the two 2π -periodic vector fields ψ1 , ψ2 given by y1 ∂ y2 − y2 ∂ y1 a1 (Ii ) a2 (Ii ) ψ1 = . a1 (Ii+1 ) a2 (Ii+1 ) ψ2 y3 ∂ y4 − y4 ∂ y3

(46)

The consistency condition on the i th corner (33), (32) guarantees that the determinant of the above matrix is ±1. We wish to glue the S+ -part of the boundary of the i th copy of the ball B 4 to the S− -part of the boundary of the (i + 1)th copy in such a way that the actions of T on these copies are compatible. It is not difficult to see that this is achieved if we identify these parts by the maps f i : S− → S+ defined by f i (y1 , y2 , y3 , y4 ) = (y3 , y4 , y1 sin(n i ϕ) + y2 cos(n i ϕ), y1 cos(n i ϕ) − y2 sin(n i ϕ)) , (47) where ϕ = arctan yy43 and n i = a1 (Ii )a2 (Ii+2 )−a2 (Ii )a1 (Ii+2 ), i.e. we have f i ∗ ψ1 = ψ1 and f i ∗ ψ2 = ψ2 . Thus, for b > 2 we define10

1 = (. . . ((B 4 ∪ f1 B 4 ) ∪ f2 B 4 ) · · · ∪ fb−3 B 4 ) ∪ fb−2 B 4 .

(48)

For b = 2 we define 1 = B 4 . The space 1 has a 3-dimensional boundary whose structure is determined by the first and last vector a(I1 ), and a(Ib ). It is either T1 × S2 , S3 , or a lens space L( p, q), see Thm. 2. We may cut out from 1 a tube D 2 × T , and glue the boundary obtained in this way onto ∂ 0 . The manifold obtained in this way is the desired T -space in the special case considered. The general case may be treated in a similar way, as we will discuss in a future paper [8]. We may call the manifold constructed from the decoration data ˆ {a(Ii )}, {qi , p }], where is an orientation, and

ˆ an oriof the orbit space X [, , i ented two-dimensional manifold with boundaries and corners. We give some examples (without conical singularities): ˆ = D 2 , ∂ D 2 = I1 ∪ I2 ∪ I3 , and consider the Example 1. (from [43]) Let s = 4,

data {(1, 0), (0, 1), (1, 1)}. Then the space X [D 2 , {(1, 0), (0, 1), (1, 1)}] is the complex projective space CP 2 = C3 / ∼, where the equivalence relation is (z 1 , z 2 , z 3 ) ∼ (λz 1 , λz 2 , λz 3 ) and the action of T = T2 is [τ1 , τ2 ]·(z 1 , z 2 , z 3 )∼ = (eiτ1 z 1 , eiτ2 z 2 , z 3 )∼ . The equivalence X [D 2 , {(1, 0), (0, 1), (1, 1)}] ∼ = CP 2 can be seen e.g. by noting that the axis in CP 2 corresponding to the vectors (1, 0), (0, 1), (1, 1) are given by the set of points (z 1 , z 2 , z 3 )∼ ∈ CP 2 such that, respectively, z 1 = 0, z 2 = 0, z 3 = 0. ˆ = D 2 and consider the data {(1, 0), (0, 1), (1, 0), (0, 1)} (four Example 2. Let s = 4,

intervals). Then the space X [D 2 , {(1, 0), (0, 1), (1, 0), (0, 1)}] is S2 × S2 , with the standard action of T . This is easily seen by considering the isotropy groups of the action. In fact, Examples 1 and 2 constitute in some sense the most general case in s = 4 because one can show that [43,44], topologically, is a connected sum of projective spaces on S2 × S2 ’s in the situation under consideration. ˆ = D 2 and consider the data {(1, 0, 0), (q1 , q2 , p), (0, 1, 0)}. Example 3. Let s = 5,

The constraints on the corners are fulfilled if we have g.c.d.( p, q1 ) = 1 = g.c.d.( p, q2 ). The corresponding space X [D 2 , {(1, 0, 0), (q1 , q2 , p), (0, 1, 0)}] is a generalized lens 10 If X, Y are sets and f is a map f : A ⊂ X → Y , then X ∪ Y is the set defined as the quotient of the f disjoint union X ∪ Y by the equivalence relation x ∼ y :⇔ (x, y) ∈ graph f .

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

651

space L( p; q1 , q2 ). The generalized lens space is defined as the quotient of S5 (realized as the unit sphere in C3 ) by the discrete subgroup of isometries of order p generated by an element λ acting as λ · (z 1 , z 2 , z 3 ) = (e2πi/ p z 1 , e2πiq1 / p z 2 , e2πiq2 / p z 3 ). The action of T = T3 on an equivalence class (z 1 , z 2 , z 3 )∼ ∈ L( p; q1 , q2 ) under this action is [τ1 , τ2 , τ3 ] · (z 1 , z 2 , z 3 )∼ = (eiτ3 / p z 1 , ei(τ1 +q1 τ3 / p) z 2 , ei(τ2 +q2 τ3 / p) z 3 )∼ .

(49)

The axis corresponding to the vectors (1, 0, 0), (q1 , q2 , p), (0, 1, 0) are, respectively, z 2 = 0, z 2 = z 3 = 0, z 3 = 0. Note that π1 (L( p; q1 , q2 )) ∼ = Z p , so for p = 1 this space is not simply connected. ˆ be as in the previous example, but let the data now be Example 4. Let s,

{(1, 0, 0), (q1 , q2 , p), (0, 1, 0), (1, 1, 0)}. The constraints on the corners are fulfilled if we have g.c.d.( p, q1 ) = 1 = g.c.d.( p, q2 ). The manifold in question is now topologically (combining Examples 1 and 3) X [D 2 , {(1, 0, 0), (q1 , q2 , p), (0, 1, 0), (1, 1, 0)}] ∼ = L( p; q1 , q2 )#(CP 2 × S1 ). (50) 3.4. The orbit space of the domain of outer communication. We next want to determine the orbit space of a D-dimensional asymptotically Kaluza-Klein stationary black hole spacetime (M, g) with D − 3 axial Killing fields ψi , i = 1, . . . , D − 3 generating an (effective) action of T = T D−3 . Thus, the total group isometries is G = T × R, with R the additive group generated by the asymptotic timelike Killing field t. The asymptotic behavior of the spacetime is assumed, as always, to be given by R4,1 × T D−5 . We have the following theorem: Theorem 4. Let (M, g) be a stationary, asymptotically Kaluza-Klein, D-dimensional vacuum black hole spacetime with isometry group G = R × T , satisfying the technical assumptions stated in Sect. 2. Then the orbit space Mˆ = M/G of the domain of outer communication is a 2-dimensional manifold with boundaries and corners homeomorphic ˆ The possibilities to a half-plane. In particular, there are no conical singularities in M. for the horizon topology are Eqs. (35), with s = D − 1. One of the boundary segments I j ⊂ ∂ Mˆ is the quotient of the horizon Hˆ = H/G, while the remaining I j correspond to the various “axis”, where ai (I j )ψi = 0. The vectors a(I j ) ∈ Z D−3 are subject to the constraint (32) on each corner Ii ∩ I j . Remarks. 1) In the statement concerning the horizon topology, Eq. (35), we do not mean that the torus factors (such as in H ∼ = S2 × T D−4 ) correspond to the rotations in the extra dimensions near infinity. 2) If the asymptotic behavior of the spacetime is instead R3,1 × T D−4 , then the statement and proof of the theorem remains more or less unchanged. However, if it is R2,1 ×T D−3 , then the orbit space is no longer homeomorphic to an upper half-plane, but instead is a plane minus a disk. We will not discuss this further here. Proof. The “structure theorem” 4.3 of [7] states that M contains a smooth, spacelike, acausal slice whose boundary is a cross section H of the horizon, which is asymptotic to a τ = const. slice in the exterior under the identification of the exterior with (part of) Rs,1 ×T D−s−1 , see Eq. (83). Furthermore, is invariant under the action of T = T D−3 ,

652

S. Hollands, S. Yazadjiev

and it is transversal to the orbits of t represented by the factor R in G. Also, if F τ is the flow of t, then M = ∪τ F τ ( ). This result is going to allow us to reduce the proof of Thm. 4 to Thm. 1, and to use Lemma 6. We first factor M by R. Then, because M = ∪τ F τ ( ), we can identify the resulting space with ( , h), with h the Riemannian metric induced from g. Evidently, T acts as a group of isometries on ( , h). Asymptotically, h approaches the standard flat metric on R4 × T D−5 , and the T -action is, by assumption, the standard action on this space in the asymptotic region. Namely, the action of T = T2 × T D−5 is the product of the action of T2 on R4 by rotations in the 12- and 34-plane, and of the action of T D−5 on itself. It also leaves the horizon H = −∂ invariant, because this is the boundary. We would like to apply the classification results derived in the previous sections to the manifold with T -action . However, there we assumed that is closed, and this is now evidently not the case. In fact, has an inner boundary H, and a (conformal) boundary at infinity. But we can reduce this case to the one which we have discussed by passing to the closed manifold

= ∪ X ∪ ({pt} × T ),

(51)

which is obtained by gluing in a suitable manifold with boundary ∂ X = H with T -action along the horizon, and another one at the end at infinity. These manifolds are glued across the boundaries in such a way that the resulting compactified space is orientable and carries a smooth action of T . It is quite obvious how this should be done at the end at infinity, because action of T is then conjugate in the exterior region to the standard action on R4 × T D−5 there, and we therefore will not describe it in detail. On the other hand, the choice of X requires some comment. First, since H is itself a compact manifold of dimension D −2 with an action of T , we can apply the classification (35). In the first two cases, we simply choose X = B 4 ×T D−5 respectively X = B 3 × T D−4 with the standard actions, and it is then quite obvious from the proof of Thm. 2 that these actions will match up with that from across the boundary H. In the last case, it is not so obvious how to choose X , and we now explain this following mainly [44]. First, as we have seen in the proof of Thm. 2, there are two degenerate T -orbits in H corresponding to places where a linear combination ai,1 ψi = 0 or ai,2 ψi vanish. By Lemma 2, we can find a matrix B ∈ S L(D − 3, Z) such that a 1 B T = (1, 0, . . . , 0), and a 2 B T = (q, p, 0, . . . , 0). Thus, redefining the axial Killing −1 if necessary, we can assume without loss of fields as ψi → j Ai j ψ j and A = B generality that a 1 = (1, 0, . . . , 0) and a 2 = (q, p, 0, . . . , 0). Now, if pq = 0 mod 2 it is possible to see that we can find a continued fraction representation of qp , q = n1 − p

1 n2 −

(52)

1 n3 −

1 ...

1 nk

in such a way that there are (u 1 , v1 ), . . . , (u k , vk ) ∈ Z2 with 0 u2 u u u u n k−1 = det 1 3 , n k−2 = det 2 4 , n k = det 1 v2 v1 v3 v2 v4 p u . . . n 1 = det k−1 . vk−1 q

(53)

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

653

Then our choice for X is 1 X = B 4 ∪ f1 (B 4 ∪ f2 (· · · ∪ fk B 4 )) × T D−5 = X [ D 2 , {(u i , vi , 0, . . . , 0)}], (54) 2 where the gluing maps f i are the same as in Eq. (47), and the action of T = T2 × T D−5 on X is the T2 -action described there on the first Cartesian factor, and it is the standard T D−5 action on itself on the second Cartesian factor. This action matches precisely that of across the joint boundary ∂ X = H = −∂ . The orbit space Xˆ = 21 D 2 is half of a 2-disk, with the cut corresponding to the orbit space Hˆ = H/T of the horizon, and with the semi-circle corresponding to the various “axis” of the T -action. When pq is odd, a similar construction can be made. We are now in a position to apply Thm. 1 to , which tells us that its orbit space is ˆ g with g handles, with orbifold points (conical singularities) given by a handle body

and with removed disks, ˆg\

/T ∼ =

d

D 2j .

(55)

j=1

To rule out the presence of handles, removed disks, and points with conical singularities, we now use the topological censorship theorem for asymptotically Kaluza-Klein spaces [9], see also [17,18]. This theorem implies that any curve γ with endpoints in the asymptotic region ∞ of can be continuously deformed to a curve entirely within

∞ , and it follows from our gluing construction that the same is still true in the compactified space . Furthermore, any closed loop in ∞ is homotopic to a closed loop in {pt} × T . These facts together imply that if f : {pt} × T → is the embedding map, then f ∗ : π1 ({pt} × T ) → π1 ( ) is surjective. We can now apply Lemma 6, and thereby conclude that there can be no handles, removed disks, nor conical singularities, and that /T ∼ = D 2 . We now recall that was the union (51), the pieces of which are separately invariant, and whose orbit spaces are, respectively X/T ∼ = 21 D 2 ˆ = /T = 1 D 2 \{pt} is a half-disk 1 D 2 , with a and ({pt} × T )/T ∼ = {pt}. Thus,

2 2 ˆ itself point {pt} removed somewhere on the arc-shaped boundary. In other words,

is homeomorphic to a half-plane, and there are no conical singularities. The boundary component of 21 D 2 \{pt} along the straight cut is the horizon interval I H , i.e. it is coming from the quotient H by T . The point {pt} corresponds to the point at infinity in the upper half plane picture. ˆ this proves the theorem. Since Mˆ = M/G ∼ = , The proof of the theorem also implies the following corollary: Corollary 1. Topologically, the domain of outer communication M is given by M ∼ = R × ( \{pt}\X ),

(56)

where “pt” represents the point at infinity, where is a compact, connected manifold without boundary with T -action, and where X (the ‘black hole’) is a compact connected manifold with T -action and boundary ∂ X = H. Furthermore, the action of T has no points with discrete isotropy group.

654

S. Hollands, S. Yazadjiev

Remark. In D = 5, the corollary implies together with results of [43,44], that 2

∼ = # k · CP 2 # k · CP # l · (S2 × S2 ).

(57)

Note that a generic compact 4-manifold could also contain K 3 and K 3’s. If one additionally assumes that M is spin, then the complex projective spaces are absent in the decomposition. All known black hole solutions in fact have k = k = l = 0, and this may well be the only possibilities. In D = 7, a similar decomposition applies if the second Stiefel-Whitney class of the spacetime vanishes. Then we have a decomposition of the type [42]:

∼ = # k · (S2 × S4 ) # (k − 1) · (S3 × S3 ).

(58)

This pattern presumably persists in all dimensions, but we have not been able to show this. 4. Stationary Vacuum Black Holes in D Dimensions 4.1. Canoical coordinates. In the previous section, we looked at the topology of the domain of outer communication M and the structure of the orbits of the symmetries. In this section, we investigate the spacetime metric, i.e. the implications of the Einstein equations Rab = 0. These equations imply a set of coupled differential equations for the metric on the ˆ described above in Thm. 4. To understand these equatwo-dimensional factor space M, tions in a geometrical way, we note that the projection π : M → M/G = Mˆ (with G = T D−3 × R the isometry group) defines a G-principal fibre bundle over the interior ˆ because we argued in the previous section that such points correspond to points in of M, the domain of outer communication with trivial isotropy group. At each point x ∈ M ˆ we may uniquely decompose the tangent space in a fibre over π(x) in the interior of M, at x into a subspace of vectors tangent to the fibres, and a space Wx of vectors orthogonal to the fibres. Evidently, the distribution of vector spaces Wx is invariant under the group G of symmetries, and hence forms a “horizontal bundle” in the terminology of principal fibre bundles [33]. According to standard results in the theory of principal fibre bundles [33], a horizontal bundle is equivalent to the specification of a G-gauge connection I d x α ∧ d x β , with Dˆ on the factor space, whose curvature we denote by Fˆ = TI Fˆαβ TI , I = 0, . . . , D − 3 the generators of the abelian group G. Roman indices α, β, . . . take the values 1, 2. The horizontal bundle gives an isomorphism Wx → Tπ(x) Mˆ for any x, and this isomorphism may be used to uniquely construct a smooth covariant tensor field tˆαβ...γ on the interior of Mˆ from any smooth G-invariant covariant tensor field tab...c on M. For example, the metric gab on M thereby gives rise to a symmetric tensor gˆ αβ on ˆ M. One can show with a significant amount of labor [8] (see also [7]) that the D − 2 dimensional subspaces spanned by the Killing fields at points of M corresponding to interior points of Mˆ always contain a timelike vector. Hence the bilinear form induced from gab on Wx has signature (++), so gˆ αβ is in fact a Riemannian metric. We let Dˆ act on ordinary tensors tˆαβ...γ as the Levi–Civita connection of gˆ αβ , with Ricci tensor denoted Rˆ αβ .

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

655

By performing the well-known Kaluza-Klein reduction of the metric gab along the orbits of G, we can locally write the Einstein equations as a system of equations I of on the interior of the factor space Mˆ in terms of the metric gˆ αβ , the components Fˆαβ the curvature, and the Gram matrix field G I J , t if I = 0, G I J = g(X I , X J ), X I = (59) ψi if I = i = 1, . . . , D − 3. The resulting equations are similar in nature to the “Einstein-equations” on Mˆ for gˆ αβ , I and the “scalar fields” G , see [5,32]. We will not coupled to the “Maxwell fields” Fˆαβ IJ write these equations down here, as we will not need them in this most general form. In our case, the equations simplify considerably because one can show (see e.g. [7]) that the distribution of horizontal subspaces Wx is locally integrable, i.e., locally tangent to a family of two-dimensional submanifolds. In that case, the connection is flat, I = 0, and the dimensionally reduced equations may be written as Fˆαβ Dˆ α (r G −1 Dˆ α G) = 0 ,

(60)

1 Rˆ αβ = Dˆ α Dˆ β log r − Tr Dˆ α G −1 Dˆ β G . 4

(61)

together with

Greek indices have been raised with gˆ αβ . The equations are well-defined a priori only at points in the interior of Mˆ where the Gram determinant r 2 = −det G

(62)

does not vanish. Chru´sciel has shown [8] (based on previous work of Carter [2] and also ˆ The reduced Einstein equations are of [7]) that r 2 > 0 away from the boundary of M. hence well-defined there. On the other hand, r vanishes on any boundary component I j of Mˆ corresponding to an axis, i.e. where a linear combination ai (I j )ψi = 0 vanishes, because the Gram matrix then has a non-trivial kernel. It also vanishes on the segment of ∂ Mˆ corresponding to the horizon H , because the span of X I , I = 0, . . . , D − 3 is tangent to H and hence a null space, with the signature of G consequently being (0 + + · · · +) there. Taking the trace of the first reduced Einstein equation (60), one finds that r is a ˆ harmonic function on the interior of M, Dˆ α Dˆ α r = 0.

(63)

Since Mˆ is an (orientable) simply connected 2-dimensional analytic manifold with connected boundary and corners by Thm. 1, we may map it analytically to the upper complex half plane {ζ ∈ C | Im ζ > 0} by the Riemann mapping theorem. Furthermore, since r is harmonic, we can introduce a harmonic scalar field z conjugate to r Dˆ α z = ˆ αβ Dˆ β r,

(64)

where ˆαβ is the anti-symmetric tensor on Mˆ satisfying ˆ αβ ˆαβ = 2. Thus both r, z are ˆ g), ˆ Combining this with the fact Mˆ is harmonic functions on ( M, ˆ and r = 0 on ∂ M.

656

S. Hollands, S. Yazadjiev

homeomorphic to a half-plane, one can argue (see e.g. [7, 6.3] or [53]) that r and z are globally defined coordinates, and identify Mˆ with {z + ir ∈ C | r > 0}. In these coordinates, the metric gˆ globally takes the form gˆ = e2ν(r,z) (dr 2 + dz 2 ).

(65)

Since Eq. (60) is invariant under conformal rescalings of gˆ αβ , and since a 2-dimensional metric is conformally flat, it decouples from Eq. (61). In fact, writing the Ricci tensor Rˆ αβ of (65) in terms of ν, one sees that Eq. (61) may be used to determine ν by a simple integration, see e.g. [21] for details. The boundary r = 0 of Mˆ consists of several segments according to our Classification Theorem 4. In the description of Mˆ as the upper complex half plane Mˆ = {z + ir ∈ C | r > 0}, these are represented by a collection of intervals {I j } of the z-axis. The length of the j th interval as measured by the coordinate z is called l(I j ). Because the coordinates (r, z) were canonically defined, the numbers l(I j ) ≥ 0 are invariantly defined, i.e. are the same for isometric spacetimes. Each segment is either an axis for which there is a vector a(I j ) ∈ Z D−3 such that i ai (I j )ψi = 0, or it corresponds to the horizon. In that case, we put the corresponding vector to zero, a H = 0, because no non-trivial linear combination of the axial Killing fields vanishes in the interior of the corresponding interval I H , see Thm. 4. Concerning the length l H of the horizon segment, we have the following lemma. Lemma 7. The length of the horizon interval satisfies (2π ) D−3l H = κ A H ,

(66)

where A H is the area of the horizon cross section H, and where κ > 0 is the surface gravity. The proof of Lemma 7 is given in Appendix A. We call the collection of real positive numbers {l(I j )} and integer vectors {a(I j )} associated with the intervals the “interval structure” of the spacetime. As we explained in the previous section, the collection {a(I j )} determines the manifold structure of M and the action of G on this space up to diffeomorphism. In particular, the vector fields X I are determined up to diffeomorphism. Furthermore, if we are given G I J and gˆ (i.e., ν) as functions of r, z, then we can reconstruct the metric g of the spacetime in the domain of outer communication. In a local coordinate system consisting of r, z and ξ I , I = 0, . . . , D − 3, such that the Killing fields are given by X I = ∂/∂ξ I , the metric locally takes the form g = e2ν(r,z) (dr 2 + dz 2 ) + G I J (r, z) dξ I dξ J .

(67)

For M = R4,1 × T D−5 with the standard flat metric g0 , the axial symmetries are the rotations in the 12-plane of R4,1 generated by the Killing field ψ1 , the rotations in the 34-plane of R4,1 generated by the Killing field ψ2 and the rotations of the D − 5 compact extra dimensions generated by Killing fields ψ3 , . . . , ψ D−3 . The coordinates r, z

as constructed above are given by r = R1 R2 and z = 21 (R12 − R22 ), with R1 = x12 + x22 and R2 = x32 + x42 , and with xi the standard spatial Cartesian coordinates of R4,1 . The

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

657

√ conformal factor is given by e2ν = 1/2 r 2 + z 2 , and the Gram matrix of g0 is given by ⎞ ⎛ −1 0 0 0 0 0 ⎟ ⎜ 0 ρ(1 − cos θ ) , (68) G0 = ⎝ 0 0 ρ(1 + cos θ ) 0 ⎠ 0 0 0 δi j where i, j = 3, . . . , D − 3. Here, we have introduced the coordinates ρ, θ which are related to r, z by r = ρ sin θ, z = ρ cos θ, or

(69)

2 (x12 + x22 )(x32 + x42 ) 1 2 ρ = (x1 + x22 + x32 + x42 ), θ = arctan 2 x12 − x32 + x22 − x42

(70)

in terms of the spatial cartesian coordinates xi of R4,1 . The metric g0 is hence given explicitly by g0 = −dτ 2 + +

D−3

1 (dρ 2 + ρ 2 dθ 2 ) + ρ(1 − cos θ )dϕ12 + ρ(1 + cos θ )dϕ22 2ρ

dϕi2 ,

(71)

i=3

where the last line is the metric of the small dimensions T D−5 . 4.2. Asymptotic behavior. A general D-dimensional asymptotically Kaluza-Klein spacetime (M, g) with asymptotically flat 5-dimensional part will differ in the asymp

totic region by terms of order O(R −2 ), with R = x12 + x22 + x32 + x42 , by our general assumptions about the asymptotic behavior of the spacetime metric, see Eq. (1). We would like to know exactly what this means in terms of our canonical coordinates (r, z) or alternatively (ρ, θ ), as we will need this type of information later in the proof of our classification theorem. In the spacetime M = R4,1 × T D−5 with standard metric g0 , we had ρ = 21 (R12 + R22 ) and tan θ = R1 R2 , where R1 =

x12 + x22 and R2 =

x32 + x42 ,

but in a spacetime whose metric is only asymptotic to g0 up to terms of order O(R −2 ), this will no longer be the case exactly. In order to analyze this issue, we recall that, in the asymptotic region of (M, g), we assumed the coordinates to be chosen in such a way that the Killing fields are given exactly by ψ1 = x1 ∂x2 − x2 ∂x1 , ψ2 = x3 ∂x4 − x4 ∂x3 , and t = ∂τ , as well as ψi = ∂ϕi for i = 3, . . . , D − 3. It is then evident from Eq. (1) that g(t, t) = −1 + O(R −2 ) , g(t, ψ1 ) = R1 O(R −2 ), g(t, ψ2 ) = R2 O(R −2 ), g(ψ1 , ψ1 ) = R12 (1 + O(R −2 )), g(ψ2 , ψ2 ) = R22 (1 + O(R −2 )) , g(ψ1 , ψ2 ) = R1 R2 O(R −2 ), g(ψi , ψ j ) = δi j + O(R g(ψi , ψ1 ) = R1 O(R

−2

−2

) , g(ψi , t) = O(R

(72) −2

) , g(ψi , ψ2 ) = R2 O(R

),

−2

),

658

S. Hollands, S. Yazadjiev

where i, j = 3, . . . , D − 3. These equations determine the Gram-matrix G, whose determinant (62) gives us the coordinate r . We find r = R1 R2 (1 + O(R −2 )).

(73)

The second canonical coordinate z was defined to be the dual harmonic coordinate to r , see Eq. (64). For this, we need gˆ = d R12 + d R22 + O(R −2 ) , ˆ = (1 + O(R −2 )) d R1 ∧ d R2 ,

(74)

which follows from the definitions of the orbit space metric and the asymptotic conditions. This leads to dz = (1 + O(R −2 )) (R1 d R2 − R2 d R1 ) + O(R −2 ).

(75)

When we integrate this we find z=

1 2 (R − R22 ) + O(1) , 2 1

(76)

as we could have anticipated. As above, let us define r = ρ sin θ, z = ρ cos θ . Inverting the relations for (r, z) just given in terms of (R1 , R2 ) gives, after some straightforward analysis: 2 = ρ(1 ± cos θ )(1 + O(ρ −1 )) + O(sin2 θ ) , R1/2

(77)

and in particular R 2 = 2ρ(1 + O(ρ −1 )). Inserting this into Eq. (1) and taking into account Eqs. (72) then delivers the following asymptotic form of the metric g: g = g0 + O(1)(dρ 2 + ρ 2 dθ 2 ) +O(ρ −1 )dτ 2 + O(1)(1 + cos θ )dϕ12 + O(1)(1 − cos θ )dϕ22 +O(1) sin θ dϕ1 dϕ2 +

D−3

O(ρ −1 )dϕi dϕ j +

i, j=3

+O(ρ +

D−3 i=3

−1/2

)(1 + cos θ )

1/2

dϕ1 dτ + O(ρ

D−3

O(ρ −1 )dϕi dτ

i=3 −1/2

O(ρ −1/2 )(1 + cos θ )1/2 dϕ1 dϕi +

)(1 − cos θ )1/2 dϕ2 dτ

D−3

O(ρ −1/2 )(1 − cos θ )1/2 dϕ2 dϕi ,

i=3

(78) where g0 is the flat background metric given above in Eq. (71). We emphasize that these estimates hold uniformly in ρ, θ , including the axes θ = 0, π . This will be needed later. Similar expansions can be carried out when the number of asymptotically large spatial dimensions is 3, but when this number is ≤ 2, the analysis would be substantially different. Using Einstein’s equations, one would expect that it is possible to determine the asymptotics of the metric g in considerably more detail than (78), and we now outline how this can be done. However, we emphasize that for the purposes of this paper, already Eq. (78) will be sufficient. First, one writes G = G 0 F, with G 0 the diagonal Gram-matrix for R4,1 × T D−5 given above. The matrix function F represents the corrections and it satisfies the second order non-linear elliptic equation Dˆ α (F −1 Dˆ α F) +

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

659

ˆ ( Dˆ α F −1 )G −1 0 ( Dα G 0 )F = 0, and F → I for ρ → ∞. By doubling the half space ˆ M across its boundary, one would then expect to be able to show that F satisfies an asymptotic expansion of the general form Fn,m (sin θ, cos θ )ρ −n logm ρ , (79) F∼I+ n,m≥1

for large ρ. If we assume that such an asymptotic expansion indeed holds for F, then it is straightforward to determine explicitly its first terms. We do not give the details of the straightforward but somewhat lengthy calculation but only quote the solution. It can be stated as saying that ⎛

⎞ −2M B1 B2 bi −1 −1 −1 M − A1 Cρ ci ρ ⎟ ⎜B ρ F = ρ −1 ⎝ 1 −1 ⎠ + ··· . B2 ρ Cρ −1 M − A2 di ρ −1 bi ci di hi j

(80)

Here, dots represent higher terms in the asymptotic expansion, and the quantities M, A1,2 , B1,2 , C, h i j , bi , ci , di are undetermined real constants and i, j range through 3, . . . , D − 3 in this block-matrix. Because we must have −det G = r 2 , they are subject to the constraint det F = 1, from which it follows that A1 + A2 =

D−3

h ii .

i=3

According to Eq. (64), we are still free to change the coordinate z by adding a constant. This will result in adding a constant ±η to A1/2 , and we may thus fix the remaining ambiguity in z in order to set A1 = A2 = A. We will do this in the following. The asymptotic form of the conformal factor e2ν can similarly be determined by the second reduced Einstein equation, Eq. (61), together with the asymptotic form of the Gram matrix G = G 0 F. Again, we omit the straightforward but somewhat lengthy calculation and give only the result, which is e2ν =

M−A 1 + + ··· , 2ρ 4ρ 2

(81)

where the dots represent terms that go to zero faster as ρ → ∞. Thus, in a coordinate system (τ, ρ, θ, ϕ1 , . . . , ϕ D−3 ) such that t = ∂/∂τ, ψi = ∂/∂ϕi , i = 1, . . . , D − 3,

(82)

we obtain the following asymptotic form of the metric Eq. (67) for large ρ: Asymptotic form of the metric. for stationary black hole spacetime with D − 3 axial Killing fields, behaving as R4,1 × T D−5 near infinity: M − A 2 2M 2 1 dτ + 1+ (dρ + ρ 2 dθ 2 ) g = − 1− ρ 2ρ 2ρ M − A 2 M − A 2 dϕ1 + ρ(1 + cos θ ) 1 + dϕ2 +ρ(1 − cos θ ) 1 + ρ ρ

660

S. Hollands, S. Yazadjiev

+

D−3

δi j +

i, j=3

+

hi j 2C sin2 θ dϕi dϕ j + dϕ1 dϕ2 ρ ρ

D−3 2B1 (1 − cos θ ) 2B2 (1 + cos θ ) 2 dϕ1 dτ + dϕ2 dτ + bi dϕi dτ ρ ρ ρ i=3

+

2(1 + cos θ ) ρ

D−3

di dϕ2 dϕi +

i=3

2(1 − cos θ ) ρ

D−3

ci dϕ1 dϕi + · · · ,

(83)

i=3

D−3 where the dots represent terms that are higher order in 1/ρ, and where A = 21 i=3 h ii . The constants bi are proportional to the angular momenta of the solutions in the asymptotically small dimensions, and the constants B1/2 are proportional to the two independent angular momenta in the asymptotically large dimensions. They can be defined e.g. by the Komar expressions Ji = ∗dψi , (84) S3 ×T D−5

where the integration is over a surface at infinity, and where dψi denote the 2-forms obtained by taking the exterior differential of ψi after lowering the index. The constant M is related to the ADM-mass of the solution, see e.g. [31, Sec. 3], and the constant A to the “tension” of the small extra dimensions. We finally remark that the analysis given above would be different for different numbers of asymptotically large dimensions. For example, for M = R3,1 × T D−4 , the axial symmetries may be taken as the rotations in the 12-plane of R3,1 and rotations of the D − 4 compact extra dimensions. The functions r, z are then given by r =

x12 + x22

and z = x3 , with xi the standard spatial Cartesian coordinates on R3,1 . The conformal factor is just e2ν = 1. For a general D dimensional asymptotically Kaluza-Klein spacetime with asymptotically flat 4-dimensional part, we may again derive an expression for the asymptotic form of the metric which is similar but not identical to that given above. In the case M = R2,1 × Y , we have again r =

x12 + x22 but z can now be e.g. a

periodic coordinate, depending on Y . For example if Y = T D−3 , then z parametrizes the orbit space T = T D−3 /T D−4 . The analysis of a general spacetime with this asymptotic behavior would also be quite different. In the case M = R1,1 × Y , the function r is simply a constant, and the definition of the (r, z) coordinates is not possible any longer. 5. Uniqueness Theorem for Stationary Black Holes with ( D − 3) Axial Symmetries In the previous two sections, we have analyzed stationary black hole spacetimes that are asymptotically R4,1 × T D−5 , and that have an isometry group G = R × T D−3 . We have derived a number of “invariants” associated with such solutions: • We showed that the orbit space of the domain of outer communication by G is a half plane Mˆ = {z + ir | r > 0}. The boundary of the half-plane is divided into a finite collection of intervals {I j }. With each interval, there is associated its length11 11 For a half infinite interval, this would be ∞.

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

661

l(I j ) ∈ R>0 , and a vector a(I j ) ∈ Z D−3 subject to the normalization (31). One of the intervals corresponds to the orbit space Hˆ of the horizon and is associated with the zero vector, while the others correspond to an “axis” in spacetime, i.e. points where the linear combination i ai (I j )ψi = 0 vanishes. For adjacent intervals I j and I j+1 (not including the horizon), there is a compatibility condition stating that the collection of minors Q kl ∈ Z, 1 ≤ k < l ≤ D − 3 given by Q kl = | det

ak (I j+1 ) ak (I j ) | al (I j+1 ) al (I j )

(85)

have greatest common divisor g.c.d.{Q kl } = 1, see the discussion around (32). The data {l(I j )} together with {a(I j )} were called the “interval structure”. • Because the spacetime is asymptotically Kaluza-Klein, we can define its mass, and the angular momenta {Ji } corresponding to the axial Killing fields, i = 1, . . . , D −3. Some of the angular momenta correspond to the large, and some to the small (extra) dimensions. • The asymptotic form of the metric (83) contains additional real parameters {h i j }, {ci }, {di }, . . . which are related to the asymptotic metric on the tori generated by the axial Killing fields ψi , i = 1, . . . , D − 3 in the region of spacetime near infinity. These numbers are invariantly defined. • The collection of angular velocities {i }, the surface gravity κ, and horizon area. It is natural to ask the following questions: Is the spacetime (M, g) under consideration uniquely determined by the above data? To what extent can the data be specified independently? The following theorem provides an answer to the first question and a partial answer to the second question. Theorem 5. There can be at most one stationary, asymptotically Kaluza-Klein spacetime (M, g) with D − 3 axial Killing fields and 5 asymptotically large dimensions, satisfying the technical assumptions stated in Sect. 2, for a given interval structure {a(I j ), l(I j )} and a given set of angular momenta {Ji }, i = 1, . . . , D − 3. This uniqueness theorem is the main result of this paper. The same result is true if the number of asymptotically large dimensions is only 4. The only difference in the proof would be the analysis of the asymptotic behavior. For 3 or less asymptotically large dimensions, we still expect a result of this type to be true, and the proof to be similar. However, in that case, the nature of the orbit space would also be different, so the differences in the proof would presumably be greater. A consequence of the theorem is that the interval structure and angular momenta uniquely determine the other invariants mentioned above, such as e.g. the mass of the spacetime. In D = 4 with no extra dimensions, the only non-trivial interval structure for a single black hole spacetime is given by the intervals (−∞, −z 0 ], [−z 0 , z 0 ], [z 0 , ∞). The middle interval corresponds to the horizon, while the half-infinite ones correspond to the axis of the rotational Killing field. The interval vectors a(I j ) are 1-dimensional integer vectors in this case and hence trivial. For each z 0 > 0 and for each angular momentum J , there exist a solution given by the appropriate member of the Kerr-family of metrics. Thus, the Kerr metrics exhaust all possible stationary, axially symmetric single black hole spacetimes (satisfying the technical assumptions stated in Sec. 2). This is of course just the classical uniqueness theorem for the Kerr-solution [1,2,24,38,48], see [7] for a rigorous account. The mass m of the non-extremal Kerr solution charac terized by z 0 , J is related to these parameters by z 0 = m 2 − J 2 /m 2 > 0. Hence the

662

S. Hollands, S. Yazadjiev

uniqueness theorem may be stated equivalently in terms of m and J , which is more commonly done. Note that the length of the horizon interval, l H = 2z 0 tends to zero in the extremal limit, in accordance with Lemma 7. In higher dimensions, one may similarly derive relations between the interval structure and angular momenta on the one side, and the other invariants on the other side for any given solution. Such formulae are provided for the Myers-Perry or black-ring solutions e.g. in [21], but they would not be expected to be universal. Of course, for most interval structures it is not known whether there actually exists a solution, so in this sense much less is known in higher dimensions than in D = 4. Proof of Thm. 5. We will show that the domains of outer communication of any two spacetimes as in the theorem must be isometric. It then follows from analyticity that they are globally isomorphic, including the interior of the black hole. Without assuming analyticity, one could obtain information about the interior if one can generalize the argument given in [16] based on the characteristic initial value formulation of the Einstein equations from D = 4 to higher dimensions. We will not do this here. The key step is to define from the reduced Einstein equations (60) a set of equations which describe the difference between two solutions as described in the theorem. This formulation is due to [35,38], see also [37], and it involves certain potentials which we define first. We first consider the twist 1-forms, ωi = ∗(ψ1 ∧ · · · ∧ ψ D−3 ∧ dψi ) i = 1, . . . , D − 3 ,

(86)

where the Killing fields have been identified with 1-forms via the metric. Using the vacuum field equations and standard identities for Killing fields [52], one shows that these 1-forms are closed, dωi = 0. Since the Killing fields commute, the twist forms are invariant under G, and so we may define corresponding 1-forms ωˆ i on the interior of the factor space Mˆ = {z + ir ∈ C | r > 0}. These 1-forms are again closed. Thus, the “twist potentials”

xˆ

χi =

ωˆ i

(87)

0

are globally defined on Mˆ and independent of the path connecting 0 and the point ˆ and dχi = ωˆ i . The twist potentials and the Gram matrix of the axial Killing xˆ ∈ M, fields f i j = g(ψi , ψ j ), satisfy a system of coupled differential equations on Mˆ which follow from the reduced Einstein equation (60). They are 0 = Dˆ α r (det f )−1 χ i Dˆ α χi + r Dˆ α log det f , (88) (89) 0 = Dˆ α r (det f )−1 f i j Dˆ α χ j , 0 = Dˆ α r f jk Dˆ α f ki + r (det f )−1 f jk χi Dˆ α χk , (90) 0 = Dˆ α −r Dˆ α χi + r χi Dˆ α log det f + r ( f jk Dˆ α f i j )χk (91) + r (det f )−1 χ j ( Dˆ α χ j )χi . Here we are using the summation convention and f i j denotes the components of the inverse of the matrix f i j , which is used to raise indices on χi . To verify these equations,

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

663

it is necessary to use the relations Dˆ α α i = r (det f )−1 ˆ αβ f i j Dˆ β χ j ,

(92)

β = f i j αi α j − (det f )−1r 2

(93)

as well as

for the scalar products αi = g(t, ψi ) and β = g(t, t). Again, α i means f i j α j . The above equations can be written in a compact matrix form. For this, one introduces the (D − 2) × (D − 2) matrix field which is written in an obvious block-matrix notation as (det f )−1 −(det f )−1 χi . (94) = −(det f )−1 χi f i j + (det f )−1 χi χ j The matrix satisfies T = , det = 1, and is positive semi-definite, being the sum of two positive semi-definite matrices. Hence it may be written in the form = S T S for some matrix S of determinant ±1, i.e. up to sign S ∈ S L(D − 2, R). Because S is defined only up to S → RS, where R is a rotation, we can think of it as an element of the right coset S L(D − 2, R)/S O(D − 2, R), and this coset therefore parametrizes the possible . Equations (88) can be stated in terms of as Dˆ α (r −1 Dˆ α ) = 0.

(95)

These can be viewed as the equations for a sigma-model on the target space S L(D − 2, R)/S O(D −2, R). We will not use this viewpoint, but it could be used to give an alter˜ g) native proof of our theorem. Consider now two black hole solutions (M, g) and ( M, ˜ as in the statement of the theorem. We denote the corresponding matrices defined as ˜ and we use the same “tilde” notation to distinguish any other quanabove by and , tities associated with the two solutions. M as a manifold with a G-action is uniquely determined by the interval structure modulo diffeomorphisms preserving the action of G and similarly for the tilde spacetime. Therefore, since the interval structures are assumed ˜ are isomorphic as manifolds with to be the same for both spacetimes, M and M a G action, and we may hence assume that t˜ = t, ψ˜ i = ψi for i = 1, . . . , D − 3, and we may also assume that r˜ = r and z˜ = z. Consequently, it is possible to combine the divergence identities (95) for the two solutions into a single identity on the upper complex half plane, called “Mazur identity”. It is given by Dˆ α (r Dˆ α σ ) = r gˆ αβ Tr Nˆ αT Nˆ β , (96) and it can be proven in almost exactly the same way as the identity given in [38]. Here, we have written ˜ −1 − I ), σ = Tr(

˜ − −1 Dˆ α )S, ˜ −1 Dˆ α Nˆ α = S˜ −1 (

(97)

˜ = S˜ T S˜ hold. The key where in turn S and S˜ are matrices such that = S T S and point about the Mazur identity (96) is that on the left side we have a total divergence, while the term on the right hand side is non-negative. This structure can be exploited in various ways. In this paper, we follow a strategy invented by Weinstein [53,54], which differs from that originally devised by Mazur.

664

S. Hollands, S. Yazadjiev

The basic idea is to view r, z as cylindrical coordinates in an auxiliary space R3 consisting of the points x = (r cos γ , r sin γ , z), and to view σ as a rotationally symmetric function defined on this R3 , minus the z-axis. The Mazur identity then gives x σ ≥ 0 on R3 \{z − axis},

(98)

where x is the ordinary Laplacian on R3 . As we will show, σ is globally bounded on R3 , including at infinity and the z-axis. Furthermore, we claim that σ ≥ 0 at any ˜ −1 , we have σ = Tr (F T F) − (D − 2). point away from the axis: Writing F = SS T T ˜ det −1 = 1, so we may bring F T F into the Now, F F ≥ 0, and det F F = det u u −u −···−u 1 1 D−3 D−3 form diag(e , . . . , e ,e ) by a similarity transformation. Thus, σ will be non-negative if and only if 1 (eu 1 + · · · + eu D−3 + e−u 1 −···−u D−3 ) ≥ 1, D−2

(99)

which in turn follows directly because the exponential function is convex. Thus, we are in a position to apply the maximum principle arguments in [54], which imply that σ = 0 everywhere. As we now see, this implies that the metrics g and g˜ are isometric on the domain of outer communication, thus proving the theorem. ˜ = everywhere First, σ = 0 implies that u 1 = · · · = u D−3 = 0, and hence that ˆ in M. Therefore, the twist potentials and the Gram matrices of the axial Killing fields are identical for the two solutions, f˜i j = f i j and χ˜ i = χi . To see that the other scalar products between the Killing fields coincide for the two solutions, let αi = g(t, ψi ), β = g(t, t) as above, and define similarly the scalar products α˜ i , β˜ for the other spacetime. The right side of Eq. (92) does not depend upon the conformal factor ν, so since χ˜ i = χi and f˜i j = f i j , it also follows that α˜ i = αi up to a constant. That constant has to vanish, since it vanishes at infinity. Furthermore, from Eq. (93) we have β˜ = β. Thus, all scalar products of the Killing fields are equal for the two solutions, G˜ I J = G I J on the entire upper half plane. Viewing now the second reduced Einstein equation (61) as an equation for ν respectively ν˜ , and bearing in mind that ν = ν˜ at infinity, one concludes that ν˜ = ν. Thus, summarizing, if we could show that the field σ of Eq. (97) is globally bounded on R3 , then we would know that G˜ I J = G I J , r˜ = r , z˜ = z and ν˜ = ν. Since t˜ = t, ψ˜ i = ψi it would follow from Eqs. (67) and (65) that g˜ = g in the domain of outer communication. Thus, it remains to be shown that σ is globally bounded, including near the z-axis ˆ and near infinity. It is here that the assumptions of the theorem (corresponding to ∂ M) about the interval structures and angular momenta are needed. Away from the z axis in R3 (or equivalently, the boundary of Mˆ = {z + ir | r > 0}, all fields f i j , χi , ν, f˜i j , χ˜ i , ν˜ and the matrix inverses are smooth (in fact analytic) functions, and this consequently also applies to σ , which is just an algebraic function of these. However, σ could diverge ˆ To see that this is not the case, we cover Mˆ (including the at infinity, or the boundary ∂ M. boundary) with different open neighborhoods (1)—(5). These neighborhoods are chosen as follows: (1) In an open neighborhood of (z j + δ, z j − δ), where I j = (z j , z j+1 ) ⊂ ∂ Mˆ corresponds to a rotation axis. The quantity δ > 0 has been introduced so that the neighborhood does not include any intersection points of two axis, or an axis with the horizon. (2) is a similar neighborhood, where the interval I H in question now represents the horizon, (3) covers the outside of a large ball r 2 + z 2 ≤ R 2 , i.e. infinity. We also need to consider separately (4) open neighborhoods of the intersection points z j of adjacent

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

665

intervals I j−1 = (z j−1 , z j ) and I j = (z j , z j+1 ) which do not represent the horizon, and (5) open neighborhoods of the intersection points of the horizon interval and the adjacent intervals representing axis. The cases (4) and (5) represent “corners” of the orbit space ˆ which were “straightened out” by introducing the coordinates (r, z). M, (1) Axis. On each segment z ∈ I j = (z j , z j+1 ), r = 0 of the boundary ∂ Mˆ representing an axis, we know that the null spaces of the Gram matrices f i j and f˜i j coincide, because we are assuming that the interval structures of both solutions are identical. Furthermore, from Eq. (87), and from the fact that ωˆ i vanishes on any axis by definition, the twist potentials χi are constant on the z-axis outside of the segment (z h , z h+1 ) representing the horizon. The difference between the constant value of χi on the z-axis left and right to the horizon segment can be calculated as follows: z h+1 χi (r = 0, z h ) − χi (r = 0, z h+1 ) = ωˆ i zh 1 = ∗(dψi ) (2π ) D−3 H 1 1 = ∗(dψi ) = Ji . D−3 (2π ) (2π ) D−3 S3 ×T D−5 The first equality follows from the definition of the twist potentials, the second from the defining formula for the twist 1-forms and the fact that these are invariant under the action of the D − 3 independent rotation isometries each with period 2π (with H a horizon cross section), the third equation follows from Gauss’ theorem and the fact that d(∗dψi ) = 0 because ψi is a Killing vector on a Ricci-flat manifold, and the last equality follows from the Komar expression for the angular momentum. ˜ g). The analogous expressions hold in the spacetime ( M, ˜ Because we assume that ˜ Ji = Ji , we can add constants to χi , if necessary, so that χi = χ˜ i on the axis. From the definition (87), it then follows that in fact χi − χ˜ i = O(r 2 ) near any axis, or more precisely, near any compact subset of the open interval I j = (z j , z j+1 ) of the ˆ boundary ∂ M. In order to analyze the behavior of σ near (z j + δ, z j+1 − δ), we now calculate σ = −1 +

f i j (χi − χ˜ i )(χ j − χ˜ j ) det f + + f i j ( f˜i j − f i j ). ˜ ˜ det f det f

(100)

We wish to show that each term on the right side is uniformly bounded near (z j + δ, z j+1 − δ). Let a(I j ) ∈ Z D−3 be the vector generating the kernel of the matrix f in on our interval I j . By Lemma 2, we can find a matrix B ∈ S L(D − 3, Z) such that a(I j )B T = (1, 0, . . . , 0). Thus, redefining the axial Killing fields as ψi → j Ai j ψ j and A = B −1 if necessary, we can assume without loss of generality that a(I j ) = (1, 0, . . . , 0), so that the axis under consideration corresponds to zeros of ψ1 . Let x ∈ M be a point on this axis. By Lemma 4, we can introduce coordinates (τ, y1 , . . . , y D−1 ) in a neighborhood of x such that the action of the rotational part of the isometry group takes the form described in Lemma 4, and such that the time-part of the isometry group acts by simply shifting τ by a constant. In other words, the isometry group acts locally by shifting τ, y4 , . . . , y D−1 , and it acts by

666

S. Hollands, S. Yazadjiev

rotating y1 + i y2 by a phase. Thus, if we define R =

y12 + y22 , Y = y3 , then (Y, R) ∈ R × R>0 are a coordinate system that displays the nature of Mˆ as a manifold with boundary near (z j + δ, z j+1 − δ). It also follows that any analytic function on M that is defined near x and is invariant under the action of G has an absolutely convergent expansion in Y and even powers of R locally near R = 0. In particular this applies to the function r 2 = − det G. Since r = 0 when R = 0, and since g = d R 2 + R 2 dϕ12 + . . . near x, we see from Eq. (67) that r 2 = R 2 (a + O(R 2 )) near our point on the axis for some a > 0. Thus, we may exchange O(R 2 ) with O(r 2 ), and similarly, we may exchange O( R˜ 2 ) (defined with respect to corresponding ˜ for O(r 2 ). coordinates near x˜ ∈ M) Actually, by arguments parallel to those in the proof of Lemma 7 and [21, Sect. 3], the matrix f takes the following form near (z j + δ, z j+1 − δ): f =

O(r 2 ) r 2 e2ν + O(r 4 ) , O(r 2 ) di j + O(r 2 )

(101)

where di j is an invertible (D − 4)-dimensional square matrix, and similarly for f˜i j of the second solution (here e2 ν is replaced by e2ν˜ and di j by d˜i j ). It follows from this expression and Eq. (100) that σ is finite near (z j + δ, z j+1 − δ), i.e. near any axis away from places where they intersect each other or the horizon, and away from infinity. (2) Horizon. On any compact sub-interval of the interval I H associated with the horizon, the matrices f i j , f˜i j are invertible, so σ is finite there. (3) Infinity. Near infinity, we use the asymptotic form of the metric (78), where we recall that r = ρ sin θ, z = ρ cos θ . This asymptotic form holds of course for both g and g, ˜ and gives us the asymptotic behavior of f i j and f˜i j since we can simply read off the scalar products of the axial Killing fields. We thereby immediately find that f − f˜ ⎛

⎞ O(1)(1 + cos θ ) O(1) sin θ O(ρ −1/2 )(1+cos θ )1/2 =⎝ O(1) sin θ O(1)(1 − cos θ ) O(ρ −1/2 )(1−cos θ )1/2 ⎠. −1/2 1/2 −1/2 1/2 O(ρ )(1 + cos θ ) O(ρ )(1−cos θ ) O(ρ −1 ) (102)

It is also easy to estimate the size of the matrix elements of the inverse; we find f −1 = ρ −2 sin−2 θ (1 + O(ρ −1 )) ⎞ ⎛ ρ(1 − cos θ ) O(1) sin θ O(ρ 1/2 )(1 − cos θ)1/2 ×⎝ O(1) sin θ ρ(1 + cos θ ) O(ρ 1/2 )(1 + cos θ)1/2 ⎠ , O(ρ 1/2 )(1 − cos θ )1/2 O(ρ 1/2 )(1 + cos θ )1/2 ρ 2 sin2 θ(δi j + O(ρ −1 ))

(103) with the same expression for f˜−1 , and we also find det f = ρ 2 sin2 θ (1 + O(ρ −1 ))

(104)

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

667

again with the same expression for det f˜. It is important to emphasize that these expressions hold for large ρ, and uniformly in θ including the axis θ = 0, π . From these expressions we immediately conclude that −1+

det f = O(ρ −1 ) , det f˜

f i j ( f˜i j − f i j ) = O(ρ −1 ) ,

(105)

again uniformly in θ , including the axis. These expressions show that the first two, and last term in Eq. (100) go to zero at infinity (including at the axis). For the middle term in Eq. (100), we also need to analyze the twist potentials χi and χ˜ i . To evaluate the twist potentials at a generic coordinate (ρ, θ ), we may take a path in Eq. (87) that moves outwards to infinity along the axis θ = 0, and then follows a half circle of constant ρ in the asymptotic region, or alternatively one that moves outwards along the axis θ = π . Either paths will give the same result, because the twist 1-forms are closed. On the axis and away from the horizon, the twist 1-forms vanish, whereas in the asymptotic region, we can use the asymptotic form of the metric, Eq. (78) derived above, and the corresponding asymptotic expansion of the twist 1-forms ωi . In the coordinates (ρ, θ ), the projected twist 1-forms ωˆ i on Mˆ are found to be given by ωˆ i = O(ρ 1/2 ) sin θ dθ + O(sin θ )O(ρ −1/2 ) dρ.

(106)

The same expression holds for the twist 1-forms of the metric g. ˜ 12 We now form the difference between the twist 1-forms for both metrics g and g, ˜ and we integrate this difference along the paths just described. The twist potentials χi respectively χ˜ i are already known to be identical on the axis θ = 0, π for sufficiently large ρ, since we have already argued that they have to be proportional to Ji and J˜i respectively, and we are assuming that J˜i = Ji . Therefore, the integration gives χi − χ˜ i = O(sin2 θ )O(ρ 1/2 ).

(108)

Together with the estimates for f i j and det f˜ given above this yields f i j (χi − χ˜ i )(χ j − χ˜ j ) = O(ρ −1 ) , det f˜

(109)

uniformly in θ , including the axis θ = π, 0. Thus, we have shown that all three terms in σ in Eq. (100) are of order O(ρ −1 ) uniformly in θ including the axis. Thus, σ tends to zero near infinity, as we wanted to show. 12 A more accurate analysis using the asymptotics (83) would give the better estimate

ωˆ i = ×

1 (2π )3−D Ji 2 (1 − (−1)i cos θ ) sin θ (1 + O(ρ −1 )) dθ + O(sin2 θ )dρ for i = 1, 2, sin θ (1 + O(ρ −1 )) dθ + O(sin2 θ )dρ

and corresponding expressions for the potentials.

for i = 3, . . . , D − 3,

(107)

668

S. Hollands, S. Yazadjiev

(4) An axis meets the horizon. Consider two adjacent intervals I H , I H +1 , where I H represents the horizon. On open interval I H , the matrices f and f˜ are non-singular, because no linear combination of the axial Killing fields ψi can vanish at any point of the horizon away from the axis. On I H +1 a linear combination a(I H +1 )ψi = 0 vanishes. By Lemma 2, we can find a matrix B ∈ S L(D − 3, Z) such that T a(I . , 0). Thus, redefining the axial Killing fields as ψi → H +1 )B = (1, 0, . .−1 A ψ and A = B if necessary, we can assume without loss of generality j ij j that a(I H +1 ) = (1, 0, . . . , 0), so that ψ1 = 0 at points represented by I H +1 . Shifting the z-coordinate by a constant if necessary, we may also assume that I H = (z 1 , 0), Ih+1 = (0, z 2 ), so that the intersection point of interest is (r, z) = (0, 0). Let x ∈ M be a point on the horizon that is also on the axis where ψ1 = 0, so ˆ By Lemma 4, that the orbit Ox corresponds to (0, 0) in the upper half plane M. we can introduce coordinates (τ, y1 , . . . , y D−1 ) in a neighborhood of x such that the action of the rotational part of the isometry group takes the form described in Lemma 4, and such that the time-part of the isometry group acts by simply shifting τ by a constant. In other words, the isometry group acts locally by shifting τ, y4 , . . . , y D−1 , and it acts by rotating y1 +i y2 by a phase. It is then straightforward to see that ∂/∂ y3 must be orthogonal to the horizon and outward pointing in the ˆ tangent space at x. Thus, we can parametrize the orbit space M by the coordinates R = y12 + y22 and Y = y3 . Outside the horizon, the range of these coordinates is {(R, Y ) | R > 0, Y > Y (R)}, where Y (R) is an analytic function which has an expansion in even powers of R, and for which Y (0) = 0. Actually, as shown in [7], by modifying the coordinates by a polynomial transformation, one can furthermore achieve that as many derivatives of Y (R) at R = 0 are made to vanish, and we are assuming that this has been done. The choice of coordinates (R, Y ) displays the nature of the orbit space as a manifold with a corner at the place where the horizon meets the axis. The canonical orbit space coordinates (r, z) in effect straighten out this corner. Any function F on M that is analytic in an open neighborhood of x and is invariant under G has a convergent power series expansion in even powers of R, F(x) = Fm,n Y m R n , (110) m∈Z,n∈2Z

as is in particular the case for any component of f i j or χi . In fact, since R = 0 corresponds to the places near x where ψ1 = 0, we have f =

R 2 (1 + O(R 2 )) R 2 O(1) , 2 2 2 R O(1) di j + O(R + Y )

(111)

where di j is an invertible (D − 4)-dimensional matrix, and that χi = Ji + O(R 2 ). It follows that the matrix inverse of f i j is of the form f −1 =

R −2 O(1) O(1) . O(1) O(1)

(112)

˜ Here, we have The exact same analysis can obviously also be carried out for M. ˜ Y˜ ), and a corresponding expansion for f˜i j , χ˜ i . coordinates ( R, In order to show that σ is bounded in a neighborhood of the horizon where an axis meets the horizon, we use Eq. (100), and the estimates for f i j , χi resp. f˜i j , χ˜ i

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

669

that we have just derived. To make this work, however, we must understand first ˜ Y˜ ) → (r, z) near (0, 0). A detailed the relation (map) (R, Y ) → (r, z) and ( R, analysis of these maps was carried out in Sect. 6.5 of [7], and the relevant results of this analysis for our purpose may be summarized as follows: In the sector r < z, we have a R 2 = r (1 + O(z)), Y 2 = a(z + r 2 + z 2 ) + O(z 5/2 ) 4 (113) a ˜ 2 2 5/2 2 2 ˜ ˜ ˜ + r + z ) + O(z ). R = r (1 + O(z)), Y = a(z 4 In the sector r ≥ z, we have instead (Y + i R)2 = a(z + ir ) + O(r 2 + z 2 ) , ˜ 2 = a(z (Y˜ + i R) ˜ + ir ) + O(r 2 + z 2 ).

(114)

These estimates can be used, in either case, to prove | f i j ( f i j − f˜i j )| ≤ O(1),

| f i j (χi − χ˜ i )(χ j − χ˜ j )| |det f | ≤ O(1), ≤ O(1), ˜ |det f | |det f˜| (115)

in either sector. In view of Eq. (100), it follows that σ is uniformly bounded in a neighborhood of any intersection point of an axis and a horizon. (5) Two axis meet. Consider two adjacent intervals I j−1 , I j , each representing an axis where the linear combinations ai (I j−1 )ψi respectively ai (I j )ψi vanish. The first step is again to change the action of the rotational symmetries generated by the Killing fields ψi to a convenient form. By Lemma 2, we can find a matrix B ∈ S L(D − 3, Z) such that a(I j )B T = (1, 0, . . . , 0) and such that a(I j−1 )B T = (0, 1, . . . , 0). Thus, redefining the axial Killing fields as ψi → j Ai j ψ j and A = B −1 if necessary, we can assume without loss of generality that a(I j ) = (1, 0, . . . , 0) and a(I j−1 ) = (0, 1, . . . , 0). Furthermore, we redefine the coordinate z by an additive constant so that the intersection point z j of the two intervals is 0, as this is going to simplify some of our formulas. Now let x ∈ M be a point covering our intersection point (0, 0) = Ox on the boundary of our orbit space Mˆ = {z + ir | r ≥ 0}. We have just argued that, at x, we may assume that ψ1 = 0 = ψ2 , and that the remaining ψi are non-vanishing there. Hence, we can introduce canonical coordinates (τ, y1 , . . . , y D−1 ) in a neighborhood of x such that the action of the rotational part of the isometry group takes the form described in 4, and such that the time-part of the isometry group acts by simply shifting τ by a constant. In other words, the isometry group acts locally by shifting τ, y5 , . . . , y D−1 , and it acts by rotating y1 + i y2 resp.y3 + i y4

by independent phases. Thus, the quantities R1 = y12 + y22 and R2 = y32 + y42 are alternative coordinates of M/G near Ox displaying clearly the character of the intersection point as a corner. We need to understand the relationship of the coordinates (R1 , R2 ) to (r, z) near r = 0 = z, and we need to understand the behavior of the fields f i j , χi in terms of either coordinate system. In a similar way, let x˜ ∈ M˜ be a point covering our intersection point (0, 0) = Ox , and introduce as above coordinates R˜ 1 , R˜ 2 . We also need to understand the relationship of the coordinates

670

S. Hollands, S. Yazadjiev

( R˜ 1 , R˜ 2 ) to (r, z) near r = 0 = z, and we need to understand the behavior of the fields f˜i j , χ˜ i in terms of either coordinate system. We first note that, if F is any analytic function on M that is defined near x and that is invariant under the action of G, it will locally have a convergent expansion of the form F(x) = Fm,n R1m R2n , (116) m,n∈2Z

i.e. in even powers of R1 , R2 . This applies e.g. to any component of f i j , χi . It follows from this and the explicit form of that action of G in the canonical coordinates introduced above that ⎛ ⎞ R12 (1 + O(R12 )) R12 R22 O(1) R12 O(1) ⎜ ⎟ f = ⎝ R12 R22 O(1) (117) R22 (1 + O(R22 )) R22 O(1) ⎠, R22 O(1) di j + O(R12 + R22 ) R12 O(1) where di j is an invertible (D − 5)-dimensional matrix, and that χi = R12 R22 O(1). As a consequence, the matrix inverse of f i j is of the form ⎛

f −1

⎞ O(1) O(1) R1−2 O(1) ⎜ ⎟ = ⎝ O(1) R2−2 O(1) O(1) ⎠ . O(1) O(1) O(1)

(118)

Similarly, any corresponding function F˜ on M˜ that is defined near x˜ will have an even expansion in R˜ 1 , R˜ 2 . This applies to any component of f˜i j , χ˜ i , and in fact these fields will again have the form (117) and (118), with R1 , R2 now replaced by R˜ 1 , R˜ 2 . We would now like to exploit these estimates above in Eq. (100) to estimate σ near, but to do this we must first relate (R1 , R2 ) and ( R˜ 1 , R˜ 2 ) to the coordinates (r, z). This was done in a very similar context in Sect. 6.5 of [7]. A relevant result can be stated saying that

r 2 + z 2 + O(r 2 + z 2 ) , = a˜ 1/2 ±z + r 2 + z 2 + O(r 2 + z 2 ),

R1/2 = a R˜ 1/2

1/2

±z +

(119)

in an open neighborhood of (r, z) = (0, 0), where a, a˜ are positive constants. We divide this open neighborhood into three sectors: The sector where r > |z|, the sector where 0 ≤ r ≤ z, and the sector where 0 ≤ r ≤ −z. In the first sector where r > |z|, we have from Eq. (119), −2 R1/2 ≤ a −1 r −1 + O(r 1/2 ) ,

−2 R˜ 1/2 ≤ a˜ −1r −1 + O(r 1/2 ) ,

(120)

as well as |χi | ≤ cr 2 + O(ρ 7/2 ) and |χ˜ i | ≤ cr ˜ 2 + O(r 7/2 ). It follows immediately from these estimates together with Eqs. (117) and (118) that the estimates (115) hold uniformly in the sector r > |z|. Hence, σ is uniformly bounded in this sector

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

671

by Eq. (100). Next, consider the sector 0 ≤ r ≤ z. We claim that in this sector, the estimate (119) can be improved to a R22 = r (1 + O(z)) , R12 = a(z + r 2 + z 2 ) + O(z 5/2 ), 4 (121) a ˜ 2 ˜ + r 2 + z 2 ) + O(z 5/2 ) , R˜ 2 = r (1 + O(z)) , R˜ 12 = a(z 4 uniformly throughout the sector. With these estimates, it again follows that σ ≤ O(1) in uniformly in the sector using Eqs. (117), (118) and (100). The remaining sector 0 ≤ r ≤ −z is treated in exactly the same way, except that the roles of R1 and R2 resp. R˜ 1 and R˜ 2 are now reversed. Thus, what remains is to show Eq. (121). Consider the mappings w, w˜ which assign to each (r, z) ∈ Mˆ in an open neighborhood of (0, 0) in the upper complex half-plane the complex numbers w = (R1 + i R2 )2 resp. w˜ = ( R˜ 1 + i R2 )2 , where R1/2 resp. R˜ 1/2 are the coordinates of the orbit labeled by (r, z) that we introduced above. If the spacetime manifolds M and M˜ were R4,1 × T D−5 with the standard action of U (1) D−3 , then the mappings z + ir → w and z + ir → w˜ would be the identity map on the complex upper half plane. In the present case, we are of course not in this situation, but in Riemannian normal coordinates (y1 , . . . , y D−1 ) resp. ( y˜1 , . . . , y˜ D−1 ) on M resp. M˜ that we introduced above in Lemma 4, the action of U (1) D−3 is identical, and furthermore, the metrics g and g˜ do not differ much in a neighborhood of the origin in these Riemannian normal coordinates. Hence, it is plausible that the maps w resp. w˜ do not differ much from the identity map. By going through the precise definition of r, z carefully, it was shown13 in Sect. 6.5 of [7] that this is indeed the case in the sense that both w and w˜ can be extended to smooth maps in an open neighborhood of the origin in the complex plane, satisfying w = a(z + O(z 2 ) + ir (1 + O(z))) , w˜ = a(z ˜ + O(z 2 ) + ir (1 + O(z))), (122) in our sector. Solving for R1/2 resp. R˜ 1/2 in terms of w resp. w˜ then gives the desired result (121). Thus, we have altogether shown that σ remains bounded in the neighborhood of any point where two intervals representing an axis meet. Thus we have argued that σ remains bounded, including the axis, horizon segment, and tends to zero near infinity. As we have explained, this concludes the proof of the theorem. 6. Conclusions and Outlook In this paper, we have proved a uniqueness theorem for D-dimensional stationary, asymptotically Kaluza-Klein black hole spacetimes satisfying the vacuum Einstein equations, allowing a group of isometries G = R×T D−3 . We showed that the solutions are uniquely determined by certain combinatorial data specifying the group action, certain moduli, and the angular momenta. This combinatorial data in particular determines the topology of the spacetime outside the black hole, and the topology of the horizon. 13 In that reference, the authors in fact considered a neighborhood of a point where an axis hits the horizon. However, the relevant calculations leading to the relevant conclusions about w and w˜ remain exactly the same.

672

S. Hollands, S. Yazadjiev

To be able to prove our uniqueness theorem, we also had to make a number of technical assumptions. They mainly concern the analyticity of the metric and the causal structure of the spacetime. One feels that it ought to be possible to remove these assumptions, but it is not clear to us how this could be done in practice. The more unsatisfactory aspect of our analysis is that we have not been able to prove or disprove the existence of smooth black hole solutions associated with more elaborate topological structure/combinatorial data, such as “black lenses” etc. Some partial results have been obtained in the literature on this (see e.g. [4]), but the general situation is still unclear. Acknowledgements S.H. would like to thank Iskander Aliev for discussions about lattices, and Piotr Chru´sciel for extensive discussions on manifolds with torus actions. S.Y. gratefully acknowledges support by the Alexander von Humboldt Foundation and the Sofia University Research Fund under grant No 111.

A. Proof of Lemma 7 Lemma 7. The length of the horizon interval satisfies (2π ) D−3l H = κ A H ,

(123)

where A H is the area of the horizon cross section H, and where κ > 0 is the surface gravity. Proof. We take the horizon to correspond to the interval z ∈ (z 1 , z 2 ), r = 0 on the boundˆ Let v = (1, 1 , . . . , D−3 ). Then by definition G I J v I v J = ary of the orbit space M. g(K , K ), where K is the Killing vector (3), which is tangent to the null generators of the horizon H , so G I J v I v J = 0 on H . It then follows e.g. from the min-max principle that G I J v J = 0 on the horizon, so limr →0 G I J v J = 0 in the orbit space for z ∈ (z 1 , z 2 ). As was shown in [21, Sec. 3], one can furthermore use the first reduced Einstein equation (60) to show that limr →0 G I J v J /r = 0 for z ∈ (z 1 , z 2 ). Let us now choose coordinates (u, r, ϕ1 , . . . , ϕ D−3 ) near H such that K = ∂/∂u, ψi = ∂/∂ϕi . Let us define Xˆ I as X I above in Eq. (59), with t replaced by ˆ and K , and let Gˆ I J = g( Xˆ I , Xˆ J ). Then the reduced Einstein equations also hold for G, furthermore, near r = 0 and z ∈ (z 1 , z 2 ), we have 2 −r det f −1 O(r 2 ) , Gˆ ∼ O(r 2 ) fi j

(124)

up to terms of higher order in r . Here, z ∈ (z 1 , z 2 ), and f i j (z) is the limit as r → 0 of g(ψi , ψ j ). Following [21, Sect. 3], the second reduced Einstein equation (61) furthermore gives 1 ∂r ν → 0, ∂z ν → − ∂z log det f, as r → 0, z ∈ (z 1 , z 2 ). 2

(125)

We conclude from the last relation that e−2ν → c2 det f for some constant c > 0 as r → 0, z ∈ (z 1 , z 2 ). From the form of the metric given in Eq. (67) (with G replaced by ˆ it follows that, near H , we have G),

Uniqueness Theorem for Stationary Kaluza-Klein Black Holes

g ∼ e2ν (dz 2 + dr 2 − c2 r 2 du 2 ) +

D−3

673

f i j (z) dϕi dϕ j + 2r 2

i, j=1

= e2ν (dz 2 + dU dV ) +

D−3 i, j=1

f i j (z) dϕi dϕ j +

D−3

O(1) dudϕi

i=1 D−3 1 O(1) (V dU − U dV )dϕi , c i=1

(126) The minus sign in front of the du 2 -term follows from the fact that K is timelike in a neighborhood outside H , which in turn follows directly from ∇a (K b K b ) = −2κ K a . In the last line we switched to Kruskal-like coordinates U, V defined by U V = r 2 , U/V = e2cu . It is apparent in these coordinates that H corresponds to V = 0. The restriction of K = ∂/∂u to H is found to be cU ∂/∂U , from which one concludes in view of the equation K a ∇a K b = κ K b on H that c = κ. The lemma may now be proven by calculating the horizon area in the coordinates z, ϕi using the above form of the metric and e−2ν = κ 2 det f . It is z 2 2π 1 AH = dz dϕi e2ν det f = (2π ) D−3 (z 2 − z 1 ), (127) κ z1 0 i

from which the lemma follows immediately in view of l H = z 2 − z 1 .

References 1. Bunting, G. L.: Proof of the uniqueness conjecture for black holes. PhD Thesis, Univ. of New England, Armidale, N.S.W., 1983 2. Carter, B.: Axisymmetric black hole has only two degrees of freedom. Phys. Rev. Lett. 26, 331–333 (1971) 3. Cassels, J.W.S.: “An introduction to the geometry of numbers.” Springer Grundlehren der Mathematischen Wissenschaften Bd. 99, Berlin, Heidelberg- Newyork: Springer, 1959 4. Chen, Y., Teo, E.: A rotating black lens solution in five dimensions. Phys. Rev D 78, 064062 (2008) 5. Cho, Y.M., Freund, P.G.O.: Non-Abelian gauge fields as Nambu-Goldstone fields. Phys. Rev. D 12, 1711 (1975) 6. Chru´sciel, P.T.: On rigidity of analytic black holes. Commun. Math. Phys. 189, 1–7 (1997) 7. Chru´sciel, P. T., Lopes Costa, J.: On uniqueness of stationary vacuum black holes. http://arXiv.orglabs/ 0806.0016vz [gr-qc], 2008 8. Chru´sciel, P.T.: On higher dimensional black holes with abelian isometry group. J. Math. Phys 50, 05250 (2009) 9. Chru´sciel, P. T., Galloway, G.J., Solis, D.: Topological censorship for Kaluza-Klein space-times. Ann. H. Poineare 10, 893–912 (2009) 10. Chru´sciel, P., Hollands, S.: Manifolds with cohomogeneity-2 actions of the torus group. In preparation 11. Elvang, H., Figueras, P.: Black Saturn. JHEP 0705, 050 (2007) 12. Elvang, H., Harmark, T., Obers, N.A.: Sequences of bubbles and holes: New phases of Kaluza-Klein black holes. JHEP 0501, 003 (2005) 13. Emparan, R., Reall, H.S.: A rotating black ring in five dimensions. Phys. Rev. Lett. 88, 101101 (2002) 14. Emparan, R., Reall, H.S.: Generalized Weyl solutions. Phys. Rev. D 65, 084025 (2002) 15. Evslin, J.: Geometric Engineering 5d Black Holes with Rod Diagrams. JHEP 0809, 004 (2008) 16. Friedrich, H., Racz, I., Wald, R.M.: On the rigidity theorem for spacetimes with a stationary event horizon or a compact Cauchy horizon. Commun. Math. Phys. 204, 691–707 (1999) 17. Galloway, G.J., Schleich, K., Witt, D.M., Woolgar, E.: Topological censorship and higher genus black holes. Phys. Rev. D 60, 104039 (1999) 18. Galloway, G.J., Schleich, K., Witt, D., Woolgar, E.: The AdS/CFT correspondence conjecture and topological censorship. Phys. Lett. B 505, 255 (2001) 19. Gibbons, G.W., Ida, D., Shiromizu, T.: Uniqueness and non-uniqueness of static black holes in higher dimensions. Phys. Rev. Lett. 89, 041101 (2002) 20. Harmark, T., Olesen, P.: On the structure of stationary and axisymmetric metrics. Phys. Rev. D 72, 124017 (2005)

674

S. Hollands, S. Yazadjiev

21. Harmark, T.: Stationary and axisymmetric solutions of higher-dimensional general relativity. Phys. Rev. D 70, 124002 (2004) 22. Harmark, T.: Talk available at http://online.itp.ucsb.edu/online/highdgr06/harmark1/pdf/Harmark_KITP. pdf, 2006 23. Hawking, S.W.: Black holes in general relativity. Commun. Math. Phys. 25, 152–166 (1972) 24. Hawking, S.W., Ellis, G.F.R.: The large scale structure of space-time. Cambridge: Cambridge University Press, 1973 25. Hirzebruch, F.: Differentiable manifolds and quadratic forms. Lect. Notes. Univ. of California, Berkely (1962) 26. Hollands, S., Ishibashi, A., Wald, R.M.: A higher dimensional stationary rotating black hole must be axisymmetric. Commun. Math. Phys. 271, 699 (2007) 27. Hollands, S., Ishibashi, A.: On the ‘Stationary Implies Axisymmetric’ Theorem for Extremal Black Holes in Higher Dimensions. Commun. Math. Phys 291, 403–441 (2009) 28. Hollands, S., Yazadjiev, S.: Uniqueness theorem for 5-dimensional black holes with two axial Killing fields. Commun. Math. Phys. 283, 749 (2008) 29. Hollands, S., Yazadjiev, S.: A Uniqueness theorem for 5-dimensional Einstein-Maxwell black holes. Class. Quant. Grav. 25, 095010 (2008) 30. Israel, W.: Event horizons in static vacuum space-times. Phys. Rev 164, 1776–1779 (1967) 31. Kastor, D., Ray, S., Traschen, J.: The First Law for Boosted Kaluza-Klein Black Holes. JHEP 0706, 026 (2007) 32. Kerner, R.: Generalization of Kaluza-Klein theory for an arbitrary non-abelian gauge group. Ann. Inst. H. Poincarè 9, 143 (1968) 33. Kobayshi, S., Nomizu, K.: Foundations of Differential Geometry I. New york: Wiley, 1969 34. Larsen, F.: Rotating Kaluza-Klein black holes. Nucl. Phys. B 575, 211 (2000) 35. Maison, D.: Ehlers-Harrison-type Transformations for Jordan’s extended theory of graviation. Gen. Rel. Grav. 10, 717 (1979) 36. Massey, W. S.: Algebraic Topology: An Introduction. Berlin-Heidelberg-New york: Springer, 1977 37. Morisawa, Y., Ida, D.: A boundary value problem for five-dimensional stationary black holes. Phys. Rev. D 69, 124005 (2004) 38. Mazur, P.O.: Proof of uniqueness of the Kerr-Newman black hole solution. J. Phys. A 15, 3173–3180 (1982) 39. Moncrief, V., Isenberg, J.: Symmetries of cosmological Cauchy horizons. Commun. Math. Phys. 89, 387–413 (1983) 40. Moncrief, V., Isenberg, J.: Symmetries of Higher Dimensional Black Holes. Class. Quant. Grav. 25, 195015 (2008) 41. Myers, R.C., Perry, M.J.: Black holes in higher dimensional space-times. Annals Phys 172, 304 (1986) 42. Oh, H.S.: Topology and Its Applications 13, 137–154 (1982) 43. Orlik, P., Raymond, F.: Actions of the torus on 4-manifolds I. Transactions of the AMS 152(2), 531–559 (1972) 44. Orlik, P., Raymond, F.: Actions of the torus on 4-manifolds II. Topology 13, 89–112 (1974) 45. Pomeransky, A.A., Sen’kov, R.A.: Black ring with two angular momenta. http://arXiv.orglabs/hep-th/ 0612005v1, 2006 46. Racz, I.: On further generalization of the rigidity theorem for spacetimes with a stationary event horizon or a compact Cauchy horizon. Class. Quant. Grav 17, 153 (2000) 47. Rasheed, D.: The Rotating dyonic black holes of Kaluza-Klein theory. Nucl. Phys. B 454, 379 (1995) 48. Robinson, D.C.: Uniqueness of the Kerr black hole. Phys. Rev. Lett. 34, 905–906 (1975) 49. Rogatko, M.: Uniqueness theorem of static degenerate and non-degenerate charged black holes in higher dimensions. Phys. Rev. D 67, 084025 (2003) 50. Rogatko, M.: Classification of static charged black holes in higher dimensions. Phys. Rev. D 73, 124027 (2006) 51. Sudarsky, D., Wald, R.M.: Extrema of mass, stationarity, and staticity, and solutions to the Einstein Yang-Mills equations. Phys. Rev. D 46, 1453–1474 (1992) 52. Wald, R.M.: General Relativity. Chicago: University of Chicago Press, 1984 53. Weinstein, G.: On rotating black holes in equilibrium in general relativity. Commun. Pure Appl. Math. 43, 903 (1990) 54. Weinstein, G.: On the Dirichlet problem for harmonic maps with prescribed singularities. Duke Math. J. 77(1), 135–165 (1995) (See Lemma 8) Communicated by P.T. Chru´sciel

Commun. Math. Phys. 302, 675–696 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1181-x

Communications in

Mathematical Physics

On Semi-Classical States of Quantum Gravity and Noncommutative Geometry Johannes Aastrup1 , Jesper Møller Grimstrup2 , Mario Paschke1 , Ryszard Nest3 1 Mathematical Institute, University of Münster, Einsteinstrasse 62, D-48149 Münster, Germany.

E-mail: [email protected]; [email protected]

2 The Niels Bohr Institute, University of Copenhagen, Blegdamsvej 17,

DK-2100 Copenhagen, Denmark. E-mail: [email protected]; [email protected]

3 Mathematical Institute, University of Copenhagen, Universitetsparken 5,

DK-2100 Copenhagen, Denmark. E-mail: [email protected] Received: 14 August 2009 / Accepted: 26 August 2010 Published online: 11 February 2011 – © Springer-Verlag 2011

Abstract: We construct normalizable, semi-classical states for the previously proposed model of quantum gravity which is formulated as a spectral triple over holonomy loops. The semi-classical limit of the spectral triple gives the Dirac Hamiltonian in 3+1 dimensions. Also, time-independent lapse and shift fields emerge from the semi-classical states. Our analysis shows that the model might contain fermionic matter degrees of freedom. The semi-classical analysis presented in this paper does away with most of the ambiguities found in the initial semi-finite spectral triple construction. The cubic lattices play the role of a coordinate system and a divergent sequence of free parameters found in the Dirac type operator is identified as a certain inverse infinitesimal volume element. Contents 1. 2. 3. 4.

5. 6. 7.

Introduction . . . . . . . . . . . . . . . Noncommutative Geometry . . . . . . . Ashtekar Variables and Holonomy Loops Spectral Triples of Holonomy Loops . . 4.1 Holonomy loops . . . . . . . . . . . 4.2 Generalized connections . . . . . . 4.3 A spectral triple over A . . . . . . 4.4 The limiting spectral triple . . . . . The Space of Connections . . . . . . . . The Quantization of the Poisson Bracket Semiclassical Analysis . . . . . . . . . . 7.1 Coherent states on a Lie group . . . 7.2 Product states . . . . . . . . . . . . 7.3 Semi-classical states: one copy of G 7.4 Determining the sequence {an } . . . 7.5 Three copies of G . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

676 677 679 680 680 681 682 682 685 685 688 688 689 690 690 691

676

8.

J. Aastrup, J. M. Grimstrup, M. Paschke, R. Nest

7.6 Semiclassical states on A . . . . . . . . . . . . . . . . . . . . . . . . . 691 7.7 The Dirac Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . 692 Discussion & Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694

1. Introduction A critical test of any quantum model is the existence of a semi-classical limit. This limit - its existence once established - should make contact to known physics, explain qualitative and quantitative results, and thereby render credibility to the model. Most importantly, the semi-classical limit serves to confirm the operational interpretation of the observables of the model. Furthermore, as there exist infinitely many in-equivalent quantizations of classical field theories, the semi-classical limit often provides an important tool to distinguish physical relevant models. The semi-finite spectral triple over a configuration space of connections constructed in [1–6] constitute a non-perturbative quantum model. The spectral triple emerges from a fusion between noncommutative geometry [7,8] and canonical quantum gravity [9–11]. It involves an algebra of holonomy loops and a Dirac type operator that resembles a global functional derivation operator. Its existence - as a mathematical entity - was established in [4,5]. Its interpretation in terms of a non-perturbative quantum field theory is immediate since the interaction between the algebra and the Dirac type operator reproduces the Poisson bracket of general relativity, formulated in terms of Ashtekar variables, and of Yang-Mills theory. What remained unresolved, in the papers [1–6], was the exact physical interpretation of the spectral triple construction. It was not clear whether the model should be understood in terms of gravity or Yang-Mills theory, or something else. In particular, no substantial results concerning a semi-classical limit were obtained. In this paper we make the first steps towards a semi-classical analysis. Drawing on results by Hall [12,13] concerning coherent states on compact Lie-groups, we construct semi-classical states over the configuration space of connections. This analysis enlightens us on two fronts: First, at a conceptual level, the semi-classical analysis entails a clearer physical interpretation of the semi-finite spectral triple. In particular, we find that the Dirac type operator descends, in this limit, to a Dirac Hamiltonian on a 3+1 dimensional ultra-static space-time. Through a careful analysis of the Poisson structure of general relativity we first obtain an interpretation of the constituents of the Dirac type operator as quantized triad field operators. In short, the Dirac type operator appears as an infinite sum of quantized triad field operators. In the semi-classical limit, these triad operators entail classical triad fields which appear in the classical Dirac operator. The special class of semi-classical states constructed in this paper suggest an interpretation as one-fermion states for a spinor field on the ultra-static space-time. This interpretation has, however, a problem since the scalar product induced on this space depends on the chosen coordinates. Nevertheless, we believe that our analysis indicates that the semi-finite spectral triple should be understood in terms of quantum gravity coupled to quantized matter fields. Indeed, if the time-scale is chosen appropriately, then the scalar product becomes coordinate independent. Second, at a more technical level, the semi-classical analysis resolves several questions and ambiguities concerning the construction of the semi-finite spectral triple. For instance, the triple is built over a countable system of nested graphs. In [5] it was clear that the construction would work for a large class of such systems of graphs and no

On Semi-Classical States of Quantum Gravity and Noncommutative Geometry

677

mechanism was found to single out one system of graphs from another. Furthermore, it was also clear that two spectral triples, based on different systems of graphs, would constitute entirely different models. This ambiguity is resolved through the semi-classical analysis: we find that a system of cubic lattices is singled out as “natural” with an interpretation as a choice of a coordinate system. This coordinate system is made to coincide with the coordinate system used to write down the Ashtekar variables and their Poisson bracket. Moreover, the construction of the Dirac type operator involves an infinite series of free parameters which is required to diverge in order for the operator to have a compact resolvent. In the papers [1–6] no clear physical interpretation of these parameters were found. Again, the semi-classical analysis resolves this ambiguity: it identifies the series of free parameters as the inverse infinitesimal, Euclidean volume element, the divergence arising through a continuum limit where the volume elements approach zero. Clearly, the introduction of finite graphs breaks diffeomorphism invariance. In loop quantum gravity [9–11], which is also based on an inductive system of graphs [14–16], the philosophy is to include all1 possible graphs and thereby restore the symmetries in the inductive limit of graphs and Hilbert spaces. This renders the limiting Hilbert space non-separable, something which probably obstructs the construction of a spectral triple [1]. In this paper we find that the constructed semi-classical limit does not depend on finite parts of the inductive system of lattices. Thus, in this limit the lattices seemingly disappear and the symmetries, broken by the initial choice of graphs, are restored. This means that the expressions for the classical Dirac operator and the Dirac Hamiltonian, found in the semi-classical limit, are coordinate covariant. The finding that cubic lattices are singled out by the semi-classical analysis plays well with recent results by Flori and Thiemann which state that, in loop quantum gravity, only lattices with cubic topology give the right semi-classical limit [17]. This paper is organized as follows: In Sect. 2 we briefly review noncommutative geometry and Connes work on the standard model. In Sect. 3 we introduce Ashtekar variables together with their dual variables, the loop and flux variables. In Sect. 4 we then review the construction of the semi-finite spectral triple. First, a spectral triple is constructed on a fixed graph, and subsequently a continuum limit of spectral triples is taken over an infinite system of ordered graphs. In Sect. 5 we comment on the underlying space of generalized connections and Sect. 6 is concerned with a careful analysis of the relationship between the spectral triple construction and the Poisson bracket between flux and loop variables. Finally, Sect. 7 is concerned with the semi-classical states. In Sect. 8 we give a conclusion. 2. Noncommutative Geometry It is a central observation in noncommutative geometry, due to Connes, that the metric of a compact manifold can be recovered from the Dirac operator together with its interaction with the smooth functions on the manifold [7]. In other words the metric is completely determined by the triple (C ∞ (M), L 2 (M, S), D). This observation leads to a noncommutative generalization of Riemmanian geometries. Here the central objects are spectral triples (A, H, D), where A is a not necessarily commutative algebra; H a Hilbert space and D an unbounded self-adjoint operator called 1 To be precise, all piece-wise analytic graphs.

678

J. Aastrup, J. M. Grimstrup, M. Paschke, R. Nest

the Dirac operator. The triple is required to satisfy some interplay relations between A, H, D mimicking those of (C ∞ (M), L 2 (M, S), D). The choice of the Dirac operator D is strongly restricted by these requirements. In physics, a key example of a noncommutative geometry comes from particle physics. Again, it was Connes who realized that the entire data of the standard model coupled to general relativity can be understood as a single, gravitational model formulated in terms of a spectral triple [8,18–23]. Here, the algebra is an almost commutative algebra A = C ∞ (M) ⊗ A F , where A F is the algebra C⊕H⊕ M3 (C). The corresponding Dirac operator then consists of two parts, D = DM + DF , one of which is the standard Dirac operator D M on M. The other part, D F , is given by a matrix-valued function on the manifold M, that encodes the metrical aspects of the states over the algebra A F . It is a highly nontrivial and very remarkable fact that the above mentioned requirements for Dirac operators force D F to contain the non-abelian gauge fields of the standard model and the Higgs-field together with their couplings to the elementary fermion fields. In particular the Higgs-field thus obtains a geometrical interpretation as being a part of the gravitational field on a noncommutative space. Even more so, the classical action of the standard model coupled to the Einstein-Hilbert action, in the Euclidean signature, emerges from the spectral triple through the so-called spectral action principle [18], which states that physics only depends on the spectrum of the Dirac operator. In view of the widely held opinion that quantum effects of the gravitational field will necessarily lead to a noncommutativity of space-time this observation indicates that the gauge interactions and the appearance of the Higgs field may be interpreted as quantum effects of the gravitational interactions. In other words they are the first shadows of the noncommutativity of space-time, visible at the length scale corresponding to the Z -mass. It should be mentioned in this respect that the spectral action does not directly reproduce the correct coupling constants of the standard model. In fact it only allows for lesser free parameters than the standard model. In order to obtain the measured coupling constants for the electromagnetic and strong interactions to a fairly good approximation, Connes and Chamseddine applied renormalization group methods in [19] and subsequent publications. This analysis ultimately leads to a prediction of the Higgs mass [21]. The predicted value, which was based on the assumption of “the big desert”, was recently excluded by Tevatron data. Nevertheless it is very remarkable that the use of quantum field theoretical concepts is absolutely essential here to obtain a physically reasonable classical action. To our point of view this strongly indicates that the spectral triple used by Connes and Chamseddine should be viewed as the semi-classical low energy limit of some genuine quantum theory. One may then also hope that other quantum corrections present in the full theory provide a more realistic value for the Higgs mass. Since the noncommutative description of the standard model is entirely gravitational this full theory should, presumably, be a theory of the quantized gravitational field. Thus, if there were already a theory of quantum gravity one should certainly investigate whether it admits some semi-classical states that resemble this almost commutative spectral triple. It was these considerations which motivated the construction of the semi-finite spectral triple over a configuration space of connections [1–6]. The idea is to seek a general

On Semi-Classical States of Quantum Gravity and Noncommutative Geometry

679

framework which combines the machinery and ideas of noncommutative geometry with elements of quantum gravity. The final goal, then, is to make contact to Connes work on the standard model through the formulation of a semi-classical analysis. 3. Ashtekar Variables and Holonomy Loops We start with some notation. Let M be a 4-dimensional globally hyperbolic manifold with a vierbein E μA and a space-time metric G μν = E μA E νB η AB , where η AB = diag(−1, 1, 1, 1) is the corresponding tangent space metric. Here the letters μ, ν, . . . and A, B, . . . denote curved and flat space-time indices respectively. Next, take a foliaa e be tion of M according to M = R × , where is a spatial manifold. Let gmn = em na a the corresponding spatial metric and em the spatial dreibein. Here the letters m, n, . . . and a, b, . . . denote curved and flat spatial indices. The Ashtekar variables [24,25] consist first of a complex SU (2) connection Aam (x) on . The Ashtekar connection is a certain complex linear combination of the spatial spin connection and the extrinsic curvature of in M. The canonically conjugate variable to Aam (x) is the inverse densitized dreibein E¯ am = eeam , a ). This set of variables satisfy the Poisson bracket where e = det(em n (3) δ (x, y), {Aam (x), E¯ bn (y)} = κδba δm

where κ is the gravitational constant. The formulation of canonical gravity in terms of connection variables permits a shift to loop variables which are taken as the holonomy transform h l (A) = Pexp Am d x m , l

along a loop l in . To define a conjugate variable to h l (A) let d Fa be the flux of the triad field E¯ am corresponding to an infinitesimal area element of the spatial manifold , which can be written d Fa = mnp E¯ am d x n ∧ d x p . Given a 2 dimensional surface S in we write the total flux of E¯ am through S, FSa = d F a . S

Next, consider a surface S and let l = l1 · l2 be a line segment in which intersect S at the point l1 ∩ l2 . The Poisson bracket between the flux and holonomy variables read [9] {h l , FSa } = ι(S, l)κh l1 τa h l2 , where τ denote the generators of the Lie algebra of G. Here, ι is given by ι(S, l) = ±1, 0, depending on the intersection between S and l.

(1)

680

J. Aastrup, J. M. Grimstrup, M. Paschke, R. Nest

Fig. 1. A plaquet in the lattice

4. Spectral Triples of Holonomy Loops In this section we outline the construction of the semi-finite spectral triple first presented in [3,4] and further developed in [5]. This spectral triple combines ideas and techniques of canonical gravity and noncommutative geometry. We first construct a spectral triple at the level of a finite graph. Next we take the limit of such spectral triples, over an infinite system of ordered graphs, to obtain a limiting spectral triple. 4.1. Holonomy loops. Let be a 3-dimensional, finite, cubic lattice. Let {vi } and {l j } denote vertices and edges in , respectively. The edges in are oriented according to the three main directions in , the x 1 - x 2 - and x 3 -directions, see Fig. 1. Thus, an edge l is a map l : {0, 1} → {vi }, where l(0) and l(1), the start and endpoints of l, are adjacent vertices in . A sequences of edges {li1 , li2 , . . . , lin }, where li j (1) = li j+1 (0) is a based loop if li1 (0) = lin (1) = v0 , where v0 ∈ {vi } is a preferred vertex in called the basepoint. An edge has a natural involution given by reversing its orientation. Thus, li∗ (t) = li (1 − t), and the involution of a loop L = {li1 , li2 , . . . , lin } is given by L ∗ = {li∗n , . . . , li∗2 , li∗1 }. In the following we shall discard trivial backtracking which means that we introduce the equivalence relation {. . . , li j−1 , li j , li∗j , . . .} ∼ {. . . , li j−1 , . . .}, and let a loop L be an equivalence class with respect hereto. The product between two loops L 1 = {li j } and L 2 = {lil } is simply given by gluing the loops to form a new sequence of edges: L 1 · L 2 = {li1 , . . . , lin , lk1 , . . . , lkm }. One easily checks that the involution equals an inverse which gives the set of loops in the structure of a group, called the hoop group.

On Semi-Classical States of Quantum Gravity and Noncommutative Geometry

681

Finally, we consider finite series of loops a=

ai L i , ai ∈ C,

(2)

i

with the involution a∗ =

a¯ i L i∗ ,

i

and the product between a and a second element b = a·b =

j

bj L j,

(ai b j )L i · L j .

i, j

The set of elements of the form (2) is a -algebra. We denote this algebra by B . 4.2. Generalized connections. Next, let G be a compact, connected Lie-group. For the aim of this paper it is natural to choose G = SU (2). We shall, however, develop the formalism for general groups. Let ∇ be a map ∇ : {li } → G, which satisfies ∇(li ) = ∇(li∗ )−1 , and denote by A the set of all such maps. Clearly, A G n , where the total number of vertices in is written n . Given a loop L = {li1 , li2 , . . . , lin }, let ∇(L) = ∇(li1 ) · ∇(li2 ) · . . . · ∇(lin ). This turns ∇ into a homomorphism from the hoop group into G and provides a norm on B ,

a = sup ∇∈A

ai ∇(L i ) G , a ∈ B ,

i

where the norm on the rhs is the matrix norm given by a chosen representation of G. The closure of the -algebra of loops with respect to this norm is a C -algebra.2 We denote this loop algebra by B . 2 Note that the natural map from B to B is not necessarily injective.

682

J. Aastrup, J. M. Grimstrup, M. Paschke, R. Nest

Fig. 2. Subdivision of a cubic lattice cell into 8 new cells

4.3. A spectral triple over A . First, let H be the Hilbert space L 2 (G n , Cl(T ∗ G n ) ⊗ Ml (C)), where L 2 is with respect to the Haar measure and where l is the size of the matrix representation of G. Here, Cl(T ∗ G n ) is the Clifford bundle of the cotangent bundle over G n with respect to a chosen left and right invariant metric. There is a natural representation of the loop algebra on H given by f L · (∇) = (1 ⊗ ∇(L)) (∇), ∈ H , where the first factor acts on the Clifford bundle and the second factor acts on the matrix factor in H . Next, denote by D a Dirac operator on A . The precise expression for D will be determined below through the process of taking the continuum limit of the construction. D acts on the factor of H which involves the Clifford bundle. In total, the triple (B , H , D ) is a geometrical construction over A . 4.4. The limiting spectral triple. The goal is to obtain a spectral triple over the space A. To do this we take the limit of spectral triples over the intermediate spaces A . Let {i }, i ∈ N, be an infinite sequence of 3-dimensional, finite, cubic lattices where i+1 is the lattice obtained from i by subdividing each elementary cell in i into 8 new cells. This process involves the subdivision of each edge l j in i into two new edges in i+1 together with the addition of new vertices and edges, see Fig. 2. We denote the initial lattice by 0 . Corresponding to this sequence of cubic lattices there is a projective system {Ai } of spaces obtained from the graphs {i }, together with natural projections between these spaces Pi,i+1 : Ai+1 → Ai .

(3)

Consider now a system of triples (Bi , Hi , Di ), with the restriction that these triples are compatible with the projections (3). This requirement is easily satisfied for the algebras and the Hilbert spaces, see [4]. For the Dirac type operators, however, some care must be taken. The problem reduces to the simple case where an edge in i subdivided into two edges in i+1 , see Fig. 3.1, which corresponds to the projection P : G 2 → G, (g1 , g2 ) → g1 · g2 ,

(4)

On Semi-Classical States of Quantum Gravity and Noncommutative Geometry

683

Fig. 3. A subdivision of an edge into two and the new parameterization of the edge

and a corresponding map between Hilbert spaces P ∗ : L 2 (G, Cl(T ∗ G) ⊗ Ml ) → L ∗ (G 2 , Cl(T ∗ G 2 ) ⊗ Ml ). The compatibility condition for the Dirac type operator reads P ∗ (D1 v)(g1 , g2 ) = D2 (P ∗ v)(g1 , g2 ), v ∈ L 2 (G, Cl(T ∗ G) ⊗ Ml ). Here D1 is the Dirac operator on G, and D2 is the corresponding Dirac operator on G 2 . Consider the following change of variables:

: G 2 → G 2 ; (g1 , g2 ) → (g1 · g2 , g1 ) =: (g1 , g2 ),

(5)

for which projection (4) obtains the simple form P(g1 , g2 ) = g1 .

(6)

This change of variables corresponds to a new parameterization of the edge, see Fig. 3.2. It is now straightforward to write down a Dirac operator on G 2 which is compatible with the projection (6). Basically, we can pick any Dirac operator of the form D2 = D1 + a D2 , a ∈ R, where D2 is a Dirac operator on the copy of G in G 2 whose coordinates are eliminated by the projection (6). At this point the choice of the operator D2 is essentially unrestricted with a being an arbitrary real parameter. However, for reasons explained in [5] it turns out that D1 and D2 should of the form j ei · L e j , (7) Di = j

i

j

where the product is Clifford multiplication. In Eq. (7) {ei } denotes a left-translated orthonormal basis of T ∗ G, where G is the i th copy in G n . L e j denotes the correspondi ing differential. For later reference we denote by Re j the right translated vector fields. i This line of analysis is straightforwardly generalized to repeated subdivisions. At the level of the n th subdivision of the edge the change of variables which generalizes (5) reads

: Gn → Gn ; (g1 , g2 , . . . , gn ) → (g1 · g2 · . . . · gn , g2 · . . . · gn , . . . , gn ) := (g1 , g2 , . . . , gn ),

(8)

684

J. Aastrup, J. M. Grimstrup, M. Paschke, R. Nest

Fig. 4. Two different types of partition which will lead to different Dirac type operators. The second partition is the one which we will later argue is “natural”

which corresponds to the structure maps ). Pn,n/2 : G n → G n/2 ; (g1 , g2 , g3 , . . . , gn ) → (g1 , g3 , . . . , gn−1

Again, it is straightforward to construct a Dirac type operator compatible with these structure maps. This construction gives rise to a series of free parameters {ai }, one for each subdivision. Thus, by solving the G 2 → G problem repeatedly, and by piecing together the different edges, we end up with a Dirac type operator on the level of n of the form ai Di , (9) Dn = i

where Di is a Dirac type operator corresponding to the i th level of subdivision in An . The change of variables in (8) is the key step to construct Dn . However, there will be many different partitions of the line segment which simplify the structure maps and lead to different Dirac type operators, see Fig. 4. This ambiguity was also commented on in [5]. In subsequent sections we will argue that a single type of subdivision stand out as “natural” due to the classical interpretation of the corresponding Dirac type operator. We are now ready to take the limit of the triples (Bi , Hi , Di ). First, the Hilbert space H is the inductive limit of the intermediate Hilbert spaces Hi . That is, it is constructed by adding all the intermediate Hilbert spaces H = ⊕∈{i } L 2 (G n() , Cl(T ∗ G n() ) ⊗ Ml (C))/N , where N is the subspace generated by elements of the form (. . . , v, . . . , −Pi∗j (v), . . .), where Pi∗j are the induced maps between Hilbert spaces. The Hilbert space H is then the completion of H . The inner product on H is the inductive limit inner product. This Hilbert space is manifestly separable. Next, the algebra B := lim B

−→

contains loops defined on a cubic lattice n in {n }. Note that the algebra B differs from the algebra used in loop quantum gravity on two points: first, we only consider loops running in cubic lattices, whereas the algebra in loop quantum gravity is generated

On Semi-Classical States of Quantum Gravity and Noncommutative Geometry

685

by piece-wise analytic loops. Second, we consider loops which correspond to untraced holonomy loops. Thus, the algebra B is noncommutative, in contrast to the algebra of traced holonomy loops in loop quantum gravity, which is commutative. Finally, the Dirac-like operator Dn descends to a densely defined operator on the limit Hilbert space H, D = lim Dn .

−→

We factorize H in lim L 2 (G n() , Ml ) ⊗ lim Cl(Tid∗ (G n() )). On lim Cl(Tid∗ (G n() )) there is an action of the algebra lim Cl(Tid∗ (G n() )). The completion of this algebra with respect to this action is the CAR algebra and admits a normalized trace, i.e. tr (1) = 1. Let T r be the ordinary operator trace on the operators on lim L 2 (G n() , Ml ) and define τ = T r × tr . In [4] we prove that for a compact Lie-group G the triple (B, H, D) is a semi-finite spectral triple with respect to τ when the sequence {an } converges to infinity. This means that: 1. (1+ D 2 )−1 is τ -compact, i.e. can be approximated in norm with finite trace operators, and 2. the commutator [D, a] is bounded. 5. The Space of Connections Let us now turn to the spaces Ai and their projective limit. Denote by A := lim A .

←−

Further, given a trivial principal G-bundle denote by A the space of all smooth connections herein. In [4] we prove that A is densely embedded in A: A → A. This fact justifies the terminology generalized connections for the completion A and shows that the semi-finite spectral triple (B, H, D) is indeed a geometrical construction over the space A of smooth connections. 6. The Quantization of the Poisson Bracket To determine the relation between the construction of the spectral triple (B, H, D) and the formulation of canonical gravity in terms of loop and flux variables satisfying the Poisson bracket (1), we calculate the commutator between the Dirac type operator D and an element in the loop algebra B. Consider first a single line element li and the corresponding group element ∇(li ) ∈ G. We assume that the copy of G in An assigned to li corresponds to the m th subdivision of the initial cubic lattice. We then find eik · ∇(li )σ k , [D, ∇(li )] = am

686

J. Aastrup, J. M. Grimstrup, M. Paschke, R. Nest

Fig. 5. The surface Si

where σ k are generators in the Lie-algebra g. Also, consider a loop L = {li1 , li2 , . . . , lin } and the commutator [D, f L ] = [D, ∇(l1 )] · ∇(l2 ) . . . ∇(ln ) + ∇(l1 )[D, ∇(l2 )] . . . ∇(ln ) + · · · . These formula show that a commutator between D and an element of the algebra B inserts Lie-algebra generators at vertices in the graphs {i }. This general structure is similar to the structure of the Poisson bracket (1) and suggest that the interaction between the Dirac type operator D and the loop algebra B is related to a representation of the Poisson bracket (1). Consider again a single edge li which we now for simplicity assume to belong to the initial lattice. Let li (0) = v j and li (1) = v j+1 , where v j and v j+1 are vertices in 0 . Let us also assume that li runs in the x 1 -direction in 0 . Also, let ∇(li ) belong to the i th copy of G in A0 . The commutator between the left-invariant vector field eia and the group element ∇(li ) gives [L eia , ∇(li )] = ∇(li )σ a . This shows that L eia corresponds to a quantization of a flux variable FSa , where the surface S intersects li at v j+1 . Actually, the surface S is of no significance here except for its intersection point with the vertex v j+1 . Let Si be a surface which intersects the vertex v j+1 and is perpendicular to li , see Fig. 5. The size of Si corresponds to the initial lattice 0 in the sense that it spans an area corresponding to a side in a single cell. The operator L eia should then, due to the Poisson bracket (1), be interpreted as a quantization a , of the flux variable FS i quantization

a −→ l 2P L eia , iFS i

where l P is the Planck length. It is important to realize that the inverse, densitized triad a is located at the endpoint of l . Thus, F a involves the quantity field involved in FS i Si i E¯ am (v j+1 ) through a d x 2 ∧ d x 3 E¯ a1 (v j+1 ). FSi = Si

. Consider next the first subdivision of li into two edges, which we denote li and li+1 Thus, ∇(li ) = ∇(li ) · ∇(li+1 ).

Also, denote the new vertex which subdivides li by v j+1/2 . Now, the new copy of G is associated to the first half of the line segment li , which means to li . For notational

On Semi-Classical States of Quantum Gravity and Noncommutative Geometry

687

Fig. 6. An alternative partition of an edge into two

simplicity, let us assume that this new copy of G is the (i +1)th copy of G in A1 whereas the full line segment li corresponds to the i th copy of G. At first hand, it seems that the corresponding left-invariant vector fields L ek should be interpreted according to i+1

quantization

a a iFS −→ l 2P L ei+1 i+1

(first guess).

(10)

However, this cannot be correct since L ek commutes with ∇(li ) which belongs to the i+1

i th copy of G. If Eq. (10) should be correct then the commutator between L ek and i+1 ∇(li ) should split up ∇(li ) and insert a Lie-algebra generator at the new vertex v j+1/2 , since the edge li intersects the surface Si+1 at v j+1/2 . Instead, we find that relation (10) obtains an additional term: quantization

a iFS −→ l 2P L ek + l 2P Rgi+1 ek g−1 . i+1 i+1

i i+1

is located at the new vertex v j+1/2 . If Notice here that the triad field involved in we had chosen a different partition of the line segment, see Fig. 6, then the left-invariant vector field corresponding to the new copy of G would have an interpretation in terms of a flux variable and triad field located at v j+1 . Thus, the classical interpretation of D distinguishes between the different modes of subdividing the line segment. Notice also that the surfaces Si must shrink with each subdivision, in order to have one intersection point between the lattice and each surface. Thus, if we set the area of the initial surface equal to one, then the size of the surfaces decrease with subdivisions like k FS i+1

|Si | = 2−2n .

(11)

Consider the next subdivision of li into four edges. The notation is as indicated in a a Fig. 7. We find that the two new flux operators FS and FS have the following i+2 i+3 correspondences: quantization

a 2 2 a +l R iFS −→ l 2P L ei+2 P gi+2 ea g −1 + l P R gi+2 ea i+2 i

−1 i+1 gi+2

i+2

,

and quantization

a 2 a +l R iFS −→ l 2P L ei+3 P gi+1 gi+3 ea g −1 g −1 . i+3 i

i+3 i+1

Once more, the particular subdivision of li is singled out by this interpretation. If we had chosen the alternative subdivision of the edge into two, as pictured in Fig. 7, then this interpretation would not have been possible. There exist, however, at this level the possibility to choose the subdivision in Fig. 4.1. At this point of the analysis, there is no particular reason to choose between the two modes

688

J. Aastrup, J. M. Grimstrup, M. Paschke, R. Nest

Fig. 7. Partition of an edge into four

of subdivision pictured in Fig. 4, except perhaps that the subdivision in Fig. 4.2 is more symmetrical since new copies of G are all assigned to edges of the same length. In general, at the n th level of subdivision of li we obtain the correspondence quantization

k k FS −→ l 2P L ek + l 2P i+s−1 , i+s i+s

(12)

k where i+s−1 is a combination of twisted, right-invariant vector fields acting on the copies of G assigned to edges which are situation “higher” in the inductive system of k lattices. Put differently, i+s−1 probes information which is more coarse grained relative to the line segment to which the (i + s)th copy of G is assigned. k In the following we shall ignore the correction terms i+s−1 when we apply relation (12) to translate quantized quantities involving the Dirac type operator D to their classical counterparts. The reason for this will become clear in the next section where we construct semi-classical states. These states have the property that any dependency on finite parts of the inductive system of lattices vanishes in the semi-classical limit. In the limit of repeated subdivision of lattices we find that the semi-finite spectral triple (B, H, D) encodes information tantamount to a representation of the Poisson bracket of general relativity. Thus, the triple carries information of the kinematical sector of quantum gravity. Clearly, the triple is based on a different set of variables than loop quantum gravity and hence the “representation” it encodes is different to the representation used there.

7. Semiclassical Analysis In this section we construct semi-classical states in H and evaluate their expectation value of D.

7.1. Coherent states on a Lie group. We will first recall the results for coherent states on compact connected Lie groups that we are going to use. For simplicity we will only consider the case of most interest, namely SU (2). Let {ea } be a basis for su(2). Given g0 in SU (2) and given three momenta (real numbers) p 1 , p 2 , p 3 there exist families φt ∈ L 2 (SU (2)) such that lim φ t , t L ea φ t = i pa ,

t→0

On Semi-Classical States of Quantum Gravity and Noncommutative Geometry

689

and lim φ t ⊗ v, gφ t ⊗ v = (v, g0 v),

t→0

where v ∈ M2 (C), and (, ) denotes the inner product hereon. Corresponding statements hold for operators of the type f (g)P(t L e1 , t L e2 , t L e3 ), where P is a polynomial in three variables, and f is a smooth function on SU (2), i.e. lim φ t , f (g)P(t L e1 , t L e2 , t L e3 )φ t = f (g0 )P(i p 1 , i p 2 , i p 3 ).

t→0

This statement also carries over to symbols, i.e. functions on T ∗ SU (2) with certain properties. The construction of these states follows from work of Hall, see [12,13], and are more explicitly described in [26] Sect. 3.1. The states have further important physical properties, which we are however not going to use at the present stage of the analysis. Also, the precise construction of these states, in particular the choice of complexifier, is irrelevant for the results presented in this paper. 7.2. Product states. Let us consider the n th level in a subdivision of lattices. We split the edges into {li }, and {li }, where {li } denotes the edges appearing in the n th subdivision but not in the n − 1th subdivision, and {li } the rest. Define φlti to be the coherent state on SU (2) such that lim φlti ⊗ v, gφlti ⊗ v = (v, h li (A)v),

t→0

and lim φlti , t L eia , φlti = 2−2n iE am (v j+1 ),

t→0

where v ∈ M2 (C); v j+1 denotes the right endpoint of li , and the m in the E am refers to the direction of li . The factor 2−2n comes from the scaling (11). Furthermore define the states φli by lim φlt ⊗ v, gφlt ⊗ v = (v, h li (A)v),

t→0

i

i

and lim φlt , t L eaj φlt = 0.

t→0

i

i

Finally define φnt to be the product of all these states as a state in L 2 (An ). These states are essentially identical to the states constructed in [26] except that they are based on cubic lattices and a particular mode of subdivision. In the limit n → ∞ these states produce the right expectation value on all loop operators in the infinite lattice.

690

J. Aastrup, J. M. Grimstrup, M. Paschke, R. Nest

Fig. 8. A single edge

7.3. Semi-classical states: one copy of G. We now proceed to construct semi-classical states in H. From here on we set t = l 2P and rescale the left-invariant vector fields in the Dirac type operator accordingly L eia → t L eia . The first step is to consider again a single edge. Let ψ(x) be a field on . A priori, ψ(x) can either be a two-spinor or a two-by-two matrix valued field. For reasons which shall become clear later, we choose the second option. Consider again an edge li with endpoints v j and v j+1 , see Fig. 8. The states in L 2 (G, Cl(T ∗ G) ⊗ M2 (C)) which we are interested in have the form3 t (li ) = (gi ψ(v j+1 ) + ieia σ a ψ(v j ))φlti , where the spinor field is evaluated at the endpoints of the edge li . A straightforward computation gives the expectation value of D on this state ¯ j )σ a E am (ψ(v j+1 ) − ψ(v j )) ¯ t |D|t = 2−2n an − ψ(v lim t→0

¯ j+1 ) − ψ(v ¯ j ))σ a E am ψ(v j ) + (ψ(v ¯ j ){ Am , σ a E am }ψ(v j ) , + ψ(v

(13)

where we applied the expansion g = 1 + Am + O( 2 ), with = 2−n . Also, the index m denotes the direction of the edge li . 7.4. Determining the sequence {an }. Formula (13) indicates that the sequence {an } of free parameters in D plays a specific role in the semiclassical analysis. In particular, note the term (ψ(v j+1 ) − ψ(v j )). If we consider the limit where the edge li lies increasingly deep in the inductive system of graphs, then this term approaches (ψ(v j+1 ) − ψ(v j )) → ∂m ψ(v j )d x m , (no sum over m), where d x m is the infinitesimal line segment, which goes as 2−n . Here n denotes the level of subdivisions of graphs. Thus, if we choose the sequence an = 23n , then the expression (13) converges, when one considers edges of increasing depth in the inductive system of lattices, towards the quantity ¯ t |D|t = ψ(v ¯ j )σ a E am ∇m ψ(v j ) − ∇m ψ(v ¯ j )σ a E am ψ(v j ), lim lim

n→∞ t→0

3 Here we assume that ψ(x) is matrix valued. If ψ(x) was a two-spinor field then we would instead consider the Hilbert space L 2 (G, Cl(T ∗ G) ⊗ C2 ) and states therein.

On Semi-Classical States of Quantum Gravity and Noncommutative Geometry

691

Fig. 9. Three edges, connected in one vertex

(again, no sum over m) with ∇m = ∂m + Am . This is the expectation value (in a point) of the self-adjoint operator σ a E am ∇m + ∇m σ a E am

(no sum over m).

Here, we applied what amounts to a partial integration (this will be justified shortly where an integral over emerges).

7.5. Three copies of G. Next, we consider instead three edges, denoted for simplicity by l1 , l2 , l3 , all leading out of the same vertex, with three copies of G associated to them, correspondingly. First, consider the state t (g1 , g2 , g3 ) = e2a e3a g1 ψ(v1 ) − e1a e3a g2 ψ(v2 ) + e1a e2a g3 ψ(v3 ) i + e1a e2b e3c δ ab σ c + δ ac σ b + δ bc σ a ψ(v0 ) φlt1 φlt2 φlt3 , 5

(14)

where the enumeration of the vertices are shown Fig. 9. We find that the expectation value of D on this state leads to the operator σ a E am ∇m + ∇m σ a E am

(15)

in the limit where the edges li lie increasingly deep in the inductive system of lattices. In Eq. (15) we now sum over m.

7.6. Semiclassical states on A. To obtain semiclassical states on the full space A we need to prescribe a procedure to sum up the results for the individual copies of G, or rather, for vertices. First, at the n th level in the inductive system of lattices, where we have n n copies of G, we write down the state ⎛ ⎞ tn (An ) = n ⎝ v j ⎠ φnt , (16) vj

692

J. Aastrup, J. M. Grimstrup, M. Paschke, R. Nest

Fig. 10. Three more edges

where n equals 2−3(n−1)/2 . This will, in the limit taken below, converge to the Lebesque measure. Also, we define v j = eaj2 eaj3 g j1 ψ(v j1 ) − eaj1 eaj3 g j2 ψ(v j2 ) + eaj1 eaj2 g j3 ψ(v j3 ) i + eaj1 ebj2 ecj3 δ ab σ c + δ ac σ b + δ bc σ a ψ(v j ), 10

(17)

see Fig. 10. The sum in (16) runs over a certain subclass of vertices in n . At the n th level, these vertices are the midpoints of the minimal cubes present at the (n − 1)th level. This discrimination between vertices admittedly appears to be somewhat arbitrary and it might be possible to take into account all edges. This, however, complicates matters. We shall return to this point in a later publication. With (16) we have a sequence {tn } of states in H and we can calculate the limit of the expectation value of D on these states. We call this limit the continuum limit. We find 1 √ a m √ t t ¯ ¯ lim lim n |D|n = d 3 x ψ(x)( gσ ea ∇m + ∇m gσ a eam )ψ(x). (18) n→∞ t→0 2 Thus, the sequence of states {tn } defines a semi-classical limit where D, to lowest order, is a spatial Dirac operator on . Notice that the integral in (18) is the invariant integral √ μ over . The factor g, where g is the determinant of the spatial metric, comes from E¯ a . Here, however, it should be stressed that the emerging normalization of spinors ψ(x) is not coordinate invariant. We shall comment on this below. Note that the emergence of the integral in (18) crucially depends on the way the CAR algebra appears in the expression (17). Interestingly, the elements of the CAR algebra play the role of localizers in the construction.

7.7. The Dirac Hamiltonian. In Eqs. (14) and (17) we ignored certain degrees of freedom. To take these into account we modify the expression in Eq. (17) to ˜ v j = eaj eaj g j1 ψ(v j1 ) − eaj eaj g j2 ψ(v j2 ) + eaj eaj g j3 ψ(v j3 ) 2 3 1 3 1 2 i a b c ab c + e j1 e j2 e j3 δ σ + δ ac σ b + δ bc σ a , Mv j ψ(v j ), 20

On Semi-Classical States of Quantum Gravity and Noncommutative Geometry

693

where Mv j is an arbitrary self-adjoint two-by-two matrix. Write Mv j = N (v j )1 + iN a (v j )σ a , where N and N a are real fields on , scalar and vectorial respectively. Finally, we let N m = N a eam ; we define

⎛ ˜ tn (An ) = n ⎝

⎞ ˜ v j ⎠ φnt ,

vj

and repeat the calculations leading to (18). We obtain ¯˜ t |D| ˜ tn lim lim n 1 √ √ m √ a m 3 ¯ a m ( g N σ ea ∇m + N ∇m gσ ea ) + i g N ∂m ψ(x) = d x ψ(x) 2 1 1 √ a m √ m √ m 3 ¯ + d x ψ(x) g N Am + (∂m g N ) + (∂m N ) gσ ea ψ(x). 2 2

n→∞ t→0

(19)

Here, the first line is the principal part of the Dirac Hamiltonian in 3+1 dimensions. The second line contains additional zero-order terms. The fields N and N m are seen to play the role of the lapse and shift fields respectively. The additional zero-order terms appearing in (19) are not identical to the zero-order terms in the Dirac Hamiltonian. This, however, is not to be expected since the Dirac Hamiltonian is not self-adjoint while the Dirac type operator is. For the reconstruction of the 4-metric only the principal part is used. We shall return to a discussion of the zeroth order part later. We believe that the correct treatment of the zeroth order terms can only be performed once the Wheeler-DeWitt constraint is formulated and implemented and thereby the freedom in choosing the foliation, i.e. the lapse and the shift fields, is eliminated. This might also be a possible solution to another problem arising at this point. The ˜ tn now depends on the lapse and shift fields, norm of the semi-classical states ¯˜ t | t ˜ ¯ lim lim = d 3 x ψ(x)ψ(x)(N , N m ), n n n→∞ t→0

where the function (N , semi-classical states as constituting the one-fermion states problematic as the induced scalar product is obviously not appropriate. Interestingly, however, the lapse and shift √ fields may be chosen such that (N , N m ) = g. Thus, an appropriate choice of the time-coordinate restores the invariance of the norm under spatial diffeomorphisms. However, we are not aware of a compelling physical reason for such a choice of lapse and shift fields. Nevertheless, it might be conceivable that there is such a reason, as in quantum field theory, the one-particle space is not invariant under general coordinate changes. Thus, our restriction to one-particle states may well imply a restriction of the choice of coordinates. The solution to the above problem might also lie in the construction of the states, i.e. it might be possible to modify the construction of the semi-classical states such that the N m ) is readily computed. This renders the interpretation of the

694

J. Aastrup, J. M. Grimstrup, M. Paschke, R. Nest

norm of the semi-classical spinors is automatically coordinate independent. We shall investigate this problem in future work. Finally, disregarding lapse and shift fields, √ we should note that it would also be possible to remedy the deficiency of the missing g in the inner product by assigning the zero-order expectation value of the Halls coherent states to the non-densitized triad field and then adding, appropriately, the density in the semi-classical state. With this alteration the inner product of semi-classical states renders the correct inner product of spinors. However, this choice would spoil the interpretation of the left invariant vector fields as flux operators. Note that ψ(x) takes values in M2 (C). In view of the action of the σ a ’s this can consistently be interpreted as a Dirac 4-spinor. The space spanned by these fields ψ(x) can thus be interpreted as the space of solutions of the Dirac equation for the static 4-metric described by the 3-metric, the lapse and the shift fields (see [27]). 8. Discussion & Outlook In this paper we have shown that to certain states for the previously constructed spectral triple over holonomy loops, one can associate gravitational and fermionic matter fields. This clearly indicates that one should interpret this model as describing quantized gravitational fields coupled to quantized matter fields. To this concern we have constructed a small class of semi-classical states. Disregarding for the moment the open problem of identifying the correct scalar product, these semi-classical states can be interpreted as one-fermion states in a given foliation and given gravitational background field. We have identified the expectation value of the Dirac type operator of the spectral triple, in these semi-classical states, as the expectation value of the energy of the corresponding matter fields. This raises the question whether one can generally interpret the Dirac type operator as the energy operator for the matter fields present in the model. Thus, future work must clarify, first, whether there are many-particle fermionic states present in the model, and, of course, whether additional matter fields, for example photons, can be found. A consistent interpretation of the Dirac type operator then requires that it can also be interpreted as the energy of these states. At the present state of the project the investigation of these issues is certainly within reach. A further strong indication that the model should be interpreted in terms of quantum gravity is the fact that it encodes information tantamount to a representation of the Poisson bracket of general relativity. This has been carefully analyzed for the first time in this paper and should therefore be seen as one of its central results. All this being said, we should stress that our Hilbert space can only be viewed as the kinematical Hilbert space of quantum gravity. The Wheeler-DeWitt constraint has not been constructed nor implemented. In the construction above, this fact is nicely reflected by the appearance of the lapse and shift fields. Yet, as the Wheeler-DeWitt equation should in principle eliminate these unphysical degrees of freedom, the concreteness of their appearance raises the hope that our analysis may lead to a novel approach to the construction and implementation of the Hamiltonian constraint in quantum gravity. Apart from the physical interpretation of the model, the semi-classical analysis has also proven beneficial at a more technical level: it turned out that the system of nested, cubic lattices, on which the semi-finite spectral triple is based, simply plays the role of a coordinate system. In particular, the lattices form the coordinate system already used to

On Semi-Classical States of Quantum Gravity and Noncommutative Geometry

695

write down the Ashtekar variables and their Poisson bracket. This choice of background structure does, however, not imply lack of background invariance: there is no choice of background metric and the semi-classical limit is coordinate independent. This shows that it is possible to recover the spatial symmetries with a countable system of lattices. Yet, it is an issue for future work to establish the full covariance of the model under change of the chosen coordinate system. These observations are all based on the fact that any dependency on finite parts of the lattices vanishes in the limits (18) and (19). That is, only the continuum limit contributes to the integrals in (18) and (19). It is as if the lattices, which we have used to construct the spectral triple, disappear in this semi-classical limit. Furthermore, the free parameters {an }, which appear in the Dirac type operator, play an important role in the semi-classical limit. A priori, this sequence is only required to diverge in order for the resolvent of the Dirac type operator to be compact. In the semiclassical limit, however, the sequence is identified as the inverse, infinitesimal volume element. This fixes the sequence. We should stress that we only found states living on static 4-manifolds. This had to be expected since we interpret these states as one-particle states and it is well known in quantum field theory that such states would not exist on non-static space-times, e.g. in accelerating frames (which would be described by time-dependent lapse and shift fields). In the future it is certainly an interesting question whether one can find and describe semi-classical states which correspond to states of a quantized fermion field on a non-static space-time. The application of the CAR algebra as a tool to form the local Riemann integral in Eqs. (18) and (19) is highly intriguing. It would certainly be very interesting and important to investigate the role played by the CAR algebra more thoroughly. Moreover, the analysis in this paper is based on a real SU (2) connection whereas the Ashtekar connection is complex. A real SU (2) connection corresponds either to a Euclidean metric or to a more involved Hamiltonian. We believe it is desirable to work with the original Ashtekar connection. One may speculate whether the complexity of the connection only appears in the semi-classical limit. If so, then one might exploit the techniques presented in this paper to obtain a complex connection via a doubling of the Hilbert space. Immediate tasks to be addressed are: to compute quantum corrections for the semiclassical states in higher order of the Planck length; to investigate the operational interpretation of the loop algebra in the semi-classical states; to construct many particle states. Hopefully this will provide further evidence that the spectral triple over holonomy loops is a viable candidate for quantum gravity coupled to matter fields. Acknowledgements. J.A. and M.P. were supported by the SFB 478 grant “Geometrische Strukturen in der Mathematik” of the Deutsche Forschungsgemeinschaft.

References 1. Aastrup, J., Grimstrup, J. M.: Spectral triples of holonomy loops. Commun. Math. Phys. 264, 657 (2006) 2. Aastrup, J., Grimstrup, J. M.: Intersecting Connes noncommutative geometry with quantum gravity. Int. J. Mod. Phys. A 22, 1589 (2007) 3. Aastrup, J., Grimstrup, J.M., Nest, R.: On Spectral Triples in Quantum Gravity I. Class. Quant. Grav. 26, 065011 (2009) 4. Aastrup, J., Grimstrup, J.M., Nest, R.: On Spectral Triples in Quantum Gravity II. J. Noncommut. Geom. 3, 47 (2009)

696

J. Aastrup, J. M. Grimstrup, M. Paschke, R. Nest

5. Aastrup, J., Grimstrup, J.M., Nest, R.: A new spectral triple over a space of connections. Commun. Math. Phys. 290, 389 (2009) 6. Aastrup, J., Grimstrup, J.M., Nest, R.: Holonomy Loops, Spectral Triples & Quantum Gravity. to appear in Class. Quant. Grav. 26, 6500 (2009) 7. Connes, A.: Noncommutative Geometry. London-New York: Academic Press, 1994 8. Connes, A.: Gravity coupled with matter and the foundation of non-commutative geometry. Commun. Math. Phys. 182, 155 (1966) 9. Thiemann, T.: Introduction to modern canonical quantum general relativity. http://arxiv.org/abs/gr-qc/ 0110034vL, 2001 10. Rovelli, C.: Quantum gravity. Cambridge, UK: Cambridge Univ. Pr, 2004 11. Ashtekar, A., Lewandowski, J.: Background independent Quantum Gravity: A status report. Class. Quant. Grav. 21, R53 (2004) 12. Hall, B.C.: The Segal-Bargmann “coherent state” transform for compact Lie groups. J. Funct. Anal. 122(1), 103–151 (1994) 13. Hall, B. C.: Phase space bounds for quantum mechanics on a compact Lie group. Commun. Math. Phys. 184(1), 233–250 (1997) 14. Ashtekar, A., Lewandowski, J.: Representation theory of analytic holonomy C* algebras. http://arxiv. org/abs/gr-qc/9311010v2, 1993 to appear in J. Baez (ed.): Knotz and Quantum Gravity, Oxford: Oxford Univ. Press, 1994 15. Ashtekar, A., Lewandowski, J.: Differential geometry on the space of connections via graphs and projective limits. J. Geom. Phys. 17, 191 (1995) 16. Ashtekar, A., Lewandowski, J.: Quantum theory of geometry. I: Area operators. Class. Quant. Grav. 14, A55 (1997) 17. Flori, C., Thiemann, T.: Semiclassical analysis of the Loop Quantum Gravity volume operator: I. Flux Coherent States. http://arxiv.org/abs/0812.1537v1, 2008 18. Chamseddine, A.H., Connes, A.: Universal formula for noncommutative geometry actions: Unification of gravity and the standard model. Phys. Rev. Lett. 77, 4868 (1996) 19. Chamseddine, A.H., Connes, A.: A universal action formula. Phys. Rev. Lett. 77, 4868 (1996) 20. Chamseddine, A.H., Connes, A.: The spectral action principle. Commun. Math. Phys. 186, 731 (1997) 21. Chamseddine, A.H., Connes, A., Marcolli, M.: Gravity and the standard model with neutrino mixing. Adv. Theor. Math. Phys. 11, 991–1089 (2007) 22. Chamseddine, A.H., Connes, A.: Why the Standard Model. J. Geom. Phys. 58, 38–47 (2008) 23. Chamseddine, A.H., Connes, A.: A Dress for SM the Beggar. with different title, 2007; appeared in Phys. Rev. Lett. 99, 9160 (2007) http://arxiv.org/abs/0706.3690v1 [hep-th] 24. Ashtekar, A.: New Variables for Classical and Quantum Gravity. Phys. Rev. Lett. 57, 2244 (1986) 25. Ashtekar, A.: New Hamiltonian Formulation of general relativity. Phys. Rev. D 36, 1587 (1987) 26. Thiemann, T., Winkler, O.: Gauge field theory coherent states (GCS). IV: Infinite tensor product and thermodynamical limit. Class. Quant. Grav. 18, 4997 (2001) 27. Paschke, M., Kopf, T.: A spectral quadruple for de Sitter space. J. Math. Phys. 43, 818 (2002) Communicated by Y. Kawahigashi

Commun. Math. Phys. 302, 697–736 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1187-z

Communications in

Mathematical Physics

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations David Maxwell Department of Mathematics, University of Alaska Fairbanks, P. O. Box 757500, Fairbanks, AK 99775, USA. E-mail: [email protected] Received: 30 September 2009 / Accepted: 19 September 2010 Published online: 8 February 2011 – © Springer-Verlag 2011

Abstract: We study the conformal and conformal thin sandwich (CTS) methods as candidates for parameterizing the set vacuum initial data for the Cauchy problem of general relativity. To this end we consider a small family of symmetric conformal data. Within this family we obtain an existence result so long as the mean curvature has constant sign. When the mean curvature changes sign we find that solutions either do not exist, or they are not unique. In some cases solutions are shown to be non-unique. Moreover, the theory for mean curvatures with changing sign is shown to be extremely sensitive with respect to the value of a coupling constant in the Einstein constraint equations. 1. Introduction Initial data for the Cauchy problem of general relativity consist of a Riemannian manifold and a second fundamental form that satisfy a system of nonlinear PDEs known as the Einstein constraint equations. A longstanding goal has been to find a constructive description of the full set of solutions of these equations on a given manifold, and hence a method of producing all possible initial data. Although this problem remains open in general, the conformal method of Lichnerowicz and Choquet-Bruhat and York provides an elegant and complete solution to the problem of constructing constant-mean curvature (CMC) solutions. For example, on compact manifolds the solutions of the Einstein constraint equations are effectively parameterized by selection of conformal data consisting of a conformal class for the metric, a so-called transverse-traceless tensor, and a (constant) mean curvature. The conformal method can also be used to construct non-CMC solutions of the constraint equations, but much less is known in this case. Ideally one would like to show that selection of generic conformal data leads to a unique corresponding solution of the constraint equations. Until recently, virtually all results for the conformal method only applied to nearCMC initial data. The first construction using the conformal method of a family of initial data with arbitrarily specified mean curvature was given by Holst, Nagy, and Tsogtgrel

698

D. Maxwell

in [9]. Although this result represents a breakthrough for the conformal method, it has a number of important limitations: – The near-CMC hypothesis is replaced by a smallness assumption on the transversetraceless tensor (i.e. a small-TT hypothesis). – It is not known if small-TT conformal data determine a unique solution. – The construction only works on Yamabe-positive compact manifolds. – The construction requires non-vanishing matter fields. It was subsequently shown in [18] that the construction could be extended to vacuum initial data, but the other restrictions remain. These results are compatible with the possibility that a large set of conformal data lead to no solutions or multiple solutions; from the point of view of parameterizing the full set of solutions one would like to show that this does not occur. In this paper we investigate the conformal method and its variation, the conformal thin sandwich (CTS) method, by studying a model problem obtained from a quotient of certain symmetric conformal data. Despite the simplicity of the model problem, it captures the core issues of the conformal method, including the nonlinear coupling and difficulties regarding conformal Killing fields. Moreover, the model problem is easily studied numerically, and thus gives an important tool for suggesting theorems which might be proved in the future. We consider a three-parameter family of model conformal data that allow for simultaneous violations of both the near-CMC and small-TT conditions on a Yamabe-null manifold. The mean curvatures in this family are written as the sum of an average mean curvature, t, and a fixed zero-mean function describing departure from the mean. If t is chosen so that the mean curvature does not change sign, we find that there exists a solution of the constraint equations so long as the transverse-traceless tensor in the family is not identically zero. When the mean curvature changes sign, the situation is more delicate. We observe in this regime non-existence for certain large transverse-traceless tensors, non-uniqueness for certain small transverse traceless tensors, and a critical value of t (depending on the choice of lapse function in the CTS method and the choice of conformal class representative in the standard conformal method) for which there is an infinite family of solutions when the transverse-traceless tensor vanishes identically. Previous non-uniqueness results for the conformal method have been obtained by adding separate, poorly behaved terms to the equations, either in the form of non-scaling matter sources [4,20] or from coupling with a separate PDE in the extended conformal thin sandwich method [19,20]. We prove here the first nontrivial non-uniqueness result for the standard, vacuum conformal method. It arises from the nonlinear coupling of the equations, and indicates that the standard conformal and CTS methods already contain poorly behaved terms. Non-existence results for the conformal method are available in the CMC case for vacuum and scaling matter [16], as well as for scalar field matter sources [10]. In vacuum, non-existence only occurs for certain non-generic, well-understood conformal data and therefore does not pose a difficulty from the point of view of parameterization. The non-existence result proved here for the model equations translates to either non-existence or non-uniqueness for the full vacuum conformal method, with the final outcome not known. This can be compared to a similar dichotomy shown by Rendall for certain Yamabe-positive data [14]. Unlike Rendall’s example, where non-existence can be thought of as an extension of a CMC non-existence result, neither of the possible outcomes shown here are favorable for using the conformal method as a parameterization scheme.

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

699

Intriguingly, we find that for mean curvatures in the three-parameter family with changing sign, the existence theory depends sensitively on the values of the constants involved in the nonlinear coupling of the conformal method. We show that these constants are balanced in such a way that any arbitrarily small adjustment to their values lead to one of two different existence theories. All previous results for the conformal method depend only on the signs of the constants in these equations. This sensitivity suggests why it has been so difficult to obtain general large-data results for the conformal method. The conformal data used in this study has one potential drawback: the mean curvature is not continuous, but has jump discontinuities. This level of regularity is lower than has previously been considered for the fully coupled conformal method. We note, however, that the CMC theory of the conformal method readily constructs solutions of the constraint equations with certain kinds of discontinuous second fundamental forms ([5,17,9]), and we use the CMC results of [5] to cope with the discontinuities in the mean curvature. From this perspective the singularities in the mean curvature are comparatively mild. It would be interesting to know if low regularity techniques introduced in [17] and extended in [9] could be generalized to non-CMC conformal data of the regularity we consider here. 1.1. Conformal parameterizations. Let (M n , h) be a Riemannian manifold and let K be a second fundamental form on M n , i.e. a symmetric (0, 2)-tensor. The vacuum Einstein constraint equations for (h, K ) are Rh − |K |2h + tr h K 2 = 0 divh K − d tr h K = 0

[Hamiltonian constraint], [momentum constraint],

(1a) (1b)

where Rh is the scalar curvature of h. For simplicity, we restrict our attention to compact manifolds. Problem 1 (Conformal Parameterization Problem). Let (M n , g) be a compact Riemannian manifold. Find a constructive parameterization of the set of solutions (h, K ) of Eq. (1) such that h belongs to the conformal class of g. If (h, K ) is a solution of Eq. (1) with h in the conformal class of g, we may write h = φ q−2 g for some positive function φ, where q=

2n . n−2

(2)

Without loss of generality we can write K = φ −2 S + Tn g , where S is a traceless (0, 2)-tensor and T is a scalar field. The constraint equations (1) for (h, K ) can then be written in terms of (φ, S, T ) as − 2κq g φ + Rg φ − |S|2g φ −q−1 + κ T 2 φ −q−1 = 0, divg S − κφ q d φ −q T = 0,

(3a) (3b)

where κ=

n−1 . n

(4)

700

D. Maxwell

The conformal parameterization problem then amounts to parameterizing the solutions (φ, S, T ) of (3). The conformal method [6] and its variation, the conformal thin sandwich (CTS) method [21], provide possible approaches for solving Problem 1. An overview of these methods can be found in [3]. We summarize the techniques here to establish notation and to state known results that impact our analysis of the model problem. With the conformal method, one specifies a mean curvature τ and a transversetraceless tensor σ (i.e. a symmetric, trace-free, divergence-free (0, 2)-tensor). We write T = φ q τ and S = σ + L W , where W is an unknown vector field and L is the conformal Killing operator defined by (L V )i j = ∇i V j + ∇ j Vi −

2 k ∇ Vk gi j . n

(5)

Equations (3) then become − 2κq g φ + Rg φ − |σ + L W |2g φ −q−1 + κτ 2 φ q+1 = 0

[conformal Hamiltonian constraint],

(6a) divg L W − κφ q dτ = 0.

[conformal momentum constraint],

(6b) These are coupled nonlinear elliptic equations to solve for unknowns (φ, W ). For the CTS approach one specifies σ and τ along with an additional positive scalar function N which represents a lapse.1 The CTS method is then obtained by replacing L W 1 with 2N L W wherever it appears in the discussion for the conformal method. Although operationally similar to the conformal method, the CTS method has the advantage of being conformally covariant. Specifically, if θ is a positive function, then conformal data (θ q−2 g, θ −2 σ, θ q N , τ ) yields the solution (h, K ) if and only if (g, σ, N , τ ) does. From the perspective of working with a fixed background metric g, the standard conformal method simply corresponds to the CTS method with the choice of N = 1/2. We can think of the CTS approach as providing many different parameterizations, one for each choice of N . It is not known if certain choices of N are superior for the purposes of finding a parameterization. From the conformal covariance we observe that the choice of N in the conformal-thin sandwich method is equivalent to the choice of background metric for the conformal method: the solution theory for the standard conformal method with the background metric gˆ = θ q−2 g is equivalent to the solution theory for the conformal thin sandwich method with lapse function N = 21 θ −q . A conformal thin sandwich solution exists for (g, σ, N , τ ) if and only if a standard conformal method solution exists for (g, ˆ θ −2 σ, τ ), and the resulting solutions of the Einstein constraint equations are the same. In the event that τ is constant, it is easy to see that the existence theory for system (6) reduces to the study of the Lichnerowicz equation − 2κq g φ + Rg φ − |σ |2g φ −q−1 + κτ 2 φ q+1 = 0.

(7)

The obstruction to the existence of solutions of (7) is stated in terms of the metric’s Yamabe invariant 1 Although the CTS method is not usually presented as specifying σ (compare [21]) it is straightforward to show that the presentation here is equivalent to the usual one.

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

Yg =

inf ∞

M

2κq |∇ f |2g + Rg f 2 d Vg

f ∈C (M) f ≡0

|| f ||2L q

,

701

(8)

and we have the following theorem from [16]. Theorem 1. Let (M, g) be a smooth compact Riemannian manifold, let σ be a transverse-traceless tensor, and let τ be a constant. Then there exists a positive solution of (7) (and hence a solution of the conformally parameterized constraint equations (6)) if and only if one of the following hold: 1. 2. 3. 4.

Yg Yg Yg Yg

> 0, σ ≡ 0, = 0, τ = 0, σ ≡ 0, < 0, τ = 0, = 0, τ = 0, σ ≡ 0.

When a solution exists it is unique, except in case 4) in which case any two solutions are related by a positive scalar multiple. Hence the set of CMC solutions of (1) having a metric conformally related to g is essentially parameterized by choosing pairs (σ, τ ). The following non-CMC variation of Theorem 1 appeared in [18]. Theorem 2. Let (M, g) be a smooth compact Riemannian 3-manifold with no conformal Killing fields. Suppose σ and τ are a transverse-traceless tensor and a mean curvature such that one of the following hold: 1. Yg > 0, σ ≡ 0, 2. Yg = 0, σ ≡ 0, τ ≡ 0, 3. Yg < 0 and there exists gˆ in the conformal class of g such that Rgˆ = −τ 2 . If there exists a global upper barrier for (g, σ, τ ), then there exists at least one solution of the conformally parameterized constraint equations (6). The reader is referred to [18] for the definition of a global upper barrier (where it is called a global supersolution2 ); see also Appendix B. Cases 1-3 of Theorem 2 reduce to those of Theorem 1 if τ is constant. Moreover, the condition on τ in Case 3 is necessary if Yg < 0[17]. Until now, all results for the conformal method are consistent with the possibility that (aside from the exceptional Case 4 of Theorem 1), the conditions of Cases 1–3 of Theorem 2 are necessary and sufficient for the unique solvability of Eq. (6). We show in this paper that this is not the case. In particular we find certain data satisfying the conditions of Case 2 for which there are nontrivially related multiple solutions. We also find other symmetric data satisfying the conditions of Case 2 for which there are no symmetric solutions (and hence there are either no solutions or there are multiple solutions). Global upper barriers can be found if the conformal data is CMC, satisfies a near-CMC condition such as max |∇τ | is sufficiently small, (9) min |τ | or if Yg > 0 and σ is small-TT, i.e. max |σ |g is sufficiently small, with smallness depending on τ.

(10)

2 The terminology global supersolution is perhaps misleading since it is not clear that all solutions of (6) have associated global supersolutions.

702

D. Maxwell

This last upper barrier was first presented in [9] and led to the far-from CMC results of [9] and [18]. Uniqueness theorems are available for a general class of near-CMC data under additional hypotheses on the size of |∇τ | ([13,11]), but nothing is known concerning uniqueness in the small-TT case. Results of O’Murchadha and Isenberg [14] show that the condition σ ≡ 0 in Hypotheses 1 and 2 of Theorem 3 is necessary for certain non-CMC data. In particular, their “no-go” theorem proves that if Rg ≥ 0 (or if Yg ≥ 0 and we are using the CTS method), then there does not exist a solution of (6) if τ is near-CMC and σ ≡ 0. Rendall has also shown, as presented in [14], that there exists a class of Yamabe-positive far-from CMC conformal data with σ ≡ 0 such that if a solution to Eq. (6) exists, it is not unique. It is not known which of existence or uniqueness fails for Rendall’s data. Symmetries pose a difficulty for the conformal method, and this hampers the development of concrete examples. Essentially all non-CMC existence results require that (M n , g) has no conformal Killing fields.3 Analytically this condition arises to guarantee that the operator div L is surjective, but the need for this condition is more fundamental. If (M n , g) admits a nontrivial conformal Killing field X , then selection of a mean curvature poses an a-priori restriction on the solution φ of (6) even before σ is selected. If (h, K ) is a solution of the constraint equations, then the mean curvature τ = tr h K must satisfy M X (τ ) d Vh = 0; this identity is obtained by multiplying the momentum constraint (1b) by X and integrating by parts.4 Writing this equation in terms of g we find φ q X (τ ) d Vg = 0. (11) M

If τ is constant then Eq. (11) holds trivially. If it is possible to find a solution (φ, W ) for general data (σ, τ ), then W has to arise in such a way that φ, which solves a Lichnerowicz equation depending on W , also satisfies (11). The mechanism which might cause this for arbitrary conformal data is not understood, and the issue is sidestepped in the literature by assuming that there are no conformal Killing fields. 2. Conformally Flat Symmetric Data on the Torus Let Sr1 denote the circle of radius r and let M n = Sr11 × · · · × Sr1n with the product metric g. We can pick coordinates x k along each factor such that gi j = δi j and consider the following variation of Problem 1. Problem 2 (Reduced Parameterization Problem). Find all solutions (h, K ) of the Einstein constraint equations on M n such that h is conformally related to g and such that the Lie derivatives L∂k g and L∂k K vanish for 1 ≤ k ≤ n − 1. In practice we are seeking solutions such that h and K are periodic functions of x n alone; by an obvious scaling argument we may reduce to the case rn = 1 and x ≡ x n ∈ [−π, π ] 3 [12] contains an exception, but it requires the conformal data be constant along the integral curves of any conformal Killing fields. For the toroidal initial data we consider in Sect. 3 this amounts to assuming that τ is constant. 4 This condition should be compared with the Bourguignon-Ezin condition M X (R g )d Vh = 0 for the prescribed scalar curvature problem [2].

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

703

The maximal globally hyperbolic spacetime obtained from such data will be a Gowdy spacetime with a conformally flat Cauchy surface. Our focus is not so much to generate initial data for Gowdy spacetimes (the formulation of the constraint equations found in [7] is more convenient for that purpose), but to use the conformally flat torus as a test case for conformal parameterizations in general. We remark that the CMC version of Problem 2 (including more general toroidal background metrics) was effectively treated in [15]. For the moment we work in three dimensions and use the variables (φ, S, T ) introduced in the previous section. In coordinates we can write ⎡ ⎤ c d 1 ⎣ −a − b c −a + b e ⎦ . S= (12) 3 d e 2a Assuming that S and T are functions of x = x 3 alone, we have div S = 13 (d , e , 2a ), and hence the momentum constraint (3b) reads 1 2 (d , e , 2a ) = φ 6 (0, 0, (φ −6 T ) ). (13) 3 3 Here primes denote derivatives with respect to x. Note that S is transverse-traceless if and only if a, d, and e are constant, and that (φ, S, T ) satisfies the momentum constraint if and only if d and e are constant and a = φ 6 (φ −6 T ) .

(14)

Letting η2 = (b2 + c2 + d 2 + e2 )/9, and noting that (M n , g) is scalar flat, the Hamiltonian constraint (3a) reads

2 2 T − a 2 φ −7 = 0. − 8φ − 2η2 φ −7 + (15) 3 A similar derivation works in higher dimensions, and we obtain the reduced equations

−2κq φ − 2η2 φ −q−1 + κ T 2 − a 2 φ −q−1 = 0, (16) a − φ q (φ −q T ) = 0. Solving Problem 2 amounts to parameterizing the solutions (φ, η, a, T ) of (16). The conformal method can be described in this framework as follows. First we write T = φ q τ,

(17)

where τ is a prescribed mean curvature function and the conformal factor φ is unknown. Additionally, we decompose a = μ + w ,

(18)

where μ is a prescribed constant and w is an unknown function. The function w is related to the vector field W of the conformal method via 2W = w∂n . The constant μ is part of the transverse-traceless tensor; to specify the remainder we select an arbitrary function η. Equations (16) become −2κq φ − 2η2 φ −q−1 − κ(μ + w )2 φ −q−1 + κτ 2 φ q−1 = 0, w − φ q τ = 0.

(19)

704

D. Maxwell

For the CTS approach we additionally choose a positive function N and write a = μ + 1/(2N )w . The CTS equations are then −2κq φ − 2η2 φ −q−1 − κ(μ + (2N )−1 w )2 φ −q−1 + κτ 2 φ q−1 = 0, ((2N )−1 w ) − φ q τ = 0.

(20)

Equations (20) provide a model for the full CTS equations on a Yamabe-null manifold. The nonlinear coupling for this system is the same as for the original equations. Moreover, the background metric on S 1 has a nontrivial conformal Killing field (∂x ). Hence the central difficulties of the conformal method are present in the model. Appendix B outlines how standard techniques for the conformal method can be adapted to equations (20) if the data satisfy an additional evenness hypothesis. Our primary focus, however, is on examining a family of conformal data for which we obtain stronger results than are possible with the techniques of Appendix B.

3. A Family of Low Regularity Conformal Data The prescribed data for system (20) are a constant μ and a function η together with a mean curvature function τ . We will assume that η is constant and work with a one-parameter family of mean curvatures τt = t + λ,

(21)

where t is constant and λ(x) =

−1 −π < x < 0 1 0 < x < π.

(22)

This three-parameter family is suitable for exploring simultaneous violations of the near-CMC and small-TT hypotheses. The parameters η and μ control the size of the relevant pieces of the transverse-traceless tensor. On the other hand, t controls the departure from CMC in the sense that for large values of t the mean curvature has small relative deviation from its mean, and is hence near-CMC (see also Proposition 22 and the subsequent discussion in Appendix B). Data of this kind fall outside the current theory of the conformal method for two reasons. First, the manifold possesses a non-trivial conformal Killing field (∂x ) and the non-CMC data is not constant along it. Second, the discontinuities in τt make the data more singular than is treated in the current best low-regularity results of [9] for the full coupled system (6). We avoid both difficulties by showing that the reduced system (20) for this data can be decoupled, and the analysis will reduce to the study of Lichnerowicz-type equations. Just as for the CMC-conformal method, the decoupling removes potential obstructions posed by conformal Killing fields. Moreover, the data we consider are only modestly irregular for the Lichnerowicz equation alone. In particular, the results of [5] are applicable.

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

705

Fig. 1. Ranges of t considered by the theorems of Sect. 3.1

3.1. Summary of results. We wish to solve −2κq φ − 2η2 φ −q−1 − κ(μ + (2N )−1 w )2 φ −q−1 + κτt2 φ q−1 = 0, ((2N )−1 w ) − φ q τt = 0

(23)

on S 1 . Here N is a given smooth lapse function, η and μ are constants, and τt is defined 2, p by (21) and (22). We seek solutions (φ, w) ∈ W+ (S 1 ) × W 1, p (S 1 ) where p > 1; the subscript + denotes the subset of positive functions. An easy bootstrap argument shows that if such a solution exists it belongs to W+2,∞ (S 1 ) × W 1,∞ (S 1 ). If (φ, w) is a solution, so is (φ, w + c) for any constant c, and it determines the same solution of the constraint equations. We will say that (φ, w) is the unique solution of (23) if any other solution is of the form (φ, w + c). The existence theory turns out to depend on the choice of lapse function N in the conformal thin sandwich case (or equivalently, on the choice of conformal representative of the background metric in the standard conformal method). We define 1 λN γ N = − S . (24) S1 N It is easy to see that −1 < γ N < 1 and that if N is constant (as in the conformal method with the flat background metric), then γ N = 0. Our results depend on the value of t and its relationship with γ N (Fig. 1). The near-CMC regime is expressed in terms of the distance between t and γ N . Theorem 3 (Near-CMC results). If |t − γ N | > 2 there exists a solution (φ, w) of (23) if and only if η = 0 or μ = 0. Solutions are unique if μ = 0. Note that the condition η = 0 or μ = 0 is exactly the condition that the transverse traceless tensor is not identically zero. Hence Theorem 3 extends the near-CMC existence/uniqueness theorem of [11] and the “no-go” theorem of [14] to this family of data. We have not determined if uniqueness holds for μ = 0. The value t = γ N is special, and we have the following result that is a partial analogue of exceptional Case 4 of Theorem 1. Theorem 4 (Exceptional case: t = γ N ). If t = γ N and if μ = η = 0, then there exists a one-parameter family of solutions of (23). If μ = 0 and η = 0, there does not exist a solution.

706

D. Maxwell

It is not known if the non-existence result of Theorem 4 can be extended to include the case μ = 0. Given the non-existence result of Theorem 4, we can only expect a small-TT existence theorem if t = γ N . We have shown that if γ N = 0, then this is essentially the only condition needed to obtain small-TT solutions, and have obtained a partial result for γ N = 0. Theorem 5 (Small-TT results). Suppose |t| > |γ N | and |t| = 1. If μ = 0 or η = 0, and if μ and η are sufficiently small, then there exists at least one solution of (23). It is not known if existence holds if γ N = 0 and either t = −γ N or |t| < |γ N |. The case |t| = 1 remains open as well. The mean curvature changes sign if and only if |t| < 1. We have the following existence theorem that applies when |t| > 1. Note that since |γ N | < 1, the near-CMC condition |t − γ N | ≥ 2 is strictly stronger than the condition |t| > 1. Theorem 6 (Non-vanishing mean curvature). Suppose |t| > 1 and either μ = 0 or η = 0. Then there exists at least one solution of (23). We have not determined if solutions are unique in this case, nor do we have an extension of the “no-go” theorem to this regime. The existence theory for |t| < 1 is quite different than that for the near-CMC regime. If μ = 0, we can show that when solutions exist, there are usually at least two, and that if μ = 0 and η is sufficiently large, then there are no solutions. Hence a small-TT hypothesis is necessary if μ = 0. Theorem 7 (Nonexistence/non-uniqueness). Suppose |t| < 1 and μ = 0. There exists a critical value η0 ≥ 0 such that if |η| < η0 there exist at least two solutions of (23), and if |η| ≥ η0 there are no solutions. If in addition |t| > |γ N |, then η0 > 0. The preceding theorems omit the case t = ±1. These values of t are interesting as they correspond to mean curvatures τt that are equal to zero on a large set. The techniques for working with such mean curvatures are somewhat specialized, and for simplicity we do not consider these values. We conjecture, however, that Theorem 5 can be extended to include t = ±1. The following theorem collects the results of Theorems 3 through 7 specialized to the case μ = 0 and γ N = 0 where they are most complete. Theorem 8. Suppose μ = 0 and γ N = 0. 1. If |t| > 2, there exists a solution of (23) if and only if η = 0. If a solution exists it is unique. 2. If |t| > 1 and η = 0, there exists at least one solution. 3. If 0 < |t| < 1, there is a critical value η0 > 0. If 0 < |η| < η0 , there are at least two solutions. If |η| > η0 there are no solutions. 4. If t = 0 there exists a solution if and only if η = 0, in which case there is a oneparameter family of solutions. Figure 2 illustrates Theorem 8. We have a fairly complete picture of the existence/uniqueness theory when μ = 0; we are missing a non-existence result for 0 < t < 2 if η = μ = 0, a uniqueness result for 1 < |t| < 2, and results for |t| = 1. A little care is required in translating the results for the model problem to the full conformal method. Because we are seeking solutions within a symmetry class, the number of

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

707

Fig. 2. Multiplicity of solutions for t ≥ 0 and η ≥ 0 when μ = 0. Dashed lines correspond to curves where the multiplicity is unknown. The shape of the curve separating the existence and non-existence regions for t < 1 is conjectural

solutions we find is a lower bound for the total number of solutions. Non-uniqueness for the model problem implies non-uniqueness for the full conformal method, but uniqueness only implies that there is a single solution with symmetry. Solutions without symmetry (of which there must be more than one if there are any) may be present. Similarly, nonexistence for the model problem implies either non-existence or non-uniqueness for the full conformal method. 3.2. Reduction to root finding. In this section we show how for the specific choice of mean curvatures τt in Eq. (22), the existence theory of system (23) can be reduced to the question of finding roots of a certain real valued function. We first show that the solution of the momentum constraint can be determined exactly, up to knowledge of the value of φ(0). Proposition 1. Suppose (φ, w) ∈ W+2,∞ (S 1 ) × W 1,∞ (S 1 ) is a solution of (23). Let 1 λN γ N = − S . (25) S1 N Then 1 w = φ(0)q λ + γ N . 2N

(26)

Proof. Notice that τt = 2 [δ0 − δπ ], where δx denotes the Dirac delta distribution with singularity at x. If (φ, w) is a solution of (23), then ((2N )−1 w ) = 2φ q [δ0 − δπ ] = 2φ(0)q δ0 − 2φ(π )q δπ . (27) Since ((2N )−1 w ) , 1 = 0 (where ·, · denotes the pairing of distributions on test functions) we have 0 = φ q (δ0 − δπ ), 1 = φ(0)q − φ(π )q . (28) Hence φ(0) = φ(π ).

708

D. Maxwell

The momentum constraint then reads 1 w = φ(0)q λ . 2N

(29)

Hence

for some constant C. Since

1 w = φ(0)q [λ + C] 2N

S1

(30)

w = 0 the value of C is determined by 2N [λ + C] = 0.

(31)

S1

This occurs precisely when C = γ N . Substituting Eq. (26) into the Hamiltonian constraint of system (23) we obtain a nonlocal equation for φ. Proposition 2. Suppose (φ, w) ∈ W+2,∞ (S 1 ) × W 1,∞ (S 1 ) solves (23). Then φ satisfies − 2κq φ − 2η2 φ −q−1 − κ[μ + φ(0)q (γ N + λ)]2 φ −q−1 + κ(t + λ)2 φ q−1 = 0.

(32)

W+2,∞ (S 1 ) is a solution of (32). Then there exists a solution

Conversely, suppose φ ∈ w ∈ W 1,∞ (S 1 ) (uniquely determined up to a constant) of (26) and (φ, w) is a solution of (23). Proof. If (φ, w) is a solution of (23) then Proposition 1 implies w solves (26). Substituting this solution into the Lichnerowicz equation, we obtain Eq. (32). Conversely, suppose φ solves (32). By the choice of γ N , Eq. (26) is integrable and the solution w ∈ W 1,∞ (S 1 ) is determined up to a constant. Let w be such a solution. By construction, w solves the momentum constraint for φ, and φ solves the Hamiltonian constraint for w. That is, (φ, w) is a solution of (23). To study the nonlocal Eq. (32) we introduce a family of Lichnerowicz equations depending on a positive parameter d: −q−1

− 2κq φd − 2η2 φd

−q−1

− κ[μ + (γ N + λ)d q ]2 φd

q−1

+ κ(t + λ)2 φd

= 0. (33)

Clearly the solutions of (32) are in one-to-one correspondence with the solutions φd of (33) satisfying φd (0) = d. The functions φd tend to grow as d increases, and it will be more convenient to work with a rescaled function that is bounded as d → ∞. The following result follows easily from Proposition 2 after defining ψd = d −1 φd . We omit the proof. Proposition 3. The solutions of (23) are in one-to-one correspondence with the functions 2, p ψd ∈ W+ (S 1 ) satisfying 2q

−q−1

− 2κqd − n ψd − 2η2 d −2q ψd

−q−1

− κ(μd −q + γ N + λ)2 ψd

q−1

+ κ(t + λ)2 ψd

=0

(34) and ψd (0) = 1

(35)

for some d > 0. Given a solution ψd solving (34) and satisfying ψd (0) = 1, the corresponding solution φ of (32) is dψd .

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

709

Equation (34) can be written as a Lichnerowicz equation of the form − u − α 2 u −q−1 + β 2 u q−1 = 0,

(36)

where α ≡ 0 and β ≡ 0. We have the following facts for this equation, which are proved in Appendix A. Proposition 4. Suppose α and β in Eq. (36) belong to L ∞ (S 1 ) and that α ≡ 0 and β ≡ 0. Let p > 1. 2, p

1. There exists a unique solution u ∈ W+ (S 1 ), and moreover u ∈ W 2,∞ (S 1 ). 2. If w ∈ W+2,∞ (S 1 ) is a subsolution of (36), (i.e. −w − α 2 w −q−1 + β 2 wq−1 ≤ 0) then w ≤ u. 3. If v ∈ W+2,∞ (S 1 ) is a supersolution of (36), (i.e. −v − α 2 v −q−1 + β 2 v q−1 ≥ 0) then v ≥ u. 2, p 4. The solution u ∈ W+ depends continuously on (α, β) ∈ L ∞ × L ∞ . We can now define the real valued function F that will be the focus of our study. Definition 1. Let t be a constant and let τt be defined by Eqs. (21) and (22). Let N be a smooth lapse function and let γ N be defined by Eq. (24). Finally, let η and μ be constants. For d > 0, Proposition 4 Part 1 implies that there exists a corresponding solution ψd ∈ W+2,∞ (S 1 ) of Eq. (34). We define F : R>0 → R>0 by F(d) = ψd (0).

(37)

We define F0 to be the analogous function corresponding to the same mean curvature but vanishing transverse-traceless tensor (i.e. for μ = η = 0). From Proposition 3 it is clear that the existence theory of the CTS method for this family of data reduces to the study of the (algebraic) solutions of F(d) = 1. Proposition 5. The solutions (φ, w) ∈ W+2,∞ (S 1 ) × W 1,∞ (S 1 ) of system (23) are in one to one correspondence with the positive solutions of F(d) = 1. 3.3. Solutions of F(d) = 1. Theorems 3 through 7 follow from Proposition 5 and facts about F and F0 proved in this section. Figure 3 shows representative graphs of F and F0 obtained by numerical computation for certain values of t, η and μ. Key features are the singular behaviour of F at d = 0, the limit of F0 at d = 0, and the rapid convergence of F and F0 to a common limit at ∞. We note that for the illustrated choice of t, η and μ it appears there is exactly one solution of F(d) = 1 and none of F0 (d) = 1. 3.3.1. Elementary estimates for F. In this section we establish: 1. If μ = 0 or η = 0 (i.e. if the transverse-traceless tensor is not identically zero) then F(d) is O(d −1 ) for d sufficiently small. 2. If μ = η = 0 then F is uniformly bounded on (0, ∞). 3. For all values of μ and η, F(d) is bounded above for values of d sufficiently large. 4. If μ = 0 or η = 0, then a solution of F(d) = 1 exists if and only if F(d) ≤ 1 for some d > 0. These facts are all demonstrated by examining constant sub- and supersolutions.

710

D. Maxwell

Fig. 3. Functions F and F0 for t = 3/2, μ = 0, and η = 3

Lemma 1. Suppose |t| = 1. We define the constants m d = min(Md,+ , Md,− ), Md = max(Md,+ , Md,− ),

(38)

where

Md,±

2η2 d −2q + κ(μd −q + γ N ± 1)2 = κ(t ± 1)2

2q1

.

(39)

Then m d ≤ ψd ≤ Md for all d > 0 and in particular m d ≤ F(d) ≤ Md .

(40)

Proof. A constant M is a supersolution of (34) so long as 2 − 2η2 d −2q M −q−1 − κ μd −q + γ N + λ M −q−1 + κ(t + λ)2 M q−1 ≥ 0.

(41)

Since λ = ±1 on S 1 , this is ensured if 2 κ(t ± 1)2 M 2q ≥ 2η2 d −2q + κ μd −q + γ N ± 1 .

(42)

In particular, Md is a supersolution. Proposition 4, Part 3 now implies that ψd ≤ Md on S 1 . A similar proof shows that m d is a subsolution if |t| = 1, and hence Proposition 4, Part 2 implies ψd > m d on S 1 . From the limiting behaviour of m d and Md as d → ∞ we have estimates for ψd (and hence F(d)) for large values of d. Lemma 2. Suppose |t| = 1. Let M∞

1 1 − γN q = max 1−t

1 1 + γN q , 1+t

(43)

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

and m∞

1 1 − γN q = min 1−t

1 1 + γN q . , 1+t

711

(44)

Given > 0, m ∞ − ≤ ψd ≤ M∞ +

(45)

holds for d sufficiently large. If μ = η = 0 then m ∞ ≤ ψd ≤ M∞

(46)

lim Md = M∞

(47)

lim m d = m ∞ .

(48)

for all d > 0. Proof. We note that d→∞

and d→∞

Hence the bounds m ∞ − ≤ ψd ≤ M∞ + hold for d sufficiently large. If μ = η = 0, then m d = m ∞ and Md = M∞ for all d > 0, so m ∞ ≤ ψd ≤ M∞ for all d > 0. The singular or bounded behaviour of F near zero follows from the analogous behaviour of the associated sub- and supersolutions. Lemma 3. Suppose |t| = 1. If η = μ = 0 then F(d) ≤ M∞

(49)

for all d > 0. Otherwise there is a positive constant c such that F(d) ≥ cd −1

(50)

for d sufficiently small. Proof. We note that if η = 0 or μ = 0, then Md,+ and Md,− are both O(d −1 ) at d = 0 and hence so is m d . The uniform upper bound (49) when μ = η = 0 was proved in Lemma 2. The singularity of F at d = 0 gives a simple test for determining if there is at least one solution of F(d) = 1. Lemma 4. Suppose η = 0 or μ = 0. There exists a solution of F(d) = 1 if and only if for some d > 0, F(d) ≤ 1. Proof. By Lemma 3, F(d) > 1 for d sufficiently small. Fixing p > 1, from Proposition 4, Part 4 it follows that the map d → ψd from (0, ∞) to W 2, p (S 1 ) is continuous. From the continuous imbedding of W 2, p (S 1 ) → C(S 1 ) it follows that F is continuous and the result now follows from the Intermediate Value Theorem.

712

D. Maxwell

3.3.2. Proof of Theorem 3 (near-CMC results). In this section we show that in the nearCMC regime (|t − γ N | > 2) the following hold: 1. lim supd→∞ F(d) < 1. 2. F is differentiable and F (d) < 0 if F(d) = 1 (and μ = 0). 3. F(d) < 1 for all d if μ = η = 0. The existence of a solution of F(d) = 1 if μ = 0 or η = 0 follows from Fact 1 and Lemma 4. The uniqueness of solutions of F(d) = 1 if μ = 0 follows from Fact 2. And the non-existence of solutions of F(d) = 1 if μ = η = 0 follows from Fact 3. The upper bounds of Facts 1 and 3 follow from the constant supersolutions of Lemma 1. In effect, F(d) < 1 because ψd < 1 everywhere. Lemma 5. Suppose |t − γ N | > 2. Then M∞ < 1.

(51)

Proof. Note that since |γ N | < 1, if |t − γ N | > 2, then |t| > 1 and in particular |T a| = 1. Suppose first that t > 1. Then 1 − γN 1 + γN 1 − γN 1 + γN q = max , M∞ = max . (52) , t −1 t +1 t −1 t +1 So M∞ < 1 if 1 − γ N < t − 1 and 1 + γ N < t + 1. The first equality holds since 2 < |t − γ N | = t − γ N . The second holds since γ N < 1 < t. The case where t < −1 is proved similarly. Corollary 1. Suppose η = 0 or μ = 0. If |t − γ N | > 2,

(53)

then there exists a solution of F(d) = 1. Proof. From Lemma 2, lim sup F(d) ≤ M∞ .

(54)

d→∞

From Lemma 5 M∞ < 1. Existence of a solution now follows from Lemma 4. If η = 0 and μ = 0 (i.e. for vanishing transverse-traceless tensors) we have a corresponding non-existence result which generalizes the “no-go” theorem of [14] to this family of data. Recall that F0 corresponds to F with η = μ = 0. Corollary 2. If |t − γ N | > 2,

(55)

then F0 (d) < 1 for all d > 0. In particular, there are no solutions of F0 (d) = 1. Proof. If η = 0 and μ = 0 then Md = M∞ for all d > 0. By Lemma 5, M∞ < 1. Hence F(d) < 1 for all d > 0. To show solutions of F(d) = 1 are unique we show that F is decreasing at any solution of F(d) = 1. We start by showing that F is differentiable.

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

713

Lemma 6. The function F is differentiable. Moreover, F (d) = h(0),

(56)

− 2κq h + d q−2 V h = −R,

(57)

where h ∈ W 2, p (S 1 ) solves

and where

−q−2 q−2 V = (q + 1) 2η2 d −2q + κ(μd −q + γ N + λ)2 ψd + (q − 1)κ(t + λ)2 ψd

(58)

and −q−1

−q−1

+ 2qκμd −3 (μd −q + γ N + λ)ψd R = (q + 2)2η2 d −q−3 ψd

q−1 −q−1 × κ(t + λ)2 ψd − κ(μd −q + γ N + λ)2 ψd .

+ (q − 2) (59)

2, p

Proof. Consider the function M : R>0 × W+ (S 1 ) → L p (S 1 ) defined by M(d, ψ) = −2κq ψ − 2η2 d −q−2 ψ −q−1 −κ(μd −q + γ N + λ)2 d q−2 ψ −q−2 s + κ(t + λ)2 ψ q−1 .

(60)

Using the fact that 2q/n = q − 2 it follows that M(d, ψd ) = 0 for all d > 0. It is tedious but routine to show that M is Fréchet differentiable and M [d, ψ](δ, h) = −2κq h + V h + Rδ.

(61)

From the continuous embedding W 2, p (S 1 ) → C(S 1 ) it follows that the operators V and R are continuous as functions of ψ and d; see, for example, Lemma 11 below that can be used to show that they are locally Lipschitz. So the map (d, ψ) → M [d, ψ] is continuous. The operator from W 2, p (S 1 ) → L p (S 1 ), h → −2κq h + V h

(62)

has a continuous inverse as V ∈ L ∞ ≥ 0 and V ≡ 0 (see, e.g. [5] Theorem 7.7). The Implicit Function Theorem ([1] Cor. 4.2) then implies that given a solution of M(d0 , ψ0 ) = 0 there is a unique function G defined near d0 such that M(d, G(d)) = 0, and G is continuously differentiable. But M(d, ψd ) = 0 for all d, so by the uniqueness of G we have G(d) = ψd . Let h = G (d). Then by the chain rule, 0=

∂ M(d, G(d)) = −2κ h + V h + R. ∂d

(63)

Now F(d) = ψd (0). Since the evaluation map ψ → ψ(0) is linear and continuous on W 2, p (S 1 ), it follows that F is continuously differentiable and F (d) = G (d)(0). That is, F (d) = h(0) where h solves (63). Proposition 6. Suppose |t − γ N | > 2. If μ = 0 there exists at most one solution of F(d) = 1.

714

D. Maxwell

Proof. Suppose F(d) = 1. We will show that F (d) < 0, and hence there can be at most one solution. Consider the functions of a real variable z, g± (z) = −(γ N ± 1)2 z −q−1 + (t ± 1)2 z q−1

(64)

f ± (z) = −2η2 d −2q z −q−1 + g± (z).

(65)

and

Note that g± and f ± are increasing in z for z > 0 and f + (M+ ) = f − (M− ) = 0 (where M± is defined in Lemma 1). Let I− = (−π, 0) and I+ = (0, π ). Then 2q

− 2κqd − n ψd + f ± (ψd ) = 0

(66)

on I± . Since the coefficients of the differential equation (66) are constant on I± , the function ψd is smooth on these intervals. Suppose without loss of generality that M− ≥ M+ . By Lemma 1, M+ ≤ ψd ≤ M− on S 1 . Since g+ (M+ ) ≥ f + (M+ ) = 0, we have g+ (ψd ) ≥ 0 on I+ . To show that g− (ψd ) ≥ 0 on I− we use the near-CMC assumption. Since ψd ≤ M− and f − (M− ) = 0, it follows from Eq. (66) that ψd ≤ 0 on I− . Since ψd (−π ) = ψd (0) = 1, it follows that ψd ≥ 1 on I− . Since g− is increasing, we conclude that (γ N − 1)2 2 2 2 . (67) g− (ψd ) ≥ g− (1) = −(γ N − 1) + (t − 1) = (t − 1) 1 − (t − 1)2 Now (γ N − 1)2 2q ≤ M∞ < 1 (t − 1)2

(68)

by the definition of M∞ and Lemma 5. Hence g− (ψd ) > 0 on I− . By Lemma 6, F (d) = h(0) where − 2κq h + V h = −R,

(69)

and where V and R are defined in Eqs. (58) and (59). Since μ = 0, −q−1

R = (q + 2)2η2 d −2q d −q−3 ψd

+ (q − 2)d q−3 κg± (ψd )

(70)

on I± . Since g± (ψd ) ≥ 0 and g− (ψd ) > 0 we conclude that R ≥ 0,

R ≡ 0.

(71)

Since V ≥ 0 and V ≡ 0, the strong maximum principle ([8] Th. 9.6) then implies that h < 0 on S 1 . In particular, F (d) = h(0) < 0. Corollaries 1 and 2, together with Propositions 6 and 5 imply Theorem 3 – in the near-CMC regime |t − γ N | > 2 there exists a solution of (23) if and only if the TT-tensor is not identically zero. If μ = 0 the solution is unique. Although we have not determined uniqueness if μ = 0, we note that Proposition 6 is the first uniqueness result for the conformal method that does not make use of a bound for |∇τ |.

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

715

3.3.3. Proof of Theorem 4 (exceptional case: t = γ N ). The value t = γ N is special. We have a partial result that is parallel to the exceptional Case 4 of Theorem 1. Lemma 7. Suppose t = γ N . If μ = η = 0, then F(d) = 1 for all d > 0, and hence there is a one-parameter family of solutions. On the other hand, if μ = 0 but η = 0, then there are no solutions. Proof. If t = γ N and μ = η = 0, then the unique solution of (34) is clearly ψd = 1 (for any d). Hence φ solves (32) if and only if φ is a positive constant. On the other hand, suppose that μ = 0 and η = 0. Then the constant 1 is evidently a subsolution of (34), as is 1 + for sufficiently small. Hence F(d) > 1 for all d > 0. Theorem 4 follows from Lemma 7 and Proposition 5. 3.3.4. Proof of Theorem 5 (small-TT results). In this section we wish to show that solutions of F(d) = 1 exist for small, nonzero, transverse-traceless tensors (i.e. if μ and η are small but not both zero). From Lemma 7 we know that if t = γ N , then there are no solutions of F(d) = 1 when μ = 0 and η = 0. So we cannot expect to find small-TT solutions if t = γ N . We show here that if γ N = 0, then this is the only obstacle. We also obtain a partial result for γ N = 0, showing small-TT solutions exist if |t| > |γ N |. Recall that F0 (d) = ψ0,d (0), where ψ0,d is defined analogously to ψd , but using μ = η = 0. We will establish the following facts: 1. If |γ N | < |t|, then limd→0+ F0 (d) < 1. 2. For any fixed d > 0, F(d) approaches F0 (d) as μ and η approach zero. So if |γ N | < |t|, and if μ and η are sufficiently small, there is a d such that F(d) < 1. If in addition μ = 0 or η = 0, then Lemma 4 implies that there is at least one solution of F(d) = 1. If μ = 0 or η = 0, Lemma 3 shows that F(d) → ∞ as d → 0, but that this 2q singularity is not present if η = μ = 0. In this case, the term −d − n ψ0,d dominates Eq. (34) as d → 0 and we expect the solutions to be nearly constant. The following lemma computes the value of this constant, which is less than one if |γ N | < |t|. Lemma 8. Let N ,t =

1 + γ N2 1 + t2

1

2q

.

(72)

If |t| = 1, then ψ0,d −−−→ N ,t

(73)

d→0

in W 2, p (S 1 ), and hence uniformly on S 1 . In particular lim F0 (d) = N ,t .

(74)

d→0+

Proof. Recall that ψ0,d is the solution of nq

−q−1

− κ(γ N + λ)2 ψ0,d − 2d − 2 κq ψ0,d

q−1

+ κ(t + λ)2 ψ0,d = 0.

(75)

716

D. Maxwell

From Lemma 2 (since μ = η = 0), 0 < m ∞ ≤ ψ0,d ≤ M∞ for all d, and consequently there exists a positive constant C such that 1 − (γ N + λ)2 ψ −q−1 + 1 (t + λ)2 ψ q−1 ≤ C (76) 0,d 0,d 2q 2q for all d > 0. Since ψ0,d satisfies (75) it follows that ||ψ0,d || L p ≤ 2πCd q−2 p

(77)

for all d > 0. The Poincaré inequality implies that there is a constant c p such that if u ∈ W 2, p (S 1 ), (78) ||u||W 2, p (S 1 ) ≤ c p ||u || L p + u . Let Ad =

1 2π

S1

||d ||W 2, p (S 1 )

ψ0,d , and let d = ψ0,d − Ad . Then ≤ c p ||d || L p + d = c p ||ψ0,d || L p (S 1 ) ≤ (2πc p C d 2q/n )1/ p → 0 1 S1

S

(79) as d → 0. We will now show that Ad → N ,t as d → 0. Since ψ0,d = Ad + d , it then follows that ψ0,d → N ,t in W 2, p (S 1 ) as d → 0. Let (dk ) be any positive sequence converging to zero. Since m ≤ Adk ≤ M,

(80)

some subsequence {Adkl } converges to a constant A ∈ [m, M]. Moreover, ψ0,dkl → A uniformly. = 0, and hence Then ψ0,dkl = Adkl + dk → A in W 2, p (S 1 ). For all d, S 1 ψ0,d −q−1 q−1 κ(γ N + λ)2 ψ0,d − κ(t + λ)2 ψ0,d = 0. (81) S1

Using the uniform convergence of ψ0,dkl to A and the fact that 0 < m ≤ ψ0,d ≤ M for all d, we conclude that κ(γ N + λ)2 − Aq−1 κ(t + λ)2 = 0. (82) A−q−1 S1

Now

S1

Similarly,

S1

(γ N + λ)2 = π (γ N + 1)2 + (γ N − 1)2 = 2π 1 + γ N2 . S1

(t + λ)2 = 2π 1 + t 2 .

Hence

A=

1 + γ N2 1 + t2

(83)

(84)

1

2q

= N ,t .

The uniqueness of the limit A now implies that Ad → N ,t as d → 0.

(85)

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

717

Proposition 7. Suppose |t| > |γ N | and |t| = 1. Then there exists at least one solution of F(d) = 1 if 1. η = 0 or μ = 0 and 2. |η| and |μ| are sufficiently small. Proof. Since |t| > |γ N | it follows that the constant N ,t from Lemma 8 is less than 1. In particular, F0 (d) < 1 for d sufficiently small. Fix a particular value of d such that this holds. By Proposition 4, Part 4 it follows that F(d) → F0 (d) as (η, μ) → (0, 0). In particular, F0 (d) < 1 if μ and η are sufficiently small. The existence result now follows from Lemma 4. 3.3.5. Proof of Theorem 6 (non-vanishing mean curvature). From the definition of the mean curvatures τt , we see that τt has constant sign if |t| > 1, but changes sign if |t| ≤ 1. In this section we wish to show that there are solutions of F(d) = 1 so long as τt has constant sign. Recall that our near-CMC existence result Corollary 1 was obtained by showing that ψd (x) < 1 for all x ∈ S 1 if d is sufficiently large and |t − γ N | > 2. We only need to show, however, that F(d) = ψd (0) < 1 if d is sufficiently large. Section 4 contains an asymptotic analysis that allows us to compute the exact value of limd→∞ F(d) (as well as the speed of the convergence). Assuming the results of Sect. 4 for now, we show in this section that: 1. limd→∞ F0 (d) < 1 if |t| > 1. 2. limd→∞ F0 (d) = 1 if |t| < 1. 3. limd→∞ F(d) = limd→∞ F0 (d) if |t| = 1. In particular, if |t| > 1, then F(d) < 1 for some d > 0. If μ = 0 or η = 0, then Lemma 4 then implies that there is a solution of F(d) = 1. Definition 2. We say that f (x) → L rapidly at infinity if lim | f (x) − L| x n = 0

x→∞

(86)

for all n ∈ N. We say that f (x) → L rapidly at 0 if lim | f (x) − L| x −n = 0

x→0

(87)

for all n ∈ N. Recall that F0 (d) = ψ0,d (0), where ψ0,d is defined analogously to ψd , but with η = μ = 0. Proposition 8. Assume that |t| = 1. Then lim ψ0,d (0) =

d→∞

and this convergence is rapid.

1 − q1

|t|

|t| < 1 |t| > 1,

(88)

718

D. Maxwell

Proof. Assuming the results of Sect. 4, it follows from Theorem 9 applied to Eq. (75) q √ (taking = 2κqd − n ) that

κ |γ N + 1| + κ |γ N − 1| lim ψd (0) = d→∞ κ |t + 1| + κ |t − 1|

1 q

(89)

and this convergence is rapid. Note that since |γ N | < 1, |1 + γ N | + |1 − γ N | = 2. If |t| < 1, then |1 + t| + |1 − t| = 2, otherwise |1 + t| + |1 − t| = 2 |t|. The result now follows. We would like to establish a corresponding limit without the hypothesis η = μ = 0. For large values of d the contribution of the terms involving η and μ in Eq. (34) are small. So we expect that ψ0,d should be a good approximation for ψd , and we expect to obtain the same limit. To make this idea precise, we will show that small perturbations of ψ0,d are sub- and supersolutions of the equation for ψd . Recall from Lemma 2 that 0 < m ∞ ≤ ψ0,d ≤ M∞ for all d > 0. We define Gd : [−m ∞ /2, M∞ ] → L ∞ by Gd (K ) = Nd (ψ0,d + K ),

(90)

2, p

where Nd : W+ (S 1 ) → L p (S 1 ) is the nonlinear Lichnerowicz operator 2q

Nd (w) = −2κqd − n w − 2η2 d −2q w −q−1 − κ[μd −q + λ + γ N ]2 wq−1 .

(91)

So ψ0,d + K is a sub- or supersolution of (34) if and only if Gd (K ) ≤ 0 or ≥ 0 almost everywhere. Using the fact that ψ0,d solves Eq. (75) we can write Gd (K ) = D(K ) + E(K ), where

(92)

q+1 D(K ) = (t + λ)2 (ψ0,d + K )q+1 − ψ0,d , −q−1

E(K ) = (γ N + λ)2 [ψ0,d − (ψ0,d + K )−q−1 ]

2 − 2η2 d −2q + (μd −q + γ N + λ)2 ) (ψ0,d + K )−q−1 .

(93)

Lemma 9. There exist positive constants D− , D+ , E − and E + such that D− K ≤ D(K ) ≤ D+ K D+ K ≤ D(K ) ≤ D− K and

K ≥ 0, K ≤ 0,

−q−1 E − K ≤ (γ N + λ)2 ψ0,d − (ψ0,d + K )−q−1 ≤ E + K

−q−1 E + K ≤ (γ N + λ)2 ψ0,d − (ψ0,d + K )−q−1 ≤ E − K

for all d > 1 and all K ∈ [−m ∞ /2, M∞ ].

(94)

K ≥ 0, K ≤0

(95)

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

719

Proof. First consider the expression f A (h) = A−q−1 −(A +h)−q−1 for A ∈ [m ∞ , M∞ ] and h ∈ [−m ∞ /2, M∞ ]. Then 1 (q + 1)(A + th)−q−2 dt h. (96) f A (h) = 0

If h ≥ 0 then (q + 1)(2M∞ )−q−2 h ≤ f A (h) ≤ (q + 1)(m ∞ /2)−q−2 h.

(97)

(q + 1)(m ∞ /2)−q−2 h ≤ f A (h) ≤ (q + 1)(2M∞ )−q−2 h.

(98)

If h ≤ 0 then Inequalities (95) now follow letting E + = max[(γ N −1)2 , (γ N +1)2 ](q +1)(m ∞ /2)−q−2 and E − = min[(γ N − 1)2 , (γ N + 1)2 ](q + 1)(2M∞ )−q−2 . The argument for inequality (94) is similar. Proposition 9. There exists a constant c > 0 such that ||ψ0,d − ψd ||∞ < cd −q

(99)

for all d sufficiently large. In particular, lim F(0) = lim ψd (0) = lim ψ0,d (0).

d→∞

d→∞

d→∞

(100)

Proof. For each d sufficiently large, we will find constants K − (d) and K + (d) that are O(d −q ) and that satisfy Gd (K − (d)) < 0 and Gd (K + (d)) > 0. Assuming this for the moment, we see that ψ0,d + K − (d) and ψ0,d + K + (d) are sub- and supersolutions of (34) and hence ψ0,d + K − (d) ≤ ψd ≤ ψ0,d + K + (d) for d sufficiently large. The asymptotics of K ± (d) then imply inequality (99). Notice that D(K ) has the same sign as K . So Gd (K ) > 0 if K > 0 and E(K ) > 0. Now if 0 < K ≤ M∞ then Lemma 9 implies −q−1

E(K ) = (γ N + λ)2 [ψ0,d − (ψ0,d + K )−q−1 ]

2 − 2η2 d −2q + (μd −q + λ + γ N )2 ) (ψ0,d + K )−q−1

≥ E − K − (2η2 + μ2 )d −2q + 4|μ|d −q (m ∞ /2)−q−1 . Let

2 (2η + μ2 )d −q + 4|μ| (m ∞ /2)−q−1 −q K + (d) = d . E−

(101)

(102)

Then 0 < K + (d) ≤ M∞ if d is sufficiently large, and we have E(K + (d)) ≥ 0 and Gd (K + (d)) ≥ 0 also. On the other hand, if −m ∞ /2 ≤ K < 0, then Lemma 9 implies E(K ) ≤ E − K − (2η2 + μ2 )d −2q (2M∞ )−q−1 + 4|μ|d −q (m ∞ /2)−q−1 .

(103)

Let K − (d) = −

4|μ|(m ∞ /2)−q−1 −q d , E−

(104)

720

D. Maxwell

so −m ∞ /2 ≤ K − (d) < 0 if d is sufficiently large. We then have E(K − (d)) ≤ 0 and Gd (K − (d)) ≤ 0 also. Since K − (d) and K + (d) are both O(d −q ), we have proved the desired result. We now summarize the argument that, along with Proposition 5, proves Theorem 6. Proposition 10. Suppose |t| > 1. If η = 0 or μ = 0, there exists at least one solution of F(d) = 1. Proof. By Propositions 8 and 9, if |t| > 1 then 1 1 q lim F(d) = < 1. d→∞ |t|

(105)

So F(d) < 1 for d sufficiently large. Since η = 0 or μ = 0, Lemma 4 now implies there exists a solution of F(d) = 1. 3.3.6. Proof of Theorem 7 (nonexistence/non-uniqueness). In this section we restrict our attention to the case μ = 0, so that η alone controls the size of the transverse-traceless tensor. We show that if |t| < 1, (i.e. when τt changes sign), then there is a critical threshold η0 ≥ 0 for the size of η. If η > η0 , then there are no solutions of F(d) = 1, whereas if η < η0 there are at least two. In some cases we can show that η0 > 0 and hence there are multiple solutions for small values of η. The choice of η plays a critical role in this section, so we use the notation F[η] to distinguish different functions F corresponding to different values of η. Since F[η] (d) only depends on η2 , we can assume that η ≥ 0. We will show the following facts (assuming μ = 0 and |t| < 1): 1. limd→∞ F[η] (d) = 1, and this limit is approached from above. 2. For any fixed d > 0, F[η] (d) is strictly increasing in η. 3. On any finite interval (0, d0 ] we can find η sufficiently large so that F[η] (d) > 1 on (0, d0 ]. The idea of the proof proceeds as follows. Picking an arbitrary η > 0, Fact 1 implies F[η] (d) > 1 for d larger than some d0 . Using Fact 3 we then increase η to ensure that F[η] (d) > 1 on (0, d0 ]. Fact 2 ensures that after having increased η, we still have the condition F[η] (d) > 1 for d > d0 . So F[η] > 1 for all d > 0 and there are no solutions of F[η] (d) = 1. The existence of a critical value of η follows from Fact 2: if no solutions exist for some η, then F[η] (d) > 1 for all d and raising the value of η maintains this inequality. On the other hand, since F[η] (d) > 1 for d large (by Fact 1) and for d near zero (since F[η] is singular there), if F[η] (d) < 1 for some d, then there will be at least two solutions. Proposition 11. For fixed d, the value of F[η] (d) is strictly increasing in η. Moreover,

2η2 F[η] (d) ≥ κ(1 + |t|)2

2q1

d −1 .

(106)

Proof. Fix d > 0 and suppose 0 ≤ η1 ≤ η2 . Let ψd,1 and ψd,2 be the corresponding solutions of (34). Then substituting ψ1 into the equation for ψ2 we have 2q

−q−1

−2κqd − n ψd,1 − 2η22 d −2q ψd,1 − κ[μd −q + λ + γ N ]2 ψd

= 2(η12 − η22 )d −2q ψd,1 < 0.

q−1

+ κ(t + λ)2 ψd

(107)

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

721

So ψd,1 is a subsolution of the equation for ψd,2 and ψd,1 ≤ ψd,2 . A similar computation shows that ψd,1 + is also a subsolution for > 0 sufficiently small and hence ψd,1 < ψd,2 everywhere. In particular, F[η1 ] (d) < F[η2 ] (d). To obtain the estimate (106) we note that a constant k is a subsolution of (34) if − 2η2 d −2q k −q−1 + κ(t + λ)2 k q−1 ≤ 0.

(108)

This holds in particular if − 2η2 d −2q k −q−1 + κ(1 + |t|)2 k q−1 ≤ 0,

(109)

and therefore if k 2q =

2η2 d −2q . κ(1 + |t|)2

(110)

Since F[η] (d) ≥ k if k is a subsolution of (34), we have established inequality (106). Proposition 12. Suppose μ = 0 and η = 0. Then there exists a constant c > 0 such that ψd ≥ ψ0,d + cd −2q

(111)

for all d sufficiently large. Proof. We use the function Gd : [−m ∞ /2, M∞ ] → L ∞ defined in Sect. 3.3.5. Recall that ψ0,d + K is a subsolution of the equation for ψd if Gd (K ) ≤ 0 almost everywhere. Recall also that Gd can be written Gd (K ) = D(K ) + E(K ),

(112)

where D and E are defined in Eq. (93). If 0 < K ≤ M∞ , then by Lemma 9, D(K ) ≤ D+ K

(113)

for a certain constant D+ > 0. Also, −q−1

E(K ) = (γ N + λ)2 [ψ0,d

− (ψ0,d + K )−q−1 ] − 2η2 d −2q (ψ0,d + K )−q−1

≤ E + K − 2η2 (2M∞ )−q−1 d −2q

(114)

for a certain constant E + > 0. Let K− =

2η2 (2M∞ )−q−1 −2q d . D+ + E +

(115)

If d is sufficiently large, then 0 < K − ≤ M∞ and we then have Gd (K − ) = D(K − ) + E(K − )

≤ D+ K − + E + K − − 2η2 (2M∞ )−q−1 d −2q = 0.

(116)

So ψ0,d + K − is a subsolution, and we have obtained inequality (111) with c = 2η2 (2M∞ )−q−1 /(D+ + E + ).

722

D. Maxwell

The following proposition formalizes the arguments made at the start of this section and, along with Proposition 5, completes the proof of Theorem 7. Proposition 13. Suppose |t| < 1 and μ = 0. There exists η0 ≥ 0 such that if 0 < |η| < η0 , there exists at least two solutions of F(d) = 1, while if |η| > η0 , there are no solutions. If |t| > γ N then η0 > 0. Proof. We first show that ψd (0) > 1 for d sufficiently large. From Proposition 9 we know that limd→∞ ψ0,d (0) = 1 and that this convergence is rapid. On the other hand, from Proposition 12 there is a positive constant c such that ψd (0) > ψ0,d (0) + cd −2q . Hence

ψd (0) − 1 ≥ (ψ0,d (0) − 1) + cd −2q = (ψ0,d (0) − 1)d 2q + c d −2q . (117) From the rapid convergence we have (ψ0,d (0) − 1)d 2q → 0 as d → ∞ and hence ψd (0) > 1 for d large enough. To show that there are no solutions for η sufficiently large, fix a given η1 and pick d0 so that if d > d0 then F[η1 ] (d) > 1. From inequality (106) we can find η2 so that F[η2 ] (d) > 1 for all d ∈ (0, d0 ]. Letting η = max(η1 , η2 ), it follows from Proposition 11 that F[η] (d) > 1 for all d > 1. Let A = inf{η ≥ 0 : F[η] (d) > 1 for alld > 0}; we have just shown that A is nonempty. Suppose η ∈ A and η ≥ η. Proposition 11 implies that for any d > 0, F[η ] (d) ≥ F[η] (d) > 1, and hence η ∈ A. Let η0 = inf A. If η > η0 then η ∈ A and there are no solutions of F[η] (d) = 1. Suppose 0 < η < η0 , and pick η so η < η < η0 . Then η ∈ A and for some d0 , F[η ] (d0 ) ≤ 1. By Proposition 11, F[η] (d0 ) < F[η ] (d0 ) ≤ 1. From Lemma 3 we know that F[η] (d) > 1 for d sufficiently small, and we have already shown that F[η] (d) > 1 for d sufficiently large. From the continuity of F it follows that there are at least two solutions of F[η] (d) = 1, one for d < d0 and one for d > d0 . Proposition 7 implies that η0 > 0 if |t| > |γ N |; if η0 = 0 then there can only be solutions of (34) if η = 0. We have now proved all the results of Sect. 3.1, up to the asymptotic analysis cited in the proof of Proposition 8. 3.4. Sensitivity with respect to a coupling coefficient. The results of the previous sections depend in a sensitive way on coupling constants in Eq. (20). Consider the following variation of the Einstein constraint equations: Rh − |K |2h + tr h K 2 = 0, divh K − (1 + ) d tr h K = 0.

(118)

The case = 0 corresponds with the standard constraint equations. Repeating the analysis above for these perturbed constraint equations the analogue of Eq. (34) is −q−1

2q

−2κqd − n ψd − 2η2 d −2q ψd +d

−q

μ]

2

−q−1 ψd

+ κ(t + λ)

2

− κ[(γ N + λ)(1 + )

q−1 ψd

= 0.

(119)

One readily shows that estimate (50) of Lemma 3 holds for this equation, as does Lemma 4, so long as > −1. Hence there exists a solution of the constraints for this data if and only if F(d) ≤ 1 for some d > 0.

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

723

Recall that for the standard conformal method (i.e. when = 0), limd→∞ F(d) = 1 if |t| < 1. Since we are seeking solutions of F(d) = 1, it is as if there is a solution of F(d) = 1 at d = ∞. Adjusting affects the value of this limit. We will show that when < 0, limd→∞ F(d) < 1, and the solution at d = ∞ becomes a true solution. On the other hand, for > 0, limd→∞ F(d) > 1, and this allows for there to be no solutions at all of F(d) = 1 for sufficiently small transverse-traceless tensors. We first show that when < 0, we have existence under rather general conditions, and lose the non-existence results of Theorems 4 and 7. Proposition 14. Suppose −1 < < 0 and t = 1. If either μ = 0 or η = 0 then there exists at least one solution of Eq. (119). Proof. Following the the arguments leading to Proposition 8 we see that lim ψ0,d (0) =

d→∞

1

|1 + | q

1

|1 + | q |t|

− q1

|t| < 1 |t| > 1.

(120)

Since |1 + | < 1, we see that for any choice of t = 1, ψ0,d (0) < 1 for d sufficiently large. The arguments of Sect. 3.3.6 can then be repeated to show that limd→∞ ψd (0) = limd→∞ ψ0,d (0), and hence ψd (0) < 1 for d sufficiently large. Hence there exists at least one solution. Raising the value of the coupling coefficient, i.e. when > 0, we lose the small-TT result 5. Proposition 15. Suppose > 0. If t is sufficiently close to γ N , and if μ = 0, then there does not exist a solution of (119). Proof. We will show that φ = 1 + δ is a subsolution of (119) for any d > 0 if δ > 0 is sufficiently small and t is sufficiently close to γ N . Having shown this we conclude that F(d) ≥ 1 + δ for all d > 0, and hence there are no solutions. Note that φ = 1 + δ is a subsolution (for μ = 0) if − 2η2 d −2q (1 + δ)−q−1 − (1 + )2 (γ N + λ)2 (1 + δ)−q−1 + (t + λ)2 (1 + δ)q−1 ≤ 0. (121) First, consider the case δ = 0. We then wish to show that − 2η2 d −2q − (1 + )2 (γ N + λ)2 + (t + λ)2 ≤ 0.

(122)

− (1 + )2 (γ N + λ)2 + (t + λ)2

(123)

Since > 0,

is strictly negative if t = γ N . Hence the left-hand side of (121) is negative if δ = 0, and it is easy to see that it remains negative if δ > 0 is sufficiently small. For any such δ, we observe that this condition also holds for t sufficiently close to γ N .

724

D. Maxwell

4. A Singularly Perturbed Lichnerowicz Equation The most interesting results of Sect. 3 concerning non-existence/non-uniqueness depend on the asymptotic analysis of this section. We consider the singularly perturbed Lichnerowicz equation − 2 u − α 2 u −q−1 + β 2 u q−1 =0

(124)

on S 1 , which we take to be [−π, π ] with endpoints identified. We assume that the functions α and β are constant on the intervals I− = (−π, 0) and I+ = (0, π ) taking on the values α± and β± . Proposition 4 implies that there exists a (unique) solution u ∈ W+2,∞ (S 1 ) of (124) so long as one of α± = 0 and one of β± = 0. By uniqueness of the solution we note that it is even about x = π/2. As → 0, Eq. (124) becomes an algebraic equation for u and we expect that, away from the points of discontinuity of α and β, u converges to the algebraic solution u 0 = |α± /β± |1/q on I± ; see Fig. 4. We are concerned with the behaviour of u at the point of discontinuity, i.e. lim→0+ u (0). The principal result of this section is the following. Theorem 9. Suppose that β− = 0 and β+ = 0. Then 1 |α+ | + |α− | q , lim u (0) = →0 |β+ | + |β− |

(125)

and this convergence is rapid (as defined in Definition 2). To obtain the limit at zero, we use a blow-up argument, guessing an asymptotic form of the solution. We start with a boundary value problem on [0, ∞). Proposition 16. Let u 0 > 0. There exists a solution on [0, ∞) of − U = U −q−1 − U q−1

(126)

satisfying U (0) = u 0 and lim x→∞ U (x) = 1 (with U converging rapidly to its limit at ∞). Moreover, U satisfies the first order equation

2 −q/2 U (127) U = − U q/2 q and U (x) → 0 rapidly as x → ∞.

Fig. 4. Functions u and their limit as → 0

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

725

Proof. We construct a solution by means of the method of reduction of order. Suppose 0 0. Hence X has an increasing inverse function U : [0, ∞) → [u 0 , 1) satisfying U (0) = u 0 and lim x→∞ U (x) = 1. Moreover,

2 1 U (x)−q/2 − U (x)q/2 . U (x) = = X (U (x)) q

(129)

(130)

An easy computation involving the chain rule and Eq. (127) now shows that U satisfies the ODE (126) and hence U is the function we seek. If u 0 > 1 one shows similarly that the inverse function of 2 u 0 v q/2 dv (131) X (u) = q u vq − 1 defined on (1, u 0 ] is the desired function. When u 0 = 1, then U (x) ≡ 1 is the solution. To show the rapid convergence at infinity we focus on the case 0 0 and lim x→∞ W (x) = 0. Now

2 (1 − W )q/2 − (1 − W )−q/2 = H (W )W, W = (132) q where H is a continuous function near 0 and

d 2 q/2 −q/2 H (0) = (1 − W ) = − 2q. − (1 − W ) dW W =0 q Since W (x) → 0 as x → ∞, there exists x0 so that if x ≥ x0 , √ H (W (x)) < − q.

(133)

(134)

Hence √ W ≤ − q W

(135)

for x ≥ x0 and by Gronwall’s inequality

√ W (x) ≤ W (x0 ) exp(− q x).

(136)

Since W ≥ 0 also, we conclude that W converges rapidly to 0 and U converges rapidly to 1. The rapid convergence when u 0 > 1 is proved similarly, while the result is trivial if u 0 = 1. Finally, we note that the rapid convergence of U to 0 at infinity follows from the rapid convergence of U to 1 at infinity and Eq. (127).

726

D. Maxwell

We now turn to a boundary value problem on R with piecewise constant coefficients. Consider − v − α 2 v −q−1 + β 2 v q−1 = 0

(137)

on R where α and β are equal to the constants α± and β± on the intervals (0, ∞) and (−∞, 0). Proposition 17. Suppose β± = 0. Let L ± = |α± /β± |1/q . There exists a solution v ∈ 2,∞ Wloc (R) of (137) satisfying lim v(x) = L ± .

(138)

x→±∞

Moreover, v converges rapidly to its limits at ±∞, v converges rapidly to 0 at ±∞, and 1 |α+ | + |α− | q . (139) v(0) = |β+ | + |β− |

1 q+1 q−1 2q Proof. Let ω± = α± β± . Given any c > 0 we define x >0 L + U+ (ω+ x) vc = , L − U− (−ω− x) x < 0

(140)

where U± is the solution of (126) provided by Proposition 16 satisfying U± (0) = c/L ± and lim x→∞ U± (x) = 1. Then vc is continuous, satisfies the differential equation (137) on (0, ∞) and (−∞, 0), and has the correct limiting behaviour at ±∞. If for some c, vc is differentiable at 0, then vc will be a weak solution on R and by elliptic regularity the desired strong solution. From Proposition 16 we have q/2 c −q/2 2 c vc (0+) = L + ω+ U+ (0) = L + ω+ − , (141) q L+ L+ and similarly

c −q/2 2 c q/2 vc (0−) = −L − ω− − . q L− L−

Setting these quantities equal we obtain

−q/2 −q/2 q/2 q/2 L + ω+ L + cq = L + ω+ L + + L − ω− L − . + L − ω− L −

(142)

(143)

From the definitions of L ± and ω± we have the identities −q

q

2 2 L 2± ω± = α+ ±2 L ± = β± L ±,

(144)

and hence cq =

|α+ | + |α− | . |β+ | + |β− |

With this choice of c we obtain a solution of (137) satisfying Eq. (139).

(145)

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

727

Using the function found in Proposition 17 we can construct approximate solutions of the differential equation (124). Our strategy for proving Theorem 9 will be to show that these approximate solutions improve as → 0 and can be corrected using Newton’s method to obtain solutions satisfying the limit (125). We form the approximate solutions first on [−π/2, π/2], defining w (x) = v(x/) + h (x),

(146)

where h will be a small correction term. We will pick h so that w (±π/2) = 0, and hence can we can extend w to be defined on S 1 by declaring it to be even about x = π/2. To define the correction term, we first let 1 2 x 0 ≤ x ≤ π/2 ζ (x) = π , (147) 0 −π/2 < x ≤ 0 and note that ζ (π/2) = 1. Let h (x) = −d,+ ζ (x) − d,− ζ (−x),

(148)

where d,± =

1 v (±π/(2)).

(149)

With this choice of h , w (±π/2) = 0. For p > 1 we define the nonlinear Lichnerowicz operator N : W 2, p (S 1 ) → L p (S 1 ) by N (w) = − 2 w − α 2 w −q−1 + β 2 wq−1 .

(150)

The error E = N (w (x)) is even about x = π/2 and one readily computes that on [−π/2, π/2],

E = −α 2 (v(x/) + h (x))−q−1 − v(x/)−q−1

2 2 d,+ χ+ + d,− χ− , (151) + β 2 (v(x/) + h (x))q−1 − v(x/)q−1 + π where χ± are the characteristic functions of (0, π ) and (−π, 0) respectively. Lemma 10. ||E || L ∞ (S 1 ) → 0

(152)

rapidly as → 0. Proof. From Proposition 17 we know that v (x) → 0 rapidly as x → ∞. Consequently the constants d,± converge rapidly to zero as → 0. Moreover, d,+ χ + and d,− χ− converge rapidly to 0 in L ∞ (S 1 ). 1 Consider F(v) = v −q−1 . Then F(v + h) − F(v) = (−q − 1) 0 (v + th)−q−2 dt h and therefore |F(v + h) − F(v)| ≤ (q + 1) max (v + th)−q−2 |h|. t∈[0,1]

(153)

728

D. Maxwell

Now v (x) ≥ min(L + , L − ) > 0 and h converges rapidly to 0 in L ∞ ([−π/2, π/2]). So there is an m such that v + th ≥ m > 0

(154)

for all t ∈ [0, 1] and all sufficiently small. It follows that ||(v + h )−q−1 − v−q−1 || L ∞ ([−π/2,π/2]) ≤ (q + 1)m −q−2 ||h || L ∞ ([−π/2,π/2]) (155) for sufficiently small. From the rapid convergence of h to zero we conclude that

α 2 (v + h )−q−1 − v−q−1 → 0 (156) rapidly in L ∞ ([−π/2, π/2]) as → 0. A similar argument establishes

β 2 (v + h )q−1 − vq−1 → 0

(157)

rapidly as → 0. We have considered all terms of E and conclude that ||E || L ∞ ([−π/2,π/2]) → 0

(158)

rapidly as → 0. Since E is even about x = π/2, we have the same convergence in L ∞ (S 1 ). p

For constants 0 < m < M and p > 1 we define the slab Sm,M = {u ∈ W 2, p (S 1 ) : m ≤ u ≤ M}. 2, p

Lemma 11. For u ∈ W+ (S 1 ), let Fr (u) = u r . There exists a constant K (m, M, r ) such that ||Fr (u) − Fr (v)|| L p (S 1 ) ≤ K (m, M, r )||u − v|| L p (S 1 )

(159)

p

for all u, v ∈ Sm,M . Let L u,r : W 2, p → L p be the linear function L u,r v = Fr (u)v.

(160)

p

The map u → L u,r is Lipschitz continuous on Sm,M . Proof. Note that if m ≤ x, y ≤ M then 1 r ((1 − t)x + t y)r −1 dt (x − y), x r − yr =

(161)

0

and hence |x r − y r | ≤ r (m r −1 + M r −1 ) |x − y| .

(162)

||Fr (u) − Fr (v)|| L p (S 1 ) ≤ r (m r −1 + M r −1 )||u − v|| L p (S 1 ) .

(163)

Consequently

Inequality (159) now follows setting K = r (m r −1 + M r −1 ).

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

729

p

If u 1 , u 2 ∈ Sm,M and v ∈ W 2, p , then ||L u 1 ,r v − L u 2 ,r v|| L p (S 1 ) = ||(Fr (u 1 ) − Fr (u 2 ))v|| L p (S 1 ) ≤ ||Fr (u 1 ) − Fr (u 2 )|| L p (S 1 ) ||v|| L ∞ (S 1 ) ≤ K (m, M, r )||u 1 − u 2 || L p (S 1 ) ||v||W 2, p (S 1 ) ≤ K (m, M, r )||u 1 − u 2 ||W 2, p (S 1 ) ||v||W 2, p (S 1 ) . (164) Hence ||L u 1 ,r − L u 2 ,r || ≤ K (m, M, r )||u 1 −u 2 ||W 2, p (S 1 ) which establishes the Lipschitz continuity. One readily shows that the linearization of N at w is the operator N [w] defined by N [w]h = − 2 h + [(q + 1)α 2 w −q−2 + (q − 1)β 2 wq−2 ]h.

(165)

As an immediate consequence of Lemma 11 we see that N is Lipschitz continuous. Corollary 3. Suppose 0 < m < M. There exists a constant C(m, M) such that for all p v, w ∈ Sm,M , ||N [v] − N [w]|| L(W 2, p (S 1 ),L p (S 1 )) < C(m, M)||v − w||W 2, p (S 1 ) .

(166)

Our application of Newton’s method requires an estimate of the size of N −1 as → 0, which we obtain next. Proposition 18. Let V ∈ L ∞ (S 1 ) and consider the operator L = − 2 +V

(167)

as a map from W 2, p (S 1 ) to L p (S 1 ), where p > 1. Suppose there is a constant m such that V ≥ m > 0. Then L is continuously invertible. Moreover, there is a constant C such that if is sufficiently small, −4 ||L−1 || ≤ C .

(168)

Proof. The fact that L is continuously invertible follows from standard elliptic theory and the positivity of V . We turn our attention to obtaining the estimate (168). Let Sr1 denote the circle of radius r , and let ir : Sr1 → S 1 be the natural diffeomorphism. For a function u defined on S 1 let u r = u ◦ ir . Suppose − 2 u + V u = f

(169)

on S 1 . Letting r = 1/ we then have − u r + Vr u r = fr .

(170)

Let I be an interval of length 1 in Sr1 and let I be the interval of length 1/2 at the center of I . From interior L p estimates ([8] Th. 9.11) we have ||u r ||W 2, p (I ) ≤ C1 || fr || L p (I ) + ||u r || L p (I ) , (171) where C1 depends on ||V ||∞ but does not depend on I or r . Averaging these interior estimates over all intervals I in Sr1 we obtain

(172) ||u r ||W 2, p (Sr1 ) ≤ C2 || fr || L p (Sr1 ) + ||u r || L p (Sr1 ) ,

730

D. Maxwell

where C2 (and all subsequent constants Ck ) is independent of r (and ). One readily verifies that for any function w on S 1 , ||∇ k w|| L p (S 1 ) = r

k− 1p

||∇ k wr || L p (Sr1 ) .

(173)

Assuming that r > 1 (i.e. < 1) it then follows that ||u||W 2, p (S 1 ) ≤ C3r

2− 1p

||u r ||W 2, p (Sr1 ) ,

(174)

and therefore

C2 C3 || fr || L p (Sr1 ) + ||u r || L p (Sr1 ) = −2 C2 C3 || f || L p (S 1 ) + ||u|| L p (S 1 ) .

||u||W 2, p (S 1 ) ≤ r

2− 1p

(175)

By Sobolev embedding in S 1 we have for some constant C4 , ||u|| L p (S 1 ) ≤ C4 ||u||W 1,2 (S 1 ) . Suppose <

(176)

√ m. Then ||u||2W 1,2 (S 1 )

=

S1

|∇u|2 + u 2

≤ max( −2 , m −1 ) = max( −2 , m −1 )

S1

2 |∇u|2 + V u 2 fu

S1

≤ −2 || f || L p (S 1 ) ||u|| L p (S 1 ) ,

(177)

where p is the conjugate exponent to p. By Sobolev embedding again we have ||u|| L p (S 1 ) ≤ C5 ||φ||W 1,2 (S 1 ) , and hence ||u||W 1,2 (S 1 ) ≤ C5 −2 || f || L p (S 1 ) . Combining inequalities (175), (176), and (178) we obtain

||u||W 2, p (S 1 ) ≤ C2 C3 −2 + C4 C5 −4 || f || L p (S 1 ) , if < min(1, C4 C5 ).

√

(178)

(179)

m). Since < 1, this establishes inequality (168) with C = C2 C3 (1 +

We are now in a position to prove our main result of the section. Proof (Theorem 9). The proof involves Newton’s method, and we briefly recall the required hypotheses here ([1]). Let X and Y be Banach spaces, x ∈ X, r > 0. Let N : Br (x) → Y be a differentiable map with Lipschitz continuous derivative, i.e. there exists k > 0 such that ||N [x1 ] − N [x2 ]|| L(X,Y ) ≤ k||x1 − x2 || X

(180)

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

731

for all x1 , x2 ∈ Br (x). Suppose x is a point where N [x] has a continuous inverse. Let c1 = ||N (x)|| and let c2 = ||N [x]−1 ||. If 2kc12 c2 < 1 and 2c1 c2 < r , then there exists a solution of N (u) = 0 satisfying ||u − x|| X ≤ 2c1 c2 . We apply this method to the operators N . Let m = inf v and M = sup v, where v is the asymptotic solution found in Proposition 17. Taking sufficiently small we can assure that m/2 < w < 2M. By the imbedding of W 2, p (S 1 ) into C 0 (S 1 ) we can find an r such that if m/2 < w < 2M and u ∈ Br (w), then m/3 < u < 3M. Let k be p the Lipschitz constant for N on Sm/3,3M obtained in Corollary 3. So for sufficiently small, N is Lipschitz continuous with constant k on Br (w ). Let c1 () = ||N (w )|| L p and let c2 () = ||N −1 ||. By Lemma 10 c1 () converges rapidly to zero, while by Proposition 18, c2 () is O( −4 ). Hence 2kc1 c22 and 2c1 c2 converge rapidly to zero, and for sufficiently small we obtain a solution of N (u ) = 0 with ||u −w ||W 2, p (S 1 ) < 2c1 c2 . By the continuous imbedding of W 2, p (S 1 ) into C 0 (S 1 ) we have in particular that u (0) converges rapidly to w (0) = v(0) as → 0. Since u is the unique solution of (124), we have proved the result. 5. Conclusion By working with a concrete model problem, we have observed a number of new phenomena for the vacuum conformal and CTS methods. For certain conformal data violating both a small-TT and a near-CMC condition we have shown that there cannot be a unique solution: there will either be no solutions or more than one. For other small-TT data violating a near-CMC we have shown that there are multiple solutions. We have also found existence of certain solutions under a very weak near-CMC hypotheses (τ has constant sign), dependence of the solution theory on the lapse function or conformal class representative, and extreme sensitivity of the solution theory with respect to a coupling constant in the Einstein constraint equations. This work was motivated by the following questions that arise from the Yamabepositive small-TT existence theorems of [9] and [18]: 1. Is the small-TT hypothesis required to ensure existence for arbitrary mean curvatures? 2. Are small-TT solutions necessarily unique? 3. Can the Yamabe-positive restriction be relaxed? Our examples were obtained using a Yamabe-null background metric, and therefore do not directly address questions 1) and 2). The answers to these questions in the Yamabe-null case, however, are that the small-TT hypothesis is necessary (at least for the existence of symmetric solutions for symmetric data), and that small-TT solutions need not be unique. Moreover, our coefficient sensitivity results also suggest that if it is possible to extend the existence results of [9] and [18] to Yamabe-null manifolds, the proof will be difficult. These negative results suggest that the conformal and CTS methods do not lead to a good parameterization scheme for solutions of the Einstein constraint equations. Since the conformal method, in its CMC formulation, is so successful, one is lead to wonder if there is some other generalization of it that does lead to a parameterization. This remains to be seen, and the model problem developed here could provide a useful test case for investigating possible alternatives. Acknowledgements. I would like to thank Daniel Pollack and Jim Isenberg for useful comments and discussions.

732

D. Maxwell

A. The Lichnerowicz Equation We give the proof here of Proposition 4 concerning solutions of the differential equation − u − α 2 u −q−1 + β 2 u q−1 = 0

(181)

on S 1 . Proposition 19. Suppose α and β in Eq. (181) belong to L ∞ (S 1 ) and that α ≡ 0 and β ≡ 0. Let p > 1. 2, p

1. There exists a unique solution u ∈ W+ (S 1 ), and u ∈ W 2,∞ (S 1 ). 2. If w ∈ W+2,∞ (S 1 ) is a subsolution of (36), (i.e. −w − α 2 w −q−1 + β 2 wq−1 ≤ 0) then w ≤ u. 3. If v ∈ W+2,∞ (S 1 ) is a supersolution of (36), (i.e. −v − α 2 v −q−1 + β 2 v q−1 ≥ 0) then v ≥ u. 2, p 4. The solution u ∈ W+ depends continuously on (α, β) ∈ L ∞ × L ∞ . Proof. We consider the differential equation (181) to hold on T n = (S 1 )n rather than S 1 so as to be able to cite existing work (recall that n is related to q by q = 2n/(n − 2)). That is, we consider − u − α 2 u −q−1 + β 2 u q−1 = 0

(182)

on (T n , g), where α and β depend only on x n . Since α 2 ≡ 0 and β 2 ≡ 0, [5] Theorem 4.10 and Corollary 4.11 imply that there exists a positive solution in W 2, p for p > n/2. Uniqueness of this solution follows from [5] Theorem 4.9. From uniqueness we know that u is a function of x n alone (otherwise translation along x k with 1 ≤ k ≤ n − 1− would yield a different solution). But then Eq. (182) reduces to Eq. (181). This establishes Part 1 for p > n/2. On the other hand, 2, p if a solution exists for some p > n/2, it also belongs to W+ for any p ∈ (1, n/2]. 2, p Moreover, if u is a solution of (181) in W+ for some p ∈ (1, n/2], then an easy boot2, p strap shows that u ∈ W+ for all p > n/2 and must therefore agree with the unique solution previously found. Thus Part 1 also holds for 1 < p ≤ n/2. Suppose u − ∈ W+2,∞ is a subsolution of (181). Then it is also a subsolution of (182). Let u be the positive solution of (181). Arguing as in [18] Lemma 2 it follows that Mu is a supersolution for any M > 1. Pick M so that u − ≤ Mu. Proposition 8.2 of [5] implies there is a solution v of (182) such that u − ≤ v ≤ Mu. By uniqueness of the solution it follows that v = u. Hence u − ≤ u and we have proved Part 2. Part 3 is proved similarly. To show continuity, we use the Implicit Function Theorem. Consider the map N : 2, p W+ × (L ∞ × L ∞ ) → L p taking (u, α, β) → −u − α 2 u −q−1 + β 2 u q−1 .

(183)

This map is evidently continuous (since W 2, p is an algebra). One readily shows that its Fréchet derivative at (u, α, β) with respect to u in the direction h is N [u, α, β]h = −h + [(q + 1)α 2 u −q−2 + (q − 1)β 2 u q−2 ]h

(184)

The continuity of the map (u, α, β) → N [u, α, β] follows from the fact that W 2, p (S 1 ) is an algebra continuously embedded in C 0 (S 1 ) along with Lemma 11. Since α ≡ 0 and β ≡ 0 the potential V = [(q +1)α 2 u −q−2 +(q −1)β 2 u q−2 ] is not identically zero. By [5]

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

733

Theorem 7.7, − +V : W 2, p → L p is an isomorphism. The Implicit Function Theorem (see, e.g. [1] Theorem 4.1) then implies that if u 0 is a solution for data (α0 , β0 ), there is a continuous map defined near (α0 , β0 ) taking (α, β) to the corresponding solution of (181). This establishes Part 4. We remark that the hypothesis u ± ∈ W 2,∞ in Parts 2 and 3 can be weakened; we make it only for convenience so as to be able to apply Proposition 8.2 of [5] in a straightforward way. In our applications in Sect. 3, the sub- and supersolutions are either constants or the sum of a constant and an element of W 2,∞ . B. Theory for Even Conformal Data In this section we sketch how, despite the presence of a conformal Killing field, existing techniques for the conformal method can be adapted to the model problem (20) if the conformal data satisfy an evenness hypothesis. For simplicity, we assume all data in this section are smooth, and we focus on the standard conformal method (i.e. N = 1/2). The coupled system to solve is −2κq φ − 2η2 φ −q−1 − κ(μ + w )2 φ −q−1 + κτ 2 φ q−1 = 0, w = φ q τ .

(185)

From Theorem 1 and dimensional reduction we have the following result for the Lichnerowicz equation: − φ − α 2 φ −q−1 + β 2 φ q−1 = 0

(186)

on S 1 . Proposition 20. Suppose α and β belong to C ∞ (S 1 ). There exists a smooth positive solution φ of (186) if and only if 1. α ≡ 0 and β ≡ 0 or 2. α ≡ 0 and β ≡ 0. The solution in Case 1 is unique. In Case 2 the solutions are the positive constants. For the momentum constraint we consider w = f

(187)

on S 1 . The following result is trivial to prove. Proposition 21. Suppose f ∈ C ∞ (S 1 ). There exists a solution w ∈ C ∞ (S 1 ) of (187) if and only if f = 0. (188) S1

Any two solutions of (187) differ by an additive constant. Recall that we are working with functions on S 1 with domain of definition [−π, π ]. We say that a function f on S 1 is even or odd if f (−x) = f (x) or f (−x) = − f (x) for all x ∈ [−π, π ]. Subscripts e and o denote subspaces of even and odd functions. Using the uniqueness results of Propositions 20 and 21 we have the following easy corollaries.

734

D. Maxwell

Corollary 4. Suppose α and β are in Ce∞ (S 1 ). If Condition 1 or 2 of Proposition 20 holds, then the solution φ of (186) belongs to Ce∞ (S 1 ). Corollary 5. Suppose f ∈ Co∞ (S 1 ) and N ∈ Ce∞ (S 1 ). Then there exists a unique solution w ∈ Co∞ (S 1 ) of (187) satisfying w(0) = 0. Any other solution of (187) can be written as a + w where a is constant. Assume η, τ ∈ Ce∞ (S 1 ) and μ is constant. We define a map N : Ce∞ (S 1 ) → Ce∞ (S 1 ) as follows. Let φ ∈ Ce∞ (S 1 ). Then φ q τ is odd and hence there exists a unique function w ∈ Co∞ (S 1 ) solving w = φ q τ .

(189)

1 1 2 [2η2 + κ(μ + 1/(2N )w )2 ] and β = 2q τ , so α and β belong to Ce∞ (S 1 ). Let α = 2κq Finally, define N (φ) to be the solution of (186) for this choice of α and β. The existence of a smooth solution of (19) is equivalent to the existence of a fixed point of N . By assuming that η and τ are even, we have ensured that N is well defined and thus avoiding the trouble with conformal Killing fields. The existence theory of [18] for the standard conformal method now proceeds without change and we have the following generalization of Theorem 2.

Theorem 10. Suppose N ∈ Ce∞ (S 1 ), η, τ ∈ Ce∞ (S 1 ) and μ ∈ R. Suppose further that τ ≡ 0 and that either η ≡ 0 or μ = 0. If there exists a global upper barrier for (η, μ, τ ), then there exists a solution (φ, w) ∈ Ce∞ × Co∞ of (6). Recall that a global upper barrier is defined as follows. Given a smooth even positive function φ, let wφ be an odd solution of wφ = φ q τ .

(190)

Then wφ is uniquely defined. We say that a smooth positive even function is a global upper barrier if for all smooth even functions φ satisfying 0 < φ ≤ , then − 2κq − 2η2 −q−1 − κ(μ + wφ )2 −q−1 + κτ 2 q−1 ≥ 0.

(191)

Following [14] and [11] one readily shows that there is a constant global upper barrier if max |∇τ | is sufficiently small. min |τ |

(192)

To conclude this section, we show how we can use such near-CMC data to construct data that violate both the small-TT condition and the near-CMC condition (192) arbitrarily. To see this we ‘double’ the frequency of the mean curvature: if f is a periodic function with period 2π , let f [k] (x) = f (2k x). Proposition 22. Suppose τ satisfies the near-CMC condition (192), and η and μ are constant. Then for any k ∈ N there exists a solution of (6) (η, μ, τ [k] ) so long as one of η or μ is non-zero. Proof. Let k ∈ N. Since τ is near-CMC, there exists a solution (φ, w) of (6) for confornk

mal data (2−nk η, 2−nk μ, τ ). One verifies then that (2 q φ [k] , 2(n−1)k w [k] ) is a solution for conformal data (η, μ, τ [k] ).

A Model Problem for Conformal Parameterizations of the Einstein Constraint Equations

735

[k]

|∇τ | By taking k sufficiently large, we can make the ratio max as large as we please. For min|τ [k] | each of these mean curvatures, we can solve (6) for certain arbitrarily large TT-tensors. This result seems to suggest that large relative gradients of τ are not, by themselves, a source of trouble. The kind of near-CMC violation described above introduces large gradients without affecting the deviation of τ from its mean. On the other hand, we can write a given mean curvature τ as

τ = t + λ,

(193)

1 where t is constant and S 1 λ = 0. If |t| is large relative to, say, 2π S 1 |λ|, then the ratio (192) will be small (and τ will be near-CMC). This weaker notion of being near-CMC is similar to one used in [14]. It is not violated by the mean curvatures of Proposition 22, and extends to the rough mean curvatures considered in Sect. 3.

References 1. Akerkar, R.: Nonlinear functional analysis. New Dehli: Narosa Publishing House, 1999 2. Bourguignon, J.P., Ezin, J.P.: Scalar curvature functions in a conformal class of metrics and conformal transformations. Trans. Amer. Math. Soc. 2, 723–736 (1987) 3. Bartnik, R., Isenberg, J.: The constraint equations. In: Chrusciel, P.T., Friedrich, H. (eds.) The Einstein Equations and the Large Scale Behavior of Gravitational Fields: 50 Years of the Cauchy Problem in General Relativity. Basel: Birkhäuser, 2004 4. Baumgarte, T.W., Ó Murchadha, N., Pfeiffer, H.P.: Einstein constraints: uniqueness and nonuniqueness in the conformal thin sandwich approach. Phys. Rev. D 75(4), 044009 (2007) 5. Choquet-Bruhat, Y.: Einstein constraints on compact n-dimensional manifolds. Class. Quant. Grav. 21(3), S127–S151 (2004) 6. Choquet-Bruhat, Y., York, J.W. Jr.: The Cauchy problem. In: Held, A. (ed.) General Relativity and Gravitation. New York: Plenum, 1980 7. Chrusciel, P.T.: On space-times with U (1) × U (1) symmetric compact Cauchy surfaces. Ann. Phys. 202(1), 100–150 (1990) 8. Gilbarg, D., Trudinger, N.S.: Elliptic partial differential equations of second order. Berlin-HeidelbergNew York: Springer-Verlag, 1999 9. Holst, M., Nagy, G., Tsogtgerel, G.: Rough solutions of the Einstein constraint equations on closed manifolds without near-CMC conditions. Commun. Math. Phys. 288(2), 547–613 (2008) 10. Hebey, E., Pacard, F., Pollack, D.: A variational analysis of Einstein-scalar field Lichnerowicz equations on compact Riemannian manifolds. Commun. Math. Phys. 278, 117–132 (2008) 11. Isenberg, J., Clausen, A., Allen, P.T.: Near-constant mean curvature solutions of the Einstein constraint equations with non-negative Yamabe metrics. Class. Quant. Grav. 25(7), 075009 (2008) 12. Isenberg, J., Choquet-Bruhat, Y., Moncrief, V.: Solutions of constraints for Einsteins equations. C. R. Acad. Sci., Ser. I: Math. 315, 349–355 (1992) 13. Isenberg, J., Moncrief, V.: A set of nonconstant mean curvature solutions of the Einstein constraint equations on closed manifolds. Class. Quant. Grav. 13(7), 1819–1847 (1996) 14. Isenberg, J., Ó Murchadha, N.: Non-CMC conformal data sets which do not produce solutions of the Einstein constraint equations. Class. Quant. Grav. 21(3), S233–S241 (2004) 15. Isenberg, J.: The construction of spacetimes from initial data, Ph.D. dissertation, University of Maryland, 1979 16. Isenberg, J.: Constant mean curvature solutions of the Einstein constraint equations on closed manifolds. Class. Quant. Grav. 12(9), 2249–2274 (1995) 17. Maxwell, D.: Rough solutions of the constraint equations on compact manifolds. J. Hyp. Diff. Eqs. 2(2), 521–546 (2005) 18. Maxwell, D.: A class of solutions of the vacuum Einstein constraint equations with freely specified mean curvature. Math. Res. Lett. 16(4), 627–645 (2009) 19. Pfeiffer, H.P., York, J.W. Jr.: Uniqueness and nonuniqueness in the Einstein constraints. Phys. Rev. Lett. 95(9), 091101 (2005)

736

D. Maxwell

20. Walsh, D.M.: Non-uniqueness in conformal formulations of the Einstein constraints. Class. Quant. Grav. 24(8), 1911–1925 (2007) 21. York, J.W. Jr.: Conformal “thin-sandwich” data for the initial-value problem of general relativity. Phys. Rev. Lett. 82(7), 1350–1353 (1999) Communicated by P.T. Chru´sciel

Commun. Math. Phys. 302, 737–753 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1183-8

Communications in

Mathematical Physics

Ergodic Solenoidal Homology: Realization Theorem Vicente Muñoz1 , Ricardo Pérez-Marco2, 1 Facultad de Matemáticas, Universidad Complutense de Madrid, Plaza de Ciencias 3, 28040 Madrid, Spain.

E-mail: [email protected]

2 CNRS, LAGA UMR 7539, Université Paris XIII, 99, Avenue J.-B. Clément, 93430 Villetaneuse, France.

E-mail: [email protected] Received: 15 October 2009 / Accepted: 13 September 2010 Published online: 11 February 2011 – © Springer-Verlag 2011

Abstract: We define generalized currents associated with immersions of abstract oriented solenoids with a transversal measure. We realize geometrically the full real homology of a compact manifold with these generalized currents, and more precisely with immersions of minimal uniquely ergodic solenoids. This makes precise and geometric De Rham’s realization of the real homology by only using a restricted geometric subclass of currents. 1. Introduction We consider a smooth compact connected oriented manifold M of dimension n ≥ 1. Any closed oriented submanifold N ⊂ M of dimension 0 ≤ k ≤ n determines a homology class in Hk (M, Z). This homology class in Hk (M, R), as dual of De Rham cohomology, is explicitly given by integration of the restriction to N of differential k-forms on M. Also, any immersion f : N → M defines an integer homology class in a similar way by integration of pull-backs of k-forms. Unfortunately, because of topological reasons dating back to Thom [13,14], not all integer homology classes in Hk (M, Z) can be realized in such a way. Geometrically, we can realize any class in Hk (M, Z) by topological k-chains. The real homology Hk (M, R) classes are only realized by formal combinations with real coefficients of k-cells. This is not satisfactory for various reasons. In particular, for diverse purposes it is important to have an explicit realization, as geometric as possible, of real homology classes. The first contribution in this direction came in 1957 from the work of S. Schwartzman [9]. Schwartzman showed how, by a limiting procedure, one-dimensional curves embedded in M can define a real homology class in H1 (M, R). More precisely, he proved that this happens for almost all curves solutions to a differential equation admitting an Partially supported through Spanish MEC grant MTM2007-63582.

Second author supported by CNRS (UMR 7539).

738

V. Muñoz, R. Pérez-Marco

invariant ergodic probability measure. Schwartzman’s idea is very natural. It consists of integrating 1-forms over large pieces of the parametrized curve and normalizing this integral by the length of the parametrization. Under suitable conditions, the limit exists and defines an element of the dual of H 1 (M, R), i.e. an element of H1 (M, R). This procedure is equivalent to the more geometric one of closing large pieces of the curve by relatively short closing paths. The closed curve obtained defines an integer homology class. The normalization by the length of the parameter range provides a class in Hk (M, R). Under suitable hypothesis, there exists a unique limit in real homology when the pieces exhaust the parametrized curve, and this limit is independent of the closing procedure. In the article [5], we study the different aspects of the Schwartzman procedure, that we extend to higher dimension. Later in 1975, D. Ruelle and D. Sullivan [8] defined, for arbitrary dimension 0 ≤ k ≤ n, geometric currents by using oriented k-laminations embedded in M and endowed with a transversal measure. They applied their results to stable and unstable laminations of Axiom A diffeomorphisms. In a later article Sullivan [11] extended further these results and their applications. The point of view of Ruelle and Sullivan is also based on duality. The observation is that k-forms can be integrated on each leaf of the lamination and then all over the lamination using the transversal measure. This makes sense locally in each flow-box, and then it can be extended globally by using a partition of unity. The result only depends on the cohomology class of the k-form. In [4] we review and extend Ruelle-Sullivan theory. It is natural to ask whether it is possible to realize every real homology class using a topologically minimal (i.e. all leaves are dense) Ruelle-Sullivan current. In order to achieve this goal we must enlarge the class of Ruelle-Sullivan currents by considering immersions of abstract oriented solenoids. We define a k-solenoid to be a Hausdorff compact space foliated by k-dimensional leaves with finite dimensional transversal structure (see the precise definition in Sect. 2). For these oriented solenoids we can consider k-forms that we can integrate provided that we are given a transversal measure invariant by the holonomy group. We define an immersion of a solenoid S into M to be a regular map f : S → M that is an immersion in each leaf. If the solenoid S is endowed with a transversal measure μ, then any smooth k-form in M can be pulled back to S by f and integrated. The resulting numerical value only depends on the cohomology class of the k-form. Therefore we have defined a closed current that we denote by ( f, Sμ ) and that we call a generalized current. This gives a homology class [ f, Sμ ] ∈ Hk (M, R). Our main result is: Theorem 1.1 (Realization Theorem). Every real homology class in Hk (M, R) can be realized by a generalized current ( f, Sμ ), where Sμ is an oriented, minimal, uniquely ergodic solenoid. Minimal and uniquely ergodic solenoids are defined later on. This result strengthens De Rham’s realization theorem of homology classes by abstract currents, i.e. forms whose coefficients are distributions. It is a geometric De Rham’s Theorem where the abstract currents are replaced by generalized currents that are geometric objects. We can ask why we do need to enlarge the class of Ruelle-Sullivan currents. The result does not hold for minimal Ruelle-Sullivan currents due to the following result from [4] (compare with [3]). Theorem 1.2 [4, Cor. 10.2]. Homology classes with non-zero self-intersection cannot be represented by Ruelle-Sullivan currents with no compact leaves.

Ergodic Solenoidal Homology: Realization Theorem

739

Therefore it is not possible to represent a real homology class in Hk (M, R) with nonzero self-intersection by a minimal Ruelle-Sullivan current that it is not a submanifold. Note that this obstruction only exists when n − k is even. This may be the historical reason behind the lack of results on the representation of an arbitrary homology class by minimal Ruelle-Sullivan currents. The space of solenoids is large, and we would like to realize the real homology classes by a minimal class of solenoids enjoying good properties. We are first naturally led to topological minimality. As we prove in [4], the spaces of k-solenoids is inductive and therefore there are always minimal k-solenoids. However, the transversal structure and the holonomy group of minimal solenoids can have a rich structure. In particular, such a solenoid may have many distinct transversal measures, each one yielding a different generalized current for the same immersion f . Also when we push Schwartzman ideas beyond 1-homology for some nice classes of solenoids, we see that in general, even when the immersion is an embedding, the generalized current does not necessarily coincide with the Schwartzman homology class of the immersion of each leaf (actually not even this Schwartzman class needs to be well defined). Indeed the classical literature lacks information about the precise relation between Ruelle-Sullivan and Schwartzman currents. One would naturally expect that there is some relation between the generalized currents and the Schwartzman current (if defined) of the leaves of the lamination. We study this problem in [5]. The main result is that there is such a relation for the class of minimal, ergodic solenoids with a trapping region. A solenoid with a trapping region (see the definition in Sect. 2) has holonomy group generated by a single map. Then the bridge between generalized currents and Schwartzman currents of the leaves is provided by Birkhoff’s ergodic theorem. The main result of [5] is the following. Theorem 1.3 [5, Theorems 1.1 & 1.2]. Let Sμ be a minimal solenoid endowed with an ergodic transversal measure μ and possessing a trapping region W . Let f : Sμ → M be an immersion of Sμ into M such that f (W ) is contained in a ball of M. Then for μ-almost all leaves l ⊂ Sμ , the Schwartzman homology class of f (l) ⊂ M is well defined and coincides with the homology class [ f, Sμ ]. If moreover S is uniquely ergodic, then this happens for all leaves. (We recall the definition of Schwartzman homology class and trapping region in Sect. 2.) The solenoids constructed for the proof of the Realization Theorem do satisfy the hypothesis of this theorem and the transversal measure is unique, that is, the solenoids are uniquely ergodic. Solenoidal Hodge Conjecture. The Hodge Conjecture is an statement about the geometric realization of an integral class of pure type ( p, p) in a complex (projective) manifold. If we drop the condition of the class being integral, then Theorem 1.1 suggests a natural conjecture for real homology classes of pure type as follows. For a compact Kähler manifold M of complex dimension n, a complex immersed solenoid f : Sμ → M (that is, a solenoid where the images f (l) of the leaves l ⊂ Sμ are complex immersed submanifolds), of dimension k = 2(n − p), defines a class in Hn− p,n− p (M) = H p, p (M)∗ ⊂ Hk (M, R), as proved in Proposition 9.3 of [4]. It is natural to formulate the following conjecture: Conjecture 1.4 (Solenoidal Hodge Conjecture). Let M be a compact Kähler manifold. Then any class in H p, p (M) is represented by a complex immersed solenoid of dimension k = 2(n − p).

740

V. Muñoz, R. Pérez-Marco

Note that the standard Hodge Conjecture is stated for projective complex manifolds, since it fails for Kähler manifolds [16]. The counterexamples of [16] are non-algebraic complex tori. It is easy to see that Conjecture 1.4 holds for complex tori (using nonminimal complex solenoids).

2. Solenoids and Generalized Currents Let us review the main concepts introduced in [4]. Definition 2.1. A k-solenoid, where k ≥ 0, of class C r,s , is a compact Hausdorff space endowed with an atlas of flow-boxes A = {(Ui , ϕi )}, ϕi : Ui → D k × K (Ui ), where D k is the k-dimensional open ball, and K (Ui ) ⊂ Rl is the transversal set of the flow-box. The changes of charts ϕi j = ϕi ◦ ϕ −1 j are of the form ϕi j (x, y) = (X (x, y), Y (y)),

(1)

where X (x, y) is of class C r,s and Y (y) is of class C s . Let S be a k-solenoid, and U ∼ = D k × K (U ) be a flow-box for S. The sets L y = × {y} are called the (local) leaves of the flow-box. A leaf l ⊂ S of the solenoid is a connected k-dimensional manifold whose intersection with any flow-box is a collection of local leaves. The solenoid is oriented if the leaves are oriented (in a transversally continuous way). A transversal for S is a subset T which is a finite union of transversals of flow-boxes. Given two local transversals T1 and T2 and a path contained in a leaf from a point of T1 to a point of T2 , there is a well-defined holonomy map h : T1 → T2 . The holonomy maps form a pseudo-group. A k-solenoid S is minimal if it does not contain a proper sub-solenoid. By [4, Sect. 2], minimal sub-solenoids do exist in any solenoid. If S is minimal, then any transversal is a global transversal, i.e., it intersects all leaves. In the special case of an oriented minimal 1-solenoid, the holonomy return map associated to a local transversal, Dk

RT : T → T is known as the Poincaré return map (see [4, Sect. 4]). Definition 2.2. Let S be a k-solenoid. A transversal measure μ = (μT ) for S associates to any local transversal T a locally finite measure μT supported on T , which are invariant by the holonomy pseudogroup, i.e. if h : T1 → T2 is a holonomy map, then h ∗ μT1 = μT2 . We denote by Sμ a k-solenoid S endowed with a transversal measure μ = (μT ). We refer to Sμ as a measured solenoid. Observe that for any transversal measure μ = (μT ) the scalar multiple c μ = (c μT ), where c > 0, is also a transversal measure. Notice that there is no natural scalar normalization of transversal measures.

Ergodic Solenoidal Homology: Realization Theorem

741

Definition 2.3 (Transverse ergodicity). A transversal measure μ = (μT ) on a solenoid S is ergodic if for any Borel set A ⊂ T invariant by the pseudo-group of holonomy maps on T , we have μT (A) = 0 or μT (A) = μT (T ). We say that Sμ is an ergodic solenoid. Definition 2.4. Let S be a k-solenoid. The solenoid S is uniquely ergodic if it has a unique (up to scalars) transversal measure μ and its support is the whole of S. Now let M be a smooth manifold of dimension n. An immersion of a k-solenoid S into M, with k < n, is a smooth map f : S → M such that the differential restricted to the tangent spaces of leaves has rank k at every point of S. The solenoid f : S → M is transversally immersed if for any flow-box U ⊂ S and chart V ⊂ M, the map f : U = D k × K (U ) → V ⊂ Rn is an embedding, and the images of the leaves intersect transversally in M. If moreover f is injective, then we say that the solenoid is embedded. Note that under a transversal immersion, resp. an embedding, f : S → M, the images of the leaves are immersed, resp. injectively immersed, submanifolds. Definition 2.5 (Generalized currents). Let S be an oriented k-solenoid of class C r,s , r ≥ 1, endowed with a transversal measure μ = (μT ). An immersion f :S→M defines a current ( f, Sμ ) ∈ Ck (M), called generalized Ruelle-Sullivan current (or just generalized current), as follows. Let ω be a k-differentialform in M. The pull-back f ∗ ω defines a k-differential form on the leaves of S. Let S = i Si be a measurable partition such that each Si is contained in a flow-box Ui . We define f ∗ ω dμ K (Ui ) (y),

( f, Sμ ), ω = K (Ui )

i

L y ∩Si

where L y denotes the horizontal disk of the flow-box. The current ( f, Sμ ) is closed, hence it defines a real homology class [ f, Sμ ] ∈ Hk (M, R), called Ruelle-Sullivan homology class. Note that this definition does not depend on the measurable partition (given two partitions consider the common refinement). If the support of f ∗ ω is contained in a flow-box U then

( f, Sμ ), ω =

K (U )

f ∗ω

Ly

dμ K (U ) (y).

In general, take a partition of unity {ρi } subordinated to the covering {Ui }, then ∗

( f, Sμ ), ω = ρi f ω dμ K (Ui ) (y). i

K (Ui )

Ly

742

V. Muñoz, R. Pérez-Marco

Let us see that ( f, Sμ ) is closed. For any exact differential ω = dα we have ∗ ρi f dα dμ K (Ui ) (y)

( f, Sμ ), dα = =

i

K (Ui )

i

K (Ui )

−

i

K (Ui )

Ly

d(ρi f ∗ α)

dμ K (Ui ) (y)

Ly

∗

dρi ∧ f α Ly

dμ K (Ui ) (y) = 0.

The first term vanishes using Stokes in each leaf (the form ρi f ∗ α is compactly supported on Ui ), and the second term vanishes because i dρi ≡ 0. Therefore [ f, Sμ ] is a well defined homology class of degree k. In their original article [8], Ruelle and Sullivan defined this notion for the restricted class of solenoids embedded in M. When M is a compact and oriented n-manifold, the Ruelle-Sullivan homology class [ f, Sμ ] ∈ Hk (M, R) gives an element [ f, Sμ ]∗ ∈ H n−k (M, R), under the Poincaré duality isomorphism Hk (M, R) ∼ = H n−k (M, R). We have the following result (Theorem 10.1 in [4]) which proves Theorem 1.2. Theorem 2.6 (Self-intersection of embedded solenoids). Let M be a compact, oriented, smooth manifold. Let f : Sμ → M be an embedded oriented measured solenoid, such that the transversal measures (μT ) have no atoms. Then we have [ f, Sμ ]∗ ∪ [ f, Sμ ]∗ = 0 in H 2(n−k) (M, R). This indicates that we cannot use only embedded solenoids to represent real homology classes in general. Now let us recall the notions of Schwartzman theory that we are going to need, and that are extensively studied in [5]. Let M be a compact smooth Riemannian manifold. Given a Riemannian immersion c : N → M from an oriented complete smooth manifold N of dimension k ≥ 1, we consider exhaustions (Un ) of N with Un ⊂ N being k-dimensional compact submanifolds with boundary ∂Un . We close Un with a k-dimensional oriented manifold n with boundary ∂n = −∂Un (that is, ∂Un with opposite orientation, so that Nn = Un ∪ n is a k-dimensional compact oriented manifold without boundary), in such a way that c|Un extends to a piecewise smooth map cn : Nn → M. We may consider the associated homology class [cn (Nn )] ∈ Hk (M, Z). Suppose that Volk (cn (n )) → 0. Volk (cn (Nn )) If the following limit exists: lim

n→+∞

1 [cn (Nn )] ∈ Hk (M, R), Volk (cn (Nn ))

we call it a Schwartzman asymptotic k-cycle.

(2)

Ergodic Solenoidal Homology: Realization Theorem

743

Definition 2.7. The immersed manifold c : N → M represents a homology class a ∈ Hk (M, R) if for all exhaustions (Un ), the class (2) exists and equals a. We denote [c, N ] = a, and call it the Schwartzman homology class of (c, N ). For immersed solenoids f : S → M, we may consider the Schwartzman homology classes associated to its leaves. Definition 2.8 (Schwartzman representation of homology classes). Let f : Sμ → M be an immersion in M of an oriented measured k-solenoid S, and give S the induced Riemannian structure. The immersed solenoid f : Sμ → M fully represents a homology class a ∈ H1 (M, R) if for all leaves l ⊂ S, we have that ( f, l) is a Schwartzman asymptotic k-cycle with [ f, l] = a. A class of solenoids with good properties are those which have a trapping region, since for them the holonomy is represented by a single map. The definition is cumbersome but very natural [5, Definition 7.9]. Definition 2.9 (Trapping region). An open subset W ⊂ S of a solenoid S is a trapping region if there exists a continuous map π : S → T = R/Z such that (1) (2) (3) (4) (5)

For some 0 < 0 < 1/2, W = π −1 ((− 0 , 0 )). There is a global transversal T ⊂ π −1 ({0}). Each connected component of π −1 ({0}) intersects T in exactly one point. 0 is a regular value for π . For each connected component L of π −1 (T − {0}) we have L ∩ T = {x, y}, where {x} = L ∩ T ∩ π −1 ((− 0 , 0]) and {y} = L ∩ T ∩ π −1 ([0, 0 )). The main result of [5] is the following theorem.

Theorem 2.10 [5, Theorem 1.2]. Let S be a minimal oriented k-solenoid endowed with a transversal uniquely ergodic measure μ ∈ ML (S) and with a trapping region W ⊂ S. Consider an immersion f : S → M such that f (W ) is contained in a contractible ball in M. Then f : Sμ → M fully represents its Ruelle-Sullivan homology class [ f, Sμ ]. 3. Realization of H1 (M, R) Let M be a C ∞ smooth compact Riemannian manifold. Given a real 1-homology class a ∈ H1 (M, R), we want to construct an immersion f : S → M in M of a uniquely ergodic solenoid Sμ with generalized current [ f, Sμ ] = a. In some situations (depending on the dimension) we will achieve an embedding. Actually the abstract 1-solenoid S that we will construct is independent of a and of M, and moreover it has a 1-dimensional transversal structure. Let h : T → T be a diffeomorphism of the circle with an irrational rotation number (and therefore uniquely ergodic), which is a Denjoy counter-example, i.e. has the unique invariant probability measure supported on the minimal Cantor set K ⊂ T. Let μ K denote the invariant probability measure. For the original construction of Denjoy counter-examples see [1]. Actually for any given > 0, h can be taken to be of class C 2− (see [2]). The suspension of h, Sh = ([0, 1] × T)/(0,x)∼(1,h(x))

744

V. Muñoz, R. Pérez-Marco

Fig. 1. The 1-solenoid S

is C 2− -diffeomorphic to the 2-torus T 2 . More explicitly, the diffeomorphism is as follows: take c > 0 small, let h t , t ∈ [0, c], be a (smooth) isotopy from id to h, then we define the diffeomorphism H : T 2 → Sh by (t, h −1 (h t (x))), for t ∈ [0, c], H (t, x) = (t, x), for t ∈ [c, 1]. Note that Sh is foliated by the horizontal leaves, so T 2 is foliated accordingly. It can be considered also as a 1-solenoid of class C ω,2− . The sub-solenoid S = ([0, 1] × K )/∼ ⊂ Sh is an oriented 1-solenoid of class C ω,2− , with transversal T = ({0}×T)∩ S = {0}× K . The holonomy is given by the map h, which is uniquely ergodic. Moreover, the associated transversal measure is μ K on the transversal K ∼ = {0} × K . So S is an oriented and uniquely ergodic 1-solenoid. Using the diffeomorphism H , we may see the solenoid S inside the 2-torus, S ⊂ Sh ∼ = T 2 , consisting of the paths (t, x), x ∈ K , t ∈ [c, 1], together with the paths (t, h t (x)), x ∈ K , t ∈ [0, c]. The embedding S → T 2 is of class C ω,2− , so we shall think of S as an oriented 1-solenoid of regularity C ω,2− (Fig. 1). Theorem 3.1 Let M be a compact smooth manifold, and let a ∈ H1 (M, R) be a non-zero 1-homology class. If dim M ≥ 3 then (a positive multiple of) a can be fully represented by an embedding (of class C ∞,2− ) of the (oriented, uniquely ergodic) 1-solenoid S into M. If dim M = 2 then (a positive multiple of) a can be fully represented by a transversal immersion of S into M. Proof. Let C1 , . . . , Cb1 be (integral) 1-cycles which form a basis of the (real) 1-homology of M. Switch orientations and reorder the cycles if necessary so that there are real numbers λ1 , . . . , λr > 0 such that a = λ1 C1 + · · · + λr Cr . By dividing by λi if necessary, we can assume that λi = 1. Consider the solenoid S constructed above and partition the cantor set K into r disjoint compact subsets K 1 , . . . , K r in cyclic order, each of which with

μ K (K i ) = λi .

Ergodic Solenoidal Homology: Realization Theorem

745

Fig. 2. The open manifold U

Consider the transversal T = {0} × T in Sh . We consider angles θ1 , θ2 , . . . , θn ∈ T in the same cyclic order as the K i , such that K i is contained in the open subset Ui ⊂ T with boundary points θi and θi+1 (denoting θn+1 = θ1 ). We may assume that θ1 = 0. Remove the segments [c, 1] × {θi } from Sh to get the open 2-manifold (Fig. 2) U = Sh − ∪i ([c, 1] × {θi }). By construction, our solenoid S is included as a subset of U, S ⊂ U . Suppose that dim M ≥ 3. Then we can C ∞ -smoothly embed F : U → M as follows: suppose that all cycles Ci share a common base-point p0 ∈ M (and are otherwise disjoint to each other). Then embed the central part (0, c) × T ⊂ U in a small ball B around p0 and embed each of the [c, 1] × Ui in M − B in such a way that if we contract B to p0 then the images of [c, 1] × {t}, t ∈ Ui , represent cycles homologous to Ci . F

The embedding f of S into M is defined as the composition S → U → M. By Theorem 2.10, as S is uniquely ergodic, to prove that f : S → M fully represents a, it is enough to see that [ f, Sμ ] = a. Let α be any closed 1-form on M. Since H 1 (M) = H 1 (M, B), we may assume that α vanishes on B. We cover the solenoid S by the flow-boxes ((0, c) × T) ∩ S and [c, 1] × K i , i = 1, . . . , r . As f ∗ α vanishes in the first flow-box, we have

r r ∗

[ f, Sμ ], [α] = f α dμ K i (y) =

Ci , [α]dμ K i (y) =

i=1 K i r

[c,1]

Ci , [α]μ(K i ) =

i=1

i=1 r

Ki

λi Ci , [α] = a, [α],

i=1

proving that [ f, Sμ ] = a. Now suppose that dim M = 2. Let us do the appropriate modifications to the previous construction. Choose cycles Ci sharing a common base-point p0 ∈ M, and such that their intersections (and self-intersections) away from p0 are transversal. Changing Ci by 2Ci if necessary, we suppose that going around Ci does not change the orientation (that is, the normal bundle to Ci is oriented, hence trivial). From the manifold U in Fig. 2, remove [0, c] × {θ1 } to get the open 2-manifold (Fig. 3) V = ((0, c) × (0, 1)) ∪i ([c, 1] × Ui ) . The manifold V can be immersed into the surface M, F : V → M, in such a way that (0, c) × (0, 1) is sent to a ball B around p0 , [c, 1] ×Ui are sent to M − B, the images

746

V. Muñoz, R. Pérez-Marco

Fig. 3. The open manifold V

of [c, 1] × {t}, t ∈ Ui , represent cycles homologous to Ci if we contract B to a point, and the intersections and self-intersections of horizontal leaves are always transverse. Note that the solenoid S is not contained in V , since we have removed [0, c] × {θ1 } from U . So we cannot define an immersion f : S → M by restricting that of F. To define f in S ∩((0, c)×T), we need to write explicitly our isotopy h t . Consider h : T → T and ˜ lift it to h˜ : R → R with r := h(0) ∈ (0, 1). Consider a smooth function ρ : R → [0, 1], with ρ(t) = 1 for t ≤ 0, ρ(t) = 0 for t ≥ c, and ρ (t) < 0 for t ∈ (0, c). Then we can define ˜ h˜ −1 (x)ρ(t) + x(1 − ρ(t))) mod Z. h t (x) = h( Define the immersion f : S → M as follows: f equals F for (t, x) ∈ [c, 1] × K ⊂ V . For (t, h −1 (h t (x))) ∈ S ∩ ([0, c] × T), we set F(t, (h˜ −1 (x) + 1)ρ(t) + x(1 − ρ(t))), x ∈ K ∩ (0, r ), f (t, h −1 (h t (x))) = −1 x ∈ K ∩ (r, 1). F(t, h˜ (x)ρ(t) + x(1 − ρ(t))), It is easily checked that f sends S ∩ ([0, c] × T) into the ball B and the intersections of the leaves in this portion of the solenoid are transverse. The proof that the Ruelle-Sullivan homology class of f : S → M is [ f, Sμ ] = a goes as before. Remark 3.2. We do not need M to be compact for the above construction to work. If M is non-compact, take integer 1-cycles C1 , C2 , . . . (possibly infinitely many) which form a basis of H1 (M, R). Then for any a ∈ H1 (M, R) there exist an integer r ≥ 1 and λ1 , . . . , λr ∈ R with a = λi Ci . The construction of Theorem 3.1 works. The solenoid S is oriented, regardless of M being oriented or not. 4. Realization of Hk (M, R) Let M be a smooth compact oriented Riemannian C ∞ manifold and let a ∈ Hk (M, R) be a non-zero real k-homology class, 1 ≤ k ≤ n − 1. We are going to construct a uniquely ergodic k-solenoid f : S → M with a 1-dimensional transversal structure, immersed in M and fully representing a.

Ergodic Solenoidal Homology: Realization Theorem

747

To start with, fix a collection of compact k-dimensional smooth oriented manifolds S1 , . . . , Sr and positive numbers λ1 , . . . , λr > 0 such that λi = 1. For any fixed

> 0, let h : T → T be a diffeomorphism of the circle which is a Denjoy counterexample with an irrational rotation number and of class C 2− . Hence h is uniquely ergodic. Let μ K be the unique invariant probability measure, which is supported on the minimal Cantor set K ⊂ T. Partition the Cantor set K into r disjoint compact subsets K 1 , . . . , K r in cyclic order, each of which with μ K (K i ) = λi . We fix two points on each manifold Si , and remove two small balls, Di+ and Di− , around them. Denote Si = Si − (Di+ ∪ Di− ), so that Si is a manifold with oriented boundary ∂ Si = ∂ Di+ ∂ Di− . Fix two diffeomorphisms: ∂ Di+ ∼ = S k−1 , being orientation preserving, and ∂ Di− ∼ = S k−1 , being orientation reversing. There are inclusions A± :=

i± (∂ Di± × K i ) → S k−1 × T,

whose image is S k−1 × K ⊂ S k−1 × T. Define S= (Si × K i )/ ∼, (x, y) ∼ i +−1 ◦ (id ×h) ◦ i − (x, y), (x, y) ∈ A− . This is an oriented k-solenoid of class C ∞,2− , with 1-dimensional transversal dimension. As S k−1 × K ⊂ S in an obvious way, fixing a point p ∈ S k−1 we have a global transversal T = { p} × K ⊂ S k−1 × K ⊂ S. Identifying T ∼ = K , the holonomy pseudogroup is generated by h : K → K . Hence S is uniquely ergodic. Let μ denote the transversal measure corresponding to μ K . We want to give an alternative description of S. Fix an isotopy h t , t ∈ [0, 1], from id to h. Define the set (Fig. 4) W := {(t, x, h −1 (h t (y))) ; t ∈ [0, 1], x ∈ S k−1 , y ∈ K } ⊂ [0, 1] × S k−1 × T. Then we have that S=

(Si × K i ) W / ∼,

(x, y) ∼ (0, i − (x, y)), (x, y) ∈ ∂ Di− × K i , (x, y) ∼ (1, i + (x, y)), (x, y) ∈ ∂ Di+ × K i .

Strictly speaking, we should say that they are diffeomorphic, but we shall fix an identification. We define a map π : S → T = R/Z by (t, x, h −1 (h t (y))) ∈ W , π(t, x, h −1 (h t (y))) = t − 21 , π( p) = 21 ,

p ∈ S − W .

Then W = Int(W ) = π −1 (− 21 , 21 ) is a trapping region according to Definition 2.9. Consider angles τ1 , τ2 , . . . , τn ∈ T in the same cyclic order as the K i , such that K i is contained in the open subset Ui ⊂ T with boundary points τi and τi+1 (denoting

748

V. Muñoz, R. Pérez-Marco

Fig. 4. The trapping region W

Fig. 5. The manifold X

τn+1 = τ1 ). We may assume that τ1 = 0. Then the solenoid S sits inside the (k + 1)dimensional open manifold X =

(Si × Ui ) ([0, 1] × S k−1 × T)/ ∼,

(where (x, y) ∼ (0, i − (x, y)), (x, y) ∈ ∂ Di− × Ui , (x, y) ∼ (1, i + (x, y)), (x, y) ∈ ∂ Di+ × Ui ),

as the collection of points (x, y), x ∈ Si , y ∈ K i , together with the points (t, x, h −1 (h t (y))), x ∈ S k−1 , y ∈ K , t ∈ [0, 1] (Fig. 5). Remark 4.1. The 1-solenoid constructed in Sect. 3 corresponds to the case Si = S 1 , i = 1, . . . , r . Theorem 4.2 Let M be a compact oriented smooth Riemannian manifold of dimension n, and let a ∈ Hk (M, R) be a non-zero real k-homology class, 1 ≤ k ≤ n − 1. Then (a positive multiple of ) a can be fully represented by a transversal immersion f : S → M of a uniquely ergodic oriented k-solenoid. If moreover, n ≥ 2k + 1 then we can suppose that f is an embedding. Proof. By Proposition A.3, we may take a collection C1 , . . . , Cbk ∈ Hk (M, Z) which are a basis of Hk (M, Q) and such that Ci are represented by immersed submanifolds Si ⊂ M with trivial normal bundle and self-transverse intersections, and such that Si intersects S j transversally. Moreover, if n ≥ 2k +1, we may assume that there are neither intersections nor self-intersections. After switching the orientations of Ci if necessary, reordering the cycles and multiplying a by a suitable positive real number, we may suppose that a = λ1 C1 + · · · + λr Cr ,

Ergodic Solenoidal Homology: Realization Theorem

749

for some r ≥ 1, λi > 0, 1 ≤ i ≤ r , and λi = 1. We construct the solenoid S with the procedure above starting with the manifolds Si and coefficients λi . This is a uniquely ergodic k-solenoid with a 1-dimensional transversal structure, and a trapping region W ⊂ S. Now we want to define an immersion f : S → M, and to prove that it fully represents a. We have the following cases: (1)

n ≥ 2k + 1. The general position property on the Si implies that all Si are disjoint submanifolds of M. As the normal bundle to Si is trivial and Ui is an interval, we can embedded Si × Ui in a small neighbourhood of Si . Fix a base point p0 ∈ M off all Si . Take a small box B ⊂ M around p0 of the form B = [0, 1] × D n−1 , where D n−1 is the open (n − 1)-dimensional ball. Consider a circle T ⊂ D k+1 ⊂ D n−1 and let D k × T ⊂ D k+1 ⊂ D n−1 be a tubular neighbourhood of it, with boundary S k−1 × T. For each i = 1, . . . , r , fix yi ∈ Ui , and consider two paths in M − Int(B), say γi± , where γi− goes from the point (0, yi ) ∈ {0}×Ui ⊂ {0}×T ⊂ {0}× D n−1 ⊂ B to the point ( pi− , yi ) ∈ Si × Ui , and γi+ goes from (1, yi ) ∈ {1} × Ui ⊂ {1} × T ⊂ {1} × D n−1 ⊂ B to ( pi+ , yi ) ∈ Si × Ui . We arrange that γi± are transverse to Si × Ui at ( pi± , yi ) and are disjoint from all S j otherwise. We thicken γi± to immersions γi± × D k × Ui into M − Int(B) such that one extreme goes to Di± × Ui and the other goes to either D k × Ui × {0} ⊂ D k × T × {0} ⊂ D n−1 ×{0} ⊂ B for γi− , or D k ×Ui ×{1} ⊂ D k ×T×{1} ⊂ D n−1 ×{1} ⊂ B for γi− . It is possible to do this in such a way that the Ui directions match, since n ≥ k + 2. Recall that Si = Si − (Di+ ∪ Di− ), and set Si = Si ∪ (γi+ × S k−1 ) ∪ (γi− × S k−1 ), which is diffeomorphic to Si (to be rigorous, we should smooth out corners). Then we can define the set U := (Si × Ui ) ∪ (γi+ × S k−1 × Ui )∪ ∪(γi− × S k−1 × Ui ) ∪ ([0, 1] × S k−1 × T), which is a (k + 1)-dimensional open manifold embedded in M. The manifold U is foliated as follows: Si ×Ui is foliated by Si ×{y}, for y ∈ Ui , and [0, 1]× S k−1 ×T is foliated by L y = {(t, x, h −1 (h t (y))) ; t ∈ [0, 1], x ∈ S k−1 },

(2)

for y ∈ T. Clearly the solenoid S is a sub-solenoid of U, S ⊂ U . Restricting the embedding F : U → M to S we get an embedding f : S → M. By construction f (W ) ⊂ Int(B), i.e. the image of the trapping region is contained in a contractible ball. 1 < n − k ≤ k. The same construction as in (1) works now, with the modification that we have to allow intersections of different leaves, but we may take them to be always transversal. So we get a transversal immersion f : S → M.

750

V. Muñoz, R. Pérez-Marco

(3)

n − k = 1. The submanifolds Si have trivial normal bundle and they intersect each other transversally. We cannot avoid that the paths γi± intersect other S j , but we arrange these intersections to be transverse. This produces a transversal immersion f of the region S − W of the solenoid into M − Int(B). We have to modify the previous construction of the immersion of W into B, as codimension one does not leave enough room for it to work. Consider the box B = [0, 1] × D n−1 and remove the axis A = [0, 1] × {0}. Use polar coordinates to identify B − A = [0, 1] × S k−1 × (0, 1), where the third coordinate corresponds to the radius. By construction, W ⊂ S embeds into C = [0, 1] × S k−1 × T, as the set of points (t, x, h −1 (h t (y))), t ∈ [0, 1], x ∈ S k−1 and y ∈ K . We remove D = [0, 1] × S k−1 × τ1 from C, so that C − D = [0, 1] × S k−1 × (0, 1). Then W immerses into C − D, by using the process at the end of the proof of Theorem 3.1 (now there is an extra factor S k−1 which plays no role). This is a transversal immersion. There is one extra detail that we should be careful about. When connecting pi± with the two faces of B, the orientations of the Ui should match. This happens because the normal bundle to Si is trivial, and in this case Si ×Ui is (diffeomorphic to) the normal bundle to Si .

We prove now that f : S → M fully represents a; we use Theorem 2.10. The solenoid S has a trapping region W , and f (W ) ⊂ Int(B), a contractible ball in M. So we only need to see that [ f, Sμ ] = a. Recall that the associated transversal measure is μ K on the transversal K . Let α be any closed 1-form on M. Since H 1 (M) = H 1 (M, B), we may assume that α vanishes on B. We cover the solenoid S by the flow-boxes Si × K i , i = 1, . . . , r , and W (where the form α vanishes). Thus r r ∗

[ f, Sμ ], [α] = f α dμ K i (y) =

Ci , [α] dμ K i (y) =

i=1 K i r

Si

Ci , [α]μ(K i ) =

i=1

i=1 r

Ki

λi Ci , [α] = a, [α],

i=1

proving that [ f, Sμ ] = a. Remark 4.3. A similar comment to that of Remark 3.2 applies to the present situation, that is, the compactness of M is not necessary. Remark 4.4. The orientability of M is not necessary as well. If M is non-orientable, we may consider its oriented double cover π : M˜ → M. Then for a ∈ Hk (M, R), there ˜ R) with π∗ (a) ˜ = a. exists a˜ ∈ Hk ( M, We can consider immersed submanifolds f i : Si → M˜ with transversal selfintersections, and intersecting transversally each other. Then it is easy to perturb f i so that f˜i = π ◦ f i : Si → M are immersed oriented submanifolds with transversal self-intersections, and intersecting transversally each other. This will allow to construct a uniquely-ergodic oriented k-solenoid f : S → M transversally immersed in M˜ fully representing (a multiple of) a˜ such that π ◦ f : S → M is transversally immersed in M and fully represents (a multiple of) a. If n ≥ 2k + 1, then we can assume that f is an embedding (since transversal intersections in this dimension do not happen).

Ergodic Solenoidal Homology: Realization Theorem

751

Remark 4.5. Theorem 4.2 also holds (obviously) for k = 0, n. Remark 4.6. In the article [6], we prove that the currents that we have constructed are general enough in order to fill a dense subset of the space of currents. Therefore, the generalized Ruelle-Sullivan currents associated to immersed measured oriented uniquelyergodic solenoids are dense in the space of closed currents. This question was prompted to the authors by Dennis Sullivan. Acknowledgements. The authors are grateful to Alberto Candel, Etienne Ghys, Nessim Sibony, Dennis Sullivan and Jaume Amorós for their comments and interest on this work. In particular, Etienne Ghys early pointed out on the impossibility of realization in general of integer homology classes by embedded manifolds. The first author wishes to acknowledge Universidad Complutense de Madrid and Institute for Advanced Study at Princeton for their hospitality and for providing excellent working conditions. The second author thanks Jean Bourgain and the IAS at Princeton for their hospitality and facilitating the collaboration of both authors.

Appendix. Homology Classes Represented by Submanifolds By a theorem of Thom (see [13] and [14]), if a ∈ Hk (M, Z) then there exists N >> 1 such that N · a is represented by a smooth submanifold of M. This submanifold C ⊂ M is oriented because it represents a non-zero homology class (the top homology of a compact connected non-orientable manifold is zero). Moreover, if n ≥ 2k + 1 or n − k is odd then it can be arranged that the normal bundle of C is trivial [13,14]. If n − k is even then it can be arranged that the normal bundle is trivial if and only if a ∪ a = 0. Also according to Sullivan [12], using Thom’s method and the thesis of Wells [15] one can always represent N · a by an immersed submanifold f : C → M with trivial normal bundle. (Note that the normal bundle is defined for any immersed manifold.) Moreover, with a small perturbation, we may assume that f has only transversal self-intersections. For completeness, we give here a proof of these results by elementary methods. We start first with the case of odd codimension. Lemma A.1. Let M be a compact and oriented manifold of dimension n. Let 1 ≤ k ≤ n − 1 with n − k odd and a ∈ Hk (M, Z). There exists N >> 1 (dependent only on n and k) and a smooth map f : M → S n−k such that for a generic point p ∈ S n−k , C = f −1 ( p) ⊂ M is a smooth submanifold with trivial normal bundle such that [C] = N · a. Proof. Let aˆ ∈ H n−k (M, Z) be the Poincaré dual of a. We aim to construct a map f : M → S n−k such that f ∗ ([S n−k ]) is a multiple of a. ˆ For this, consider a CW decomposition of M. Let (C ∗ (M, Z), ∂) be the complex of CW-cochains, and let a¯ ∈ C n−k (M, Z) such that ∂ a¯ = 0 and [a] ¯ = a. ˆ We start by considering a map f from the (n − k − 1)-skeleton of M to a base point p ∈ S n−k . To define f in the (n − k)-skeleton, write n i Ci∗ , aˆ = i

with {Ci } being the (n − k)-cells of M, and {Ci∗ } the dual basis. Then define f |Ci in such a way that the induced map f |Ci : Ci /∂Ci → S n−k has degree n i .

752

V. Muñoz, R. Pérez-Marco

To extend f to the higher skeleta, we work as follows: let T be an (n − k + 1)-cell of M. Since a(∂ ˆ T ) = ∂ a(T ˆ ) = 0, we have that f |∂ T : ∂ T → S n−k has degree 0. Therefore, we can extend f to a map T → S n−k . Now by induction on l = 1, 2, . . . we assume that the map f has been extended to the (n − k + l − 1)-skeleton of M and we wish to extend it to the (n − k + l)skeleton. Let T be a (n − k + l)-cell. The map f |∂ T : ∂ T → S n−k gives, recalling that ∂T ∼ = S n−k+l−1 , an element [ f |∂ T ] ∈ πn−k+l−1 (S n−k ). By a result of Serre [10], this group is torsion (since n − k is odd). So there is a non-zero integer m l such that m l ·[∂ T ] = 0. Multiplying a by m l , the map f (in the (n −k +l −1)skeleton) corresponding to a = m l ·a is the composition of f with a map S n−k → S n−k of degree m l . Therefore [ f |∂ T ] = m l · [ f |∂ T ] = 0, and there is no obstruction to extend f to the cell T , and hence to the (n − k + l)-skeleton. In this way, we get an extension to the n-skeleton, i.e. to M. This gives a continuous map f : M → S n−k and it is trivial to verify that f ∗ ([S n−k ]) = N · a, ˆ for some large integer N (actually, N = m 2 m 3 · · · m k ). Now, we homotop f to a smooth function, which we call f again. Taking a regular value p ∈ S n−k , we have a smooth submanifold C = f −1 ( p) of dimension k, and with trivial normal bundle. Clearly, [C] = P D[N · a] ˆ = N · a. Lemma A.2. Let M be a compact and oriented manifold of dimension n. Let 1 ≤ k ≤ n − 1 with n − k even and a ∈ Hk (M, Z). There exists N >> 1 (only dependent of n and k), and an immersion i : C → M of an oriented compact manifold C with i ∗ [C] = N · a and whose normal bundle νC/M → C is trivial. Proof. We consider M × R, which is an (n + 1)-manifold. It is open, but the proof of Lemma A.1 works for it and for the homology class a ∈ Hk (M × R, Z) ∼ = Hk (M, Z). Note that (n + 1) − k is odd, so Lemma A.1 guarantees the existence of a smooth k-dimensional submanifold C ⊂ M × R with trivial normal bundle, and such that [C] = N · a, for some N ≥ 1. Denote by j : C → M × R the inclusion, and let π : M × R → M be the projection into the first factor. Denote by t the coordinate of the R direction, and by ∂t∂ the vertical vector field. Fixing a non-zero normal vector field X to C ⊂ M × R, the compression theorem in [7] allows to isotop the pair ( j, X ) to ( j , ∂t∂ ), where j : C → M × R is an embedding and ∂t∂ becomes a normal vector field to j (C). Therefore the composition i = π ◦ j : C → M is an immersion. Clearly, i ∗ [C] = π∗ j∗ [C] = π∗ [C] = π∗ (N · a) = N · a ∈ Hk (M, R) and the normal bundle to C in M is trivial. The precise result that we use in Sect. 4 is the following: Proposition A.3. Let M be a compact manifold of dimension n, and let bk = dim Hk (M, R). Then we may take a collection C1 , . . . , Cbk ∈ Hk (M, Z) which forms a basis of Hk (M, Q) and such that Ci are represented by immersed submanifolds Si ⊂ M with trivial normal bundle and self-transverse intersections, and such that Si intersects S j transversally. Moreover, if n ≥ 2k + 1, we may assume that there are neither intersections nor self-intersections.

Ergodic Solenoidal Homology: Realization Theorem

753

Proof. Using Lemma A.1 or Lemma A.2 (according to the parity or n − k), we may find a collection of immersed oriented compact submanifolds Si with trivial normal bundle representing a basis for the rational homology Hk (M, Q). Now a small perturbation of each Si makes all intersections of Si with S j , i = j, and all self-intersections of Si , transverse. If n ≥ 2k +1, the transversality of the intersections implies that there are no intersections at all. So the result follows. References 1. Denjoy, A.: Sur les courbes définies par les équations différentielles à la surface du tore. J. Math. Pures Et Appliquées 11(9. série), 333–375 (1932) 2. Herman, M.R.: Sur la conjugaison différentiable des difféomorphismes du cercle à des rotations. Inst. Hautes Études Sci. Publ. Math. 49, 5–233 (1979) 3. Hurder, S., Mitsumatsu, Y.: The intersection product of transverse invariant measures. Indiana Univ. Math. J 40(4), 1169–1183 (1991) 4. Muñoz, V., Pérez-Marco, R.: Ergodic solenoids and generalized currents. Revista Matematica Complutense. In press, doi:10.1007/s13163-010-0050-7, 2010 5. Muñoz, V., Pérez-Marco, R.: Schwartzman cycles and ergodic solenoids. In: Essays in Mathematics and its Applications. Dedicated to Stephen Smale, eds. P. Pardalos, Th.M. Rassias. Berlin-Heidelberg-Newyork: Springer. In press 6. Muñoz, V., Pérez-Marco, R.: Ergodic solenoidal homology: Density of ergodic solenoids. Australian J. Math. Anal. Appl. 6(1), Article 11, 1–8 (2009) 7. Rourke, C., Sanderson, B.: The compression theorem. Geometry & Topology 5, 399–429 (2001) 8. Ruelle, D., Sullivan, D.: Currents, flows and diffeomorphisms. Topology 14(4), 319–327 (1975) 9. Schwartzman, S.: Asymptotic cycles. Ann. Math. 66(2), 270–284 (1957) 10. Serre, J.-P.: Groupes d’homotopie et classes de groupes abéliens.. Ann. Math. 58(2), 258–294 (1943) 11. Sullivan, D.: Cycles for the dynamical study of foliated manifolds and complex manifolds. Invent. Math. 36, 225–255 (1976) 12. Sullivan, D.: René Thom’s work on geometric homology class and bordism. Bull. AMS 41(3), 341–350 (2004) 13. Thom, R.: Sous-variétés et classes d’homologie des variétés différentiables. I et II. C. R. Acad. Sci. Paris 236, 453–454 and 573–575 (1953) 14. Thom, R.: Quelques propriétés globales des variétés différentiables. Commentarii Mathematici Halvetici 236, 17–86 (1954) 15. Wells, R.: Cobordisms groups of immersions. Topology 5, 281–294 (1966) 16. Zucker, S.: The Hodge conjecture for cubic fourfolds. Compositio. Math. 34, 199–209 (1977) Communicated by A. Connes

Commun. Math. Phys. 302, 755–788 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1188-y

Communications in

Mathematical Physics

Sugawara-Type Constraints in Hyperbolic Coset Models Thibault Damour1 , Axel Kleinschmidt2 , Hermann Nicolai3 1 Institut des Hautes Etudes Scientifiques, 35, Route de Chartres, FR-91440 Bures-sur-Yvette, France 2 Physique Théorique et Mathématique, Université Libre de Bruxelles & International Solvay Institutes,

ULB-Campus Plaine C.P. 231, BE-1050 Bruxelles, Belgium. E-mail: [email protected]

3 Max-Planck-Insitut für Gravitationsphysik, Albert-Einstein-Institut, Am Mühlenberg 1,

DE-14476 Potsdam, Germany Received: 21 December 2009 / Accepted: 7 September 2010 Published online: 8 February 2011 – © Springer-Verlag 2011

Abstract: In the conjectured correspondence between supergravity and geodesic models on infinite-dimensional hyperbolic coset spaces, and E 10 /K (E 10 ) in particular, the constraints play a central role. We present a Sugawara-type construction in terms of the E 10 Noether charges that extends these constraints infinitely into the hyperbolic algebra, in contrast to the truncated expressions obtained in Damour et al. (Class. Quant. Grav. 24:6097, 2007) that involved only finitely many generators. Our extended constraints are associated to an infinite set of roots which are all imaginary, and in fact fill the closed past light-cone of the Lorentzian root lattice. The construction makes crucial use of the E 10 Weyl group and of the fact that the E 10 model contains both D = 11 supergravity and D = 10 IIB supergravity. Our extended constraints appear to unite in a remarkable manner the different canonical constraints of these two theories. This construction may also shed new light on the issue of ‘open constraint algebras’ in traditional canonical approaches to gravity. 1. Introduction In canonical formulations of gravity, the constraints are the essential ingredients for, and main obstacles to, carrying out a canonical quantization of gravity [1] (for an overview and bibliography see [2]). This applies in particular to the Hamiltonian (scalar) constraint determining evolution in ‘time’, and therefore the dynamics. The problem of properly setting up and defining the quantum constraints has been tackled in a variety of approaches but, arguably, the problem remains as open as in Bryce DeWitt’s seminal 1967 paper [1]. A further cause of difficulties, shared by all approaches so far, can be traced to the fact that the constraints form an open algebra, that is, the structure ‘constants’ are not constants, but field dependent. At the level of classical maximal supergravity, progress has been made in the last years towards establishing a correspondence between the equations of D = 11 supergravity on the one hand and a geodesic coset model based on the hyperbolic Kac–Moody structure E 10 [3] on the other (similar correspondences exist for other supergravity models). The

756

T. Damour, A. Kleinschmidt, H. Nicolai

supergravity equations are treated canonically and therefore comprise dynamical (evolution) equations and constraint equations. There is a precise correspondence between a truncation of the dynamical equations and a truncation of the geodesic equation on the coset E 10 /K (E 10 ) [3]. The D = 11 supergravity constraint equations can similarly be mapped to constraints that can be imposed consistently on the geodesic motion [4]. For instance, imposition of the Hamiltonian constraint implies that the geodesic is null. According to [4] the weakly conserved constraints of D = 11 supergravity can be translated into weakly conserved coset model constraints, which in turn allow for a reformulation as bilinear expressions in terms of conserved charges, that is, as strongly conserved constraints.1 As noted there, this construction is very reminiscent of the wellknown Sugawara construction [5] for affine Lie algebras [6,7]. It is the purpose of the present paper to follow up on this observation, making it more precise and giving the beginning of a generalized Sugawara construction for hyperbolic Kac–Moody algebras which makes the analogy with the affine construction much more compelling. Understanding and reformulating supergravity in these algebraic terms could prove very useful for the transition to the quantum theory (see [8] for first steps towards the quantization of the E 10 /K (E 10 ) model and [9] for pure gravity). An analogy to be kept in mind in this discussion is that of (bosonic) string theory. There, the dynamical equation for the embedding (target space) coordinates can be written as a free wave equation if one adopts a conformal gauge. This free wave equation admits an infinite set μ of conserved charges αn . The price to pay for the simple dynamical equation is that one has to impose the (Fubini-Veneziano-)Virasoro constraints, L ∼ α α, on the solutions. In the quantum version, the Virasoro constraints and the existence of a proper Hilbert space imply the critical dimension [10]. Assuming the validity of the Kac–Moody/supergravity correspondence, the dynamical equations of supergravity also become simple, yielding geodesics on a symmetric space as their solutions. This system is fully integrable. It admits an infinite set of conserved charges, J , that do not (Poisson) commute among themselves, and one can formally write down the general solution in terms of J and some initial data. The complications and interesting structures are then again to be found in the constraints and their algebra. The fact that all constraints found so far admit a Sugawara-like structure, i.e., L ∼ J J , is tantalizing in this analogy, and may turn out to be crucial for the quantisation of the theory. The gauge symmetries encoded in the coset constraints are directly linked to the space-time and gauge symmetries that are known from the geometrical formulation of supergravity. The replacement of the supergravity constraints by coset model constraints with an underlying algebraic structure may also shed new light on the old problem of open constraint algebras alluded to above, circumventing some of the seemingly insurmountable difficulties of the usual canonical formulation. The main new feature here is that the ‘structure constants’, while still dependent on the dynamical degrees of freedom (fields), become constants of motion in the present formulation. More explicitly, suppose the classical constraints C A (φ) satisfy the first-class canonical (Poisson) algebra C A (φ), C B (φ) = f AB C (φ)C C (φ), (1.1) where φ denotes the canonical variables. In the standard formulation of canonical gravity and supergravity, the φ-dependent structure ‘constants’ f AB C (φ) do not (Poisson) 1 As usual, the term ‘weakly conserved constraints’ here refers to a set of constraints C satisfying (modulo the coset equations of motion) dC/dt = f (C) ≈ 0, where f (C) is a function vanishing on the constraint surface defined by C = 0, while ‘strongly conserved’ constraints satisfy dC/dt = 0 (upon use of the equations of motion).

Sugawara-Type Constraints in Hyperbolic Coset Models

757

commute with the Hamiltonian and thus vary in time. By contrast, the structure constants obtained with the Sugawara-like form of the constraints do commute with the Hamiltonian constraint, and are thus preserved in time, even though they still depend on the canonical variables φ. Because the correspondence between the space-time based field theory and the one-dimensional E 10 /K (E 10 ) model is only very incompletely understood, it is, however, not clear how to translate the coset model constraints back into more conventional field theory language. At the very least, one can say that the relation between the field variables of the geometric theory and the E 10 variables must be extremely non-local. Obtaining a universal algebraic description of the constraints and their algebra is also desirable from an M-theory point of view. In the same way that the unique dynamical geodesic equation on E 10 /K (E 10 ) allows for maps to different maximal supergravity theories, depending on the level decomposition chosen to describe the infinite-dimensional Lie algebra [11–14], the constraints should also exhibit this ‘versatility’. Our construction below has this property, albeit in a novel way. More precisely, we will define a ‘universal scaffold’ of hyperbolic Sugawara constraints by using null root vectors α of the hyperbolic algebra, decomposed into a sum of two real roots β1 + β2 = α, and the hyperbolic Weyl group. This will define an infinite number of constraints Lα associated with a ‘skeleton’ of roots α on the light-cone in terms of current bilinears. (The notions of skeleton and scaffold are depicted in Figs. 2 and 3 below.) Extending (away from the real βi case) the set of current-bilinear contributions Lα ∼ Jβ1 Jβ2 to a given null-root constraint (α 2 = 0), or extending the skeleton of supporting roots α constraints into the light-cone (α 2 < 0), however, seems to require the choice of a subalgebra of the hyperbolic algebra that is kept manifest. In analogy with affine algebras, this procedure is very suggestive of a choice of ‘spectral parameters’ for the hyperbolic algebra, even though we do not know whether such a realization of the hyperbolic algebra exists. However, the picture that emerges from the present work is that if such realizations exist, they do so only in combination with suitable constraints. Furthermore, such realizations cannot be unique, giving the algebra a ‘chameleon-like’ aspect. This feature would be in line with the conjectured emergence of a space-time structure from the Lie algebra, where the dimension of the emergent space would depend on the decomposition and the chosen form of the constraints, such that the ‘spectral parameters’ would become associated to spatial coordinates.2 These points will be elaborated on and explained below by means of the constraints of D = 11 supergravity and of type IIB supergravity, respectively, but similar results are expected to hold for other decompositions, such as massive IIA theory, as well as for maximal supergravities in lower dimensions. Importantly, though the set of roots ‘supporting’ the constraints is clearly related to the weight diagram of particular highest-weight representations of E 10 , the constraints themselves do not form (under Poisson commutation) a highest or lowest weight representation of the hyperbolic E 10 , as already observed in [4], and explained in much more detail here. Rather, they indicate the existence of new unexplored algebraic structures inside the hyperbolic algebra and its enveloping algebra. We emphasize that our approach is canonical and crucially relies on a split of space and time, as well as certain gauge choices required for matching the supergravity and coset model degrees of freedom. An earlier and conceptually different M-theory proposal 2 However, this association is likely to be more subtle than just a simple equality, as can already be seen for the affine spectral parameter in D = 2 supergravities, cf. Eq. (2.1) of [15] with ρ = t (time) and ρ˜ = x 1 (space).

758

T. Damour, A. Kleinschmidt, H. Nicolai

based on the indefinite, but non-hyperbolic, ‘very extended’ Kac–Moody algebra E 11 has been developed by Peter West and collaborators [16,17]. In contradistinction to the present work, their approach is ‘covariant’ in the sense that neither a split of space-time nor gauge choices for the supergravity fields are required, and the issue of writing down canonical constraints thus does not arise in the same way. Instead, one needs to introduce extra gauge invariances encompassing the gauge transformations of supergravity, and the problem becomes one of ‘fitting’ such gauge symmetries into the E 11 framework [18]. However, despite many similarities at the kinematical level, especially with regard to embedding the bosonic sectors of maximal supergravities [19–22], it appears doubtful whether a gauge-fixed version of that approach matches with the structures presented here. From the mathematical point of view, it would also be desirable to associate a Sugawara-type construction to a hyperbolic algebra. In the affine case, the existence of this construction is directly linked to the realization of affine algebras as loop algebras via the so-called spectral parameter. A similar description and understanding is lacking for hyperbolic algebras; the only known description is in terms of generators and relations in the Chevalley–Serre basis. Any construction hinting at an alternative description could shed light on the deeper and to date elusive structure of hyperbolic Kac–Moody algebras. After all, even not knowing about the current algebra realization of affine algebras, the existence of a preferred set of bilinear Virasoro operators in the enveloping algebra would almost inevitably lead to this realization. Here, we are searching for a similarly distinguished structure in the enveloping algebra of the hyperbolic algebra. The remainder of the paper is structured as follows. In Sect. 2 we first review the affine Sugawara construction and rephrase it in a slightly unconventional form. We use this form to propose a (partly schematic) trial expression for Sugawara generators for hyperbolic algebras. In Sect. 3 we then explore this trial expression in more detail in the case of E 10 and show that our trial expression does not only serve to reproduce the D = 11 constraints but also those of type IIB supergravity. This also allows for a more precise definition of the Sugawara constraints and an exploration of their structure in terms of a skeleton of constraints associated with null roots and terms induced by covariantization. In appendices, we collect some known results on level decomposition in order to render the presentation self-contained, as well as some more detailed computations.

2. Sugawara Construction Before proceeding to the discussion of the hyperbolic Sugawara construction we first review briefly the definition of Sugawara operators for affine Lie algebras, see [7] (as well as [5,6] for earlier work and [23,24] for generalizations of Sugawara’s construction)

2.1. Affine Sugawara construction. A non-twisted affine Lie algebra can be defined for any finite-dimensional Lie algebra. Let the finite-dimensional Lie algebra gbe simple and generated by T A (A = 1, . . . , dim g) with commutation relations T A , T B = f AB C T C and non-degenerate invariant form T A |T B = κ AB . Then the corresponding affine Lie algebra gˆ has generators TmA (for m ∈ Z), c and d with non-trivial commutation relations

C TmA , TnB = f AB C Tm+n + κ AB mδm,−n c, d, TmA = −mTmA .

(2.1)

Sugawara-Type Constraints in Hyperbolic Coset Models

759

The generator c commutes with all Lie algebra generators and is called the central element,3 while the generator d is called the derivation.4 In any irreducible highest weight representation, the central element c acts as a scalar; its eigenvalue k on that representation is called the level of the representation. For such a level k representation, the Sugawara generators are defined (within the enveloping algebra of the TmA ’s) by [7] (for n ∈ Z) Ln =

1 A : Tn−m TmB : κ AB , ∨ 2(k + h )

(2.2)

m∈Z

where the colons denote normal ordering as appropriate for the highest weight representation and κ AB is the inverse of κ AB ; h ∨ is the dual Coxeter number defined by f AC D f B D C = 2h ∨ κ AB . We note that there are two separate contributions to the normalization of the Sugawara generators (2.2): The first one is k, related to the central extension, the second one h ∨ comes from normal ordering. Both contributions are quantum effects. Below, we will treat these two contributions differently. In the hyperbolic extension, the central generator ceases to be central and is on par with all the other Lie algebra generators. Normal ordering, on the other hand, will be mostly ignored, as our discussion deals with the classical constraints only. Normal ordering ensures that the generators L m are well defined on any element of the representation. The operators (2.2) obey a Virasoro algebra [L m , L n ] = (m − n)L m+n +

k dim g m(m 2 − 1)δm,−n . 12(k + h ∨ )

Their commutators with the affine generators are A L m , TnA = −nTm+n .

(2.3)

(2.4)

Here, we would like to take a more formal point of view and rewrite (2.2) as a quadratic expression in the generators without resorting to an integrable representation. The reason is that the normalization in (2.2) involves the inverse of the (shifted) eigenvalue of the central generator c. However, in the full hyperbolic algebra the element c is no longer central (in fact, the hyperbolic algebra does not possess any central elements), and a direct generalization of (2.2) would thus necessarily involve the inverse of an operator, which furthermore is no longer singled out in the full algebra. For this reason, we formally multiply (2.2) by the central element and drop the normalization constant. We also recall that affine Lie algebras have two different kinds of roots: real roots and null roots. In particular, there is a primitive null root δ which can be used to describe all roots of the affine algebra via an affine ladder diagram: Let fin ≡ (g) be the set of roots of the finite-dimensional algebra g (where we include α = 0 for simplicity), then the root system of the affine extension gˆ is aff ≡ (ˆg) = α + nδ : α ∈ fin and n ∈ Z , (2.5) that is, there are Z copies of the finite root system. The roots nδ are null roots and the associated root space gˆ nδ has dimension given by the rank: mult(nδ) = dim gˆ nδ = rank(g) 3 The central element of the affine Lie algebra, here denoted c, is often denoted K ; it should not be confused with the central element of the Virasoro algebra associated to the affine algebra. 4 This terminology follows from the presentation of affine algebras as loop algebras where d is the derivative with respect to the spectral parameter [7].

760

T. Damour, A. Kleinschmidt, H. Nicolai

for n = 0. For n = 0 the dimension is equal to that of the Cartan subalgebra and takes the value rank(g) + 2 (the two extra elements are c and d). All other roots are real and the corresponding root spaces are one-dimensional. Using the structure of the affine root system we can rewrite the commutation relations (2.1) as Tα1 , Tα2 = f α1 α2 α1 +α2 Tα1 +α2 + κα1 ,α2 c, (2.6) where we have suppressed the multiplicity index for null roots. The values of f α1 α2 α1 +α2 and κα1 ,α2 can be obtained by comparison with (2.1). We furthermore define quadratic generators in the enveloping algebra U (ˆg) by L nδ := Tnδ−β Tβ , (2.7) β∈aff

where Tβ is a canonically normalized element in the root space gˆ β . If the β root space is degenerate, we choose an orthonormal basis and contract with the canonically conjugate basis. Then the definition (2.7) is unambiguous except when the root spaces of nδ − β and β have different dimensions. This happens only when one of nδ − β or β is equal to zero, i.e., when one of the generators belongs to the Cartan subalgebra. In that case the generators are to be contracted according to the definition (2.2), i.e., we omit any terms involving a contraction with c or d, but contract only with elements of the Cartan subalgebra of the horizontal g. Except for this point and the lack of normal ordering, the expression (2.7) is a reformulation of (2.2). Note that although we could have defined quadratic generators of the form (2.7) for any point on the root lattice, we do this only for null roots. To get a Virasoro algebra it is furthermore essential that the space of null roots has an additive structure since all null roots lie on a Z-graded line. The affine Weyl group is the semi-direct product of the finite Weyl group with a translation group [25]. After the standard embedding of the affine algebra into a hyperbolic algebra of over-extended type [26], the affine Weyl group can also be described as the subgroup of the hyperbolic Weyl group stabilizing an affine null root [27]; the so-called affine translations are then realized as Lorentz boosts along this null direction.5 Since null roots nδ are stabilized by the affine Weyl group W aff , the l.h.s. of the definition (2.7) is invariant under the action of the Weyl group. One can check that the r.h.s. is also invariant. Besides the convention for null root spaces, the definition (2.7) differs from the standard one (2.2) by its lack of normal ordering. However, as is well known, this affects only the generator L 0 for affine algebras. In addition, normal ordering is only required for the quantum theory, whereas we are here mainly concerned with the structure of the classical constraints. In the classical theory, one associates to each symmetry generator Tα a corresponding conserved charge, say Jα . Accordingly, we will below consider expressions such as (2.7) (with the replacement Tα → Jα ) as functions on phase space and leave open the quantum definition of the constraints. We also remark that the generator L 0 as defined in (2.2) differs from the Hamiltonian (quadratic Casimir) by a term proportional to cd. Omission of this term is admissible in the affine case, but not in the hyperbolic algebra. [In other words, our hyperbolic-algebra generalization of (2.2) will contain terms of the type cd, which do not enter the affine version of (2.2).] Correlatively, 5 We also note that the Weyl orbit of the ‘cusp’ δ is dense on the boundary of the hyperbolic space obtained by projecting the interior of the forward lightcone onto the unit hyperboloid. Equivalently, the rays through all the hyperbolic null roots cover the boundary of the lightcone densely.

Sugawara-Type Constraints in Hyperbolic Coset Models

761

while the affine Hamiltonian is bounded below, i.e., L 0 ≥ 0, the full Hamiltonian is not because the Cartan-Killing metric on the Cartan subalgebra is indefinite for hyperbolic algebras (with c|c = d|d = 0 and c|d = 1). We now proceed to compute the algebra of the constraints as defined by (2.7). In the course of the following computations we manipulate infinite sums formally, well aware that they are not well-defined and normally would require a normal ordered evaluation on a representation space. With this in mind one computes in the universal enveloping algebra, [L mδ , Tα ] = −2κα,−α c Tmδ+α ,

(2.8)

which is the same as (2.4), but now expressed in terms of affine roots. The important point we wish to emphasize here is that the r.h.s is bilinear in affine generators since we multiplied the Sugawara generators by the central element. Continuing now to the commutator of two Sugawara generators (2.7) leads to [L mδ , L nδ ] = 2(m − n) c L (m+n)δ ,

(2.9)

so that in this formulation the algebra closes with a pre-factor (= c) that is itself an algebra generator. Due to the lack of normal ordering one does not obtain the central term as in (2.3). Neither is the shift by the dual Coxeter number visible in this formal computation in the enveloping algebra. 2.2. Hyperbolic Sugawara construction. The expression (2.7) can be formally generalized to hyperbolic Lie algebras of the over-extended type [26].6 In the hyperbolic case the root system hyp is much more complicated than (2.5): Besides the real and null roots there are now time-like (purely imaginary) roots α with α 2 < 0. The multiplicities of these roots grows exponentially and no closed formula for their multiplicities is known although these can be computed algorithmically, for example via the Peterson recursion formula. For each α root space gα ⊂ g , we choose a basis Tα(s)

for s = 1, . . . , mult(α),

(2.10)

which is ‘null orthonormal’ (when using the standard bilinear form) with respect to the corresponding dual basis in the g−α root space: (s )

Tα(s) |Tβ = δs,s δα+β,0 . The commutation relations are then +α2 (s12 ) Tα(s1 1 ) , Tα(s2 2 ) = f (sα11)(sα22 ) α(s112 ) Tα1 +α2 .

(2.11)

(2.12)

Our hyperbolic generalization of the affine Sugawara construction (2.7) then consists of two elements: (i) the choice of a special set of ‘constraint’ generators, labelled by a subset, say C, of the set of pairs (α, s¯ ) labelling the roots (including their degeneracy); and (ii) a general expression for the hyperbolic Sugawara generator Lα,¯s (or ‘generalized Virasoro constraint’) associated to a particular pair7 (α, s¯ ) ∈ C of the form 6 By ‘over-extension’ we mean the canonical extension via the non-twisted affine extension, whereby two nodes are added to the Dynkin diagram; adding a third node would yield ‘very-extended’ algebras [28]. 7 Note that while α runs over a subset of , s¯ correspondingly runs over a subset of the full degeneracy of the root α ∈ .

762

T. Damour, A. Kleinschmidt, H. Nicolai

Lα,¯s =

β1 ,β2 ∈hyp s1 ,s2 β1 +β2 =α

(s )

(s )

Ms1 ,s2 (β1 , β2 )Tβ1 1 Tβ2 2 .

(2.13)

Here Ms1 ,s2 (β1 , β2 ) denote some numerical coefficients that we expect to be simply ±1 (s) or 0 (or possibly other rational numbers) for an appropriate choice of the dual bases T±α in the ±α root spaces. We do not have yet a full understanding of the precise set C of ‘constraint’ generators,8 nor of the numerical coefficients Ms1 ,s2 (β1 , β2 ) entering the definition of our generalized Virasoro constraints Lα,¯s . We will argue that a distinguished role is played by the ‘null subset’ of C, i.e., by the case where α is a null root. In that case, the corresponding constraint degeneracy index takes only one value (while the degeneracy of a null root within the hyperbolic algebra is equal to the rank). Moreover, still in the case where α is a null root, we will be able to verify that the coefficients Ms1 ,s2 (β1 , β2 ) in (2.13) are indeed simply equal to ±1 when both β1 and β2 (such that α = β1 + β2 ) are real roots. In the following, we shall refer to the better understood ‘null’ subset of C as being the skeleton of C; and we shall refer to the better understood set of special configurations (α, β1 , β2 ), with α null, β1 , and β2 real, and α = β1 + β2 , as being the universal scaffold at the basis of our construction. As the name ‘skeleton’ suggests, there are more constraints than those associated to null roots. Below, we shall give explicit examples of (‘fleshy’) constraints associated with strictly imaginary roots α 2 < 0. However, constraints associated to null roots play a distinguished role in our construction. The special role of light-like α is already suggested by the affine Sugawara construction (2.7) where constraints were only defined for null roots. In addition, the special configurations where both β1 and β2 are real introduce a significant simplification in our construction. Indeed, in that case the root spaces associated to β1 and β2 are one-dimensional, so that there exists a unique (up to sign) contraction between the associated step operators. By contrast, when not both β1 and β2 are real, the root spaces that are paired are multidimensional, and moreover not necessarily of equal dimension. This leaves open many possibilities for ‘contracting’ (s ) (s ) Tβ1 1 with Tβ2 2 in forming Lα,¯s . The information on how to contract the elements of different root spaces is then encoded in the choice of the coefficients Ms1 ,s2 (β1 , β2 ). Let us note, however, that, given a certain pair (α, s¯ ) ∈ C, i.e., given a certain Lie algebra (¯s ) generator Tα , there exists (when α = β1 + β2 ) a distinguished way of contracting (a part of) the β1 root space gβ1 with the β2 one gβ2 . Indeed, if we denote β1 = α − β, (¯s )

(¯s )

so that β2 = +β, the adjoint action of Tα , ad T (¯s ) x ≡ Tα , x maps g−β onto (a part α of) gβ1 = gα−β . We can then use the natural ‘dual’ pairing between g−β and g+β (i.e., between g−β2 and g+β2 ) to write putative constraints of the form9 (s) (s) Lα,¯s = N (α, β) Tα(¯s ) , T−β Tβ . (2.14) β∈hyp s

Here the coefficients N (α, β) no longer depend on the degeneracy index s within the dual spaces g±β , and the sum over s is easily seen to be independent of the choice of 8 The letter C is used here to evoke both the word ‘constraint’, and the fact that the set C appears to have the structure of a convex cone. 9 To see that expression (2.14) is indeed well-defined, one can invoke the invariance of the bilinear form, see Lemma 2.4 in [25].

Sugawara-Type Constraints in Hyperbolic Coset Models

763

(s)

(dual) bases T±β (as long as the orthonormalization condition (2.11) is satisfied). We leave to future work further study of the usefulness of the special construction (2.14). One advantage of expressing the constraints as in (2.13) is that, contrary to the expressions derived in [4] (which were formulated in terms of the G L(10) level decomposition of E 10 ), such a definition a priori appears not to be tied to any particular level decomposition of the hyperbolic algebra. Therefore, this opens up the possibility of writing a ‘universal’ set of coset constraints, whose further (particular) level decompositions could give rise to the apparently different canonical constraints arising in different maximal supergravities (mIIA, IIB, . . .). However, we shall give evidence below that this hope of a universal constraint construction is not fulfilled in this simple way. Rather, we will encounter a more refined construction, where only the scaffold is universal. The reason appears to lie in the existence of various ways of contracting (multi-dimensional) root spaces, i.e., in the possibility of various consistent choices for the coefficients Ms1 ,s2 (β1 , β2 ). Each particular level decomposition might be tied to a particular corresponding choice for these coefficients. Even if this turns out to be the case, it seems that our construction still involves a universal part, namely the part of (2.13) involving the skeleton of ‘null’ constraints, and its associated scaffold of special configurations where a null root α is decomposed into two real roots β1 and β2 . As we shall emphasize below, this universal part is invariant under the Weyl group of the hyperbolic algebra and already yields an infinite number of constraints (associated to the intersection of the light-cone with the root lattice). This ‘universal part’ is, however, not invariant under the hyperbolic algebra itself. As we shall see below, one can associate to each choice of a finite-dimensional subalgebra (used as a way of ‘slicing’ the hyperbolic algebra by means of a corresponding level decomposition) a way of generating additional constraints by covariantizing under that subalgebra. Each such covariantization procedure allows one to ‘flesh out’ the skeleton by adding new constraints inside the light cone and also terms with β1 and β2 not both real. The prescription will be made more precise in Sect. 3 when we discuss the example of E 10 . A further general issue regarding (2.13) is the operator ordering. Below we will work with similar expressions involving functions on classical phase space which are commuting. [Note that they commute as functions, but do not ‘Poisson commute’.] For those the issue of ordering becomes relevant only after the transition to the quantum theory, which we will not consider here. Finally, as written, (2.13) is meant to define only one constraint per root even though null roots have multiplicity greater than one. The structure of null roots in hyperbolic over-extended algebras is known to be given by Weyl orbits through null = W · (n δ), (2.15) n∈Z\{0}

where W is the hyperbolic Weyl group and δ the primitive null root of the affine algebra embedded in the hyperbolic extension. Restricting the construction (3.5) to affine generators reduces all the Weyl orbits to points since δ is invariant under the affine Weyl group. Hence the construction gives constraints only for the roots α = n δ in agreement with the affine Sugawara construction (2.7). At this point, we stress a possible qualitative difference between the usual affine Sugawara construction (2.7) and the corresponding hyperbolic construction (2.13) at the present stage of our understanding of the construction. The affine Virasoro constraints L n δ form a two-sided tower, where n runs over the set of integers Z, while it seems consistent that the hyperbolic constraints Lα,¯s run over a set C which is a one-sided

764

T. Damour, A. Kleinschmidt, H. Nicolai

convex cone, contained within the past light-cone of the Lorentzian root lattice. This one-sided structure of the constraints was clearly apparent in [4], where only constraints Lα corresponding to negative imaginary α were found, as will be shown in Sect. 3 below.10 This asymmetry between the two-sidedness of the usual affine (Virasoro) constraints, and the one-sidedness of the hyperbolic ones, seems to be deeply rooted in the different physics (and mathematics) associated to the origin of these constraints. In the usual affine case, the origin of the constraints is a gauge invariance under reparametrizations of (two) periodic (world-sheet light-cone) variables σ± = τ ± σ . The periodic nature of these variables, and the real (or hermitian) character of the worldsheet embedding functions, e.g. ∂± X μ (τ, σ ), implies the existence of two-sided Fourier expansions involving, for each choice of sign in σ± the two complex-conjugated basis functions exp(+inσ± ) and exp(−inσ± ). By contrast, the hyperbolic coset models should describe the gravitational physics taking place near a spacelike singularity, i.e., in a time-asymmetric situation of the type t → 0+ , say. Moreover, the hyperbolic coset model is itself parametrized asymmetrically in terms of positive roots only. The analysis of the dynamics of supergravity in [3] found evidence for relating the supergravity fields to one-sided towers of coset variables. This tower consists of the so-called ‘gradient generators’ that are conjectured to correspond to multiple spatial gradients, roughly in terms of a spatial Taylor expansion. It is then natural to conjecture that the usual space-dependent supergravity constraints will also give rise to one-sided-only towers of ‘gradient cousins’ of the (already one-sided) low-level constraints discussed in [4]. Another (related) argument for expecting that the tower of coset constraints be onesided only, is the idea proposed in [4] that the set of constraints be just large enough to reduce the exponentially infinite number of variables entering the hyperbolic coset models to a much smaller number of degrees of freedom involving only a rather small vicinity of the future light-cone in root space (i.e., essentially the gradient generators, plus a relatively manageable set of extra M-theoretic degrees of freedom). To achieve such a strong reduction in the number of degrees of freedom, without killing them all, it is natural to have a set of constraints C which fills, like the coset variables, a onesided cone and whose degeneracies do not grow faster than the ones of the roots. Note, however, that our intuitive argument cannot exclude the possibility that the constraints fill a double-sided cone, if the degeneracies of the constraints are such that the sum of the positive-sided and negative-sided ones does not grow faster than the positive-root degeneracies. Whatever is the ultimate definition of the physically correct set of coset constraints, Lα,¯s , one would expect it to satisfy some commutation relations (of the general type [L, L] = O(L)) reflecting some aspects of the (currently unknown) underlying gauge symmetry of the hyperbolic models, in the same way that the Virasoro algebra (2.9) is a gauge-fixed remnant of the worldsheet diffeomorphism symmetry of the underlying (Nambu-Goto-type) string action. Given the trial expression (2.13) one can wonder what algebra these expressions satisfy, i.e., whether there is a generalization of the Virasoro algebra (2.9) associated with our construction. While a conclusive answer to this question would require a knowledge of the E 10 algebra which is presently not available, we can at least formulate the following expectation. Under the Poisson (or Dirac) bracket

10 There was a further one-sidedness in [4] related to the fact that we were working in a truncated coset whence only a Borel subalgebra of the hyperbolic algebra played a role. This effect is an artefact of the truncation and irrelevant to the present construction.

Sugawara-Type Constraints in Hyperbolic Coset Models

765

the grading of the algebra implies that the simplest type of commutation relation one might have is of the form

Lα , Lβ = Jα+β−γ Lγ . (2.16) γ

As we shall discuss in the next section below, relations of the type (2.16) do hold if we consider only the (truncated, low-level) constraints of [4]. However, the vast generalization of the definition of the constraints introduced in the present paper makes the validity of a result of the type (2.16) highly non-trivial and dependent upon delicate structures that we do not currently understand in detail. Indeed, there are two non-trivial assertions contained in the expected result (2.16). The first one is that the trilinear11 expression in current components on the r.h.s. organizes itself into products between constraints and certain current components, much in the same way as for the affine Virasoro algebra (cf. (2.9) where the r.h.s. is a product of a constraint L nδ by a (conserved) algebra generator c). The second claim relates to the roots γ contributing on the r.h.s. and the question whether these only cover constraints that had been defined previously. Both points are important for ascertaining the closure of the constraint algebra. The fact that only strongly conserved coefficients appear in the algebra of constraints is important for the discussion of open algebras, as mentioned in the Introduction. We note one point concerning (2.16) in comparison to the affine Virasoro algebra (2.9). There it was important that an additive structure existed on the set of all roots for which generators L mδ were defined. Here, we expect that this additive structure will be replaced by a certain convexity-related structure of the cone C, akin to the structure of integrable highestweight representations [25]. Though we do not yet fully comprehend this structure, we shall see below that our proposed ‘fleshing out’ of the skeleton ensures (when starting from a past-light-cone-only skeleton) the convex structure of a solid cone, i.e., all α’s generated by our construction lie on or inside the light-cone. 3. Universality and Relation to Supergravity In this section we specialize to the case of E 10 whose Dynkin diagram is given in Fig. 1. The relation to supergravity will help to make the construction of the preceding section more concrete. An important role will be seen to be played by the relation between D = 11 supergravity (or type IIA in D = 10), and type IIB in D = 10. 3.1. Consistency with supergravity constraints: D = 11. The Sugawara constraints (2.13) can be interpreted as constraints to be imposed on geodesics on the infinitedimensional coset space E 10 /K (E 10 ) as follows [4]. The global E 10 symmetry gives rise to conserved Noether charges J ∈ Lie(E 10 ) that can be expanded in the orthonormal (s) basis {Tα | α ∈ hyp , s = 1, . . . , mult α} as J =

mult α

Jα(s) Tα(s) .

(3.1)

α∈hyp s=1 11 The hyperbolic Lie algebra structure {J, J } = J guarantees that the commutator of two J -bilinear constraints L is only trilinear in the J ’s.

766

T. Damour, A. Kleinschmidt, H. Nicolai

Fig. 1. Dynkin diagram of E 10 with numbering of nodes

The pairing between charges and generators is as in [4]: J = ··· + +

(0) 1 (−1)m 1 m 2 m 3 Fm 1 m 2 m 3 + J m n K n m J 3!

1 (1) J m1m2m3 E m1m2m3 + · · · , 3!

(3.2)

where we have for definiteness chosen the gl(10) level decomposition of E 10 that is reviewed in Appendix A.1. An important point to note here is that tensor generators and coefficients transform contragrediently. For instance, for the Chevalley-Serre generators this translates into the following identification: Tα1 = K 1 2 [∼ e1 ] ,

(3.3)

Jα1 = J 2 1 [∼ f 1 = −ω(e1 )] ,

and so on, where ω is the Chevalley involution on E 10 . With this identification of algebra generators and current components we can work either in the universal enveloping algebra, generated by the Tα , or in the Poisson algebra, generated by the current components Jα . Namely, when considered as elements of a Poisson algebra on phase space, the (s) components Jα close into the same hyperbolic algebra under Poisson commutation, as follows directly from the Hamiltonian formulation of the coset space dynamics. That is, we have the canonical brackets (s )(s ) α +α (s ) Jα(s11 ) , Jα(s22 ) = f α11 α22 (s112 )2 Jα112 (3.4) +α2 , identical (including the sign) to the commutation relations of the hyperbolic algebra (2.11). The classically conserved charges of the E 10 /K (E 10 ) model are commuting functions on phase space in terms of which we write the classical constraints as Lα =

β∈hyp s,s

(s) Ms,s (α, β)Jα−β Jβ(s ) ,

(3.5)

without specifying the summation over the ‘internal’ degrees of freedom at this point (that is, the matrix Ms,s (α, β)). The Hamiltonian (scalar) constraint entering the coset model of [3] can be represented as the special member of the hierarchy of constraints (3.5) corresponding to α = 0, L0 ≡ H =

β mult β≥0 s=1

(s) (s)

J−β Jβ .

(3.6)

Sugawara-Type Constraints in Hyperbolic Coset Models

767 (s)

In this way one confirms that all Noether charges Jα are indeed classically conserved because they Poisson commute with H:

H, Jα(s) = 0.

(3.7)

This is a direct consequence of the fact that H is just the quadratic Casimir operator for the hyperbolic algebra (see Chap. 2 of [25] for a proof and the explicit computation). We note also that for the Hamiltonian constraint (3.6) the issues of contracting generators from root spaces of different dimensions are absent since the root spaces of α and −α always have the same dimension. Since all components of J are conserved, any expression of the type (3.5) is strictly conserved for any geodesic. We can therefore consistently constrain the geodesic motion on the coset space by demanding that the initial conditions satisfy Lα = 0. In [4] we have shown (with the same truncation of higher order spatial gradients as in [3]) that the canonical constraints of D = 11 supergravity can be successively rewritten in two different (but related) forms. Our analysis used an A9 = sl(10) level decomposition of the E 10 algebra, corresponding to the removal of node 10 in Fig. 1. The results of this level decomposition of [3,29] are reproduced in Appendix A.1. The explicit computation involved the determination of various numerical coefficients in the E 10 expressions that were originally fixed by requiring weak conservation of the constraint surface under the coset model equations of motion. Comparison with the canonical D = 11 supergravity constraints and use of the dictionary then showed precise agreement of these numerical coefficients, thus extending the correspondence between the E 10 /K (E 10 ) coset model and the (truncated) D = 11 supergravity equations of motion to the full canonical formulation. In Sect. 3.1.2, we shall show that, remarkably, these specific numerical coefficients found for the supergravity constraints in [4] coincide with our proposed sum over canonically normalized current components (3.5) when both β and α − β are real and for unit coefficients Ms,s (α − β, β). In addition to this unearthing of a hidden simplicity in the definition of the constraints, another advantage of writing the constraints in the form (3.5) is that this will allow us to evaluate them also for other level decompositions, and in this way to verify agreement with the canonical constraints of massive IIA and IIB supergravity as well. The agreement between the dynamical (evolution) equations of these theories with the coset model equations in appropriate truncations had already been established in [11,13,14]. Moreover, the form (3.5) is directly amenable to an affine reduction, and brings out more clearly the analogy with the affine Sugawara construction. 3.1.1. On the roots associated to the supergravity constraints. Let us first turn to the detailed consideration of the set of roots, including their multiplicities, that are associated to supergravity constraints. In the case of D = 11 supergravity, these constraints are, respectively, the diffeomorphism and Gauss constraints, and the Bianchi identities for the 4-form field strength and the Riemann tensor.12 The analysis of [4] was based on a gl(10) level decomposition truncated at level = 3, such that, when expressed in terms of the conserved E 10 Noether current in this decomposition, the constraints take 12 In a more conventional canonical analysis, one would not interpret the Bianchi identities as proper constraints, as they are not directly associated to gauge transformations, unlike the diffeomorphism and Gauss constraints. In the present setting, however, they would correspond to generators of gauge transformations on the dual fields, i.e., on the 7-form field and the ‘dual graviton’.

768

T. Damour, A. Kleinschmidt, H. Nicolai

the form (−3)

L

(−4)

L

n 1 ...n 9

m 1 ...m 10 ||n 1 n 2

(−1)

(−2)

(−3)

(0)

= 28 J [n 1 n 2 n 3 J n 4 ...n 9 ] + 3 J p|[n 1 ...n 8 J n 9 ] p , (3.8a) (−2) (−2) (−3) (−1) 21 3 = J n 1 [m 1 ...m 5 J m 6 ...m 10 ]n 2 + J n 2 |[m 1 ...m 8 J m 9 m 10 ]n 1 10 2 −(n 1 ↔ n 2 ), (3.8b)

for the diffeomorphism and Gauss constraints and (−5)

L

(−6)

L

m 1 ...m 10 ||n 1 ...n 5

m 1 ...m 10 ||n 0 |n 1 ...n 7

(−2) (−3) m 1 m 2 [n 1 ...n 4 n 5 ]|m 3 ...m 10

=3 J

J

(−3) (−3) n 0 |m 1 ...m 8 m 9 |m 10 n 1 ...n 7

=9 J

J

,

(3.8c)

,

(3.8d)

for the Bianchi identities. Here, we have changed the normalization of the charge J (−3) compared to [4,12] so that all highest weight states are uniformly normalized to unity (the usefulness of this re-definition was already pointed out in footnote 19 of [4]). Explicitly, the normalizations of the E 10 generators, in their A9 decomposition are (0)

(0)

(−1) (1) a1 a2 a3 | J b1 b2 b3

J a b | J c d = δda δbc − δba δdc , J

(−2) (2) a1 ...a6 | J b1 ...b6

J

= 3! δba11ba22ba33 ,

...a6 = 6! δba11...b . 6

(3.9)

By contrast, for the mixed symmetry field on level || = 3 we shall take here a normalization that differs from the one given in Eq. (2.30) of [12] by a factor 1/9, viz.

(−3) (3) 8 · 8! a0 a1 ...a8 a0 a1 ...a7 a8 J a0 |a1 ...a8 | J b0 |b1 ...b8 = δb0 δb1 ...b8 − δ[b . (3.10) δ 1 b2 ...b8 ]b0 9 This normalization is chosen so that operators associated to real roots (two indices identical) have unit norm, like the highest weight (−3) (3) 10|3 4 5 6 7 8 9 10 | J 10|3 4 5 6 7 8 9 10

J

= 1,

(3.11)

whereas for operators associated to null roots (all indices different) (−3) (3) 2|3 4 5 6 7 8 9 10 | J 2|3 4 5 6 7 8 9 10

J

=

8 . 9

(3.12)

In addition to these normalizations, we have used in (3.8) the same implicit antisymmetrization conventions as in [4]. For instance, the expression in (3.8c), corresponding to a Bianchi constraint on the four-form field strength, is understood to be antisymmetrized (with weight one) over m 1 . . . m 10 ; furthermore the last relation (3.8d) is to be projected onto a (7, 1) hook for the indices n 1 . . . n 7 and n 0 . We note that for the constraints listed in (3.8) there are no ordering ambiguities in a possible transition to operator expressions in a quantum theory, except for L(−6) in (3.8d), since all commutator terms vanish by Jacobi or Serre relations; for instance (−1) (−2) (−3) (3.13) J [m 1 m 2 m 3 , J m 4 ...m 9 ] ∝ J [m 1 |m 2 ...m 9 ] = 0. Let us now exhibit the roots underlying the diffeomorphism constraint (3.8a). For this, we first consider its highest component, corresponding to the indices 2 3 4 5 6 7 8 9 10.

Sugawara-Type Constraints in Hyperbolic Coset Models

769

To identify the root α to which it belongs we must find the eigenvalues under the ten Cartan generators of E 10 . (Indeed, the ‘covariant’ components, αi ≡ α(h i ) of a root precisely encode the eigenvalues in [h i , eα ] = α(h i )eα .) Since we are working with the current components J we display the Cartan elements in this description. In the gl(10) basis the Cartan elements are h i = J i i − J i+1 i+1 (i = 1, . . . , 9),

2

1 1 h 10 = − J 1 + · · · + J 77 + J 8 8 + J 9 9 + J 10 10 . 3 3

(3.14)

Alternatively, one can do the calculation with Lie algebra elements, using the more familiar expressions of the Cartan generators h i in terms of Lie algebra generators recalled in Appendix A.1. (In that case, one notes that the constraint L(−3) 2 3 4 5 6 7 8 9 10 is associated with the contragredient Lie-algebra basis element F2 3 4 5 6 7 8 9 10 .) An easy calculation shows that the only non-zero eigenvalue corresponds to h 1 (first node in Fig. 1), and is equal to +1. Hence, the list of ‘covariant’ components αi ≡ α(h i ), also known as ‘Dynkin labels’, is [+1, 0, 0, 0, 0, 0, 0, 0, 0, 0]. This is equivalent to saying that the root associated to the highest component of the diffeomorphism constraint is equal to the fundamental weight 1 associated to the simple root α1 .13 To explicitly write the root α = 1 associated to the highest diffeomorphism constraint in terms of the simple roots, we must convert its Dynkin labels to root labels, i.e., pass from covariant indices to contravariant ones by using the inverse of the Cartan matrix Ai j = h i |h j . This leads to the corresponding root α = −(α2 + 2α3 + 3α4 + 4α5 + 5α6 + 6α7 + 4α8 + 2α9 + 3α10 ) ≡ −δ, where the (positive) root δ denotes the primitive null root of E 9 ⊂ E 10 . In particular, this shows that the root α = 1 associated to the highest component of the diffeomorphism constraint is a negative null root.14 We can therefore write for this particular component (−3)

L

2 3 4 5 6 7 8 9 10

≡ Lα

with α = 1 = −δ ≡ −δ (3) .

(3.15)

Let us now consider the roots associated to the other components of the diffeomorphism constraint (3.8a). They are obtained by the action of the permutation group S10 on the indices. Since the permutation group is the Weyl group of sl(10), we conclude that all components of the diffeomorphism constraint are associated with (negative) null roots, forming a single orbit of the Weyl group W (sl(10)). These null roots can be obtained by acting with the corresponding Weyl transformation on δ, such that w (Lα ) = Lw(α) ,

(3.16)

where w on the left-hand side acts on the indices of the constraint L by permuting them. 13 The fundamental weights are defined as dual to the simple roots α w.r.t. the Cartan inner product: i j i |α j = +δi j . The fact that 1 , and the integrable highest-weight representation L(1 ) built from it, is related to the tower of constraints was already discussed at some length in [4]. This relation does not mean, however, that L(−3) 2 3 4 5 6 7 8 9 10 is a highest weight vector for the action of all the E 10 generators. Actually, as was already shown in [4], and will be further discussed in Sect. 3.3, it fails to be one. 14 We note that the association of the ‘null’ ( or ‘cusp’) fundamental weight to the diffeomorphism 1 constraint is valid not only for maximal supergravity and E 10 , but also for other (super)gravity theories. For instance, for pure gravity in any spatial dimension d the basic (diffeomorphism) constraint is always associated to roots of the form −μa , where μa (with a = 1, . . . , d) denotes the null roots that are contained within the G L(d) multiplet of the ‘gravity root’. The notation μa = −β a + c β c is the notation used in [30]. Note that the null root −μ1 is indeed the fundamental weight associated with the ‘hyperbolic’ node of AE d (as explicitly dispayed in Eq. (3.14) of [31]).

770

T. Damour, A. Kleinschmidt, H. Nicolai

Let us now proceed to considering the roots associated to the higher-level (or rather ‘lower-level’, as the levels are negative) constraints. To find the roots for the level = −4 and = −5 constraints in (3.8b) and (3.8c), we consider their highest weight components. These are L(−4) 1 2 3 4 5 6 7 8 9 10||9 10 and L(−5) 1 2 3 4 5 6 7 8 9 10||6 7 8 9 10 , respectively. A straightforward calculation gives the eigenvalues [0, 0, 0, 0, 0, 0, 0, 1, 0, −1] and [0, 0, 0, 0, 1, 0, 0, 0, 0, −1], respectively. The corresponding roots are again found to be null and negative. In view of the fact, recalled in (2.15), that all null roots are Weyl images of the basic one-dimensional string of affine null roots n δ, we can look for the specific affine root n δ from which they descend. We find that it is −δ, i.e., n = −1. In other words, in addition to being null, the roots associated to the level = −4 and = −5 constraints can be obtained from the ‘basic’ = −3 ‘diffeomorphism-constraint’ root α = 1 = −δ ≡ −δ (3) by applying some E 10 Weyl reflection: wα (β) = β − (α · β)α (here simplified by taking into account the fact that α · α = 2 for the roots of a simply laced algebra). More explicitly, we have: δ (4) = wθ (δ (3) ), θ := α1 + α2 + α3 + α4 + α5 + α6 + α7 + α10

(3.17)

δ (5) = wθ (δ (4) ), θ := α6 + 2α7 + 2α8 + α9 + α10 ,

(3.18)

and

where we have given the explicit Weyl reflections in W(E 10 ) that move between the different levels. Note that θ is the highest root of the embedded A8 algebra associated with the IIB theory, and θ is the highest weight of an embedded D5 algebra. Finally, similarly to the case of the roots associated to L(−3) , the fact that the Young tableaux describing the G L(10) index structure of L(−4) and L(−5) are totally antisymmetric guarantees that all the roots associated to the other components of these constraints are obtained from the basic ones (3.17) and (3.18) by G L(10) permutations, i.e., by further Weyl reflections. In particular, all of them are null. So far all the roots associated to the first three levels of constraints have been found to be light-like (and negative). The constraint L(−6) differs from the lower level ones in that it is the first in the hierarchy of constraints to involve a non-trivial Young tableau. As a consequence, we are going to see that it contains a mixture of null (α 2 = 0) and time-like (α 2 = −2) roots. More precisely, the highest weight component L(−6) 1 2 3 4 5 6 7 8 9 10||10|4 5 6 7 8 9 10 is easily checked to be associated to a null root, which can be obtained from −δ (5) by the following Weyl transformation: δ (6) = wθ

(δ (5) ), θ

= α4 + 2α5 + 2α6 + 2α7 + α8 + α10 .

(3.19)

Here, θ

is the highest root of an embedded D6 algebra. Covariantizing this component under the action of the sl(10) = A9 subalgebra gives a representation of (7, 1) hook type which is not a pure antisymmetric tensor unlike the constraints on levels −3, −4 and −5. From the point of view of the permutation group S10 = W(sl(10)) this means that there are two separate orbits under W(sl(10)). The ‘outer’ orbit consists of permutations of the lowest weight indices and corresponds to null roots of E 10 . The inner orbit corresponds to imaginary E 10 roots with α 2 = −2. In terms of the supergravity constraint (3.8d) these two orbits correspond to cases when there are two identical indices on the (7, 1) hook part or when they are all different, respectively. The ‘skeleton’ of null roots −δ (3) , −δ (4) , −δ (5) , . . . , together with their multiples (discussed below) and their time-like descendants, is sketched in Fig. 2.

Sugawara-Type Constraints in Hyperbolic Coset Models

771

Fig. 2. Sketch of the set C of roots (and notably its ‘skeleton’ of null roots on the past light-cone) labelling the extended set of constraints constructed in this paper

Let us finally note that all null roots α appearing in these constraints appear with multiplicity one, although the same roots, considered as E 10 roots have the non-trivial root multiplicity eight. That the null roots appear with multiplicity one in the Sugawara construction should be so by consistency with the affine case. By contrast, the purely imaginary roots belonging to the inner orbit of L(−6) have multiplicity seven as constraints compared to multiplicity 44 as roots of E 10 . 3.1.2. Supergravity constraints and canonical normalization. So far we have analyzed the roots α labelling the l.h.s. of our basic Sugawara-like expression (2.13). Next we analyze the roots β1 , β2 contributing to the right hand side of (2.13). Our principal aim here will be to see what are the values of the numerical coefficients Ms1 ,s2 (β1 , β2 ) that enter the Sugawara-like sum. We start here from the explicit G L(10)-decomposed form (3.8). To this aim let us consider the components of the currents J on the r.h.s. where the indices are distributed in a specific way. For example, we can pick out two representative terms where only operators for real roots appear and obtain (−3)

L

2 3 4 5 6 7 8 9 10

3! · 6! (−1)2 3 4 (−2)5 6 7 8 9 10 8! (−3)2|2 3 4 5 6 7 8 9 (0)10 28 · +3· J J J J 2 9! 9! 1 (−1)2 3 4 (−2)5 6 7 8 9 10 (−3)2|2 3 4 5 6 7 8 9 (0)10 (3.20) = + J J J J 2 . 3

Hence, we find the remarkable fact that the combinatorial factors appearing in (3.8a) are precisely such as to imply, in the root basis, a relative normalization equal to unity. As the overall prefactor 1/3 (as well as the corresponding 1/60 in the formulas below) is merely chosen to agree with the normalisations in [4], it might eventually be traded for a more convenient one. Thus, all terms in the bracket belong to real roots and are canonically normalized, justifying in retrospect the relative factor in (3.8a) by (3.5).

772

T. Damour, A. Kleinschmidt, H. Nicolai

Fig. 3. Sketch of one of the basic elements of the infinite ‘scaffold’ of special Sugawara configurations α = β1 + β2 with α null and β1 , β2 real. The real roots β1 , β2 lie within the hyperplane tangent to the light-cone along the null root (here chosen to be α = −δ). One must imagine completing the structure shown here by all its Weyl images

For the Gauss constraint (3.8b) one similarly finds (−4)

L

1 2 3 4 5 6 7 8 9 10||9 10

21 2 · 5! · 5! (−2)9 1 2 3 10 4 (−2)5 6 7 8 9 10 · J J 10 10! 3 1 2 · 8! (−3)9|9 10 1 2 3 4 5 6 (−1)7 8 10 + · · J J 2 2 10! 1 (−2)9 1 2 3 10 4 (−2)5 6 7 8 9 10 (−3)9|9 10 1 2 3 4 5 6 (−1)7 8 10 . = + J J J J 60 (3.21)

Again, the terms appear with the same relative coefficient and confirm the expression (3.5) for real roots. For the constraints (3.8c) and (3.8d) on levels = −5 and = −6 there is nothing to check since there is only one type of term. The basic ‘scaffold’ of Sugawara constraints exhibiting a decomposition α = β1 + β2 with α null and β1 , β2 real is illustrated in Fig. 3. Note that the relations α 2 = 0 and β12 = β22 = 2 imply that α · β1 = 0 = α · β2 , i.e., that β1 and β2 are orthogonal to α, so that they belong to the hyperplane tangent to the light-cone along the considered null root (see Fig. 3, where one has chosen α = −δ). One has to imagine the infinite ‘scaffold’ made by the tangent hyperplanes associated to the infinite skeleton of Weyl images of −δ. 3.1.3. General structure of constraints. We note that there are also terms contributing to (3.5) where not both Jα−β and Jβ are real. For example, (3.8a) contains a term (−3)

L

2 3 4 5 6 7 8 9 10

1 (−3)1|2 3 4 5 6 7 8 9 (0)10 J J 1, 3

(3.22)

Sugawara-Type Constraints in Hyperbolic Coset Models

773

where an imaginary level three root is contracted with a real level zero root (albeit positive). Similar contractions appear also for the other constraints. Note that, though, after removing the same prefactor (1/3) as above, we have again a simple coefficient unity, the time-like-root generator associated to the level −3 root is such that its normalization involves the fraction 8/9, see (3.12). At this stage, we start seeing several patterns appearing within the structure of the constraints, and notably in the set C labelling the roots (together with their multiplicity) associated to the constraints. A first pattern is that, so far, all the constraints can be labelled by the members of the integrable highest-weight representation descending from the fundamental weight 1 , which is dual to the first (‘hyperbolic’) node of the Dynkin diagram, Fig. 1. A second, closely related, pattern is that the pattern of roots comprise many null roots, and that the null constraint-roots studied so far all belong to the Weyl orbit of 1 = −δ. A third pattern is the simple (unit) relative normalization of the contributions (α, β1 , β2 ) to the Sugawara expression (2.13) involving the decomposition of a null root α into two real roots (β1 , β2 ). A fourth pattern is that the null roots associated to non purely antisymmetric Young tableaux give rise, upon covariantization under G L(10), to a set of roots which ‘penetrate’ within the past light-cone, i.e., which are time-like (and past-directed) rather than light-like. It is tantalizing to generalize these patterns to the infinite tower of coset constraints that we are trying to construct. We can first assume that the set C of ‘constraint roots’ contains the full weight diagram, say P(1 ), of the fundamental representation L(1 ) based on 1 = −δ. By Proposition 10.1 of [25] we know that P(1 ) (including its multiplicities) is invariant under the full E 10 Weyl group, W(E 10 ). In particular, this would imply, in view of (2.15), that there is an infinite sequence of null constraints related to the orbit of minus the primitive null element −δ. Upon covariantization of the resulting highest weight vectors under sl(10) we obtain a series of constraints related to δ as indicated in the first row of the following table: −δ −2δ . . .

= −3 (−3) L(−δ)

= −4 (−4) L(−δ)

= −5 (−5) L(−δ)

= −6 (−6) L(−δ) (−6)

L(−2δ)

= −7 (−7) L(−δ)

= −8 (−8) L(−δ)

… …

L(−2δ)

…

(−8)

Here, we added a subscript (−δ) to all the constraints in the W(E 10 ) orbit of −δ and suppressed the labels for the W(sl(10)) suborbits in the columns. Let us also recall the existence of the Hamiltonian constraint, L0 , which could be thought of as being associated to the 0th multiple of δ. In addition to the ‘skeleton’ of null roots constituting the Weyl orbit of 1 = −δ, the weight diagram P(1 ) of L(1 ) contains all (past-directed) time-like roots. This follows from Proposition 11.2a of [25]. To apply this proposition, we need, for each putative 10 pi i , with pi ≥ 0), to conweight μ within the Weyl chamber (μ ∈ P+ ; i.e. μ = i=1 trol the ‘support’ of the root 1 − μ, i.e. the non-zero coefficients m j in its simple-root decomposition: 1 − μ = 10 j=1 m j α j . Using 1 |α j = δ1 j , and αi |α j = Ai j , the root-basis integers m j are easily seen to be related to the weight-basis integers pi via the knowledge of the inverse of the E 10 Cartan matrix Ai j . Now, by explicit inspection of this inverse Cartan matrix (see, e.g, [32]), one finds that the only place in it where there is a zero in the first column is in the first row. This shows that any element of the Weyl 10 chamber μ = i=1 pi i such that pi = 0 for at least one i among 2, . . . , 10, the vector 1 − μ has non-vanishing ‘support’ m 1 on the first node and hence is ‘non-degenerate

774

T. Damour, A. Kleinschmidt, H. Nicolai

w.r.t 1 ’ (in the sense defined in Sect. 11.2 of [25]). Hence, by Kac’s Proposition 11.2a such μ’s are indeed weights (together with their Weyl images). The only exceptional case is when p j = 0 for j = 2, . . . , 10, which corresponds to μ = p1 1 . In other words, we have found that all the negative time-like weights belong to P(1 ), but that the multiples of 1 = −δ are not part of the weight diagram P(1 ).15 Though the set P(1 ) is already quite large, it only corresponds to the G L(10) covariantization of the first row in the table above. In view of the structure of the usual affine Virasoro-Sugawara constraints L n δ recalled above, together with the known structure of E 10 null roots (2.15), it is now quite natural to conjecture that the ‘null skeleton’ of C contains, in addition to the orbit of −δ (first row in the table) the Weyl orbits of (negative) multiples of δ: −nδ. This amounts to conjecturing that, besides the weight diagram P(1 ) of the fundamental representation L(1 ), we must add the weight diagrams P(n1 ) (with n = 2, 3, . . .) corresponding to the multiple tensor product of L(1 ) with itself: L(1 ) ⊗ L(1 ), L(1 ) ⊗ L(1 ) ⊗ L(1 ), etc. Besides this mathematical argument for conjecturing an extension of the set of constraints beyond the ones related to the Weyl orbit of −δ (and its covariantization), there is a physical argument suggesting the necessity of this extension. Indeed, all the constraints discussed so far correspond, in view of the ‘dictionary’ of [3], to the values at one spatial point, of some space-dependent supergravity constraints. For instance, L(−3) n 1 ...n 9 is the spatial n 1 ...n 10 dual of the diffeomorphism constraint Hm (x0 ), taken at the specific spatial point x0 around which one analyzes the asymptotic behaviour of the supergravity fields as t → 0. However, the full supergravity diffeomorphism constraint consists of imposing the vanishing of Hm (x) at all spatial points. When expanding the diffeomorphism constraint Hm (x) in a (ten-dimensional) spatial Taylor expansion around the base point x0 , we see that we should replace the unique constraint Hm (x0 ) ∼ L(−3) n 1 ...n 9 by an infinite gradient tower of spatial derivatives of the form ∂m 1 ...m k Hm (x0 ). For instance, at the first spatial-gradient level m = 1, we should be considering the two irreducible G L(10) tensors contained in ∂m Hn (x0 ), i.e., its symmetric and antisymmetric parts. Dualizing back these first-gradient constraints by means of n 1 ...n 10 , we are led to expecting that the ‘first-gradient descendants’ of L(−3) n 1 ...n 9 will comprise two G L(10) tensors bearing 18 contravariant indices, and belonging to two different Young tableaux: one with [9,9] boxes (corresponding to the symmetric combination) and one with [10,8] boxes (corresponding to the antisymmetric combination). The former corresponds to the null root −2δ = 21 , whereas the latter corresponds to the imaginary 2 and so lies inside the past light-cone. The extension of this gradient construction to the other supergravity constraints (Gauss, etc.) then naturally leads us to conjecture the existence of the second row of the table. Then, when considering higher spatial gradients we are led to conjecturing the existence of further rows ‘stemming’ from −3 δ, −4 δ, etc. One finds that the putative constraints associated with the Weyl orbit of −nδ start on level = −3n and are spaced by n. Finally, it seems that the full table is describing all possible weights on or inside the (past) light-cone. The notation in the table is condensed and does not display the sl(10) representation structure of the various constraints. For example, the set of constraints labelled by 15 Another way of seeing this is by using Proposition 11.3 of [25] where P( ) is described as the convex 1 hull of the Weyl orbit of 1 . The infinitely many Weyl images of 1 all lie on the light-cone (and densely approximate any null direction) and one might think that the convex hull covers all points on the light-cone. This is not true since one is constructing the convex hull as an infinite union of closed sets but this is not necessarily closed. In the present case it is open and misses exactly the multiples of 1 and their Weyl images but the convex hull covers all points inside the light-cone.

Sugawara-Type Constraints in Hyperbolic Coset Models (−6)

775

(−6)

L(−δ) and L(−2δ) transform in different sl(10) representations. The former one is in the hook representation of (3.8d), whereas the latter has two sets of antisymmetric 9-tuples. Explicitly, one has the following two index structures: (−6) m ...m ||n |n ...n 1 10 0 1 7 L (−δ)

and

(−6) m ...m |n ...n 1 9 1 9 . L (−2δ)

(3.23)

In the affine truncation to E 9 only one member in each infinite sequence (row) for a given −nδ is non-trivial because of the presence of 10-tuples of antisymmetrized indices in the higher components. In the example (3.23) above, the first tensor vanishes in the affine truncation, whereas the second one is non-zero. In addition, all the surviving constraints (−3n) from the beginning of each sequence reduce to singlets under sl(9). These are the L(−nδ) . This (one-sided) sequence of constraints naturally correspond to the generators L −nδ (for n > 0) of the affine Sugawara construction that we had introduced in (2.7). (We will return below to specific issues concerning the contractions of the null roots and Cartan subalgebra generators). We do not present an explicit expression for the second rung of constraints, like the second term in (3.23), but note that on the contractions of real root spaces it is given by the same general formula (3.5) as the other constraints we have considered so far. Among (−n) the other constraints in L(−nδ) , some have an index structure similar to the elementary (−)

(−8)

L(−δ) , but with all tuples replicated n times. For example, the index structure of L(−2δ) contains a tensor with two 10-tuples and two 2-tuples (−8) m ...m || p ... p ||n n |q q 1 10 1 10 1 2 1 2 . L (−2δ)

(3.24)

To complete this discussion, let us point out the following ‘experimental’ relation between the constraints and the level decomposition of the adjoint of E 10 under A9 [29]. ‘Admissible’ A9 representations in the level decomposition rarely appear with outer multiplicity zero. Here, ‘admissible’ refers to solving necessary diophantine conditions on the lowest weight vectors of a possible A9 representation occurring in the adjoint representation of E 10 , see Eqs. (6) and (7) in [3]. The only cases up to ≤ 28 for which the outer multiplicity of an admissible representation is zero are those when the associated lowest root in the representation is null.16 More precisely, the only entries with vanishing outer multiplicities in the tables of [29] occur at17 Level = 3n = 4n = 5n

E 10 root n(0, 1, 2, 3, 4, 5, 6, 4, 2, 3) n(1, 2, 3, 4, 5, 6, 7, 4, 2, 4) n(1, 2, 3, 4, 5, 7, 9, 6, 3, 5)

A9 weight [n, 0, 0, 0, 0, 0, 0, 0, 0] [0, 0, 0, 0, 0, 0, 0, n, 0] [0, 0, 0, 0, n, 0, 0, 0, 0]

The first line corresponds to the root nδ and both the second and the third line can be obtained from the first line by the Weyl transformations given explicitly in Eqs. (3.17) and (3.18). These entries have vanishing outer multiplicities since the corresponding E 10 generators are already contained in the gradient representations on the relevant 16 This is no longer necessarily true when considering Kac–Moody algebras different from E 10 [21] or decompositions other than that under A9 . 17 We use two different notations for describing elements α of the (self-dual) E root lattice, namely in terms 10 of either the basis of simple roots αi or of the basis of fundamental weights i : α = i m i αi = i pi i . In the former we write the ten-tuple of coefficients with round parentheses (m 1 , . . . , m 10 ) and in the latter with square brackets [ p1 , . . . , p10 ]. The pi are often referred to as Dynkin labels. The A9 weight is obtained from [ p1 , . . . , p10 ] by dropping the last entry p10 since this corresponds to the node that is deleted in the A9 level decomposition.

776

T. Damour, A. Kleinschmidt, H. Nicolai

level. One potentially important implication of the vanishing outer multiplicities is that there are no ordering ambiguities because the relevant commutators always vanish, as in (3.13), whereas ordering ambiguities will occur in general for higher level constraints (−6) like L(−δ) . 3.1.4. Algebra of constraints. Let us now return to the question of the constraint algebra (2.16) raised at the end of Sect. 2. We discuss this issue by using the explicit expressions for the constraints (3.8). As discussed above, one would like the constraint algebra to close with structure constants given by current components. From the results of [4] it follows that one can generate higher level constraints from lower level constraints by the action of the negative level current operator, J (−1) , i.e., that schematically (−(+1)) (−1) (−) (3.25) L = J , L , is valid for = −3, −4, −5. This property is equivalent to the result of [4] that the + (i.e., that they form a repconstraints are ‘covariant’ under the upper Borel group E 10 + resentation of E 10 ; even if they do not form a representation of the full group E 10 ). In addition, the level-three truncated constraints (3.8) have the property that their Sugawara expression contains only negative level currents, i.e., schematically (−)

L =

(− p) (−q)

J · J .

(3.26)

p+q=

It is now easy to see that the two properties (3.25) and (3.26) imply that the Poisson bracket of two constraints closes in the desired manner of (2.16). This is certainly an encouraging result, which suggests that the structure of the constraints incorporates special features allowing for the existence of a closed algebra of the type of a generalized Virasoro algebra (2.16). However, it is not clear whether the two special properties (3.25) and (3.26) continue to hold for the generalized infinite tower of E 10 constraints whose construction was sketched above. We shall see that the property (3.26) is likely to be violated when implementing a certain ‘see-saw’ construction defined below. As for the property (3.25) + ), one reason for believing (which says that the constraints form a representation of E 10 that it might not be universally valid comes from the example of the affine Sugawara construction. There the constraints do not transform in a representation of the affine algebra: L(−−n) = J (−n) , L(−) . Rather one finds that it is the algebra which trans

forms under the constraints, i.e., J (−−n) = J (−n) , L(−) . We leave to future work further discussion of this important issue. 3.2. Universality: D = 11, IIB and massive IIA. The full E 10 Lie algebra can be obtained from the closure (via commutators) of two of its finite-dimensional sub-algebras: (i) its A9 subalgebra (relevant for D = 11 supergravity), and (ii) its A8 ⊕ A1 subalgebra (relevant for type IIB supergravity). The A9 subalgebra corresponds to nodes 1, 2, 3, 4, 5, 6, 7, 8, 9 of the Dynkin diagram in Fig. 1; the A8 ⊕ A1 algebra corresponds to the nodes 1, 2, 3, 4, 5, 6, 7, 10 and 9 in Fig. 1. The two subalgebras A9 and A8 ⊕ A1 together cover all ten nodes of the E 10 diagram, and therefore their closure is all of E 10 . For the A8 ⊕ A1 decomposition, the term ‘level’ refers to node 8. For low levels,

Sugawara-Type Constraints in Hyperbolic Coset Models

777

the decomposition under this A8 ⊕ A1 subalgebra, originally performed in [13,21], is reproduced in Appendix A.2. The two decompositions under A9 and A8 ⊕ A1 provide two different bases for the same Lie algebra E 10 . In order to distinguish them we use the letter J for the current components in the A9 decomposition, as already done for example in (3.8), and the letter I for current components in the A8 ⊕ A1 decomposition. Since the real root spaces are one dimensional it is usually straightforward to explicitly work out the ‘change of basis’ between the current components expressed in the J basis or the I basis. For example, the root space of the real root α = −α10 contains the current component J 8 9 10 in the A9 decomposition. In the A8 ⊕ A1 decomposition this root space is part of the A8 ‘gravity line’ and therefore one obtains the following relation between the vectors of the two bases in the α = −α10 root space (−1) 8 9 10

J

(0) 8

=I

9

corresponding to E −α10 ≡ f 10 .

(3.27)

That the two generators are not on the same level with regard to the two decompositions of the E 10 algebra will be of crucial importance for the construction we shall discuss next. In Appendix A, we also recall the association of the level decompositions with low-lying generators in an explicit tensor basis for the two decompositions. The fact that A9 and A8 ⊕ A1 together generate the whole E 10 algebra allows in principle to extend the lowest level supergravity constraints to arbitrarily high levels by the following mechanism (which for obvious reasons we will refer to as a ‘see-saw mechanism’). Among the root components contributing to a given known constraint in one level decomposition, there are some that correspond to ‘unknown’ levels in a different decomposition. Covariantizing the resulting expression with regard to the gl(n, R) subalgebra relevant for that new decomposition we generate new components, which in turn can be analyzed in terms of the first decomposition. Covariantizing again, but now with respect to the first decomposition, we again generate new components. It is easy to see that this procedure never stops, and so continues ad infinitum. To see how this construction works in a concrete example consider the following terms in the D = 11 diffeomorphism constraint (3.8a), see also (3.20), (−3) 1 (−1)2 3 4 (−2)5 6 7 8 9 10 (−3)9|2 3 4 5 6 7 9 10 (0)8 2 3 4 5 6 7 8 9 10 + J J J J 9 L 3 (−3) (−3) (0) (0) (3.28) + J 9|2 3 4 5 6 7 8 9 J 10 9 − J 8|2 3 4 5 6 8 9 10 J 7 8 . All the terms in the bracket correspond to canonically normalized real root components of the current. In analogy with (3.27) one can now convert these terms into the alternative basis provided by the A8 ⊕ A1 decomposition. In this way we obtain (see Appendix A for the notation) (−1) 234

J

(−3) 9|2 3 4 5 6 7 9 10

J

(−3) 9|2 3 4 5 6 7 8 9

J

(−3) 8|2 3 4 5 6 8 9 10

J

(−2) 2349

= I

,

(−2) 5 6 7 8 9 10

(−3) 2 3 4 5 6 7,1˙

= I

J

(0) 8

,

J

(−4) 2 3 4 5 6 7 8 9,1˙ 1˙

= I

(−4) 8|2 3 4 5 6 8 9

= I

,

(−1) 8 9,2˙

= I

9

(0) 10

,

J

(0) 7

J

(−2) 5678

= I

8

9

(0) ˙ 2

=I

(0) 7

=I

8,

, 1˙ ,

, (3.29)

778

T. Damour, A. Kleinschmidt, H. Nicolai

where dotted indices refer to the sl(2, R) algebra associated with node 9. Putting this back into (3.28) one can see that this is part of a G L(9, R) × S L(2, R) covariant expression of the form18 (−4)

C

n 1 ...n 8

=

35 (−2)[n 1 ...n 4 (−2)n 5 ...n 8 ] 28 (−1)[n 1 n 2 ,α (−3)n 3 ...n 8 ],β − αβ I I I I 3 3 1 (−4)n 1 ...n 8 ,αγ (0)β 8 (−4) p|[n 1 ...n 7 (0)n 8 ] − I I γ αβ − I I p + ..., 3 3

(3.30)

where for clarity of notation we use the symbol C to denote the IIB constraints. Remarkably, this expression is exactly the diffeomorphism constraint of IIB supergravity when the correspondence with E 10 of [13] is used. This is explained in more detail in Appendix B.19 Indeed, using the expressions (A.5) and (A.6) for the Cartan generators expressed in IIB variables, one finds that the component 2 3 4 5 6 7 8 9 of the IIB diffeomorphism constraint is associated with the root space of −δ, just as is the component 2 3 4 5 6 7 8 9 10 of the D = 11 diffeomorphism constraint, see Appendix B. This suggests that, possibly the two expressions agree completely. Inspecting all the different root components and E 10 generators one verifies (−3)

L

(−4)

2 3 4 5 6 7 8 9 10

= C

2 3 4 5 6 7 8 9

real roots

= real roots

1 L−δ , 3 real roots

(3.31)

i.e., the expressions agree on the bilinear expressions involving two real root generators — as was, in fact, guaranteed by our use of the Weyl group in the covariantization procedure. We find it remarkable that there is such an agreement between the constraints of two different physical theories expressed in the simple algebraic fashion (3.5). However, considering the bilinear terms contributing to the two expressions, one finds that there are terms that differ, an explicit example can be found in Appendix C. One way to interpret this difference is the following: The full set of constraints can be divided in two parts: (i) a universal part, based on the ‘skeleton’ of null roots, and comprising the ‘scaffold’ of special configurations Lα =

Jβ1 Jβ2

for α null

(3.32)

β1 +β2 =α β1 ,β2 real

and, (ii) a non-universal part (the ‘flesh’) that depends on the choice of subgroup under which one covariantizes the ‘scaffold’ part (3.32). The universal part of the construction (3.32) has the property of being preserved by the action of the discrete Weyl group W (E 10 ) and its subgroups W (A9 ) and W (A8 ⊕ A1 ). By contrast, the covariantization of the skeleton under the corresponding continuous groups G L(10, R) (for D = 11) and G L(9, R) × S L(2, R) (for type IIB) leads to different results on the additional new terms inside the light-cone that are generated by the covariantization. That different new terms are possible is due to the fact that in those terms one has to specify the coefficients Ms1 ,s2 (β1 , β2 ) for the contraction of root 18 The 8-index tensor on the l.h.s. is fully antisymmetric. We use the convention that 1˙ 2˙ = +1 = − . 1˙ 2˙ 19 We take this opportunity to point out a typo in the Einstein equation (67) in [13]: The terms involving the

(self-dual) five-form field strength should be multiplied by 1/2. This does not affect the dictionary derived in that paper.

Sugawara-Type Constraints in Hyperbolic Coset Models

779

spaces of different dimensions.20 These are fixed by covariance under a chosen level decomposition subgroup.21 We believe there is some evidence that hyperbolic algebras may admit a realisation akin to the realisation of affine algebras in terms of a spectral parameter22 , but our results here strongly suggest that, if there is such a realisation, it will not be unique. Thinking of Sugawara constructions as being associated with spectral parameters, this can be interpreted by saying that the E 10 algebra may not possess a single or unique set of spectral parameters. Rather, one can and has to choose a set of spectral parameters by covariantizing under a subalgebra of one’s choice. If the spectral parameters are related to space variables (as is the case for the affine algebras appearing in D = 2 supergravities), then this would be in good agreement with the anticipation that one can make space-times of different dimensions emergent from E 10 , depending on the choice of level decomposition [34]. From this point of view the hyperbolic Sugawara construction considered here is less unique than in the affine case since it depends on the choice of level decomposition. At the same time it nicely incorporates the expected possibility of having different spaces emerging from an U-duality (Weyl group) invariant scaffold. On the other hand, restricting only to real root generators, we can now use the agreement between the two expressions to construct new terms involving higher level generators, showing the full power of the approach. The crucial point is that the IIB diffeomorphism constraint (3.30) also contains other components that are not contained in the previous expressions (3.8) corresponding to the A9 level decomposition with the level truncation appropriate to D = 11 supergravity, for example the real root combination (−4)

C

23456789

1 (−4)8|2 3 4 5 6 7 8 (0)9 I I 8. 3

(3.33)

Translating again between the two different bases of E 10 using (0) 9

I

(+1)

8

= J 8 9 10 ,

(−4) 8|2 3 4 5 6 7 8

I

(−4) 8 9 10|2 3 4 5 6 7 8 9 10

= J

,

(3.34)

we infer that this is part of an extended sl(10) covariant expression, namely (−3)

L

m 1 ...m 9

→ (3.8a) +

(−4) 1 (+1) J p1 p2 p3 J p1 p2 p3 |m 1 ...m 9 . 3 · 3!

(3.35)

The normalization is fixed by the term in the IIB expansion. This is also the only possible contraction between A9 level +1 and −4 contributing to the diffeomorphism constraint in D = 11. (The mass deformation generator on = 4 does not contribute to the diffeomorphism constraint [14].) We note that the generator appearing in this new piece of the D = 11 diffeomorphism constraint is a gradient generator in the language of [3]. The new term in the D = 11 constraint now has components on IIB level = −5 and = −6 that can be covariantized now under sl(9) ⊕ sl(2) generating new terms. We have carried out this procedure one step farther and found the following expressions for 20 A similar difference was already noted for the E contraction in [4]. 9 21 See, however, the suggestion above Eq. (2.14) that one might use the Lie algebra generator associated to

the considered constraint-root α to define a universal way of pairing the two different root spaces gβ1 , gβ2 . 22 Some evidence from the structure of the compact subgroup K (E ) was given in [33]. 10

780

T. Damour, A. Kleinschmidt, H. Nicolai

the ‘diffeomorphism constraints’ in A9 decomposition (−3)

L

m 1 ...m 9

(−1)

(−2)

(−3)

(0)

= 28 J m 1 m 2 m 3 J m 4 ...m 6 + 3 J p|m 1 ...m 8 J m 9 p 1 (−4) p1 p2 p3 |m 1 ...m 9 (+1) 1 (−5) p1 ... p6 |m 1 ...m 9 (+2) + J J p1 p2 p3 + J J p1 ... p6 3 · 3! 3 · 6! +··· . (3.36)

with implicit antisymmetrization over [m 1 . . . m 9 ], and a corresponding expression in A8 ⊕ A1 decomposition (−4)

C

m 1 ...m 8

=

35 (−2)m 1 ...m 4 (−2)m 5 ...m 8 28 (−1)m 1 m 2 ,α (−3)m 3 ...m 8 ,β − αβ I I I I 3 3 1 (−4)m 1 ...m 8 ,αγ (0)β 8 (−4) p|m 1 ...m 7 (0)m 8 − I I γ αβ − I I p 3 3 1 (−5) p1 p2 |m 1 ...m 8 ,α (+1) 1 (−6) p1 ... p4 |m 1 ...m 8 (+2) + I I p1 p2 ,α + I I p1 ... p4 3 · 2! 3 · 4! +··· (3.37)

with implicit antisymmetrization over [m 1 . . . m 8 ]. Note that the index range of the world indices is different in the two decompositions: In the D = 11 case, corresponding to A9 , the index range is m = 1, . . . , 10 and in the type IIB case, corresponding to A8 ⊕ A1 , the index range is m = 1, . . . , 9. By construction, these two expressions have the property that they agree on the real roots. In this way one produces an expression for a Sugawara constraint L−δ which extends to arbitrarily positive and negative step operators. We also note the appearance of gradient generators precisely in accord with (2.2), as these generators are the ones that reduce to the higher level affine generators in the truncation of E 10 to E 9 [33]. The gradient generators are those generators related to real roots of the affine E 9 [3,29]. It is straightforward to see that the infinite prolongation of our procedure will give rise to all the terms needed to match with the full sum in (2.2), for negative values of n. Our see-saw mechanism not only demands the extension of the constraints Lα (for a given α) to infinite strings of bilinears of Noether charges in agreement with the affine Sugarawara construction, but also allows to switch between constraints that are distinct as supergravity constraints. For instance, certain components of the IIB diffeomorphism constraint metamorphose into components of the D = 11 Gauss constraint when viewed in a different level decomposition! To see this more explicitly, consider the following component of the IIB diffeomorphism constraint (3.37) (−4)

C

12345678

1 (−2)1 2 3 4 (−2)5 6 7 8 + ··· , I I 3

(3.38)

where we only picked out one real root combination for simplicity. Translating this to the A9 basis via (−2) 1234

I

(−2) 1 2 3 4 9 10

= J

,

(−2) 5678

I

(−2) 5 6 7 8 9 10

= J

,

(3.39)

− (n 1 ↔ n 2 ),

(3.40)

we find that it is part of a covariant expression (−4)

L

m 1 ...m 10 ||n 1 n 2

(−2) (−2) n 1 [m 1 ...m 5 m 6 ...m 10 ]n 2

= 42 J

J

Sugawara-Type Constraints in Hyperbolic Coset Models

781

where the overall normalization differs by a factor of 20 from (3.8b), see also (3.20) in comparison to (3.21). This is exactly the combination that appears in the Gauss constraint of D = 11 supergravity, a result not too surprising from the point of view of U-duality. Evidently, this process could now be continued ad libitum. One can similarly generate new terms for the Gauss constraint in D = 11, given up to = 3 in (3.8b). Starting from the following components of the IIB diffeomorphism constraint: (−4) (−4) (0) 1 (−4)1 2 3 4 5 6 7 8,1˙ 1˙ (0)2˙ 12345678 8|2 3 4 5 6 7 8 1 I I 1˙ + I I 8 + · · · . (3.41) C 3 They can be mapped to A9 quantities using the two distinct A9 level = 4 representations (−4) 1 2 3 4 5 6 7 8,1˙ 1˙

I

(−4) 9|9|1 2 3 4 5 6 7 8 9 10

= J

,

(−4) 8|2 3 4 5 6 7 8

I

(−4) 8 9 10|2 3 4 5 6 7 8 9 10

= J

(3.42) to give sl(10) covariant additions to (3.8b) via (−4)

L

m 1 ...m 10 ||n 1 n 2

→ (3.8b) +

2 (−4)m 1 ...m 10 | p[n 1 (0)n 2 ] J J p 3

10 (−4)m 1 ...m 9 |n 1 n 2 p (0)m 10 (3.43) J J p + ··· . 3 We note that here both the gradient and non-gradient generator on A9 level = 4 contribute. Since the mass deformation parameter of massive type IIA is contained in the non-gradient generator, this is in agreement with the fact that the Gauss constraint of massive IIA gets modified by the Romans mass [14,35]. +

3.3. General remarks on the construction. Let us summarize the construction and comment on some open questions concerning this procedure. Starting from a single constraint L−δ , associated to the primitive null root, we construct the ‘scaffold’ as in (3.32), based on the decomposition α = β1 + β2 for α null and β1 , β2 real.23 Since all root spaces involved are real, they are one-dimensional and there is no ambiguity in the contraction. There are also no ordering ambiguities at this level. We can then act on the expression (3.32) with W (E 10 ) to generate similar expressions for all null roots. This constitutes the full scaffold of the hyperbolic Sugawara constraints which is invariant (only) under W (E 10 ). The constraints of the type in (3.32) are both infinite in number and each consists of an infinite number of bilinears in the current components. In order to construct constraints for the full E 10 one then needs to choose a level decomposition under a regular, finite-dimensional subalgebra. Covariance under this subalgebra induces additional terms on top of those already contained in the skeleton. The precise form of these additional terms depends on the subalgebra one chose in a systematic way, as is apparent from the explicit expressions in Appendix C. From that point of view it is clear that our construction is not covariant with respect to the full E 10 Lie algebra, but involves only the Weyl group W(E 10 ) in a canonical way. Everything beyond that depends on the chosen subalgebra for the level decomposition.24 23 We note that decompositions of imaginary roots into real roots have been considered in a different context in [36]. 24 In some sense this is also true for the affine E Sugawara construction which uses as choice of subalgebra 9 for the level decomposition E 8 .

782

T. Damour, A. Kleinschmidt, H. Nicolai

We can also bring out the lack of ‘E 10 covariance’ by relating our construction to the question of an E 10 representation structure in the bilinear expression in E 10 generators. As already pointed out in footnote 13 one might have liked to identify the constraint L−δ with a highest weight vector of an integrable E 10 representation with highest weight 1 = −δ. If this were the case the constraint should be annihilated by all raising operators. Here, we recall that we express the step operators in terms of current components (s) (rather than in terms of the ‘contragredient’ E 10 Lie algebra generators Tα ), so that for 2 example e1 = J 1 . Using the explicit expression for L−δ in A9 decomposition we find that (−3) (−3) (0) 2 3 4 5 6 7 8 9 10 i+1 2 3 4 5 6 7 8 9 10 ei , L = J i, L = 0 for i = 1, . . . , 9 (3.44) where all commutators should be read as Poisson (or Dirac) brackets in the canonical setting. However, for the e generator corresponding to the omitted node we get (−3) (−3) (1) e10 , L 2 3 4 5 6 7 8 9 10 = J 8 9 10 , L 2 3 4 5 6 7 8 9 10 = 0, (3.45) showing that this component of the constraint generator is only a highest weight state with respect to the A9 subalgebra, but not the full E 10 algebra. Since the A9 expression agrees with the A8 ⊕ A1 expression on the real roots, we can repeat the calculation in IIB variables to find (−4) 23456789 ei , C = 0 for i = 1, . . . , 7, 9, 10 (3.46) and

(−4)

e8 , C

23456789

= 0.

(3.47)

Being a highest weight vector now with respect to A8 ⊕ A1 , this is different from the result for the A9 decomposition, but again illustrates the lack of full E 10 covariance. Similar conclusions hold for the D9 ≡ S O(9, 9) decomposition of [11]. Without going into the details of the calculation, the lowest order constraint for the D9 decomposition is (−2)

L

I

=

(−1) 1 (0)K L (−2)I K L 1 (−1)A + J J J (C I ) AB J B + · · · , 2 2

(3.48)

where I, K , L = 1, . . . , 18 and A, B = 1, . . . , 256 are S O(9, 9) vector and spinor indices, respectively, and we use again the symbol J to denote the components of the conserved E 10 current, but now in the D9 decomposition. The 18 = 9 + 9 constraints in (3.48) correspond to the diffeomorphism constraint and the Gauss constraint for the Neveu-Schwarz 2-form field of IIA theory; alternatively, they might be interpreted as a doubled set of diffeomorphism constraints w.r.t. the nine spatial target space coordinates X i and their (world-sheet) ‘duals’ X˜ i [11]. As before it is the omitted node (i.e., node 9 for the massive IIA theory) which causes failure of the construction: by S O(9, 9) covariance, the dilaton field associated with this node cannot appear in (3.48). Accordingly, it is now the generator e9 which does not annihilate the relevant component of L(−2) . To summarize: The failure of the constraint to be a highest weight vector w.r.t. the full E 10 algebra is invariably associated with the node that has been deleted for the given

Sugawara-Type Constraints in Hyperbolic Coset Models

783

level decomposition. In Appendix C we show that a related statement applies to the dependence of the constraints on the Cartan subalgebra generators. One further interesting aspect of our construction is that, to start with, it associates a constraint with every Weyl image of the fundamental null root −δ. In the same way one can associate constraints to the Weyl images of −nδ and in this way obtain a constraint for every E 10 root on the (past) light-cone. After choosing a level decomposition subalgebra one generates additional constraints inside the light-cone by covariantization under this subalgebra. It is possible that, as indicated in Subsect. 3.1.1, the set of roots C ‘supporting’ the full set of constraints be universally given by all the weights inside the (past) light-cone. This set can also be described as the union of the weight diagrams of the representations L(1 ), L(1 ) ⊗ L(1 ), L(1 ) ⊗ L(1 ) ⊗ L(1 ), etc.. On the other hand, the precise Sugawara-like expression defining the constraint Lα associated to some α ∈ C seems to depend on the choice of a level decomposition. Finally, note that since we are defining an infinity of constraints associated with all null roots of the hyperbolic algebra E 10 , one might worry whether there are any solutions that satisfy the geodesic equation and all the Sugawara constraints. It is reassuring to note that there are such solutions, namely for example the Kasner cosmologies. These correspond to only non-vanishing Cartan subalgebra components of the current, and hence all constraints except the Hamiltonian constraint (3.6) are trivially satisfied. Other solutions correspond to specific cases of Bianchi cosmologies. The exact count of the remaining number of degrees of freedom is quite involved and beyond the scope of this paper. Acknowledgements. We would like to thank Ofer Gabber and Victor Kac for informative discussions. AK is a Research Associate of the Fonds de la Recherche–FNRS, Belgium, and would like to thank IHES and AEI for hospitality. This work has been supported in part by IISN-Belgium (conventions 4.4511.06, 4.4505.86 and 4.4514.08) and by the Belgian Federal Science Policy Office through the Interuniversity Attraction Pole P6/11.

A. Level Decompositions For the reader’s convenience, we collect in this appendix some results on the level decompositions of E 10 appropriate for D = 11 supergravity and for type IIB in D = 10. These appeared originally in [3,29] and [13,21], respectively. A.1. Level decomposition under A9 . The A9 ∼ = sl(10) subalgebra relevant for D = 11 supergravity is obtained by removing node 10 from the Dynkin diagram of Fig. 1. 0 0 1 2 3 3 4 4 4

A9 Dynkin labels [1, 0, 0, 0, 0, 0, 0, 0, 1] [0, 0, 0, 0, 0, 0, 0, 0, 0] [0, 0, 0, 0, 0, 0, 1, 0, 0] [0, 0, 0, 1, 0, 0, 0, 0, 0] [0, 1, 0, 0, 0, 0, 0, 0, 1] [1, 0, 0, 0, 0, 0, 0, 0, 0] [0, 0, 0, 0, 0, 0, 0, 0, 2] [0, 0, 0, 0, 0, 0, 0, 1, 0] [1, 0, 0, 0, 0, 0, 1, 0, 0]

E 10 root for lowest weight (−1, −1, −1, −1, −1, −1, −1, −1, −1, 0) (0, 0, 0, 0, 0, 0, 0, 0, 0, 0) (0, 0, 0, 0, 0, 0, 0, 0, 0, 1) (0, 0, 0, 0, 1, 2, 3, 2, 1, 2) (0, 0, 1, 2, 3, 4, 5, 3, 1, 3) (0, 1, 2, 3, 4, 5, 6, 4, 2, 3) (1, 2, 3, 4, 5, 6, 7, 4, 1, 4) (1, 2, 3, 4, 5, 6, 7, 4, 2, 4) (0, 1, 2, 3, 4, 5, 6, 4, 2, 4)

μ 1 1 1 1 1 0 1 0 1

α2 2 0 2 2 2 0 2 0 2

784

T. Damour, A. Kleinschmidt, H. Nicolai

5 5 5 6 6 6 6 6 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 8

A9 Dynkin labels [0, 0, 0, 0, 0, 1, 0, 0, 1] [0, 0, 0, 0, 1, 0, 0, 0, 0] [1, 0, 0, 1, 0, 0, 0, 0, 0] [0, 0, 0, 1, 0, 0, 0, 1, 0] [0, 0, 1, 0, 0, 0, 0, 0, 1] [0, 1, 0, 0, 0, 0, 0, 0, 0] [1, 1, 0, 0, 0, 0, 0, 0, 1] [2, 0, 0, 0, 0, 0, 0, 0, 0] [0, 0, 0, 0, 0, 0, 0, 0, 1] [0, 0, 1, 0, 0, 1, 0, 0, 0] [0, 1, 0, 0, 0, 0, 0, 1, 1] [0, 1, 0, 0, 0, 0, 1, 0, 0] [1, 0, 0, 0, 0, 0, 0, 0, 2] [1, 0, 0, 0, 0, 0, 0, 1, 0] [2, 0, 0, 0, 0, 0, 1, 0, 0] [0, 0, 0, 0, 0, 0, 0, 1, 2] [0, 0, 0, 0, 0, 0, 0, 2, 0] [0, 0, 0, 0, 0, 0, 1, 0, 1] [0, 0, 0, 0, 0, 1, 0, 0, 0] [0, 0, 2, 0, 0, 0, 0, 0, 0] [0, 1, 0, 0, 1, 0, 0, 0, 1] [0, 1, 0, 1, 0, 0, 0, 0, 0] [1, 0, 0, 0, 0, 0, 1, 1, 0] [1, 0, 0, 0, 0, 1, 0, 0, 1] [1, 0, 0, 0, 1, 0, 0, 0, 0] [2, 0, 0, 1, 0, 0, 0, 0, 0]

E 10 root for lowest weight (1, 2, 3, 4, 5, 6, 8, 5, 2, 5) (1, 2, 3, 4, 5, 7, 9, 6, 3, 5) (0, 1, 2, 3, 5, 7, 9, 6, 3, 5) (1, 2, 3, 4, 6, 8, 10, 6, 3, 6) (1, 2, 3, 5, 7, 9, 11, 7, 3, 6) (1, 2, 4, 6, 8, 10, 12, 8, 4, 6) (0, 1, 3, 5, 7, 9, 11, 7, 3, 6) (0, 2, 4, 6, 8, 10, 12, 8, 4, 6) (2, 4, 6, 8, 10, 12, 14, 9, 4, 7) (1, 2, 3, 5, 7, 9, 12, 8, 4, 7) (1, 2, 4, 6, 8, 10, 12, 7, 3, 7) (1, 2, 4, 6, 8, 10, 12, 8, 4, 7) (1, 3, 5, 7, 9, 11, 13, 8, 3, 7) (1, 3, 5, 7, 9, 11, 13, 8, 4, 7) (0, 2, 4, 6, 8, 10, 12, 8, 4, 7) (2, 4, 6, 8, 10, 12, 14, 8, 3, 8) (2, 4, 6, 8, 10, 12, 14, 8, 4, 8) (2, 4, 6, 8, 10, 12, 14, 9, 4, 8) (2, 4, 6, 8, 10, 12, 15, 10, 5, 8) (1, 2, 3, 6, 9, 12, 15, 10, 5, 8) (1, 2, 4, 6, 8, 11, 14, 9, 4, 8) (1, 2, 4, 6, 9, 12, 15, 10, 5, 8) (1, 3, 5, 7, 9, 11, 13, 8, 4, 8) (1, 3, 5, 7, 9, 11, 14, 9, 4, 8) (1, 3, 5, 7, 9, 12, 15, 10, 5, 8) (0, 2, 4, 6, 9, 12, 15, 10, 5, 8)

μ 1 0 1 1 1 1 1 0 1 1 1 1 1 2 1 1 0 2 2 1 1 1 1 2 2 1

α2 2 0 2 2 0 −2 2 0 −4 2 2 0 0 −2 2 2 0 −2 −4 2 2 0 2 0 −2 2

The low-lying generators are denoted by (a, b, . . . = 1, . . . , 10) = 0 : K a b, = 1 : E abc = E [abc] , = 2 : E a1 ...a6 = E [a1 ...a6 ] , =3:E

a0 |a1 ...a8

=4:E

a1 a2 a3 |b1 ...b9

=E

a0 |[a1 ...a8 ]

=E

(A.1) ,

[a1 a2 a3 ]|[b1 ...b9 ]

and E a|b|c1 ...c10 = E (a|b)|[c1 ...c10 ]

(with the usual irreducibility conditions E [a0 |a1 ...a8 ] = 0, etc.). They are related to the Chevalley–Serre generators by ei = K i i+1 ,

f i = K i+1 i , h i = K i i − K i+1 i+1 (i = 1, . . . , 9)

(A.2)

e10 = E 8 9 10 , f 10 = F8 9 10 , 1 h 10 = − K + K 8 8 + K 9 9 + K 10 10 , 3

(A.3)

and

10 where K = a=1 K a a . Commutation relations for these generators can be found in [12,14] but note that we have rescaled all generators such that their lowest weight elements (e.g. E 10|3 4 5 6 7 8 9 10 ) have norm 1.

Sugawara-Type Constraints in Hyperbolic Coset Models

785

A.2. Level decomposition under A8 ⊕ A1 . The A8 ⊕ A1 ∼ = sl(9) ⊕ sl(2) subalgebra relevant for type IIB supergravity is obtained by removing node 8 from the Dynkin diagram of Fig. 1. 0 0 0 1 2 3 4 4 4 5 5 6 6 6 6 7 7 7 8 8 8 8 8 8 8 8

A8 ⊕ A1 Dynkin labels [1, 0, 0, 0, 0, 0, 0, 1][0] [0, 0, 0, 0, 0, 0, 0, 0][0] [0, 0, 0, 0, 0, 0, 0, 0][2] [0, 0, 0, 0, 0, 0, 1, 0][1] [0, 0, 0, 0, 1, 0, 0, 0][0] [0, 0, 1, 0, 0, 0, 0, 0][1] [0, 1, 0, 0, 0, 0, 0, 1][0] [1, 0, 0, 0, 0, 0, 0, 0][0] [1, 0, 0, 0, 0, 0, 0, 0][2] [0, 0, 0, 0, 0, 0, 0, 1][1] [1, 0, 0, 0, 0, 0, 1, 0][1] [0, 0, 0, 0, 0, 0, 1, 1][0] [0, 0, 0, 0, 0, 1, 0, 0][0] [0, 0, 0, 0, 0, 1, 0, 0][2] [1, 0, 0, 0, 1, 0, 0, 0][0] [0, 0, 0, 0, 1, 0, 0, 1][1] [0, 0, 0, 1, 0, 0, 0, 0][1] [1, 0, 1, 0, 0, 0, 0, 0][1] [0, 0, 0, 1, 0, 0, 1, 0][0] [0, 0, 1, 0, 0, 0, 0, 1][0] [0, 0, 1, 0, 0, 0, 0, 1][2] [0, 1, 0, 0, 0, 0, 0, 0][0] [0, 1, 0, 0, 0, 0, 0, 0][2] [1, 1, 0, 0, 0, 0, 0, 1][0] [2, 0, 0, 0, 0, 0, 0, 0][0] [2, 0, 0, 0, 0, 0, 0, 0][2]

E 10 root for lowest weight (−1, −1, −1, −1, −1, −1, −1, 0, 0, −1) (0, 0, 0, 0, 0, 0, 0, 0, 0, 0) (0, 0, 0, 0, 0, 0, 0, 0, -1, 0) (0, 0, 0, 0, 0, 0, 0, 1, 0, 0) (0, 0, 0, 0, 0, 1, 2, 2, 1, 1) (0, 0, 0, 1, 2, 3, 4, 3, 1, 2) (0, 0, 1, 2, 3, 4, 5, 4, 2, 2) (0, 1, 2, 3, 4, 5, 6, 4, 2, 3) (0, 1, 2, 3, 4, 5, 6, 4, 1, 3) (1, 2, 3, 4, 5, 6, 7, 5, 2, 3) (0, 1, 2, 3, 4, 5, 6, 5, 2, 3) (1, 2, 3, 4, 5, 6, 7, 6, 3, 3) (1, 2, 3, 4, 5, 6, 8, 6, 3, 4) (1, 2, 3, 4, 5, 6, 8, 6, 2, 4) (0, 1, 2, 3, 4, 6, 8, 6, 3, 4) (1, 2, 3, 4, 5, 7, 9, 7, 3, 4) (1, 2, 3, 4, 6, 8, 10, 7, 3, 5) (0, 1, 2, 4, 6, 8, 10, 7, 3, 5) (1, 2, 3, 4, 6, 8, 10, 8, 4, 5) (1, 2, 3, 5, 7, 9, 11, 8, 4, 5) (1, 2, 3, 5, 7, 9, 11, 8, 3, 5) (1, 2, 4, 6, 8, 10, 12, 8, 4, 6) (1, 2, 4, 6, 8, 10, 12, 8, 3, 6) (0, 1, 3, 5, 7, 9, 11, 8, 4, 5) (0, 2, 4, 6, 8, 10, 12, 8, 4, 6) (0, 2, 4, 6, 8, 10, 12, 8, 3, 6)

μ 1 1 1 1 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 2 1 1 0 1

α2 2 0 2 2 2 2 2 0 2 0 2 2 0 2 2 2 0 2 2 0 2 −2 0 2 0 2

The low-lying generators are (now a, b, . . . = 1, . . . , 9 are sl(9) vector indices and ˙ 2˙ are sl(2) vector indices) α, β = 1, = 0 : K a b and K α β (with δαβ K α β = 0), = 1 : E ab,α = E [ab],α ,

= 2 : E a1 a2 a3 a4 = E [a1 a2 a3 a4 ] , =3:E

a1 ...a6 ,α

=4:E

a0 |a1 ...a7

=E

[a1 ...a6 ],α

=E

(A.4)

,

a0 |[a1 ...a7 ]

and E a1 ...a8 ,αβ = E [a1 ...a8 ],(αβ) ,

The relation to the Chevalley–Serre generators is now given by ei = K i i+1 , f i = K i+1 i , h i = K i i − K i+1 i+1 (i = 1, . . . , 7), e10 = K 8 9 , f 10 = K 9 8 , h 10 = K 8 8 − K 9 9 , e9 = K

1˙

2˙ ,

f9 = K

2˙

1˙ ,

h9 = K

1˙

1˙

−K

2˙

(A.5)

2˙ .

The explicit dots on the indices indicate numerical values for sl(2) vector indices. For the deleted node 8 one has

786

T. Damour, A. Kleinschmidt, H. Nicolai ˙

e8 = E 8 9,2 , f 8 = F8 9,2˙ ,

1 1 1˙ ˙ K 1˙ − K 2 2˙ , h8 = − K + K 88 + K 99 − 4 2

(A.6)

9 where now K = a=1 K a a is the trace in gl(9). Commutation relations for these generators can be found in [13], where we used an so(1, 2) spinor and vector notation instead of sl(2) tensors as above. B. Constraints of Type IIB Supergravity and Universality The Einstein equation of motion of IIB supergravity can be written as 1 1 R AB = − S αA S B,α + FA C1 ...C4 FBC1 ...C4 4 96 1 1 + H A C1 C2 ,α H BC1 C2 ,α − η AB H C1 ...C3 ,α HC1 C2 C3 α 4 48

(B.1)

in flat indices, where we corrected a factor of two compared to [13]. The diffeomorphism constraint is obtained as the 0a component of this equation. Using self-duality of F and the dictionary of [13] one finds, up to overall normalization, the expression (−4)

C

m 1 ...m 8

=

35 (−2)m 1 ...m 4 (−2)m 5 ...m 8 28 (−1)m 1 m 2 ,α (−3)m 3 ...m 8 ,β + αβ I I I I 3 3 1 (−4)m 1 ...m 8 ,αγ (0)β 8 (−4) p|m 1 ...m 7 (0)m 8 + (B.2) I I γ αβ + I I p 3 3

in terms of the E 10 current components in A8 ⊕ A1 decomposition. C. Explicit Expressions Involving Cartan Generators In this appendix, we give explicit expressions for the contractions between the Cartan subalgebra and the δ root space to show that the A9 and A8 ⊕ A1 covariant expressions (3.36) and (3.37) differ, thereby illustrating that (3.31) is indeed only valid on contractions of real root spaces. Consider the contributions from the Cartan subalgebra to the highest component of (3.36). They come exclusively from the J (−3) J (0) contraction and are (−3) (0) (0) (−3) (−3) 3 L 2 3 4 5 6 7 8 9 10 J 2|3 4 5 6 7 8 9 10 J 2 2 + J 3|4 5 6 7 8 9 10 2 J 3 3 (0) (−3) + · · · + J 10|2 3 4 5 6 7 8 9 J 10 10 (−3) (−3) = J 2|3 4 5 6 7 8 9 10 (h 2 + h 3 + · · · + h 8 + h 9 ) + · · · + J 9|10 2 3 4 5 6 7 8 h 9 ,

(C.1) where the hook symmetry (−3) [2|3 4 5 6 7 8 9 10]

J

=0

(C.2)

Sugawara-Type Constraints in Hyperbolic Coset Models

787

of the level three element was used and we identified for simplicity the current component with the corresponding Cartan generators using (A.2) and (A.3). We see that only the Cartan generators of the A9 ‘gravity line’ appear in this contraction. The only ‘missing’ ones are the one from the deleted node 10 and the hyperbolic node 1. The latter is related to our choice of (highest) component. Repeating the same calculation for the A8 ⊕ A1 decomposition and (3.37) one finds similarly (−4)

3 C

23456789

(−4) (0) 2|3 4 5 6 7 8 9 2

I

I

2

(−4) (0) ˙ 2 3 4 5 6 7 8 9,1˙ 2˙ 1

(I

+ I

(−4) 2|3 4 5 6 7 8 9

= I

1˙ −

I

(0) ˙ 2

I

I

9

2˙ ) (−4) 8|9 2 3 4 5 6 7

(h 2 + · · · + h 7 + h 10 )+ I

(−4) 2 3 4 5 6 7 8 9,1˙ 2˙

+ I

(−4) (0) 9|2 3 4 5 6 7 8 9

+ ···+

h9.

h 10 (C.3)

The Cartan generators that appear in this expression are those from the A8 ⊕ A1 gravity line, so the ‘missing’ generators are that of the deleted node 8 and of the hyperbolic node 1. The latter is again related to our choice of component of the diffeomorphism constraint so that the real discrepancy between the two expressions can be traced again to the different deleted nodes. This is related to the failure of this constraint to be a highest weight vector, see the expressions (3.45) and (3.47). Finally, it is clear that the Cartan generator missing in (3.48) is h 9 , as the diagonal generators among the S O(9, 9) generators J (0) K L are identified with h 1 , . . . , h 8 , h 10 , while h 9 is associated with the dilaton, again confirming our general conclusion. References 1. DeWitt, B.S.: Quantum Theory of Gravity. 1. The Canonical Theory. Phys. Rev. 160, 1113 (1967) 2. Kiefer, C.: Quantum gravity. Int. Ser. Monogr. Phys. 124, Oxford: Oxford University Press, 2004 3. Damour, T., Henneaux, M., Nicolai, H.: E 10 and a ‘small tension expansion’ of M Theory. Phys. Rev. Lett. 89, 221601 (2002) 4. Damour, T., Kleinschmidt, A., Nicolai, H.: Constraints and the E 10 Coset Model. Class. Quant. Grav. 24, 6097 (2007) 5. Sugawara, H.: A Field theory of currents. Phys. Rev. 170, 1659 (1968) 6. Bardakçi, K., Halpern, M. B.: New dual quark models. Phys. Rev. D 3, 2493 (1971) 7. Goddard, P., Olive, D.I.: Kac-Moody And Virasoro Algebras In Relation To Quantum Physics. Int. J. Mod. Phys. A 1, 303 (1986) 8. Kleinschmidt, A., Koehn, M., Nicolai, H.: Supersymmetric quantum cosmological billiards. Phys. Rev. D 80, 061701 (2009) 9. Forte, L.A.: Arithmetical Chaos and Quantum Cosmology. Class. Quant. Grav. 26, 045001 (2009) 10. Goddard, P., Thorn, C.B.: Compatibility of the Dual Pomeron with Unitarity and the Absence of Ghosts in the Dual Resonance Model. Phys. Lett. B 40, 235 (1972) 11. Kleinschmidt, A., Nicolai, H.: E 10 and S O(9, 9) invariant supergravity. JHEP 0407, 041 (2004) 12. Damour, T., Nicolai, H.: Eleven dimensional supergravity and the E 10 /K (E 10 ) sigma-model at low A9 levels. In: Pogoyan, G.S., Vicent, L.E., Wolf, K.B. (eds.) Group Theoretical Methods in Physics. IOP conference series no. 185, pp. 93–111. IOP Publishing (2005) 13. Kleinschmidt, A., Nicolai, H.: IIB supergravity and E 10 . Phys. Lett. B 606, 391 (2005) 14. Henneaux, M., Jamsin, E., Kleinschmidt, A., Persson, D.: On the E 10 /Massive Type IIA Supergravity Correspondence. Phys. Rev. D 79, 045008 (2009) 15. Nicolai, H., Samtleben, H.A.J.: On K (E 9 ). Q.J. Pure Appl. Math. 1, 180 (2005) 16. West, P.C.: E 11 and M theory. Class. Quant. Grav. 18, 4443 (2001) 17. West, P.C.: E(11), S L(32) and central charges. Phys. Lett. B 575, 333 (2003) 18. Riccioni, F., West, P.: Local E 11 . JHEP 0904, 051 (2009) 19. Schnakenburg, I., West, P.C.: Kac-Moody symmetries of IIB supergravity. Phys. Lett. B 517, 421 (2001)

788

T. Damour, A. Kleinschmidt, H. Nicolai

20. Schnakenburg, I., West, P.C.: Massive IIA supergravity as a non-linear realisation. Phys. Lett. B 540, 137 (2002) 21. Kleinschmidt, A., Schnakenburg, I., West, P.C.: Very-extended Kac-Moody algebras and their interpretation at low levels. Class. Quant. Grav. 21, 2493 (2004) 22. West, P.C.: The IIA, IIB and eleven-dimensional theories and their common E(11) origin. Nucl. Phys. B 693, 76 (2004) 23. Morozov, A.Y., Perelomov, A.M., Roslyi, A.A., Shifman, M.A., Turbiner, A.V.: Quasiexactly Solvable Quantal Problems: One-Dimensional Analog of Rational Conformal Field Theories. Int. J. Mod. Phys. A 5, 803 (1990) 24. Halpern, M.B., Kiritsis, E.: General Virasoro Construction on Affine G. Mod. Phys. Lett. A 4, 1373 (1989) 25. Kac, V.G.: Infinite dimensional Lie algebras. Cambridge: Cambridge University Press, 1995 26. Damour, T., de Buyl, S., Henneaux, M., Schomblond, C.: Einstein billiards and overextensions of finitedimensional simple Lie algebras. JHEP 0208, 030 (2002) 27. Gebert, R.W., Nicolai, H.: An affine string vertex operator construction at arbitrary level. J. Math. Phys. 38, 4435 (1997) 28. Gaberdiel, M.R., Olive, D.I., West, P.C.: A class of Lorentzian Kac-Moody algebras. Nucl. Phys. B 645, 403 (2002) 29. Nicolai, H., Fischbacher, T.: Low level representations for E 10 and E 11 . Cont. Math. 343, Providence, RI: Amer. Math. Soc., 2004, p. 191 30. Damour, T., Henneaux, M., Nicolai, H.: Cosmological billiards. Class. Quant. Grav. 20, R145 (2003) 31. Damour, T., Henneaux, M., Julia, B., Nicolai, H.: Hyperbolic Kac-Moody algebras and chaos in KaluzaKlein models. Phys. Lett. B 509, 323 (2001) 32. Kac, V., Moody, R.V., Wakimoto, M.: On E 10 . In: Bleuler, K., Werner, M., (eds.) “Differential geometrical methods in theoretical physics”, pp. 109–128. Dordrecht, Kluwer (1988) 33. Kleinschmidt, A., Nicolai, H., Palmkvist, J.: K (E 9 ) from K (E 10 ). JHEP 0706, 051 (2007) 34. Damour, T., Nicolai, H.: Symmetries, singularities and the de-emergence of space. Int. J. Mod. Phys. D 17, 525 (2008) 35. Romans, L.J.: Massive N=2a supergravity in ten-dimensions. Phys. Lett. B 169, 374 (1986) 36. Brown, J., Ganor, O.J., Helfgott, C.: M-theory and E 10 : Billiards, branes, and imaginary roots. JHEP 0408, 063 (2004) Communicated by N.A. Nekrasov

Commun. Math. Phys. 302, 789–813 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1189-x

Communications in

Mathematical Physics

Dorey’s Rule and the q-Characters of Simply-Laced Quantum Affine Algebras C. A. S. Young1 , R. Zegers2 1 Yukawa Institute for Theoretical Physics, Kyoto University, Kyoto 606-8502, Japan.

E-mail: [email protected]

2 Laboratoire de Physique Théorique, Université Paris-Sud 11/CNRS, 91405 Orsay Cedex, France.

E-mail: [email protected] Received: 3 January 2010 / Accepted: 3 October 2010 Published online: 8 February 2011 – © Springer-Verlag 2011

Abstract: Let Uq ( g) be the quantum affine algebra associated to a simply-laced simple Lie algebra g. We examine the relationship between Dorey’s rule, which is a geometrical statement about Coxeter orbits of g-weights, and the structure of q-characters of fundamental representations Vi,a of Uq ( g). In particular, we prove, without recourse to the ADE classification, that the rule provides a necessary and sufficient condition for the monomial 1 to appear in the q-character of a three-fold tensor product Vi,a ⊗ V j,b ⊗ Vk,c . 1. Introduction 1.1. Background. This paper concerns the relationship between the representation theory of simply-laced quantum affine algebras on the one hand, and, on the other, the particle fusing rule originally given by Dorey in the context of affine Toda field theories. Recall that Affine Toda Field Theories (ATFTs) are integrable quantum field theories in 1+1 dimensions [Cor94]. Let g be any simply-laced simple Lie algebra, and I the set of nodes of the Dynkin diagram of g. The (real coupling, purely elastic) ATFT associated to the untwisted affine algebra g has rank g species of particles, labelled by the nodes i ∈ I . The root system data of g determine not only the masses of these particles, but also the allowed fusings: if particles of species j ∈ I and i ∈ I can interact to form a ¯ and this process can occur particle of species k¯ ∈ I one says there is a fusing j, i → k, only if the rapidities θi , θ j of the incoming particles are related by θi − θ j =

√ −1 θ kji ,

(1.1)

where θ kji is a real angle, called the fusing angle. If there is a fusing j, i → k¯ then there are also fusings i, k → j¯ and k, j → ı¯, and the fusing angles obey j

θ kji + θik + θki j = 2π.

(1.2)

790

C. A. S. Young, R. Zegers

The problem of determining the masses, fusings and fusing angles for the ATFTs associated to all simple Lie algebras (simply-laced or not) was solved in [BCDS90]. It was observed in that paper that the allowed fusings form a strict subset of the non-zero Clebsch-Gordon coefficients for g, in the sense that if i, j → k¯ is a fusing then Homg Vi ⊗ V j , Vk¯ ∼ = Homg Vi ⊗ V j ⊗ Vk , C = 0,

(1.3)

where Vi is the i th fundamental representation of g. It is a strict subset because the converse statement does not hold: the first counterexample is D5 ,

where there is a non-trivial homomorphism V2 ⊗ V2 → V2 of d5 modules but no fusing 2, 2 → 2 in the ATFT. Soon after, this same “hole” in the allowed interactions was also found in a different (and non-diagonal) scattering theory [Mac91], giving an indication of a more general underlying structure. Subsequently, Dorey gave a rule which encodes both the pattern of allowed fusings, and the fusing angles, in an elegant geometrical fashion for all the simply-laced cases [Dor91,Dor92b,Dor92a,FLO91,FO92].1 To state the rule, we introduce some standard notation: let (αi )i∈I be a set of simple roots of g, (λi )i∈I the corresponding fundamental weights, and ai j the Cartan matrix: αi · α j = ai j ,

αi · λ j = δi j .

(1.4)

Let W denote the Weyl group of g, generated by the reflections (si )i∈I in the simple roots. It is always possible to write I as a disjoint union I = I• I◦

(1.5)

in such a way that (I• , I◦ ) is a two-colouring of the Dynkin diagram (as, for example, in the case D5 above). Let then w ∈ W be the choice of Coxeter element given by2 w = w◦ w• ,

w◦ =

i∈I◦

si ,

w• =

si ,

(1.6)

i∈I•

and write = w for the cyclic subgroup of W generated by w, whose order h is the Coxeter number of g. 1 A generalization of the rule to non-simply laced cases was mentioned in [Dor93] and used in [CP96]; see also [Oot97,FKS00]. In the present work we shall focus exclusively on the simply laced cases but it would be very interesting to try to prove analogous results for any simple Lie algebra. 2 This choice will be convenient in what follows, but the rule itself is independent of the choice of Coxeter element.

Dorey’s Rule and the q-Characters of Simply-Laced Quantum Affine Algebras

791

Then the rule states that there is a fusing i, j → k¯ if and only if 0 ∈ λi + λ j + λk ;

(1.7)

that is, if and only if there are integers p, q, r such that 0 = w p λi + wq λ j + wr λk .

(1.8)

j

Moreover, the fusing angles, θikj θ ijk and θki are given by projecting this latter equation onto the exp (±2πi/ h) eigenplane of w, as discussed in [Dor91,Dor92b] and recalled in Sec. 3 below. The original statement of the rule involved Coxeter orbits of roots, but it was observed in [Bra92] that the statement above in terms of weights is equivalent, essentially because (one can show that) φi := (1 − w −1 )λi are a linearly independent collection of roots. Writing the rule in terms of weights is suggestive, because of the following Theorem 1.1 (PRV [PRRV67,Kum88,Mat89]). A necessary and sufficient condition for Homg Vi ⊗ V j ⊗ Vk , C = 0 (1.9) is that 0 ∈ W λi + W λ j + W λk .

(1.10)

Now clearly (1.7) implies (1.10), but not vice versa. So in light of this result, which connects the Weyl-orbits of weights to invariants of g-representations, it is very natural to suppose that the fusing rule (1.7) plays a similar role for representations of some larger (and hence more restrictive) algebraic structure. In [Mac92], MacKay conjectured that this is indeed the case and that the relevant algebra is the Yangian Y (g). Recall that the universal envelope U ( g) of the untwisted affine algebra g has a canonical Drinfel’d-Jimbo deformation Uq ( g), called a quantum affine algebra, and that the Yangian Y (g) is the rational degeneration of Uq ( g) [Dri85,Dri88]. Y (g) and Uq ( g) share essentially the same representation theory [Var00]. There is a notion of the fundamental representations Vi,a of Uq ( g), where i ∈ I , and a ∈ C=0 is an additional label which we will call the rapidity; see e.g. [CP].3 The Vi,a are finite-dimensional and Vi ⊂ Vi,a . In the classical cases, the following theorem was proved by Chari and Pressley, confirming the conjecture above. (In fact, [CP96] considered all the classical cases ABC D, but we quote here only the result for the classical simply-laced cases AD.) Theorem 1.2 ([CP96]). A necessary and sufficient condition for HomUq (g) Vi,a ⊗ V j,b ⊗ Vk,b , C = 0 ,

(1.11)

for some rapidities a, b, c ∈ C=0 , is that 0 ∈ λi + λ j + λk .

(1.12)

1.2. Motivations and outline. Despite the positive result above, it is fair to say that a satisfactory understanding of the link between the fusing rule and the representation theory of simply-laced quantum affine algebras is still missing. Most apparently, the 3 We are of course using the word “rapidity” in two, a priori different, senses: for the kinematical label of particles in ATFT and for the spectral parameter of representations of Uq ( g). The role of Uq ( g)-symmetry in real- and imaginary-coupling affine Toda field theory is indeed rather subtle. See [TW99,SWK00] and references therein.

792

C. A. S. Young, R. Zegers

proof in [CP96] was case-by-case and did not include the exceptional cases E 6 , E 7 and E 8 . More importantly, part of what makes the rule (1.7) elegant is that it encodes not only the triples (i, j, k) for which fusing can occur, but also the fusing angles, via the projection map mentioned above. This aspect played no role in [CP96], where the required rapidities were determined without reference to this projection map. One would like to understand why the rapidities emerge as they do from the geometry of Coxeter orbits of roots and weights. In the present paper we take a step in this direction, by relating the geometry of Coxeter orbits to the q-characters of fundamental representations of Uq ( g). The notion of q-characters, due to Frenkel and Reshetikhin [FR98], following [Kni95], is an important development in the representation theory of quantum affine algebras. Here they will allow us to give, in particular, a general proofthat Dorey’s rule is a necessary condition for the existence of invariant maps, HomUq (g) Vi,a ⊗ V j,b ⊗ Vk,c , C = 0, and singlets, HomUq (g) (C, Vk,c ⊗ V j,b ⊗ Vi,a ) = 0. The structure of this paper is as follows: in Sect. 2 we recall the definition of Uq ( g), and the necessary details of the theory of q-characters. Then in Sect. 3 we go on to prove our main result (Theorem 3.1), which states that Dorey’s rule provides a necessary and sufficient condition for the monomial 1 to occur in the q-character of a three-fold tensor product of fundamental representations. We prove this by first showing (Lemma 3.2) that the latter statement can be rephrased as a statement about the occurrence of quadratic monomials in the q-character of a single fundamental representation. We then prove that such quadratic monomials are in a certain precise correspondence with solutions to Dorey’s rule. Indeed, it will emerge that in fact every monomial in the q-character can very naturally be seen as specifying some identity among the Coxeter orbits of the fundamental weights of g (Proposition 3.3). The reverse direction however (going from identities to monomials) is more subtle, and one must work harder to show (Propositions 3.4 and 3.45) that it always holds for identities of the form (1.7) above. We conclude in Sect. 4 by commenting on the relationship of our result to Theorem 1.2 above, and noting some open questions. We assume, throughout this paper, that q ∈ C=0 is not a root of unity. 2. Quantum Affine Algebras and q-Characters g) is an associative algebra over C generated by The quantum affine algebra Uq ( ± )i∈I,n∈Z , (xi,n

(ki±1 )i∈I ,

(h i,n )i∈I,n∈Z=0 ,

(2.1)

and central elements c±1/2 . In this paper we study finite dimensional representations of Uq ( g) when g is simply laced. As we recall below, for this purpose it actually suffices to work with the quantum loop algebra Uq (Lg) = Uq ( g)/(c±1/2 − 1). Following [Dri88], let us arrange the generators into formal series xi± (u) :=

± −n xi,n u ,

(2.2)

n∈Z

φi± (u)

=

∞ n=0

± φi,±n u ±n

:=

ki±1 exp

±(q − q

−1

)

∞ m=1

h i,±m u

±m

,

(2.3)

Dorey’s Rule and the q-Characters of Simply-Laced Quantum Affine Algebras

and set δ(u) :=

793

un .

(2.4)

n∈Z

The defining relations of Uq (Lg) are then

± ∓ φi± (u), φ ± j (v) = φi (u), φ j (v) = 0,

(2.5)

1 − q −ai j uv + x (v) φi± (u), 1 − q ai j uv j ai j −ai j 1 − q uv x − (v) φi± (u), φi± (u) x − j (v) = q −a 1 − q i j uv j

δi j xi+ (u), x − δ(v/u)φi+ (1/v) − δ(u/v)φi− (1/u) , j (v) = −1 q −q ± ±a ±ai j i j u − v x ± (v) x ± (u), v xi (u) x ± u−q j (v) = q j i φi± (u) x +j (v) = q ai j

± ± −1 ± xi± (u)xi± (v)x ± j (w) − (q + q )x i (u)x j (v)x i (w) ± ± +x ± j (v)x i (u)x i (w) + (u ↔ v) = 0 if ai j =

−1,

(2.6) (2.7) (2.8) (2.9)

(2.10)

where ai j is the Cartan matrix of g. As we shall see, this presentation, which is a slightly modified version of Drinfel’d’s current presentation [Dri88], is convenient because the φi± (u) and xi± (u) behave analogously to the usual Cartan generators and raising/lowering operators in the representation theory of finite-dimensional simple Lie algebras. From its origin as a standard Drinfel’d-Jimbo deformation of U ( g), Uq ( g) admits a standard Hopf algebra structure Uq ( g)std – see e.g. [CP]. No closed form is known for the standard coproduct in the current presentation above. As we note in the conclusion, there does exist another (twist-equivalent [EKP07]) Hopf algebra structure for Uq ( g) better suited to the current presentation; for details see [Her05,Her07a,Gro01]. 2.1. Representations and characters. A representation V of Uq ( g) is of type 1 if c±1/2 acts as the identity on V and V is the direct sum of its Uq (g)-weight spaces, V = ⊕λ Vλ where Vλ = {v ∈ V : ki v = q αi ,λ v}

(2.11)

and λ in the weight lattice of g. We recall (see e.g. [CP] Chap. 12.2B) that any finitedimensional irreducible representation of Uq ( g) can be obtained by twisting, by an automorphism of Uq ( g), a finite-dimensional type 1 representation. Thus it suffices for our purposes to consider type 1 representations, and to regard them as representations of Uq (Lg). Any type 1 representation V of Uq ( g) also furnishes a representation of Uq (g) (the ± )i∈I , (ki± )i∈I ). Recall that the character latter being the subalgebra generated by (xi,0 χ (V ) of V regarded as a Uq (g)-module is defined as χ (V ) = dim (Vλ ) eλ . (2.12) λ

If Rep(Uq (g)) is the category whose objects are finite-dimensional representations of Uq (g) and whose morphisms are homomorphisms of Uq (g)-modules, then the Grothendieck ring Rep(Uq (g)) is the ring generated by the isomorphism classes of objects

794

C. A. S. Young, R. Zegers

in Rep(Uq (g)) subject to the relations [X ][Y ] = [X ⊗ Y ] and, for each exact sequence 0 → U → W → V → 0 of Uq (g)-modules, [W ] = [U ] + [V ]. The character map χ is a homomorphism of rings

(2.13) χ : Rep(Uq (g)) −→ Z yi±1 i∈I

yi±1

= e±λi . to the ring of polynomials in variables Let us pause to recall that Rep(Uq (g)), like Rep(U (g)), is a semisimple category: exact sequences 0 → U → W → V → 0 exist precisely when W = U ⊕ V as Uq (g)modules; and thus the defining relations of Rep(Uq (g)) are in fact just [U ][V ] = [U ⊗V ] and [U ] + [V ] = [U ⊕ V ]. In contrast, representations of Uq ( g) can be reducible but not fully-reducible. That is, it can happen that there is a short exact sequence 0 → U → W → V → 0 of Uq ( g)-modules, so that U is a submodule of W , but that W is not the direct sum U ⊕ V as a Uq ( g)-module. One says that W is indecomposable. Now for any type 1 representation V of Uq ( g), the decomposition above into Uq (g)weight spaces may be further refined by decomposing V into Jordan subspaces of the ± mutually commuting φi,±r defined in (2.3), [FR98]: ± ± V = ⊕γ Vγ , γ = (γi,±r )i∈I,r ∈N , γi,±r ∈ C,

where Vγ = {v ∈ V : ∃N ∈ N, ∀i ∈ I,

± N φi (u) − γi± (u) v = 0}.

If dim(Vγ ) > 0, we shall refer to the corresponding formal series ± ∀i ∈ I , γi± (u) := γi,±r u ±r

(2.14)

(2.15)

(2.16)

r ∈N

as an l-weight of V . It is known [FR98] that for every finite-dimensional type 1 repreg), these l-weights are of the form sentation of Uq ( γi± (u) = q deg Q i −deg Ri

Q i (uq −1 )Ri (uq) , Q i (uq)Ri (uq −1 )

(2.17)

where the right hand side is to be treated as a formal series in positive (negative) integer powers of u for γi+ (u) (respectively γi− (u)), and Q i and Ri are polynomials with constant term 1. These latter may be written as Q i (u) = Ri (u) = (2.18) (1 − ua)qi,a , (1 − ua)ri,a , a∈C=0

a∈C=0

and this allows one to assign to γ a monomial q −r mγ = Yi,ai,a i,a

(2.19)

i∈I,a∈C=0

in variables (Yi,a )i∈I ;a∈C=0 . The q-character map χq [FR98] is the injective homomorphism of rings

±1 g)) −→ Z Yi,a (2.20) χq : Rep(Uq ( i∈I,a∈C=0

Dorey’s Rule and the q-Characters of Simply-Laced Quantum Affine Algebras

795

defined by4 χq (V ) =

γ

dim Vγ mγ .

(2.21)

±1 The Yi,a are to be thought of as the quantum-affine analogues of the usual variables ±1 ±1 with y j = e±λ j appearing in character polynomials. In particular, one associates Yi,a the classical weight ±λi . An algorithm for computing q-characters of Uq ( g)-modules directly from the root-system data of g was proposed in [FR98,FM01]. It has been proven to work for all fundamental representations [FM01], which is all that we shall require in the present paper, although it is known not to work in general [HL09,NN08]. In [Nak04], Nakajima deduced an algorithm for computing the q-character of any irreducible representation, and formulas for the q-characters of fundamental representations were given in [Nak03a,Nak06]; see also [CM06]. We now turn to summarizing the properties of q-characters that we shall need. A

±1 monomial in Z Yi,a is said to be i-dominant if and only if it contains no

i∈I ; a∈C=0 −1 Yi,a ’s. It is said to be dominant if and only if it is i-dominant for all i monomials are similarly defined to be those not containing Yi,a ’s.

∈ I . Antidominant

2.2. Uq ( sl2 ) characters. We first summarize the situation for Uq ( sl2 ) characters. In ±1 this case the Dynkin diagram has one node, I = {1}, and we write Y1,a = Ya±1 . The sl2 ) have dimension two and are labelled by the fundamental representations Va of Uq ( rapidity a ∈ C=0 . Their q-characters are −1 −1 = Ya + Yaq χq (Va ) = Ya 1 + Aaq 2,

(2.22)

Aa = Yaq Yaq −1 .

(2.23)

where one defines

The tensor product Vb ⊗ Vc of two fundamental representations is irreducible whenever b/c ∈ / {q −2 , q +2 }. When b = aq and c = aq −1 for some a ∈ C=0 , there is an exact sequence of Uq ( sl2 )-modules ([CP91], and with their choice of coproduct) 0 → Wa(2) → Vaq ⊗ Vaq −1 → C → 0,

(2.24)

(2) where Wa(2) is a 3-dimensional irreducible submodule and C ∼ = Vaq ⊗ Vaq −1 Wa is the 1-dimensional module. If instead b = aq −1 and c = aq, one has the same exact sequence but with arrows reversed: 4 Note that the original definition of χ [FR98] was in terms of the universal R-matrix of U ( q q g), which makes its close relationship to the transfer matrix of physics more evident. But the above definition, cf. e.g. [CH], is more directly suited for our purposes.

796

C. A. S. Young, R. Zegers

0 → C → Vaq −1 ⊗ Vaq → Wa(2) → 0.

(2.25)

In either case, there is more than one dominant monomial in the q-character: χq (Vaq −1 ⊗ Vaq ) = χq (Vaq ⊗ Vaq −1 ) = χq (Vaq −1 )χq (Vaq ) −1 −1 Yaq + Yaq = Yaq −1 + Yaq 3 −1 −1 −1 = 1 + Yaq −1 Yaq + Yaq −1 Yaq + Y Y 3 aq aq 3 . (2.26) (2)

In the final line the quantity in brackets is χq (Wa ). More generally, for each r ∈ Z≥1 and a ∈ C=0 there is an irreducible submodule Wa(r ) ⊂ Vaq r −1 ⊗ Vaq r −3 ⊗ · · · ⊗ Vaq −r +1

(2.27)

sl2 ). It has dimension r + 1 and called the r -th Kirillov-Reshetikhin module of Uq ( 5 q-character r −1 −1 −1 −1 (r ) χq (Wa ) = Yaq −r +1 Yaq −r +3 . . . Yaq r −1 1 + Aaq r Aaq r −2 . . . Aaq r −2t t=0

=

Yaq −r +1 Yaq −r +3 . . . Yaq r −3 Yaq r −1 −1 + Yaq −r +1 Yaq −r +3 . . . Yaq r −3 Yaq r +1 ... −1 −1 −1 +Yaq −r +1 Yaq −r +5 . . . Yaq r −1 Yaq r +1

+

−1 −1 −1 −1 Yaq −r +3 Yaq −r +5 . . . Yaq r −1 Yaq r +1 .

(2.28)

Wa(r ) is completely characterised by the set of rapidities Sr (a) = aq −r +1 , aq −r +3 , . . . , aq r −1 appearing in its dominant monomial, which we shall refer to as a segment of length r centred on a. Two such segments are said to be in special position if their union is itself a segment and neither of them contains the other. We say aq −r +1 is the leftmost element of Sr (a), aq r −1 the rightmost. More generally we say that aq k is to the right (left) of aq l iff k > l (resp. k < l). Presented with any dominant monomial m + = s Yas one can reconstruct the unique irreducible Uq ( sl2 )-module V (m + ) such that m + is the highest weight monomial in χq (V (m + )). First split the factors Yas into a product of segments no two of which are in special position: say Yat q −rt +1 Yat q −rt +3 . . . Yat q rt −1 , (2.29) m+ = t∈T

for some index set T ; then V (m + ) ∼ =

Wa(rt t ) ,

(2.30)

t∈T (r )

5 W a is the pull-back of the usual spin r/2 representation of Uq (sl2 ) by the evaluation homomorphism 2 ) → Uq (sl2 ). See e.g [CP91]. eva : Uq (sl

Dorey’s Rule and the q-Characters of Simply-Laced Quantum Affine Algebras

797

which can be shown to be irreducible and, up to isomorphism, independent of the ordering of the tensor factors. Finally, there is an important caveat: reducible modules certainly have more than one dominant monomial, as in e.g. (2.26), but irreducible modules can also have multiple dominant monomials. This happens precisely when they fail to be regular, in the termi2 nology of [FR98]. Consider m + = Yaq −1 Yaq to see the problem. Note that the resulting

−1 . Thus, q-character contains the (dominant) monomial Yaq −1 but not the monomial Yaq in computing q-characters, one cannot treat all dominant monomials as though they were highest monomials. For that reason we shall need the following

Proposition 2.1. Let V be a simple finite dimensional Uq ( sl2 )-module of type 1. Suppose that for some a ∈ C=0 and n > 0, χq (V ) includes a dominant monomial m such that Yan is a factor6 of m and Yaq 2 is not. Then, either −p

i) χq (V ) includes the monomials m Aaq , 1 ≤ p ≤ n; or ii) there exists a k > 0 such that χq (V ) includes the monomial m Aaq k . Proof. Let T be an index set such that V can be written as in (2.30) above, with the Srt (at ) in pairwise general position and rt > 0 for all t ∈ T . By hypothesis there exist (r ) (m t )t∈T such that m t is a monomial of χq (Wat t ) for each t ∈ T and m=

mt .

(2.31)

t∈T −1 Let T = {t ∈ T : a ∈ Srt (at )}. Note that Yaq 2 is a factor of m t only if m t ∈ T . T is the disjoint union of the following three subsets:

T1 = {t ∈ T : m t is dominant and has both Ya and Yaq 2 as factors}, T2 = {t ∈ T : m t is dominant and has rightmost factor Ya }, T3 = {t ∈ T : m t is not dominant}. If there is a t ∈ T3 such that Ya−1 is not a factor of m t then the leftmost factor Y −1 in m t (rt ) −1 is Yaq 2 for some > 0. In that case m t Aaq 2 −1 appears in χq (Wat ), cf. (2.28), and ii) holds. It remains to consider the case that Ya−1 is a factor of every m t , t ∈ T3 . By defini−1 tion of T , Yaq 2 is then also a factor of every m t , t ∈ T3 . Suppose for a contradiction that there existed a t ∈ T such that m t is dominant with leftmost factor Yaq 2 . Since by assumption the total power of Yaq 2 in m is zero, that would require |T1 | < |T3 |; but also, by definition of general position, that |T2 | = 0 and hence |T1 | − |T3 | ≥ n > 0, a contradiction. Therefore there is no such t ∈ T and so in fact, by counting powers of Yaq 2 in m, |T1 | = |T3 |. Consequently the power of Ya in t∈T T m t is zero. It follows 1 3 that |T2 | ≥ n and hence that i) holds. uc 6 For every b ∈ C , k ∈ Z , we say that Y k is a factor of m = =0 =0 c∈C=0 Yc iff either u b ≥ k > 0 or b

u b ≤ k < 0.

798

C. A. S. Young, R. Zegers

2.3. Uq ( g) characters. Returning to the general case, we let Vi,a , i ∈ I, a ∈ C=0 denote the i th fundamental representation of Uq ( g) at rapidity a. (See e.g. [CP].) It may be shown [FR98,FM01] that χq (Vi,a ) contains the highest weight monomial Yi,a and that, if we define −1 Ai,a = Yi,aq −1 Yi,aq Y j,a , (2.32)

j,i

where the product j,i is over the nodes j of the Dynkin diagram that neighbour i,7 then every monomial in χq (Vi,a ) is of the form −1 Yi,a A−1 j1 ,a1 . . . A jn ,an

(2.34)

for some finite collection of n ≥ 0 pairs ( jk , ak ) ∈ I × C=0 . For each j ∈ I , let ( j) ± Uq ( sl2 ( j) ) ⊂ Uq ( g) be the subalgebra generated by x ± be the j (u), φ j (u). Let χq sl2 ( j) ) and q-character map of Uq (

±1 ±1 → Z Y j,a (2.35) β j : Z Yi,a i∈I ;a∈C=0

a∈C=0

±1 the ring homomorphism which sets to one all the Yk,a with k = j. Then every Uq ( g)( j) ( j) module V is also a Uq (sl2 )-module, and χq (V ) = β j ◦ χq (V ). In fact, more is true: there exists [FM01] an injective ring homomorphism

±1 ±1 ±1 τ j : Z Yi,a → Z Y j,a ⊗ Z Z k,b (2.36) i∈I ;a∈C=0

a∈C=0

k= j;b∈C=0

±1 refining β j , where Z i,a are certain new formal variables, and ( j) χq (V p ) ⊗ N p , τ j (χq (Vi,a )) =

(2.37)

p ±1 where the V p are Uq ( sl2 ( j) )-modules and the N p are monomials in (Z k,b )k= j,b∈C=0 . Furthermore, in the diagram

(2.38)

let the right vertical arrow be multiplication by β j (A−1 j,c )⊗1; then the diagram commutes

if and only if the left vertical arrow is multiplication by A−1 j,c . 7 That is

j,i = j:I ji =1 , where Ii j = 2δi j − ai j = is the incidence matrix.

1 0

if i, j are neighbouring nodes on the Dynkin diagram otherwise

(2.33)

Dorey’s Rule and the q-Characters of Simply-Laced Quantum Affine Algebras

799

Consequently, if one has found a term m + ⊗ N p in the r.h.s of (2.37), and one knows ( j) that m + is the highest weight monomial of χq (V p ), then one can construct all the ( j) remaining monomials in χq (V p ) ⊗ N p (as discussed in the previous subsection) and hence their (unique) preimages in χq (Vi,a ). Frenkel and Mukhin gave an algorithm for computing the q-character with a given highest monomial [FM01], by repeatedly completing Uq ( sl2 )-characters in this way. They proved that it works for any q-character with a unique dominant monomial (and so in particular for the q-characters of fundamental representations). The specific instance of this sort of reasoning which we will require, in Proposition 3.5, is the following, which follows immediately from the existence and property (2.38) of τ j together with Proposition 2.1 above. Proposition 2.2. Let j ∈ I, a ∈ C=0 and n > 0. Suppose m is a j-dominant monomial in χq (Vi,a ) such that Yan is a factor of β j (m) and Yaq 2 is not. Then, either −p

i) χq (Vi,a ) includes the monomials m A j,aq , 1 ≤ p ≤ n; or ii) there exists a k > 0 such that χq (Vi,a ) includes the monomial m A j,aq k . Also, in Proposition 3.3 below, we will need the following consequence of the Frenkel-Mukhin algorithm. Theorem 2.3 ([FM01]). Every monomial m = Yi,a in χq (Vi,a ) is of the form m A−1 j,aq r +1 for some j ∈ I and some r ∈ Z, where m is a monomial in χq (Vi,a ) having Y j,aq r as a factor. Equivalently but more intuitively, every monomial apart from the highest one is obtained from some (at least one) other monomial by a “lowering step” consisting of a replacement of the form −1 Y j,aq r → Y j,aq r A−1 = Y Yk,aq r +1 . (2.39) r +1 r +2 j,aq j,aq

k, j

The q-characters of any fundamental representation Vi,a thus has the structure of a connected directed graph, whose nodes are the monomials and whose edges are labelled by −1 . (An example is shown in Fig. 2.) factors Ai,a Finally, in the proof of Lemma 3.2, we will need the following results from [FM01]. We shall say that a monomial m has compact support of length n and base d if m ∈ ±1 Z[Yl,dq r ]l∈I,0≤r ≤n . Combining Lemma 6.1 and 6.13 of [FM01], we have Lemma 2.4. All the monomials in χq (Vl,d ), where l ∈ I and d ∈ C=0 , have compact support of length h and base d. Moreover, a monomial m=

p

l,r Yl,dq r ,

pl,r ∈ Z

(2.40)

(l,r )∈I ×N0

having compact support of length n and base d is said to be right negative (resp. left positive) if, in addition, there exists a (k, s) ∈ I × N0 such that pk,s < 0 (resp. pk,s > 0) and for each (l, r ) ∈ I × N0 such that pl,r > 0 (resp. pl,r < 0), r < s (resp. r > s). Lemma 2.5. For all i ∈ I, a ∈ C=0 , in the q-character χq (Vi,a ) i) every monomial except for the highest weight monomial, Yi,a , is right negative, and ii) every monomial except for the lowest weight monomial, Yı¯−1 , is left positive. ,aq h

800

C. A. S. Young, R. Zegers

Proof. Part 2.5) is Lemma 6.5 in [FM01]. Proposition 6.18 in [FM01] states (in the ±1 simply-laced case) that χq (Vı¯,aq −h ) and χq (Vi,a ) are related by exchanging Y j,aq n ↔

∓1 Y j,aq −n , for all j ∈ I, n ∈ {0, 1, . . . , h}. This map sends right-negative monomials to left-positive monomials (and vice versa). So part 2.5) for χq (Vı¯,aq −h ) implies part ii) for χq (Vi,a ).

Corollary 2.6. Let i ∈ I and a ∈ C=0 . The monomial 1 does not occur in χq (Vi,a ). Proof. The monomial 1 is not right negative and 1 = Yi,a . Thus, by Lemma 2.5, it cannot appear in χq (Vi,a ). 3. Coxeter Orbits and q-Characters In this section we relate the geometry of the Coxeter orbits of g-weights to the structure of q-characters of fundamental representations. Recalling our notations for the roots, weights and Coxeter element of g from the Introduction, let us begin by noting the following identities. Write λi = λi• (λi◦ ) when i ∈ I• (respectively I◦ ). Then λ◦j , w• λi◦ = λi◦ (3.1) w• λi• = λi• − αi = −λi• +

j,i

and likewise with ◦ ↔ •. Thus

◦ • 1 + w ±1 λi• = λ◦j .

(3.2)

j,i

We also define 2 P= h

n∈Z/ h Z

2π n cos h

wn ,

(3.3)

which is the orthogonal (with respect to the Killing form ·, ·) projector from the weight lattice of g to the exp (±2πi/ h)-eigenplane of w.8 Let θ be the map which returns the signed angle between the projections of two given vectors in weight space into this plane, i.e. the map defined by

Pμ, Pρ cos θ (μ, ρ) = √ ;

Pμ, Pμ Pρ, Pρ

im(θ ) = (−π, π ]

(3.4)

and, to fix the orientation, θ (μ, wμ) = +2π/ h. To fix a direction in the plane, let λ be any vector in weight space such that Pλ = 0. Our main result is then Theorem 3.1. Let i 1 , i 2 , i 3 ∈ I and a1 , a2 , a3 ∈ C=0 . The following are equivalent: i) The q-character χq Vi1 ,a1 ⊗ Vi2 ,a2 ⊗ Vi3 ,a3

(3.5)

includes the monomial 1. 8 Recall that the exponents of g are by definition those integers s ∈ Z/ hZ for which exp (2πis/ h) is an

eigenvalue of w, and that s = ±1 are always exponents.

Dorey’s Rule and the q-Characters of Simply-Laced Quantum Affine Algebras

801

ii) There exist n 1 , n 2 , n 3 ∈ Z and a ∈ C=0 such that w n 1 λi1 + w n 2 λi2 + w n 3 λi3 = 0

(3.6)

and h

ak = aq π θ

λ,wn k λi k

, k = 1, 2, 3.

(3.7)

Let us illustrate this with an example in the case of E 6 , for which the Coxeter number is h = 12. We label the nodes of the Dynkin diagram as in [BCDS90]:

(This labelling is related to the masses of the corresponding particles, H/heavy or L/light), in the Toda theory.) Among the solutions to the fusing rule (tabulated in [BCDS90]) is w −2 λ¯l + λL + w 5 λh = 0,

(3.8)

whose P-projection may be pictured as follows:

(3.9)

So the theorem asserts, in particular, that 1 occurs in the q-character χq V¯l,aq −5 ⊗ VL,a ⊗ Vh,aq 10 .

(3.10)

Proof of Theorem 3.1. We first express ii) in a less symmetric but more convenient form. The reference vector λ serves purely to make manifest the symmetry under permutations of {1, 2, 3}. It follows from (3.6) that, by using this symmetry if necessary, we can assume − π < θ (λi1 , w n 2 λi2 ) ≤ 0 < θ (λi1 , w n 3 λi3 ) ≤ π.

(3.11)

Then, by the freedom in the choice of a, we can assume that λ = λi1 and n 1 = 0. Let us also pick the two-colouring I = I• I◦ such that i 1 ∈ I• . Given that −λı¯2 is in the

802

C. A. S. Young, R. Zegers

Fig. 1. Picture of the e±2πi/ h -eigenplane of w, for h = 5 (left) and h = 6 (right) showing the directions (though not the lengths) of the projected Coxeter orbits of fundamental weights. Here λ• (λ◦ ) denotes any λi such that i ∈ I• (respectively I◦ )

Coxeter orbit of λi2 ,9 we can introduce an n ∈ Z such that w n 2 λi2 = −w n λı¯2 . Let us also write m := n 3 . Then (3.11) becomes 0 < θ (λi1 , w n λı¯2 ) ≤ π, 0 < θ (λi1 , w m λi3 ) ≤ π.

(3.12)

Thus the solution ii) has been brought to the form λi1 − w n λı¯2 + w m λi3 = 0

(3.13)

where, on examining Fig. 1, one sees that (3.12) is equivalent to the following conditions on n, m (modulo h): h2 h2 ı¯2 ∈ I• i 3 ∈ I• 0
r=

2n ı¯2 ∈ I• , s= 2n − 1 ı¯2 ∈ I◦

(3.15)

2m i 3 ∈ I• 2m − 1 i 3 ∈ I◦ .

(3.16)

It is also clear that (3.6) implies in particular that r < s,

(3.17)

for if not, the images of the three vectors wn 1 λ1 , w n 2 λ2 , w n 3 λ3 would lie strictly inside some half-plane and certainly could not sum to zero. 9 By definition λ is the fundamental weight in the Weyl orbit of −λ . It is given by λ = −w λ , where 0 i ı¯ ı¯ i w0 is the longest element of the Weyl group, which may be written w0 = w• w◦ . . . = w◦ w• . . .. Then since h h

w◦ λi• = λi• and w• λi◦ = λi◦ , one has w0 λi• = w 2 λi• and w0 λi◦ = w

h+1 2 λ◦ .

i

h

Dorey’s Rule and the q-Characters of Simply-Laced Quantum Affine Algebras

803

The remainder of the proof, which occupies the rest of this section, is structured as follows: Lemma 3.2 will re-express i) as a statement about the occurrence of quadratic monomials in χq (Vi,a ). Then i) ⇒ ii) will be an immediate corollary of Proposition 3.3, while ii) ⇒ i) is the content of Propositions 3.4 and 3.45. Lemma 3.2. The q-character χq Vi,a ⊗ V j,b ⊗ Vk,c = χq (V j,b )χq (Vi,a )χq (Vk,c )

(3.18)

can include the monomial 1 only if b = aq r −h and c = aq s for some r, s ∈ Z. Suppose, without loss of generality, that s ≥ 0 ≥ r − h. (If not, rearrange the factors.) Then the monomial 1 is present if and only if χq (Vi,a ) contains the quadratic monomial −1 Yj¯,bq h Yk,c .

(3.19)

Proof. Assume that there exist monomials m j in χq (V j,b ),

m i in χq (Vi,a ),

m k in χq (Vk,c )

(3.20)

such that 1 = m j mi mk .

(3.21)

It follows from Corollary 2.6 that m i , m j , and m k differ from 1. Thus, Eq. (3.21) can only hold by virtue of a complete cross-cancellation of all the factors of the three monomials. Since by Lemma 2.4, m i , m j and m k each have compact support of length h and respective bases a, b and c, such cross-cancellation can occur only if b = aq r −h and c = aq s for some r, s ∈ Z, thus proving the first part of the lemma. As for the second part, we first prove that one of the three monomials has to be the highest weight monomial of the q-character where it appears while another one has to be the lowest weight monomial of the q-character where it appears. Suppose for a contradiction that all three monomials were right negative. Since the product of two right negative monomials is obviously right negative, it would follow that m j m i m k is right negative and therefore not equal to 1, a contradiction. Suppose similarly that they were all left positive: then m j m i m k would be left positive, a contradiction. By Lemma 2.5, the only monomial in the q-character of a fundamental representation that is not right negative (resp. left positive) is its highest weight monomial (resp. its lowest weight monomial). Now it follows that the only solution to (3.21) that is also compatible with the assumption that r −h ≤0≤s

(3.22)

−1 r s is m j = Yj−1 ¯,aq r , m i = Yj¯,aq Yk,aq s and m k = Yk,aq . Indeed, we know that one of the three monomials, m i , m j or m k , has to be the quadratic monomial obtained by multiplying the inverses of the other two, namely the one which is the highest weight monomial of its q-character and the one which is the lowest. By Lemma 2.5, this quadratic monomial should be both right negative and left positive. Assuming that (3.22) holds thus implies −1 −1 m j = Yi,a Yk,aq Y h . Furthermore, by Lemma 2.4, assuming that s+h and m k = Y ¯ j,aq r −h ı¯,aq −1 −1 (3.22) holds also implies that m j = Yı¯,aq h Yk,aq s and m k = Yj¯,aq r Yi,a since m j and m k , as monomials in χq (V j,aq r −h ) and χq (Vk,aq s ) respectively, should have compact supports of length h and respective bases aq r −h and aq s . Therefore, it is clear that the quadratic monomial is m i . Finally, Lemma 2.4 implies that m i , as a monomial of χq (Vi,a ), has −1 compact support of length h and base a and hence that m i = Y j,aq s+h . ¯ r −h Yk,aq

804

C. A. S. Young, R. Zegers

Proposition 3.3. For any given i ∈ I , choose a two-colouring of the Dynkin diagram of g such that i ∈ I• . Then q-character χq Vi,a contains the monomial −1 Y j1 ,aq r1 . . . Y ju ,aq ru Yk−1 s . . . Yk ,aq sv v 1 ,aq 1

(3.23)

λi = w n 1 λ j1 + · · · + w n u λ ju − w m 1 λk1 − · · · − w m v λkv ,

(3.24)

only if

where

rx =

2n x jx ∈ I• 2n x − 1 jx ∈ I◦

sx =

2m x k x ∈ I• . 2m x − 1 k x ∈ I◦

(3.25)

Proof. We must show that each monomial in χq (Vi,a ) is associated with an identity of the form (3.24) in the fashion specified. This is certainly true of the highest monomial, which is associated with the trivial identity: Yi,a

←→

λi = λi .

(3.26)

We know that all monomials in χq (Vi,a ) are of the form (2.34). So suppose that, for some integer k ≥ 0, we have successfully demonstrated the required identity for all monomials in χq (Vi,a ) that are k lowering steps, in the sense of (2.39), from Yi,a . Let m ∈ χq (Vi,a ) be any monomial k + 1 steps away from Yi,a . By Theorem 2.3, we have that m = m A−1 , for some monomial m ∈ χq (Vi,a ) that is k lowering steps from j,aq r +1 Yi,a and that has as a factor Y j,aq r . By supposition, the identity to which m is associated thus contains a summand +wn λ j , where n and r are related as in (3.25). We associate the lowering operation (2.39) in the direction of the simple root α j with one of the following re-writings of λ j , to be chosen according to the colour of the node j ∈ I : = λ•j → w λ◦k − wλ•j , (3.27)

k, j

= λ◦j →

λ•k − wλ◦j .

(3.28)

k, j

That these are identities follows from (3.2). It is straightforward to check that they produce precisely the terms required for the resulting identity to be that associated to as the proposition requires. This completes the inductive step, and the m = m A−1 j,aq r +1 result follows by induction on k. An example is shown in Fig. 2. Now, in particular, the quadratic monomials required in Lemma 3.2 correspond to identities of the form λi = w n λj¯ − w m λk .

(3.29)

This completes the proof of the i) ⇒ ii) part of Theorem 3.1. It remains to prove the converse. In view of the preceding proposition, it is clear that what underpins this whole approach is the similarity between the definition −1 Ai,a = Yi,aq −1 Yi,aq Y j,a (3.30)

j,i

Dorey’s Rule and the q-Characters of Simply-Laced Quantum Affine Algebras

805

Fig. 2. Proposition 3.3 illustrated for the representation V1,a of Uq ( d4 ). On the left is the graph of the character −1 χq (V1,a ); the edge label i n denotes multiplication by Ai,aq n . On the right are the corresponding identities involving the Coxeter orbits of fundamental weights

and the identities 0 = λi• + wλi• − w

j,i

λ◦j ,

0 = λi◦ + wλi◦ −

λ•j .

(3.31)

j,i

In trying to pass from a solution to the fusing rule to a monomial in the q-character, the first problem is thus to express the solution explicitly in terms of these identities. We begin by introducing some useful scaffolding. By a slight abuse of notation, let I be the Dynkin diagram of g, and consider the product graph Iˆ = I × {0, 1, 2, . . . }. The two-colouring of I extends to a two-colouring Iˆ = Iˆ◦ Iˆ• of the infinite graph. We picture Iˆ as a vertical stack of copies of I , and will refer to each copy of I as a row and to the set of nodes ( j, 0), ( j, 1), . . . for any fixed j ∈ I as a column. (Figure 3 illustrates an example.)

806

C. A. S. Young, R. Zegers

Fig. 3. The bipartite graph Iˆ in the case g = d5

The black nodes of Iˆ are those of the form (i, 2n), i ∈ I• and (i, 2n − 1), i ∈ I◦ . We associate to each black node a factor Y in the obvious way: (i, r ) ∈ Iˆ• → Yi,aq r .

(3.32)

We also associate to each black node (i, r ) a term Yi,r of the form w n λi , defined as follows: Yi∈I• ,2n := w n λi• ,

Yi∈I◦ ,2n−1 := w n λi◦ .

(3.33)

The white nodes of Iˆ are those of the form (i, 2n), i ∈ I◦ and (i, 2n − 1), i ∈ I• . We associate to each white node (i, r ) the factor Ai,aq r , and also an identity A among the terms w n λi at the neighbouring black nodes: ⎛ 0 = Ai∈I◦ ,2n := w n ⎝λi◦ + wλi◦ −

0 = Ai∈I• ,2n−1 :=

⎞ λ•j ⎠ ,

j,i

⎛ w n−1 ⎝λi•

+ wλi•

−w

(3.34) ⎞

λ◦j ⎠ ;

(3.35)

j,i

that is, simply, Ai,r := Yi,r −1 + Yi,r +1 −

j,i

Y j,r .

(3.36)

Dorey’s Rule and the q-Characters of Simply-Laced Quantum Affine Algebras

807

Let c and g be integer-valued functions defined on the black and white nodes respectively c : Iˆ• → Z; (i, n) → cin , g : Iˆ◦ → Z; (i, n) → gin .

(3.37) (3.38)

One may then ask: when does the coefficient of a term Yi,n , with n > 0, vanish in the expression crj Y j,r − grj A j,r ? (3.39) E (c, g) := ( j,r )∈ Iˆ•

( j,r )∈ Iˆ◦

Or, equivalently, when is the factor Yi,aq n absent from the monomial ⎛ ⎞⎛ ⎞ cr −gr j⎠ ? m(c, g) := ⎝ Y j,aq r j ⎠ ⎝ A j,aq r ( j,r )∈ Iˆ•

(3.40)

( j,r )∈ Iˆ◦

It is clear that the answer is: if and only if gin−1 + gin+1 −

g nj = cin .

(3.41)

j,i

Let us regard c as a fixed source term. Then it is possible to satisfy (3.41) at every black node (i, n) with n > 0 by choosing an appropriate g. Assume that sufficiently far down the graph the source vanishes, i.e. that there is an N such that cin = 0 for all n > N . Then, furthermore, the solution is unique if we specify also that gin = 0 for all n > N , because Eq. (3.41) at each row n fixes uniquely the gin−1 in the row above. Proposition 3.4. Choose the two-colouring of I such that i 1 ∈ I• . Suppose that we have a solution to the fusing equation (3.6), written, as in (3.13), in the form λi1 − w n λı¯2 + w m λi3 = 0,

(3.42)

with n, m ∈ Z subject to (3.14). Then there exists a unique g : Iˆ◦ → Z such that −gt j , t Yı¯2 ,aq r Yi−1 A (3.43) s = Yi 1 ,a j,aq ,aq 3 ( j,t)∈ Iˆ◦

where r, s ∈ Z are as in (3.16), and such that, for some N ∈ N, gin = 0 for all n > N . Proof. Let c be the source function that vanishes everywhere except ci01 = +1,

cır¯2 = −1,

cis3 = +1.

(3.44)

Note that then (3.42) is E (c, 0) = 0. Consider solving (3.41) for g in the manner given above. The resulting expression E (c, g) has by construction no terms Yi,n with n > 0. So it can only be a linear combination of the Yi,0 = λi• and 10 Yi,−1 = λi◦ . But of course E (c, g) = 0 identically, since all we have done is to add various re-writings of zero (the A’s) to an expression (3.42) which was zero to begin with. Therefore, since the λi are linearly independent, the identity E (c, g) = 0 must be trivial, in the sense that the expression on the right-hand side of (3.39) consists entirely of cancelling pairs of terms and vanishes without appealing to properties of the Coxeter element. Consequently, we have also that m(c, g) = 1, which, on rearranging, is (3.43) as required. 10 For this proof only, we consider working on I × {−1, 0, 1 . . . }.

808

C. A. S. Young, R. Zegers

Fig. 4. Two copies of Iˆ in the case g = d5 , showing the solutions to the problem (3.41) for the source functions c associated, as in Proposition 3.4, to the identities shown. ⊕ denotes a node at which g = +1; elsewhere g=0

The right-hand side of (3.43) is of the right form to be a monomial in χq (Vi1 ,a ), cf. (2.34), but we are by no means done. A priori, it is perhaps not even clear from the procedure above that the gin need all be non-negative: indeed, although we stated the proposition for identities involving three terms, (cf. Fig. 4), the obvious generalization to arbitrary identities of the form (3.24) is valid, but the resulting gin are not all non-negative in general. Nonetheless, Proposition 3.5. Under the assumptions of the preceding proposition, the monomial Yı¯2 ,aq r Yi−1 s 3 ,aq

(3.45)

of (3.43) occurs in χq (Vi1 ,a ). Proof. First consider the following iterative procedure which generates a finite sequence m 0 , m 1 , . . . , m h−1 of monomials in χq (Vi1 ,a ). We set m 0 = Yi1 ,a . Roughly speaking, the idea is to lower fully in all black directions to obtain m 1 , then lower fully in all white directions to obtain m 2 , and so on. More precisely, suppose that for some even p ≥ 0 we have found an m p in χq (Vi1 ,a ) of the form ⎛ m p = ⎝

⎞⎛ bi ⎠⎝ Yi,aq p

i∈I•

⎞−1 bi ⎠ Yi,aq p+1

(3.46)

i∈I◦

for some non-negative integers bi , i ∈ I . Certainly (cf. 2.34) −gt j A j,aq t m p = Yi1 ,a ( j,t)∈ Iˆ◦

(3.47)

Dorey’s Rule and the q-Characters of Simply-Laced Quantum Affine Algebras

809

for some g tj ≥ 0 with, in view of (3.46), g tj = 0 ∀ t > p. Thus for all k > 0 and i ∈ I, m p Ai,aq p+k is not of the form (2.34) and so cannot be in χq (Vi1 ,a ). Proposition

−bi 2.2 thus guarantees that m p Ai,aq p+1 is in χq (Vi 1 ,a ), for i ∈ I• . By similar reasoning for each black direction in turn, we have that χq (Vi1 ,a ) contains −b m p+1 = m p Ai,aqi p+1 . (3.48) i∈I•

It too is of the form (3.46), but with p odd and the roles of black and white exchanged. With the obvious colour swaps, we then iterate. As stated, the iteration proceeds until we arrive at the lowest monomial m h−1 = −1 Yı¯ ,aq h of χq (Vi1 ,a ).11 The key observation is that, for all p ≤ h − 1, the g tj of (3.47) 1 solve the problem (3.41) in rows 1, 2 . . . , p − 1, for the source function c defined to be zero everywhere except for c0i1 = +1, and the initial conditions g 0i = 0 ∀i ∈ I . Note that for all p ≤ h − 1 the g tj of (3.47) are non-negative in rows 1, 2, . . . , p − 1; this is clear from their character-theoretic construction, and is a fact about the solution to (3.41) for the source c and initial conditions g 0i = 0 ∀i ∈ I that is not otherwise manifest. Now let g and c be the functions of the proof of Proposition 3.4. Since ∀n < r, cin = cni ,

(3.49)

and because each gin is determined by the values of c and g in rows above (when we think of solving from row 0 downwards), we have ∀n ≤ r, gin = g ni .

(3.50)

In particular, the gin are non-negative for all n ≤ r . On the other hand, by imagining turning the diagram upside-down and applying the same argument starting from the +1 source in row s, we conclude also that the gin are non-negative for all n ≥ r . Therefore all the gin are non-negative. (Note that this trick would not work if c were non-zero at more than three nodes.) Furthermore, again thinking of solving from row 0 downwards, crı¯2 = cır¯2 + 1

⇒

g rı¯2+1 = gı¯r2+1 + 1 > 0.

(3.51)

This relation is crucial, because if we are to obtain the desired quadratic monomial (3.43), we must modify the procedure on reaching row r : we set m 1 = m 1 , . . . , m r = m r , but then rather than lowering m r completely in the direction ı¯2 , we want to preserve one factor of Yı¯2 ,aq r – and the above inequality guarantees that there is at least one such factor. That is, if ⎛ ⎞⎛ ⎞−1 b b i i ⎠⎝ ⎠ , mr = ⎝ Yi,aq Yi,aq (3.52) r r +1 i∈I•

i∈I◦

11 This sequence of “lowering steps” is of the general type mentioned in [Her07b], Remark 2.16. Note that this particular sequence picks out a route through the graph of χq (Vi 1 ,a ), from the highest to the lowest 2 ) Kirillov-Reshetikhin modules, in the sense that at each lowering monomial, that avoids non-trivial Uq (sl 2 )-character is that of an (irreducible) tensor product of fundamental representations at step the relevant Uq (sl coincident rapidity. It is also interesting to note that the monomials m 0 , m 1 , . . . , m h−1 have the property that the sequence of their classical weights is a permutation of the Coxeter orbit of the highest weight λi 1 .

810

C. A. S. Young, R. Zegers

supposing in what follows, for the sake of definiteness, that ı¯2 ∈ I• , then we are guaranteed that bı¯2 ≥ 1. Setting n = bı¯2 , p = bı¯2 − 1 in Proposition 2.2 we deduce that −gt −bı¯ +1 −bi j m r +1 := m r Aı¯ ,aq2 r +1 A j,aq t Ai,aq (3.53) r +1 = Yi 1 ,a 2

i∈I• \{¯ı 2 }

( j,v)∈ Iˆ◦ :t≤r +1

is a monomial in χq (Vi1 ,a ). We would then like to continue to apply the above alternating black/white lowering procedure in subsequent rows, preserving the prefactor Yı¯2 ,aq r at each step. Once more we shall argue that this is possible by a finite recursion. Consider a white lowering step: suppose that for some odd p with s > p ≥ r + 1 we have shown that ⎛ ⎞⎛ ⎞−1 b b −gt i i j = Y ⎠⎝ ⎠ (3.54) m p := Yi1 ,a A j,aq t Yi,aq Yi,aq p ı¯2 ,aq r ⎝ p+1 ( j,t)∈ Iˆ◦ :t≤ p

i∈I◦

i∈I•

is a monomial in χq (Vi1 ,a ), for certain bi ∈ Z, i ∈ I . To begin the recursion, this is cerp+1 tainly true for p = r + 1, as in (3.53). Now observe that in fact, for all i ∈ I◦ , bi = gi (this is clear when thinking of solving for g row-by-row from row 0 downwards) and that these are non-negative as noted above. Thus we can lower in all white directions as before and find that −g p+1 j m p+1 := m p A j,aq p+1 (3.55) ( j, p+1)∈ Iˆ◦

is also a monomial in χq (Vi1 ,a ). This completes the white inductive step. For the black step, lowering in the directions I• \ {¯ı 2 } works in exactly the same way. It remains only to check that the lowering step in the direction ı¯2 is also valid: but this is clear because p+2 m p+1 is an ı¯2 -dominant monomial and βı¯2 (m p+1 ) = Yı¯2 ,aq r Yı¯n ,aq p+1 with n = gı¯2 ≥ 0, 2 which is still of the correct form to apply Proposition 2.2. Iterating, we have that every monomial in the sequence −gt j Yi1 ,a A j,aq t for p = 1, 2, . . . , s (3.56) ( j,v)∈ Iˆ◦ :t≤ p

is in χq (Vi1 ,a ). Finally then, at row s, we indeed arrive at −gt −1 j = Y A j,aq t Yi1 ,a ı¯2 ,aq r Yi 3 ,aq s ,

(3.57)

( j,t)∈ Iˆ◦

which is the required monomial.

4. Outlook It is an immediate corollary of our main result, Theorem 3.1, that Dorey’s rule provides a necessary condition for HomUq (g) Vi,a ⊗ V j,b ⊗ Vk,c , C = 0. We have not, however, given a general proof here of sufficiency; and it may be that such a proof would require more knowledge about the structure of fundamental Uq ( g)-modules than their q-characters alone provide. The correct statement should be the following. Under the

Dorey’s Rule and the q-Characters of Simply-Laced Quantum Affine Algebras

811

conditions of Theorem 3.1, the ordered triple of vectors (w n 1 λi1 , w n 2 λi2 , w n 3 λi3 ) can be said to be either cyclic or acyclic according to the order in which their projections occur in the oriented s = 1 eigenplane of w, cf. (3.4). In the example following the theorem, (w −2 λ¯l , λL , w 5 λh ) is cyclic, for instance. It should be that, in the cyclic case, HomUq (g) (C, Vi1 ,a1 ⊗Vi2 ,a2 ⊗Vi3 ,a3 ) = 0 and HomUq (g) (Vi3 ,a3 ⊗Vi2 ,a2 ⊗Vi1 ,a1 , C) = 0. (For the a- and d-series, one may verify that this statement indeed unpacks to give Theorems 6.1 and 7.1 of [CP96]. There the proof proceeds by induction on the rank, and relies on specific properties of these root systems.) Now, as mentioned in Sect. 2, there is a “current” Hopf algebra structure on Uq ( g), originally due to Drinfel’d. It restricts, over the quantum loop algebra, to the following relations: (φi± (u)) = φi± (u) ⊗ φi± (u),

(4.1)

(xi+ (u)) (xi− (u)) S(φi± (u)) S(xi+ (u)) (φi± (u))

(4.2)

= = = = =

1 ⊗ xi+ (u) + xi+ (u) ⊗ φi− (1/u), xi− (u) ⊗ 1 + φi+ (1/u) ⊗ xi− (u), φi± (u)−1 , −xi+ (u)φi− (1/u), S(xi− (u)) = 1, (xi± (u)) = 0,

(4.3) (4.4) −φi+ (1/u)xi− (u),

(4.5) (4.6)

where is the coproduct, S the antipode and the counit. This Hopf algebra structure is twist-equivalent to the standard one in a sense given in [EKP07]; note that the infinite sums on the right of the coproducts above require careful interpretation [Her05,Her07a, Gro01]. With respect to this “current” Hopf algebra structure,it is clear that the singlet −1 state in Vi1 ,a1 ⊗ Vi2 ,a2 ⊗ Vi3 ,a3 must be of the form |Yı¯−1 h ⊗ Yı¯1 ,a1 q h Yi 3 ,a3 ⊗ Yi 3 ,a3 ,a q 1 1 – where the first and last tensor factors are the lowest and highest weight vectors of the respective representations, and the middle factor is an eigenvector of φi± (u) with l-weight corresponding to the monomial shown. Finally, let us remark that it would be interesting to investigate whether generalizations of our results exist for representations other than the fundamental ones (as was suggested in [EKMY05] based on the structure of local charges in certain integra(k) ble sigma models). The natural candidates are the Kirillov-Reshetikhin modules Wi,a , which can be thought of as the “minimal affinizations” [CP95] of the highest weight representations Vkλi of g and for which the Frenkel-Mukhin algorithm is known to work [Nak03b,Her06]. The form of our arguments suggests that such generalizations may be possible, perhaps using the braid group actions of [Bec94,Cha02] to lift the periodicity of the Coxeter element. Acknowledgemets. We are grateful to Patrick Dorey and Niall MacKay for valuable discussions and suggestions. During much of the preparation of this work, C.A.S.Y. was funded by the Leverhulme Trust and R.Z. by an EPSRC postdoctoral fellowship. C.A.S.Y. is funded by a fellowship from the Japan Society for the Promotion of Science.

References [BCDS90] Braden, H.W., Corrigan, E., Dorey, P.E., Sasaki, R.: Affine toda field theory and exact S matrices. Nucl. Phys. B 338, 689–746 (1990) [Bec94] Beck, J.: Braid group action and quantum affine algebras. Commun. Math. Phys. 165, 555–568 (1994)

812

[Bra92] [CH]

C. A. S. Young, R. Zegers

Braden, H.W.: A note on affine toda couplings. J. Phys. A 25, L15–L20 (1992) Chari, V.J., Hernandez, D.: Beyond Kirillov-Reshetikhin Modules. In: Quantum Affine Algebras Extended Affine Lie Algebras, and Their Applications, Y. Gro et al (eds.), Cont. Math. 506, Providence, RI: Amer. Math. Soc., 2010, pp. 49–81 [Cha02] Chari, V.: Braid group actions and tensor products. Int. Math. Res. Notices. 2002, 357–382 (2002) [CM06] Chari, V., Moura, A.: Characters of fundamental representations of quantum affine algebras. Acta Appl. Math. 90, 43–63 (2006) [Cor94] Corrigan, E.: Recent developments in affine Toda quantum field theory, Lecture at CRM-CAP Summer School, 16-24 Aug. 1994 (Banff, Alberta, Canada), available at http://arxiv.org/abs/ hep-th/9412213v1, 1994 [CP] Chari, V., Pressley, A.: A guide to quantum groups. Cambridge, UK: Cambridge. Univ. Pr., 1994 [CP91] Chari, V., Pressley, A.: Quantum affine algebras. Commun. Math. Phys. 142, 261–283 (1991) [CP95] Chari, V., Pressley, A.: Minimal affinization of representations of quantum groups: the simply laced case. Lett. Math. Phys. 35, 99–114 (1995) [CP96] Chari, V., Pressley, A.: Yangians, integrable quantum systems and dorey’s rule. Commun. Math. Phys. 181, 265–302 (1996) [Dor91] Dorey, P.: Root systems and purely elastic s matrices. Nucl. Phys. B 358, 654–676 (1991) [Dor92a] Dorey, P.: Hidden geometrical structures in integrable models. Available at http://arxiv.org/abs/ hep-th/9212143v2, 1992 [Dor92b] Dorey, P.: Root systems and purely elastic S matrices. 2. Nucl. Phys. B374, 741–762 (1992) [Dor93] Dorey, P.: A remark on the coupling dependence in affine toda field theories. Phys. Lett. B312, 291–298 (1993) [Dri85] Drinfeld, V.G.: Hopf algebras and the quantum yang-baxter equation. Sov. Math. Dokl. 32, 254– 258 (1985) [Dri88] Drinfeld, V.G.: A new realization of yangians and quantized affine algebras. Sov. Math. Dokl. 36, 212–216 (1988) [EKMY05] Evans, J.M., Kagan, D., MacKay, N.J., Young, C.A.S.: Quantum, higher-spin, local charges in symmetric space sigma models. JHEP 01, 020 (2005) [EKP07] Enriquez, B., Khoroshkin, S., Pakuliak, S.: Weight functions and drinfeld currents. Commun. Math. Phys. 276, 691–725 (2007) [FKS00] Fring, A., Korff, C., Schulz, B.J.: On the universal representation of the scattering matrix of affine toda field theory. Nucl. Phys. B567, 409–453 (2000) [FLO91] Fring, A., Liao, H.C., Olive, D.I.: The mass spectrum and coupling in affine toda theories. Phys. Lett. B266, 82–86 (1991) [FM01] Frenkel, E., Mukhin, E.: Combinatorics of q-characters of finite-dimensional representations of quantum affine algebras. Commun. Math. Phys. 216, 23–57 (2001) [FO92] Fring, A., Olive, D.I.: The fusing rule and the scattering matrix of affine toda theory. Nucl. Phys. B 379, 429–447 (1992) [FR98] Frenkel, E., Reshetikhin, N.: The q-characters of representations of quantum affine algebras and deformations of W-algebras. Contemp. Math. 248, 163–205 (1998) [Gro01] Grosse, P.: On quantum shuffle and quantum affine algebras. J. Alg. 318(2), 495–519 (2001) [Her05] Hernandez, D.: Representations of quantum affinizations and fusion product. Trans. Groups 10, 163–200 (2005) [Her06] Hernandez, D.: The kirillov-reshetikhin conjecture and solutions of t-systems. J. Reine Angew. Math. 596, 63–87 (2006) [Her07a] Hernandez, D.: Drinfeld coproduct, quantum fusion tensor category and applications. Proc. London Math. Soc. 95(3), 567–608 (2007) [Her07b] Hernandez, D.: On minimal affinizations of representations of quantum groups. Commun. Math. Phys. 277, 221–259 (2007) [HL09] Hernandez, D., Leclerc, B.: Cluster algebras and quantum affine algebras. Duke Math. J. 154(2), 265–341 (2009) [Kni95] Knight, H.: Spectra of tensor products of finite dimensional representations of yangians. J. Algebra 174(1), 187–196 (1995) [Kum88] Kumar, S.: A proof of the parthasarathy ranga rao varadarajan conjecture. Invent. Math. 93, 117– 130 (1988) [Mac91] MacKay, N.J.: New factorized s matrices associated with so(n). Nucl. Phys. B 356, 729– 749 (1991) [Mac92] MacKay, N.J.: On the bootstrap structure of Yangian invariant factorized S matrices. Int. J. Mod. Phys. (Proc. Suppl.), 3A, 360–364 (1992) presented at 21st Conference on Differential Geometric Methods in Theoretical Physics (XXI DGM), Tianjin, China, 5-9 Jun 1992 [Mat89] Mathieu, O.: Construction d’un groupe de kac-moody et applications. Compositio Math. 69, 37–60 (1989)

Dorey’s Rule and the q-Characters of Simply-Laced Quantum Affine Algebras

[Nak03a]

813

Nakajima, H.: T-analogs of q-characters of quantum affine algebras of type an and dn . Contemp. Math. 325, 141–160 (2003) [Nak03b] Nakajima, H.: T-analogs of q-characters of kirillov-reshetikhin modules of quantum affine algebras. Represent. Theory 7, 259–274 (2003) [Nak04] Nakajima, H.: Quiver varieties and t-analogs of q-characters of quantum affine algebras. Ann. Math. 160(3), 1057–1097 (2004) [Nak06] Nakajima, H.: t-analogs of q-characters of quantum affine algebras of type E 6 , E 7 , E 8 . In: Representation Theory of Algebraic Groups and Quantum Groups, Prog. Math. 284, Berlin-Heidleberg NewYork: Springer, 2011, pp. 257–272 [NN08] Nakai, W., Nakanishi, T.: On Frenkel-Mukhin algorithm for q-character of quantum affine algebras. To appear in Adv. Stud. in Pure Math., Proc. of Workship “Exploration of new Structures and Natural Constructions” in Math. Phys. (Nagoya, 2007). available at http://arxiv.org/abs/0801. 2239v2 [math.QA], 2008 [Oot97] Oota, T.: Q-deformed coxeter element in non-simply laced affine toda field theories. Nucl. Phys. B 504, 738–752 (1997) [PRRV67] Parthasarathy, K.R., Ranga Rao, R., Varadarajan, V.S.: Representations of complex semi-simple lie groups and lie algebras. Ann. Math. 85, 383–429 (1967) [SWK00] Saleur, H., Wehefritz-Kaufmann, B.: Thermodynamics of the complex su(3) toda theory. Phys. Lett. B 481, 419–426 (2000) [TW99] Takacs, G., Watts, G.: Non-unitarity in quantum affine toda theory and perturbed conformal field theory. Nucl. Phys. B 547, 538–568 (1999) [Var00] Varagnolo, M.: Quiver varieties and yangians. Lett. Math. Phys. 53, 273–283 (2000) Communicated by Y. Kawahigashi

Commun. Math. Phys. 302, 815–841 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1191-3

Communications in

Mathematical Physics

Breather Solutions in Periodic Media Carsten Blank, Martina Chirilus-Bruckner, Vincent Lescarret, Guido Schneider Institut für Analysis, Dynamik und Modellierung, Universität Stuttgart, Pfaffenwaldring 57, 70569 Stuttgart, Germany. E-mail: [email protected] Received: 25 February 2010 / Accepted: 16 September 2010 Published online: 1 February 2011 – © Springer-Verlag 2011

Abstract: For nonlinear wave equations existence proofs for breathers are very rare. In the spatially homogeneous case up to rescaling the sine-Gordon equation ∂t2 u = ∂x2 u − sin(u) is the only nonlinear wave equation which is known to possess breather solutions. For nonlinear wave equations in periodic media no examples of breather solutions have been known so far. Using spatial dynamics, center manifold theory and bifurcation theory for periodic systems we construct for the first time such time periodic solutions of finite energy for a nonlinear wave equation s(x)∂t2 u(x, t) = ∂x2 u(x, t) − q(x)u(x, t) + r (x)u(x, t)3 , with spatially periodic coefficients s, q, and r on the real axis. Such breather solutions play an important role in theoretical scenarios where photonic crystals are used as optical storage. Contents 1.

2. 3. 4. 5.

6.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 The construction . . . . . . . . . . . . . . . . . . . . . . 1.2 On the spectral assumption (Spec) . . . . . . . . . . . . 1.3 Some remarks . . . . . . . . . . . . . . . . . . . . . . . Symmetries of the Spatial Dynamics Formulation . . . . . . Application of Floquet’s Theory . . . . . . . . . . . . . . . . An Example for a Suitable Choice of s = s(x) . . . . . . . . . The Reversibility . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Preparations . . . . . . . . . . . . . . . . . . . . . . . . 5.2 The reversible change of variables . . . . . . . . . . . . 5.3 Conjugation of the old and the new reversibility operator The Center Manifold Reduction . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

816 817 818 820 823 824 825 828 828 829 831 832

816

C. Blank, M. Chirilus-Bruckner, V. Lescarret, G. Schneider

7. Construction of a Homoclinic Solution . . . . . . . . . . . . . . . . . . . . 835 8. Persistence of the Homoclinic Solution . . . . . . . . . . . . . . . . . . . . 837 A. The Physical Motivation: Photonic Crystals as Optical Storage . . . . . . . . 838 1. Introduction We consider a nonlinear wave equation of the form s(x)∂t2 u(x, t) = ∂x2 u(x, t) − q(x)u(x, t) + r (x)u(x, t)3 ,

(1.1)

with x ∈ R, t ∈ R, u(x, t) ∈ R and a-periodic real-valued coefficient functions s, q and r , i.e., s(x) = s(x + a), q(x) = q(x + a), and r (x) = r (x + a), where w.l.o.g. in the following a = 1. It is the purpose of this paper to give an example of coefficient functions s, q, and r such that (1.1) possesses breather solutions u, i.e., for this choice of coefficient functions we prove the existence of spatially localized, 2π/ω-time periodic solutions of finite energy. With χ M being the characteristic function of the set M our result is as follows. Theorem 1.1. Let s(x) = χ[0,6/13] + 16χ(6/13,7/13) + χ[7/13,1] (x mod 1), q(x) = (q0 − ε2 )s(x) with q0 ∈ R defined explicitly in (4.5), and r (x) = 1. Then there exist an ε0 > 0 and a C > 0 such that for all ε ∈ (0, ε0 ), Eq. (1.1) possesses breather solutions with minimal period 2π/ω∗ , where ω∗ = 13π/16, i.e., there are solutions u : R × R → R of (1.1) which satisfy for a β > 0 that lim u(x, t)eβ|x| = 0, ∀t ∈ R,

|x|→∞

u(x, t) = u(x, t + 2π/ω∗ ), ∀x, t ∈ R, and

(1.2)

sup u(x, t) − u app (x, t) ≤ Cε2 ,

(1.3)

u app (x, t) = εη1 sech(εη2 x)w1 (π, x)eiπ x eiω∗ t + c.c.,

(1.4)

x,t∈R

where

with constants η1 , η2 and a 1-periodic function w1 (π, ·) which are all defined subsequently in Remark 1.5. The solution is C σ w.r.t. t for every fixed σ , but only piecewise smooth w.r.t. x and C 1 w.r.t. x at the jumps of s. A sketch of such a solution can be found in Fig. 1. The novelty of this result is as follows. For spatially homogeneous nonlinear wave equations up to rescaling the sine-Gordon equation ∂t2 u = ∂x2 u − sin(u) is the only nonlinear wave equation which is known to possess breather solutions. For nonlinear wave equations in periodic media the situation is expected to be different. However, no examples of breather solutions have been known so far. Using spatial dynamics, center

Breather Solutions in Periodic Media

817

Fig. 1. A breather solution in periodic media. The wavelength of the carrier wave and of the medium are of a comparable order

manifold theory and bifurcation theory for periodic systems, we construct for the first time such time periodic solutions of finite energy for a nonlinear wave equation with spatially periodic coefficients. Note that our method of construction heavily relies on the subsequent condition (Spec). The special choice for s in Theorem 1.1 is a carefully tuned example which fulfills this condition. It is not clear at this point if more such examples do exist which are not a trivial adaption of the present one, in other words, the genericity of breathers in periodic media is still an open question. Before we explain the relevance of Theorem 1.1 in applications, as for the use of photonic crystals as optical storage, and generalizations of the result, we explain our method to construct such solutions and the major mathematical difficulty associated with the problem. 1.1. The construction. The subsequent explanations are made for general 1-periodic coefficient functions s = s(x), q = q(x), and r = r (x) in order to put the problem in a general framework which allows us to discuss subsequently possible generalizations of Theorem 1.1. For the construction of the breather solutions we will use spatial dynamics, center manifold theory, and bifurcation theory. Spatial dynamics means that we write (1.1) as an evolutionary system w.r.t. x ∈ R in the phase space of 2π/ω-time periodic functions, i.e., we consider ∂x u(x, t) = v(x, t), ∂x v(x, t) = s(x)∂t2 u(x, t) + q(x)u(x, t) − r (x)u(x, t)3 .

(1.5)

Due to the periodic dependence of s, q, and r on x the system is non-autonomous. Using the symmetries of the system we can restrict ourselves to solutions which are odd w.r.t. t in order to reduce the dimensionality of the existence problem by a factor 2. If the spectral assumption (Spec): The linearization of the periodic spatial dynamics system (1.5) possesses two Floquet exponents with real part zero and the rest of the Floquet spectrum is uniformly bounded away from the imaginary axis, cf. the left panel of Fig. 2, holds, by using invariant manifold theory for periodic systems the infinite-dimensional spatial dynamics system (1.5) can be reduced to a two-dimensional system on the center

818

C. Blank, M. Chirilus-Bruckner, V. Lescarret, G. Schneider

Fig. 2. The spectral picture of the linearized spatial dynamics formulation (1.5) for ε = 0 and ε > 0. All Floquet exponents possess imaginary part iπ

manifold which is associated with the two Floquet exponents with real part zero. As will be explained in Sect. 1.2 for constant coefficients s, q and r the spectral condition (Spec) is not satisfied and as a consequence breather solutions for (1.1) cannot be constructed with the spatial dynamics method in the constant coefficient case if the minimal period w.r.t. t is non-zero, cf. Remark 1.4. Thus, for a given minimal period 2π/ω∗ w.r.t. t the coefficients s, q and r have to be suitably chosen. By moving the two central Floquet exponents from the imaginary axis, cf. Fig. 2, bifurcating homoclinic solutions can be found in the lowest order approximation of the reduced system, cf. Fig. 10 in Sect. 8. Using reversibility arguments for the reduced system finally gives the persistence of the homoclinic solution w.r.t. higher order perturbations. These homoclinic solutions of the spatial dynamics formulation (1.5) in the phase space of time-periodic solutions correspond to breather solutions in the original formulation (1.1). In order to have the reversibility of the spatial dynamics formulation (1.5), i.e. the invariance of (1.5) under (x, u) → (−x, u) the coefficient functions s = s(x), q = q(x) and r = r (x) have to be even w.r.t. x, i.e., s(x) = s(−x), q(x) = q(−x), and r (x) = r (−x). Hence the strategy to establish the existence of such solutions is very clear. However, it is not clear at all how to choose s, q, and r such that the assumption (Spec) can become true. 1.2. On the spectral assumption (Spec). It is well known that the solutions of the linearization s(x)∂t2 u = ∂x2 u − q(x)u of (1.1) at the origin are given by oscillations of Bloch modes, namely eilx wn (l, x)eiωn (l)t ,

(1.6)

with wn (l, x) = wn (l, x + 1) and curves of eigenvalues l → ωn (l), where ωn (l) ∈ R for l ∈ (−π, π ] and n ∈ Z/{0}. They are ordered such that ωn (l) ≥ ωn−1 (l). Spectral

Breather Solutions in Periodic Media

819

Fig. 3. If the dotted line l → mω∗ falls into a spectral gap between the curves of eigenvalues plotted over the Bloch wave numbers l of the linearized time evolutionary system (left upper panel), then in the spectral picture of the linearized space evolutionary system two Floquet exponents off the imaginary axis occur (right upper panel). In the other case they are on the imaginary axis (lower panels)

gaps can occur, i.e., the set { ωn (l) | l ∈ (−π, π ], n ∈ Z/{0} } ⊂ C is in general not connected for periodic s and q, cf. [7]. In case of Schrödinger operators with periodic potential a detailed discussion about the occurence of spectral gaps can be found for instance in [28, Sect. XIII.16], especially have a look at [28, Theorem XIII.91]. There is a one-to-one correspondence between the spectral pictures of the linearizations of the time evolutionary and of the space evolutionary system (1.1) and (1.5), respectively. Using Fourier series u(x, t) = m∈Z uˆ m (x)eimω∗ t with respect to time the spectral problem of the space evolutionary system (1.5) will split into infinitely many decoupled problems (3.1) which are indexed with m ∈ Z and which each modulo 2π i will create two Floquet exponents. When the integer multiple mω∗ of the basic temporal wavenumber ω∗ falls into a spectral gap of the time evolutionary system (1.1) then modulo 2π i there are two Floquet exponents off the imaginary axis in the m th space evolutionary system (1.5). In the other case the Floquet exponents are on the imaginary axis, see Fig. 3. In order to satisfy the spectral assumption (Spec) except of two (see Sect. 2) all integer multiples mω∗ of the basic temporal wavenumber ω∗ have to fall into a spectral gap of the time evolutionary system (1.1). For smooth s the spectral gaps become smaller and smaller for larger n and, therefore, integer multiples mω∗ of ω∗ in general do not fall into spectral gaps. In [7, Theorem 4.5.3] and [23] there are estimates on the asymptotic size ln =

inf

l∈(−π,π ]

ωn+1 (l) −

sup

l∈(−π,π ]

ωn (l)

820

C. Blank, M. Chirilus-Bruckner, V. Lescarret, G. Schneider

Fig. 4. Graph of the function x → s(x)

of the n th spectral gap for n → ∞. For the sake of clarity we restate the findings of [7, Theorem 4.5.1]: i) ii) iii)

ln = O(n) if s(x) is piecewise smooth, ln = O(1) if s (x) exists and is piecewise smooth, ln = O(n −r −1 ) if s (r +2) (x) and q (r ) all exist and are piecewise smooth.

To avoid that gaps become smaller we have to ensure that s is at most once differentiable. A particularly handy choice for s seems therefore to be a step function, since, on one hand, it will ensure that the open gaps are O(1) (due to the lack of regularity) and on the other hand the resulting band structure can be computed explicitly which is important for tuning it to fulfill the crucial condition (Spec). An example of coefficient functions s and q leading to O(1)-spectral gaps around each value (2n + 1)ω∗ with ω∗ = 13π/16 and n ∈ N are s(x) = χ[0,6/13] + 16χ(6/13,7/13) + χ[7/13,1] (x mod 1)

and

q(x) = 0,

where χ M is the characteristic function of the set M (Fig. 4). Since q according to [7, Theorem 4.5.3 ii)] does not affect the asymptotics of the spectral gaps, we can choose q to adjust two Floquet exponents on the imaginary axis without destroying the overall spectral picture. It turns out that a choice q(x) = μs(x) with μ ∈ R is sufficient in our case. See Fig. 2. Since we have chosen a cubic nonlinearity like in nonlinear optics, cf. Remark 1.2, it is sufficient that only every second gap opens in the required way. See Sect. 4 for the details. 1.3. Some remarks. The plan of the paper is as follows. After exploiting the symmetries of the spatial dynamics formulation (1.5) in Sect. 2 we recall Floquet theory for periodic systems and analyze the spectral situation for the above choice of s and q in Sects. 3 and 4. The center manifold reduction is made in Sect. 6 and the reduced system is analyzed in Sects. 7 and 8. In the Appendix we explain how (1.1) can be derived as a model for the evolution of the electric field in photonic crystals. Before we start with this plan we close the Introduction with a number of remarks mainly explaining related results in mathematics, the relevance of the result w.r.t. applications, and possible generalizations of the result. Remark 1.2. The paper is motivated by theoretical scenarios where photonic crystals are used as optical storage [5]. Photonic crystals consist of a dielectric material, for instance

Breather Solutions in Periodic Media

821

glass, with a periodic structure with a wavelength comparable to the wavelength of light. They are suitable tools for the construction of all optical devices in photonics which is loosely speaking electronics with photons instead of electrons. Due to the vanishing group velocities, which are implied by the horizontal tangencies in the left panel of Fig. 3, in principle, photonic crystals can be used as optical storage, where the breather solution of Theorem 1.1 stands for a one in the digital encoding of information. In the Appendix we explain more details and how Eq. (1.1) can be derived as a model for the dynamics of light in photonic crystals. Remark 1.3. The construction of breather solutions is a very active field of research. Breather solutions are known to exist in various systems, and so we refrain from giving a complete overview. In discrete systems such as Hamiltonian networks of weakly coupled oscillators or nonlinear Schrödinger lattices breather solutions have been constructed for instance in [20,25]. In S 1 -symmetric continuous systems such solutions are widely known to exist. In the spatially periodic case they have been constructed for instance in [24]. In various limits connecting the discrete and the continuous case, breather solutions have been constructed, very recently in [1,2,27]. For a recent overview of existence results for breathers in lattice equations see [19]. The result which is closest to the presented result is the construction of breathers in a diatomic FPU-model in [17]. Remark 1.4. For nonlinear wave equations existence proofs for breathers are very rare. In the homogeneous case there are no spectral gaps and so according with Fig. 3 all eigenvalues of the spatial dynamics formulation lie on the imaginary axis and hence up to rescaling, the sine-Gordon equation ∂t2 u = ∂x2 u − sin(u) is the only nonlinear wave equation in homogeneous medium which is known to possess breather solutions in NLS-form [3,6]. In general only solutions with small tails at infinity have been proven to exist [10–13]. However, for periodic media the situation is different. By the coefficient functions s, q, and r we have infinitely many parameters which can be adjusted to have an intersection of the one-dimensional weakly unstable and one-dimensional weakly stable manifold which are associated to the two Floquet exponents close to the imaginary axis. Remark 1.5. Solutions of (1.1) can be approximated via the ansatz u(x, t) = ε A(ε(x − cg t), ε2 t)wn (l0 , x)eil0 x eiωn (l0 )t + c.c. with A(X, T ) ∈ C, group velocity cg = ∂T A = −i

ωn (l0 )

∈ R and 0 < ε 1 by a NLS-equation

ωn (l0 ) 2 ∂ X A + iγn (l0 )A|A|2 2

with coefficient

1

γn (l0 ) = 3

(1.7)

r (x)|wn (l0 , x)|4 dx,

(1.8)

(1.9)

0

where cg = 0 for l0 = 0, ±π . The NLS-equation possesses pulse solutions A(X, T ) = ˜ if ω (l )γ (l ) < 0 of the form displayed in (1.4). In [4] an approximation ˜ )eiωT A(X n 0 n 0 result has been established that guarantees that solutions of (1.1) can be approximated on an O(ε−2 ) time scale via even more general solutions of this NLS-equation. The approximation (1.4) is obtained from (1.7) by choosing l0 = π . By the opening of the gaps which is induced by our choice of s and q we have to choose l0 = π in order to satisfy (Spec). See Sect. 4.

822

C. Blank, M. Chirilus-Bruckner, V. Lescarret, G. Schneider

Fig. 5. In the right panel there is an example for which ω∗ touches a band edge at l0 = ±π . This is the situation of Theorem 1.1. In the left panel there is an example for which ω∗ touches a band edge at l0 = 0. This situation cannot be handled with the coefficient function s = s(x) chosen in this paper

Remark 1.6. Theorem 1.1 is formulated for the first spectral gap at ω∗ = 13π/16 and l0 = π as can be seen in the right panel of Fig. 5. It can also be formulated for all band edges at l0 = π with the same function s = s(x). For band edges at l0 = 0 as can be seen in the left panel of Fig. 5 the function s = s(x) cannot be used, as will be explained below. Moreover, r = 1 can be replaced by arbitrary γ r (x) as long as the non-degeneracy condition γn (l0 ) = 0 is satisfied. Breather solutions exist then either for γ = −1 or γ = 1. By rescaling time, other values of ω∗ can be reached. Remark 1.7. An alternative approach to spatial dynamics and center manifold theory would be a Lyapunov-Schmidt reduction as used in [26]. The subsequent infinite set of ODEs (2.3) is considered as an elliptic problem. In the subspace of reflection symmetric solutions the linearization around the approximate pulse solution possesses no zero eigenvalue such that the implicit function theorem can be applied to prove the persistence of the homoclinic solutions under higher order perturbations. Remark 1.8. The subsequent Lemma 4.1 explains how to construct step functions s and q in order to make (1.1) to possess breather solutions. By a simple perturbation argument it is also clear that the situation is structurally stable if small smooth perturbations are added to the step functions s and q. Remark 1.9. So far we were unable to find s and q for the construction of breather solutions with l0 = 0, cf. Remark 1.5, and breather solutions for general nonlinearities, in particular for those containing quadratic terms. Remark 1.10. The assumption that the functions s, q, and r have to be even is necessary to establish Theorem 1.1 with our method. However, we strongly expect that there are breather solutions in the non-even case, too. The functions s, q, and r give infinitely many parameters which we expect to allow us to bring the one-dimensional stable and the one-dimensional unstable manifold in the two-dimensional center manifold to an intersection. However, this is a different story. Remark 1.11. We refrain from speculating about the stability of the breather solutions constructed in Theorem 1.1. We only remark that since the solutions are not very smooth w.r.t. x we expect that some nonlinear stability result will be hard to obtain, even if some linear stability result can be established.

Breather Solutions in Periodic Media

823

Notation. Constants which can be chosen independently of the small parameter 0 < ε 1 are often denoted with the same symbol C. A solution of a non-autonomous differential equation ddux = f (u, x) with the initial condition u|x=x0 = u 0 is denoted with u = u(x, x0 , u 0 ). 2. Symmetries of the Spatial Dynamics Formulation Since we are interested in time-periodic solutions of Eq. (1.1), i.e., u(x, t + 2π/ω) = u(x, t) for all x ∈ R we use Fourier series u(x, t) = m∈Z uˆ m (x)eimω∗ t with respect to time leading to the system of countably many ODEs, ∂x2 uˆ m (x) = −s(x)m 2 ω∗2 uˆ m (x) + q(x)uˆ m (x) − r (x)gˆ m (x),

m ∈ Z,

(2.1)

where gˆ m (x) =

uˆ n 1 (x)uˆ n 2 (x)uˆ n 3 (x),

m ∈ Z.

(2.2)

n 1 ,n 2 ,n 3 ∈Z,n 1 +n 2 +n 3 =m

There are a number of linear subspaces invariant under the evolution of (2.1). These are as follows. The invariant subspace corresponding to real solutions of (2.1) is given by UR = {(uˆ n )n∈Z | uˆ n = uˆ −n }. Since the associated first order system (1.5) is invariant under the transform S : (t, u, v) → (−t, −u, −v) also Uodd = {(uˆ n )n∈Z | uˆ n = −uˆ −n } is some invariant subspace. According to the fact that we have a cubic nonlinearity also UO = {(uˆ n )n∈Z | uˆ 2n = 0}, the space of solutions whose even coefficients vanish, is an invariant subspace. Therefore, the intersection of all these subspaces UR ∩ Uodd ∩ UO = {(uˆ m )m∈Z | Re uˆ m = 0, uˆ 2m = 0, uˆ m = −uˆ −m , m ∈ Z} = Xˆ is also invariant. In the following we restrict our analysis to those solutions of (2.1) which are in Xˆ for fixed x, i.e., in particular we can restrict ourselves to m ∈ Nodd = {1, 3, 5, . . .}. Since u m ∈ R introduced by uˆ m = iu m satisfies (2.1), except for the opposite sign in front of the nonlinearity, the subsequent systems have the properties of real-valued systems, i.e., we consider in the following ∂x2 u m (x) = −s(x)m 2 ω∗2 u m (x) + q(x)u m (x) + r (x)gm (x),

m ∈ Z,

(2.3)

where gm (x) =

n 1 ,n 2 ,n 3 ∈Z,n 1 +n 2 +n 3 =m

u n 1 (x)u n 2 (x)u n 3 (x),

m ∈ Z.

(2.4)

824

C. Blank, M. Chirilus-Bruckner, V. Lescarret, G. Schneider

3. Application of Floquet’s Theory In order to analyze the linear part of the spatial dynamics system (2.3) which decouples into infinitely many linear second order ODEs with periodic coefficients, namely ∂x2 u m (x) = −s(x)m 2 ω∗2 u m (x) + q(x)u m (x),

(3.1)

we use tools from Floquet theory. We shortly recapitulate these for the system ∂x u m (x) = vm (x), ∂x vm (x) = −s(x)ω2 u m (x) + q(x)u m (x),

(3.2)

which is simply (3.1) written as a first order system and with the replacement ω2 = m 2 ω∗2 . In order to adjust the spectral picture we will consider ω ∈ R and, thus, first ignore the sampling via m ∈ Z. The fundamental matrix of (3.2) is denoted with = (x), where (0) = I . Floquet’s theorem [7] shows that x M˜ ˜ (x) = P(x)e

(3.3)

˜ ˜ + 1) and a x-independent matrix M. ˜ Note that M˜ is not unique with P(x) = P(x 2π in ˜ = 1 for n ∈ Z. The eigenvalues of M are called Floquet exponents. according to e ˜ The eigenvalues of the so called monodromy matrix C = e M are called Floquet multipliers. For our special system we have two Floquet multipliers ρ− and ρ+ satisfying ρ+ ρ− = 1. They can be computed via the characteristic polynomial, cf. [7], ρ 2 − D(ω2 )ρ + 1 = 0 and are given by ρ± (ω2 ) =

1 1 D(ω2 ) ± (D(ω2 ))2 − 4, 2 2

where the trace of the monodromy matrix, D(ω2 ) = trace Cω2 , is called the discriminant. We find that S1a) if |D(ω2 )| > 2 then the Floquet multipliers ρ± (ω2 ) are real. As a consequence the solutions have exponential growth or decay w.r.t. x. S1b) if |D(ω2 )| < 2 then the Floquet multipliers ρ± (ω2 ) are on the complex unit circle away from ±1. As a consequence the solutions are uniformly bounded w.r.t. x. S2) if |D(ω2 )| = 2 then the Floquet multipliers ρ± (ω2 ) are 1 or −1. In this case we have at most polynomial growth. In case S1) the Floquet multipliers are simple. In case S2) we have algebraic multiplicity 2, but geometric multiplicity 1, i.e., in M˜ a nontrivial Jordan-block occurs. In order to deduce from the discriminant the spectral relation (l, ω) belonging to (1.6) one uses 1 1 e±il = D(ω2 ) ± (D(ω2 ))2 − 4 (3.4) 2 2 for values where −2 ≤ D(ω2 ) ≤ 2, i.e., where the Floquet exponents are given by ±il. Why the spectral relation (l, ω) really consists of infinitely many curves l → ωn (l) becomes evident from inspecting the properties of D(ω2 ) and Eq. (3.4), in particular, by taking into account that solving for ω involves an inversion of D(ω2 ).

Breather Solutions in Periodic Media

825

4. An Example for a Suitable Choice of s = s(x) In this section we show that the choice s(x) = χ[0,6/13] + 16χ(6/13,7/13) + χ[7/13,1] (x mod 1)

and

q(x) = μs(x) (4.1)

allows us to satisfy the spectral assumption (Spec) in case of ω∗ = 13π/16 and μ = q0 with q0 ∈ R a fixed number defined explicitly in (4.5). Lemma 4.1. For s and q defined in Eq. (4.1) we find the discriminant 16 8 25 9 ω2 − μ − cos ω2 − μ . Dμ (ω2 ) = cos 8 13 8 13

(4.2)

Proof. We have to solve the ODE u (x) + (ω2 − μ)s(x)u(x) = 0, i.e. u (x) + (ω2 − μ)u(x) = 0, for x ∈ [0, 6/13] ∪ [7/13, 1]

(4.3)

u (x) + 16(ω2 − μ)u(x) = 0, for x ∈ (6/13, 7/13),

(4.4)

and

˜ = (x) ˜ for continuous u and u . We set λ = ω2 − μ. The fundamental matrix of ˜ (4.3) with (0) = I is given by

√ √ √1 sin( λx) cos( λx) λ ˜ √ √ √ (x) = cos( λx) − λ sin( λx) and of (4.4) by ˜˜ (x) =

√ cos(4 λx) √ √ −4 λ sin(4 λx)

√ sin(4 λx) √ . cos(4 λx)

1 √ 4 λ

Hence, we find 6 6 ˜ 1 ˜ ˜ ˜ (1) = 13 13 13

√ 6 √ √1 sin( λ 6 ) cos( λ 13 ) 13 λ √ √ 6 √ = 6 ) cos( λ 13 ) − λ sin( λ 13 √ 1 √ 1

1 √ cos(4 λ 13 ) sin(4 λ 13 ) 4 λ √ 1 √ 1 √ × cos(4 λ 13 ) −4 λ sin(4 λ 13 )

√ 6 √ √1 sin( λ 6 ) cos( λ 13 ) 13 λ √ 6 √ 6 √ × . ) cos( λ 13 ) − λ sin( λ 13

826

C. Blank, M. Chirilus-Bruckner, V. Lescarret, G. Schneider 15

15

10

10

5

5

0

0

−5

−5

−10

−10

−15 −5

0

5

10

15

−15 −5

0

5

Fig. 6. The figure shows the discriminant and the associated dispersion relation for μ = 0. S1a) If |D0 (ω2 )| > 2 then the Floquet exponents are real. As a consequence the solutions have exponential growth or decay w.r.t. x and so a spectral gap occurs. S1b) If |D0 (ω2 )| < 2 then the Floquet exponents are purely imaginary. As a consequence the solutions are uniformly bounded w.r.t. x and curves of eigenvalues occur in the spectral picture of the time evolutionary system. S2) If |D0 (ω2 )| = 2 then the Floquet exponents are 0 or iπ . In this case we have a horizontal tangency in the spectral picture of the time evolutionary system. The right panel show the associated functions l → ωn (l). For the understanding of the relation between the left and right panel compare (1.6) with (3.3)

For the discriminant which is the trace of this matrix we find √ 6 2 √ 4 √ 6 2 √ 4 − 2 sin D(λ) = 2 cos λ cos λ λ cos λ 13 13 13 13 √ 6 √ 6 √ 4 17 cos sin . λ λ λ − sin 2 13 13 13

Using the representation of sin and cos as exponentials allows us to simplify this expression easily into (4.2). Note that in the simplification there is a cancellation since no terms 4 with argument 13 ω2 − μ occur. Both q and s are even as necessary for the reversibility of (1.5). The graph ω → D0 (ω2 ) of the discriminant for μ = 0 and the associated dispersion relation can be found in Fig. 6. For ω∗ = 13π/16 and ω2 = m 2 ω∗2 with m ∈ N we find for q = μ = 0 that ⎧ 34 ⎪ ⎨ 8, 2 2 D0 (m ω∗ ) = 2, ⎪ ⎩ 25 −8,

m ∈ 2 + 4Z, m ∈ 4Z, m ∈ 1 + 2Z,

Breather Solutions in Periodic Media

827

Im

Im

Re

Re

Fig. 7. The Floquet multipliers for the above choice of s, q, and ω. Left: for m ∈ Z. Right: for m ∈ Nodd

with associated Floquet multipliers ⎧ ⎪ 34 1 34 2 ⎪ ± − 4 ∈ {−1, 1}, ⎪ 8 ⎨ 16 2 2 2 1, ρ± (m ω∗ ) = ⎪ 2 ⎪ ⎪ 25 1 25 ⎩ − 16 ± 2 − 4 ∈ {−1, 1}, 8

m ∈ 2 + 4Z, m ∈ 4Z, m ∈ 1 + 2Z.

The associated Floquet diagram is plotted on the left hand side of Fig. 7. For μ = 0 these five points remain as accumulation points of the Floquet multipliers. The fact that there are infinitely many Floquet multipliers on the unit circle prevents at a first view the application of center manifold theory. However, since we only need to look for m ∈ Nodd the Floquet multipliers on the unit circle have not to be taken into account. As a consequence we have only negative Floquet multipliers being off the unit circle as can be seen in the right hand side of Fig. 7. Since the asymptotics of the Floquet multipliers is not affected by q, cf. [7, Theorem 4.5.3 ii)] we can use numerics to check for which parameter value μ of a given family of functions q = q(x, μ) two of the multipliers collide in −1. For our choice, q = q(x, μ) = μs(x), there is a critical value μ = q0 ≈ 3.7703 for which we have exactly two Floquet multipliers at −1. This value can be computed explicitly by solving 2 Dμ (( 13π 16 ) ) = −2. This condition is equivalent to the solution of the following set of equations: 13π , p = 8 ω∗2 − q0 /13, ω∗ = 16 √ w.r.t q0 . Using cos(2 p) = 2 cos2 ( p) − 1 we find p = arccos((9 + 1881)/100) and finally 25 cos(2 p) − 9 cos( p) = −16,

q0 =

ω∗2

−

13 p 8

2

=

13π 16

2 −

2 √ 13 arccos((9 + 1881)/100) ≈ 3.7703. 8 (4.5)

828

C. Blank, M. Chirilus-Bruckner, V. Lescarret, G. Schneider

Since

m 2 ω∗2 − μ = mω∗ (1 −

μ + . . .) 2(mω∗ )2

the role of μ in Dμ (m 2 ω∗2 ) becomes smaller and smaller for m → ∞. Hence the spectral gaps are located asymptotically around mω∗ and the spectral gaps are O(1) for m → ∞. Since the discriminant becomes asymptotically a periodic function the associated eigenvalues in the spatial dynamics formulation are uniformly bounded away from the unit circle and on the real axis. See Fig. 9. 5. The Reversibility In the spatial dynamics formulation (1.5) the breather solution is a homoclinic solution. We find this bifurcating homoclinic solution approximately in the two-dimensional center manifold associated with the spectral picture drawn in Fig. 2. The persistence of the approximate homoclinic solution w.r.t. higher order perturbations heavily relies on the reversibility of the spatial dynamics formulation. 5.1. Preparations. We introduce a number of abbreviations, namely Um = (u m , vm ), vm (x) , (m Um )(x) = −s(x)m 2 ω∗2 u m (x) + q0 s(x)u m (x) 0 , Nm (U ) = ε2 s(x)u m (x) + r (x)gm (x) and rewrite (2.3) as ∂x U = F(x, U ) = U + N (U ),

(5.1)

where U = (Um )m∈Nodd , = (m )m∈Nodd , and N = (Nm )m∈Nodd . Definition 5.1. The non-autonomous system (5.1) is called reversible if there is an operator R such that RF(x, U ) = −F(−x, RU ). R is called the reversibility operator. For system (5.1) we define a reversibility operator R by R = ⊕m∈Nodd R with R(u m , vm ) = (u m , −vm ). System (5.1) is reversible, i.e., invariant under (x, u m , vm ) → (−x, u m , −vm ), due to the assumption that s, q and r are even functions. Lemma 5.2. With x → U (x) solving (5.1), also x → V (x) = RU (−x) is a solution. Proof. We have V˙ (x) = −RU˙ (−x) = −RF(−x, U (−x)) = F(x, RU (−x)) = F(x, V (x)). Lemma 5.2 implies that with x → U (x) = (u m , vm )m∈Nodd (x) being a solution of (5.1), also x → RU (−x) = (u m , −vm )m∈Nodd (−x) is a solution of (5.1). In the following arguments the fixed space of reversibility plays a major role. It is given by R f i x = {U = RU } = {(u m , 0)m∈Nodd }.

Breather Solutions in Periodic Media

829

15

15

———————10

10

———————5

5

0

0

−5

−5

−10

−10

−15 −5

0

5

10

——————————————-

−15 −5

15

0

5

Fig. 8. The figure shows the discriminant and the associated dispersion relation for μ = q0 = 3.7703. The large value of Dμ (ω2 ) at ω = 0 comes from the fact that cos(iu) = cosh(u) Im

Im

Re

Im

Re

Im

Re

Re

Fig. 9. The Floquet multipliers for different choices of μ. Upper left: μ = 0. Upper right: μ < q0 . Lower left: μ = q0 . Lower right: μ > q0

5.2. The reversible change of variables. Due to the above theorem of Floquet (see ˜ Sect. 3) the solutions of ∂x Um = m Um are given by Um (x) = P˜m (x)e x Mm Um (0) 2×2 with P˜m (x) = P˜m (x + 1) and M˜ m ∈ C . Since all Floquet multipliers in Fig. 9 have negative real part and vanishing imaginary part the associated Floquet exponents, i.e., the eigenvalues of the M˜ m , are of the form α ± iπ with α ∈ R. In order to have real Floquet exponents we apply Floquet’s theorem for 2-periodic functions, i.e., the solutions of ∂x Um = m Um are given by Um (x) = Pm (x)e x Mm Um (0) with Pm (x) = Pm (x + 2), Pm (0) = I and Mm ∈ R2×2 . In order to make the linear part of the system autonomous we could make a change of variables Um (x) = Pm (x)Vm (x). However, this choice would destroy the reversibility. Moreover, the linear part will not be in Jordan normal form. Hence we proceed as follows. We write Um (x) = Pm (x)e x Mm Um (0) = Pm (x)Sm−1 e x Jm Sm Um (0) = Q m (x)e x Jm Vm (0)

830

C. Blank, M. Chirilus-Bruckner, V. Lescarret, G. Schneider

such that Vm (x) defined by Um (x) = Q m (x)Vm (x)

(5.2)

satisfies ∂x Vm = Jm Vm , where Jm is the Jordan normal form of Mm and Sm the associated transformation. Case S1). Assume first that real-valued the Floquet exponents λ j for fixed m satisfy λ1 = λ2 . The solutions of ∂x Um = m Um can be written as U (x) = c1 ψ1 (x) + c2 ψ2 (x) = c1 eλ1 x φ1 (x) + c2 eλ2 x φ2 (x) with constants c j and 2-periodic φ j here and in the following. Since the systems are reversible with x → eλ1 x φ1 (x) also x → e−λ1 x Rφ1 (−x) is a solution. Hence we define the second fundamental solution eλ2 x φ2 (x) = e−λ1 x Rφ1 (−x), which implies λ2 = −λ1 and φ2 (x) = Rφ1 (−x). We introduce the new variable V (x) = (v1 , v2 )(x) by v (x) , U (x) = v1 (x)φ1 (x) + v2 (x)φ2 (x) = (φ1 (x), φ2 (x)) 1 v2 (x) where by construction ∂x V (x) = BV (x) with B = diag(λ1 , λ2 ). Hence, the above change of variables (5.2) and the last change of variables coincide, i.e. B = Jm , and the linear system is now reversible w.r.t. the transformed reversibility operator R˜ m defined through v1 v2 ˜ = . Rm v2 v1 Case S2). Next assume that we have a Jordan-block for the Floquet exponent λ = 0. Then U (x) = c1 ψ1 (x) + c2 ψ2 (x) = c1 φ1 (x) + c2 (xφ1 (x) + φ2 (x)). Due to the reversibility φ1 (x) = Rφ1 (−x), and φ2 can be chosen such that φ2 (x) = −Rφ2 (−x). We introduce the new variable V (x) = (v1 , v2 )(x) by v1 (x) , U (x) = v1 (x)φ1 (x) + v2 (x)φ2 (x) = (φ1 (x), φ2 (x)) v2 (x) 0 1 where by construction ∂x V (x) = BV (x), with B = . In this case the repre0 0 sentation of the reversibility operator is preserved, i.e. v1 v1 ˜ Rm = . v2 −v2 With Um (x) = Q m (x)Vm (x) we find ∂x Vm (x) = Bm Vm (x) + N˜ m (x, V (x)),

(5.3)

Breather Solutions in Periodic Media

831

where N˜ (x, V (x)) = Q −1 (x)N (x, Q(x)V (x)) = ((Q m (x))−1 Nm (x, (Q j (x)V j (x)) j∈Nodd ))m∈Nodd and Q(x) = ⊕n j ∈Nodd Q m (x). We have by construction that B1 has some Jordan block of size 2 with associated eigenvalue 0. All other Bm with m ≥ 3 possess one positive and one negative eigenvalue which are uniformly bounded away from the imaginary axis w.r.t. m, i.e., (5.3) has the spectral picture plotted in the right panel of Fig. 2. The change of variables is bounded in the following sense. q11,m (x) q12,m (x) . Then there exists a C > 0 Lemma 5.3. Let Q m (x) = q21,m (x) q22,m (x) such that for all m ∈ Nodd we have supx∈[0,2] (|q11,m (x)| + |q12,m (x)|) < C and supx∈[0,2] |(Q m (x))−1 | < C. Proof. The proof follows from explicit calculations using the representation of the fundamental matrix (x) given in Lemma 4.1 for ω2 = m 2 ω∗2 and m ∈ 1 + 2Z. We refrain from writing down all the formulas, especially from writing down (1) for these values of m. We restrict ourselves to some explanations why the result is true. From the form of (1) it follows that we can choose the eigenvectors to be of the form a a and ψ2 (1) = , ψ1 (1) = b −b with a and b non-vanishing numbers satisfying b/a = O(n). Since ψ j (1) = ρ j ψ j (0), where ρ j are the associated Floquet multipliers we can compute ψ1 (x) = λ (x)ψ1 (0). We find that ψ2 (x) = (x)ψ2 (0) satisfies ψ2 (x) = Rψ1 (−x). From ψ j (x) we can = φ j (x + 1). Note that φ j (x) and ψ j (x) have compute explicitly φ j (x) = ψ j (x)ρ −x j the same asymptotics w.r.t. m since all Floquet multipliers are uniformly bounded independently of m. The matrix Q m (x) which possesses φ1 (x) and φ2 (x) as columns is O(1) O(1) , where for the same reason its m-depentherefore again of the form O(m) O(m) dent determinants are by O(m) constants. Hence, we have that bounded from below O(1) O(1/m) Q −1 m is of the form O(1) O(1/m) . 5.3. Conjugation of the old and the new reversibility operator. The old reversibility ˜ = ⊕m∈N R˜ m are conjugated via the operator R and the new reversibility operator R odd transform U (x) = Q(x)V (x). We find ˜ = RQ(x), Q(−x)R ˜ −1 (x). By the analysis of the last subsection we already which implies Q −1 (−x)R = RQ know that the transformed linear operator is reversible w.r.t. to the new reversibility oper˜ ator R. ˜ i.e., Lemma 5.4. System (5.3) is reversible w.r.t. to the new reversibility operator R, especially we have ˜ N˜ (x, V ) = − N˜ (−x, RV ˜ ). R

832

C. Blank, M. Chirilus-Bruckner, V. Lescarret, G. Schneider

Proof. This holds according to ˜ N˜ (x, V ) = RQ ˜ −1 (x)N (x, Q(x)V ) = Q −1 (−x)RN (x, Q(x)V ) R ˜ ) = −Q −1 (−x)N (−x, RQ(x)V ) = −Q −1 (−x)N (−x, Q(−x)RV ˜ ). = − N˜ (−x, RV 6. The Center Manifold Reduction It is the purpose of this section to construct the center manifold for (5.3) associated to the two Floquet multipliers on the unit circle. System (5.3) is extended with the equation ∂x ε = 0 which allows us to handle all terms with an ε as nonlinear. Before we do so we make a number of remarks. Remark 6.1. The combination of spatial dynamics and center manifold theory goes back to [21]. For a number of different formulations in the continuous case see [29]. There are a number formulations for discrete dynamical systems, too. The estimates below show that for instance, the abstract center manifold theorem [18, Theorem 6.2] applies to the time-one-map in our case. We do not use this discrete version since for the discussion of the reduced system in the following we would like to have an ordinary differential equation rather than a discrete system. Remark 6.2. Invariant manifolds for periodic systems are invariant in the following sense. We denote with V = V (x, x0 , V0 ) a solution V at x with initial condition V0 given at x0 . We introduce a nonlinear evolution operator Sx,x0 defined by Sx,x0 V0 = V (x, x0 , V0 ). Due to the 1-periodicity we have V (x, x0 , V0 ) = V (x + 1, x0 + 1, V0 ). Hence the time-1-maps x0 which are defined through x0 (V0 ) = V (x0 + 1, x0 , V0 ) play a crucial role. Time-1-maps for different x0 are conjugated to each other, i.e. x ◦ Sx,0 = Sx,0 ◦ 0 which is a direct consequence of V (x + 1, x, V (x, 0, V0 )) = V (x + 1, 1, V (1, 0, V0 )) = V (x, 0, V (1, 0, V0 )). The center manifold of the origin W c (x0 ) = {V0 | (x0 )n (V0 ) ≤ Ceη|n| for |n| → ∞} = {V0 |V (x, x0 , V0 ) ≤ Ceη|x−x0 | for |x| → ∞} for a C > 0 and a fixed small η > 0 satisfies W c (x0 ) = W c (x0 + 1) by construction. We have the invariance x0 W c (x0 ) ⊂ W c (x0 ) and the transport by the flow W c (x) = Sx,x0 W c (x0 ).

(6.1)

Similar statements are true for the unstable and stable manifold. Since our systems are only reversible, cf. Definition 5.1, for x0 = 0, we set x0 = 0 in the following.

Breather Solutions in Periodic Media

833

In the following we fix σ ≥ 0. For m ≥ 3 we define projections Ps,m and Pu,m on the stable and unstable eigenspaces which are uniformly bounded in R2×2 w.r.t. m due to the diagonal form of the Bm . Introducing Vm,s (x) = Pm,s Vm (x) and Vm,u (x) = Pm,u Vm (x) for m ≥ 3 allows us to construct a center manifold as usual by applying a fixed point argument to the map F = (F1 , (Fm,s , Fm,u )m∈Nodd ) : Yη → Yη for a small but fixed η > 0, where x F1 (x) = e B1 x V1 (0) + e B1 (x−ξ ) Nˇ 1 (ξ, V (ξ )) dξ, 0 x Fm,s (x) = e Bm (x−ξ ) Pm,s Nˇ m (ξ, V (ξ )) dξ, −∞ ∞ Fm,u (x) = − e Bm (x−ξ ) Pm,u Nˇ m (ξ, V (ξ )) dξ, x

and Yη = {V ∈ C 0 (R, 1 (σ )) | sup |eη|x| V (x)1 (σ ) < ∞} x∈R

with V = (V1 , V3 , V5 , . . .) and V 1 (σ ) =

m∈Nodd

|Vm |m σ . Moreover, we let

Nˇ m (x, V ) = N˜ m (x, V )χ (V 1 (σ ) /δ) for a small, but fixed δ > 0, where χ is a C0∞ -function with values in [0, 1] satisfying χ (r ) = 1 for r ≤ 1, χ (r ) = 0 for r ≥ 2. Lemma 6.3. Nˇ (V ) = ( Nˇ m (V ))m∈Nodd is Lipschitz continuous in 1 (σ ) with Lipschitz constant proportional to δ 2 for δ → 0. Moreover, Nˇ (V ) is reversible w.r.t. the revers˜ i.e., we have ibility operator R, ˜ Nˇ (x, V ) = − Nˇ (−x, RV ˜ ). R

(6.2)

Proof. Since in Nm only the first coordinate of U occurs, after the transforms Um = Q m Vm only q11,m and q12,m occur in the transformed nonlinearity. Since both are uniformly bounded, since the same is true for Q −1 m according to Lemma 5.3, and since 1 (σ ) is closed under convolutions the Lipschitz continuity follows. The magnitude of the Lipschitz constant follows from the cut-off function and the fact that N does not contain quadratic terms. The reversibility of N˜ which is known from Lemma 5.4 is not destroyed by the cut-off function and so (6.2) follows. Moreover, due to the asymptotics of the discriminant which results in Fig. 2 it follows Lemma 6.4. There exist β such that for all η with β > η > 0 we have a C > 0 such that e B1 x R2 →R2 ≤ Ceη|x|/2 , sup e m

Bm x

Pm,s R2 →R2 ≤ Ce−βx ,

sup e Bm x Pm,u R2 →R2 ≤ Ceβx , m

∀x ∈ R, ∀x ≥ 0, ∀x ≤ 0.

Proof. See Fig. 9 and note that the Bm are diagonal for m ≥ 3.

834

C. Blank, M. Chirilus-Bruckner, V. Lescarret, G. Schneider

Hence from Lemma 6.3 and 6.4 the existence of a fixed point V ∗ = V ∗ (x, V1∗ , ε) of F follows with the usual estimates, cf. [29]. Since the spectral gap, the cut-off and all estimates are O(1) w.r.t. ε, the size of the center manifold will be O(1) for ε → 0. We define the graph of the center manifold by a mapping h from the central subspace to the hyperbolic subspace by h(0, V1∗ , ε) = Ph V ∗ (0, V1∗ , ε), where Ph = (Pm,u + Pm,s )m∈Nodd is the projection to the hyperbolic subspace. After the sketch of the major steps of the proof of the center manifold theorem the result is summarized in Theorem 6.5. For all n ∈ N there exist ε0 > 0 and ϑ0 > 0 such that the spatial dynamics formulation (5.3) extended with ∂x ε = 0 possesses a three-dimensional invariant manifold Wc = {(V ∗ , ε) ∈ 1 (σ ) × R | (0, V3 , V5 , . . .) = h(0, V1∗ , ε)} tangential to the space E c = {(V1∗ , 0, 0, . . .) × R | V1∗ ∈ R2 } with h(0, ·, ·) ∈ C n ({V1 ∈ R2 | V1 1 (σ ) ≤ ϑ0 } × [0, ε0 ], 1 (σ )). The center manifold Wc = Wc (0) has been constructed for starting time x0 = 0. At the beginning of this subsection we explained that the invariant manifolds are transported by the flow, i.e. the center manifold Wc (x0 ) for starting time x0 and Wc (0) are related via Wc (x0 ) = Sx0 ,0 Wc (0), where Sx0 ,0 is the evolution operator of the spatial dynamics system (5.1) extended with ∂x ε = 0. Hence we define the reduction function h(x0 , ·, ·) for Wc (x0 ) by V ∗ (x) ⊕ ε = Sx,0 (V ∗ (0) ⊕ ε) = Sx,0 (V1∗ (0) ⊕ h(0, V1∗ (0)) ⊕ ε) = V1∗ (x) ⊕ h(x, V1∗ (x)) ⊕ ε. Since the dynamics in ε is trivial we suppress in the following the variable ε in our notation. Since with x → V (x) = V1 (x) ⊕ h(x, V1 (x)) being a solution on the center ˜ r h(−x, V1 (−x)) is a solution on the ˜ (−x) = RV ˜ 1 (−x) ⊕ R manifold also x → RV ˜ ˜ ˜ r h(−x, V1 ) = h(x, RV ˜ ˜ 1) center manifold, where R = R ⊕ Rr we can conclude that R ˜ r h(x, V1 ) = h(−x, RV ˜ 1 ). From this we find or equivalently R ˜ r h(x, V1 )) ˜ 1⊕R R˜ N1 (x, V1 ⊕ h(x, V1 )) = −N1 (−x, RV ˜ 1 ⊕ h(−x, RV ˜ 1 )) = −N1 (−x, RV such that the vector field of the reduced system ∂x V1 (x) = B1 V1 (x) + Nˇ 1 (x, V1 (x), h 3 (x, V1 (x), ε), . . .)) (6.3) is reversible w.r.t. the transformed reversibility operator R˜ which coincides for V1 with the original one reversibility operator R. As a consequence of the center manifold reduction all small bounded solutions of (5.3) can be found on the center manifold and it is sufficient to discuss the reduced system on the center manifold. Finally, we mention that the right-hand side of (6.3) is smooth w.r.t. (V, ε), but only piecewise smooth w.r.t. x with discontinuities at the jumps of the coefficient function s = s(x).

Breather Solutions in Periodic Media

835

7. Construction of a Homoclinic Solution By the center manifold reduction of the last section the infinite dimensional spatial dynamics formulation (5.1) has been reduced to the 2-periodic two-dimensional ordinary differential equation (6.3) which is reversible w.r.t. the reversibility operator R. The reduced system (6.3) is analyzed with the help of bifurcation theory. The small bifurcation parameter has been introduced by q(x) = (q0 −ε2 )s(x). Hence, in (6.3) only powers of ε2 occur. Since V1 = 0 is a solution for all values of ε, terms depending on ε alone cannot occur. Moreover, the reduced system must have the same symmetries as the original one which reduces the number of possible terms. Especially the translation invariance w.r.t. t reduces the number of possible terms drastically. By our choice of coordinates B1 is a Jordan block of size two. Setting V1 = (a, b) this allows us to rewrite (6.3) into ∂x a = b + O(|ε2 a|, |ε2 b|, |a 3 |, . . . , |b3 |), ∂x b = O(|ε2 a|, |ε2 b|, |a 3 |, . . . , |b3 |) for small (a, b). Introducing a˜ and b˜ by ˜ a(x) = εa(x) ˜ and b(x) = ε2 b(x) yields ∂x a˜ = εb˜ + O(ε2 ),

∂x b˜ = εs1 (x)a˜ + εs3 (x)a˜ 3 + O(ε2 ),

(7.1)

with s1 = s1 (x) and s3 = s3 (x) being 2-periodic functions. In order to find the homoclinic solutions for (7.1) which is of the form z˙ = ε f (z, x, ε)

(7.2)

˜ we compare it with the averaged system with z = (a, ˜ b) y˙ = ε f (y) =

ε 2

2

f (y, x, 0)d x.

(7.3)

0

In order to analyze the averaged system we rescale time X = εx, and introduce A(X ) = a(x), ˜

˜ B(X ) = b(x).

(7.4)

In this scaling the averaged system is given ∂ X A = B,

∂ X B = s¯1 A + s¯3 A3 ,

(7.5)

where s¯1 =

2

s1 (x) dx/2

and

0

Lemma 7.1. We have s¯1 > 0 and s¯3 < 0.

2

s¯3 = 0

s3 (x) dx/2.

(7.6)

836

C. Blank, M. Chirilus-Bruckner, V. Lescarret, G. Schneider

Proof. Since ε has been defined in such a way that ω∗ falls into a spectral gap in the right panel of Fig. 8 for ε > 0 or equivalently that in Fig. 2 a real positive and a real negative eigenvalue of order O(ε) occur, it follows for the ε independent coefficient s¯1 that s¯1 > 0. 1 (π ) The coefficient s¯3 is a positive multiple of 2γ as can be seen by comparing the ω1 (π ) formal derivation of (7.5) and the derivation of the associated NLS-equation, cf. Remark 1.5. According to (1.9) the coefficient γ1 (π ) is positive due to r (x) = 1 and according to the right panel of Fig. 8 the coefficient ω1 (π ) is negative such that s¯3 has a negative sign. Remark 7.2. Due to the fact that we have a cubic nonlinearity the coefficient function s3 = s3 (x) is independent of the reduction function h and so it can be computed from Q −1 1 (x)N1 (x, Q 1 (x)V1 ) εA alone. According to our scalings V1 = we have ε2 B εA εq11 A q11 (x) q12 (x) + O(ε2 ). Q 1 (x)V1 = = q21 (x) q22 (x) εq12 A ε2 B Since

0 N1 (x, V1 ) = and ε2 s(x)V1 + 3r (x)V13 q22 (x) −q12 (x) −1 −1 Q 1 (x) = (det Q 1 (x)) −q12 (x) q11 (x)

we find

0 −1 + O(ε4 ) Q −1 (x)N (x, Q (x)V ) = Q (x) 1 1 1 1 1 ε3 s(x)q11 (x)A + 3ε3 r (x)q11 (x)3 A3 O(ε3 ) + O(ε4 ) = ε3 (det Q 1 (x))−1 (s(x)q11 (x)2 A + 3r (x)q11 (x)4 A3

and so s1 (x) = (det Q 1 (x))−1 s(x)q11 (x)2 and s3 (x) = 3(det Q 1 (x))−1r (x)q11 (x)4 . (7.7) In Q 1 (x), cf. Sect. 5.2 S2), only φ1 (x) plays a role and so q11 is nothing else than a multiple of w1 (x, π )eiπ x Comparing (7.6) and (7.7) with the formula (1.9) for the coefficient γ1 (π ) in front of the cubic terms in the associated NLS equation shows that det Q 1 (x) is a multiple ω1 (π ). The magnitude of the multiple depends on the scaling of w1 (x, π )eiπ x . Since s¯3 has a negative sign system (7.5) possesses a pair of homoclinic solutions qhom = (Ahom , Bhom ) which is given by √ 2s1 Ahom (X ) = ± sech s1 X , ∂ X Ahom = Bhom . −s 3 Undoing the scaling (7.4) shows that system (7.3) possesses a pair of homoclinic solutions, too.

Breather Solutions in Periodic Media

837

8. Persistence of the Homoclinic Solution Remark 8.1. The homoclinic orbit qhom lies in the intersection of the stable manifold and the unstable manifold of System (7.3). In general systems if higher order terms are added the intersection will break up and the perturbed stable manifold and the unstable manifold will no longer intersect. In reversible systems the situation is different. The persistence of the homoclinic solution is established by proving a transversal intersection of the stable manifold with the fixed space of reversibility. This gives the homoclinic orbit for x ∈ [0, ∞). Applying the reversibility operator R to this part of the solution gives the homoclinic orbit also for x ∈ (−∞, 0]. The persistence proof consists of three steps: i)

Beyond other things in [14, Theorem 4.1.1] the following is shown: Lemma 8.2. There exists a C r -w.r.t. y-change of coordinates z = y + εw(y, x, ε) under which (7.2) becomes y˙ = ε f (y) + ε2 f 1 (y, x, ε),

(8.1)

where f 1 is of period 2 w.r.t. x.

ii)

Hence in an O(1)-neighborhood the stable manifold W s of the averaged system (7.3) and the stable manifold Ws of the full system (7.2), resp. (8.1), are O(ε)-close together. In addition to the statement in Lemma 8.2 in [14, Theorem 4.1.1] it is shown Lemma 8.3. If z(x) and y(x) are solutions of (7.2) and (7.3) with |z(−1/ε) − y(−1/ε)| = O(ε), then supx∈[−1/ε,0] |z(x) − y(x)| = O(ε).

iii)

Applying the approximation result from Lemma 8.3 shows that the stable manifold W s of the averaged system (7.3) and the stable manifold Ws of the full system (7.2), resp. (8.1), are O(ε)-close together on a scale O(1/ε). Hence O(ε)-close to the intersection point of the averaged system (7.3) with the fixed space of reversibility there is an intersection point of the full system (7.2), resp. (8.1). See Fig. 11 As a consequence we have a solution V1 (x) of (7.2) for x ∈ [0, ∞) which satisfies lim x→∞ V1 (x) = 0 and V1 (0) ∈ {B = 0}. Finally, we use the reversibility of the reduced system (6.3), resp. (7.2). It allows us to extend V1 (x) for x ∈ [0, ∞) by V1 (−x) = RV1 (x) to x ∈ R. Hence we constructed a homoclinic solution to the origin for (6.3) and so as a consequence of the exact center manifold reduction finally for the original system (2.3).

Remark 8.4. We cannot use [14, Theorem 4.1.1 iii)] directly which is a statement about the closedness of the stable manifolds, since there the closedness for x = 0 is assumed, which was exactly the goal of the above steps i) and ii). We used i) to control x ∈ (−∞, −1/ε) and ii) to control x ∈ (−1/ε, 0]. Remark 8.5. Finally we remark that in the statement of [14, Theorem 4.1.1] also smoothness of the vector field w.r.t. x is assumed. However, looking at the proof shows that piecewise continuity w.r.t. x is sufficient. Inserting z = y + εw(y, x, ε) into z˙ = ε f (z, x, ε) shows that y˙ = (1 + ε∂1 w(y, x, ε))−1 (ε∂2 w(y, x, ε) + ε f (y + εw(y, x, ε), x, ε) = ε∂2 w(y, x, ε) + ε f (y, x, 0) + O(ε2 ).

838

C. Blank, M. Chirilus-Bruckner, V. Lescarret, G. Schneider

0

0

Fig. 10. The homoclinic orbits for the reduced system

Fig. 11. The combination of local estimate for the difference from (i) with the approximation result from ii). The dotted/full line is the stable manifold of the averaged system (7.3)/full system (7.2)

As a consequence the x-dependent parts of f (y, x, ε) = m∈Z f m (y, ε)eiπ mx can be eliminated by choosing w(y, x, ε) = m∈Z\{0} (iπ m)−1 f m (y, ε)eiπ mx . Hence w is at least continuous w.r.t. x and so the transformed system has the same regularity w.r.t. x as the original one. Lemma 8.3 follows from a simple application of Gronwall’s inequality for which the given regularity w.r.t. x is sufficient. Acknowledgements. The authors are grateful for discussions with K. Busch, D. Pelinovski, L. Tkeshelashvili, and H. Uecker. The research is partially supported by the Graduiertenkolleg 1294: Analysis, Simulation, and Design of Nanotechnological Processes sponsored by the Deutsche Forschungsgemeinschaft (DFG) and the Land Baden-Württemberg. Finally, we would like to thank the referees for a number of useful suggestions, especially for pointing out the possibility of writing down the explicit formula (4.5) for q0 .

A. The Physical Motivation: Photonic Crystals as Optical Storage This research is motivated by theoretical scenarios where photonic crystals are used as optical storage [5]. Photonic crystals consist of a dielectric material, for instance glass, with a periodic structure with a wavelength comparable to the wavelength of light. They are suitable tools for the construction of all optical devices in photonics which is loosely

Breather Solutions in Periodic Media

839

speaking electronics with photons instead of electrons. The light pulses in the photonic crystal are described by Maxwell’s equations (cf. [9]) in media which are given by ∇ · (E + P) = ∇×E = ∇×B= ∇ · (B + M) =

ρ, ∂t (B + M), −∂t (E + P) + j, 0,

(A.1) (A.2) (A.3) (A.4)

where E is the electric field, B the magnetic field, P the polarization, M the magnetization, j the electric current density, and ρ the electric charge density, where by rescaling all coefficients have been set to one. In photonic crystals there are no free charges, no electric current, and no magnetization, i.e., ρ = 0, j = 0 and M = 0. Differentiating (A.3) w.r.t t and substituting ∂t B via (A.2) yields E − ∇(∇ · E) = ∂t2 E + ∂t2 P,

(A.5)

where we additionally used the rule ∇ × ∇ × u = −u + ∇(∇ · u). For polarized light, i.e., in the one-dimensional situation (A.5) simplifies into ∂x2 E = ∂t2 E + ∂t2 P.

(A.6)

In order to close (A.6) the polarization P has to be expressed in terms of E. Equation (1.1) can be obtained by the following modelling. We split P into an instantaneous part Pinst and into a non-instantaneous part Pnon . For the instantaneous part Pinst we choose the linear constitutive law Pinst = (s −1)E with s(x) = s(x +1) a one-periodic function. In the simplest model for the non-instantaneous part Pnon the nuclei of the atoms are fixed and the centers of the electrons move like a nonlinear oscillator. This simple modeling finally leads to a system ∂x2 E = s∂t2 E + ∂t2 Pnon , Pnon = Nj=1 P j , ∂t2 P j

+ ω2j P j

+ r j P j |P j | = d j E, 2

(A.7) (A.8) (A.9)

with constants ω j , r j , d j , and where N is the number of different kinds of molecules. In our modeling dissipation is neglected. The argument to come to (1.1) is as follows. For E = E 0 eiωt , Eq. (A.9) possesses solutions P j = P0 j eiωt with − ω2 P0 j + ω2j P0 j + r j P0 j |P0 j |2 = d j E 0 .

(A.10)

For photonic crystals the parameters ω2j , r j , and d j depend periodically on x. For small E 0 (A.10) can be solved w.r.t. P0 j , i.e., we have a constitutive law P0 j (x, ω) = α j (x, ω)E 0 j (x, ω) + β j (x, ω)E 03 j (x, ω) + · · · .

(A.11)

For ω in the optical window the changes in α j (x, ω) are negligible w.r.t. ω, i.e., the relation (A.11) is modeled independently of ω, or equivalently ω =: ω˜ j is fixed. Then multiplying (A.10) by eiωt yields the relation (−ω˜ 2j + ω2j + r j |P j |2 )P j = d j E,

840

C. Blank, M. Chirilus-Bruckner, V. Lescarret, G. Schneider

which can be inverted for small E, i.e., dj Pj = 2 E − rj ω j − ω˜ 2j

dj 2 ω j − ω˜ j

3 |E|2 E + · · · .

(A.12)

Next we replace ∂t2 P in (A.7) via (A.8) and (A.9), i.e., ∂x2 E = s∂t2 E +

N

(d j E − ω2j P j − r j P j |P j |2 ).

(A.13)

j=1

Using (A.12) to replace P j in (A.13) yields (1.1) when terms of order O(|P j |5 ) are neglected. We do not claim that this modeling is the physically most realistic modeling of photonic crystals, but we claim that with the same arguments which are used in the derivation of the other simplified models for the description of photonic crystals our starting Eq. (1.1) can be derived, too. Since we are interested in real-valued solutions we have E|E|2 = E 3 . Our Theorem 1.1 guarantees that in this modeling an infinitely extended photonic crystal can be designed which can be used as perfect optical storage, where the breather solution of Theorem 1.1 stands for a one in the digital encoding of information. There is no radiation and the information will be stored for all times. In reality the photonic crystals have finite size such that radiation of the pulse will be present. Although much smaller than the dispersion which comes from the periodic structure, dissipation cannot be neglected in the long term. But the strongest obstruction in practice is that the pulse will move with a fraction of the velocity of light if the underlying carrier wave has not precisely the wave number with the horizontal tangency. Hence, w.r.t. this possible application our result so far is mainly of theoretical interest. References 1. Bambusi, D., Paleari, S., Penati, T.: Existence and continuous approximation of small amplitude breathers in 1D and 2D Klein–Gordon lattices. Preprint 2009 2. Bambusi, D., Penati, T.: Continuous approximation of breathers in one and two dimensional DNLS lattices. Nonlinearity 23, 143–157 (2010) 3. Birnir, B., McKean, H.P., Weinstein, A.: The rigidity of sine-Gordon breathers. Comm. Pure Appl. Math. 47(8), 1043–1051 (1994) 4. Busch, K., Schneider, G., Tkeshelashvili, L., Uecker, H.: Justification of the Nonlinear Schrödinger equation in spatially periodic media. ZAMP, 57, 1–35 (2006) 5. Busch, K., von Freyman, G., Linden, S., Mingaleev, S.F., Theshelashvili, L., Wegener, M.: Periodic nanostructures for photonics. Phys. Rep. 444, 101–202 (2007) 6. Denzler, J.: Nonpersistence of breather families for the perturbed sine Gordon equation. Commun. Math. Phys. 158(2), 397–430 (1993) 7. Eastham, M.S.P.: The spectral theory of periodic differential equations. Edinburgh: Scottish Academic Press, 1973 8. Eckmann, J.-P., Wayne, C.E.: The nonlinear stability of front solutions for parabolic partial differential equations. Commun. Math. Phys. 161(2), 323–334 (1994) 9. Feynman, R.P., Leighton, R.B., Sands, M.: The Feynman lectures on physics. Vol. 2: Mainly electromagnetism and matter. Reading, MA-London: Addison-Wesley Publishing Co., Inc., 1964 10. Groves, M.D., Mielke, A.: A spatial dynamics approach to three-dimensional gravity-capillary steady water waves. Proc. Roy. Soc. Edinburgh Sect., A 131, 83–136 (2001) 11. Groves, M.D., Schneider, G.: Modulating pulse solutions for a class of nonlinear wave equations. Commun. Math. Phys. 219(3), 489–522 (2001)

Breather Solutions in Periodic Media

841

12. Groves, M.D., Schneider, G.: Modulating pulse solutions for quasilinear wave equations. J. Diff. Eq. 219(1), 221–258 (2005) 13. Groves, M.D., Schneider, G.: Modulating pulse solutions to quadratic quasilinear wave equations over exponentially long length scales. Commun. Math. Phys. 278(3), 567–625 (2008) 14. Guckenheimer, J., Holmes, P.: Nonlinear oscillations, dynamical systems, and bifurcations of vector fields. Applied Mathematical Sciences, 42. New York: Springer-Verlag, 1983 15. Haragus, M., Schneider, G.: Bifurcating fronts for the Taylor-Couette problem in infinite cylinders. Z. Angew. Math. Phys. 50(1), 120–151 (1999) 16. Henry, D.: Geometric Theory of Semilinear Parabolic Equations. Springer Lecture Notes in Mathematics Vol. 840, Berlin-Heidelberg-NewYork:Springer, 1981 17. James, G., Noble, P.: Breathers on diatomic Fermi-Pasta-Ulam lattices. Physica D 196(1–2), 124–171 (2004) 18. James, G., Sirr, Y.: Center manifold theory in the context of infinite one-dimensional lattices. The FermiPasta-Ulam problem, Lecture Notes in Phys. Vol. 728, Berlin-Heidelberg-New York: Springer, 2008, pp. 208–238 19. James, G., Sanchez-Rey, B., Cuevas, J.: Breathers in inhomogeneous nonlinear lattices: an analysis via center manifold reduction. Rev. Math. Phys. 21(1), 1–59 (2009) 20. MacKay, R.S., Aubry, S.: Proof of existence of breathers for time-reversible or Hamiltonian networks of weakly coupled oscillators. Nonlinearity 7, 1623–1643 (1994) 21. Kirchgässner, K.: Wave solutions of reversible systems and applications. J. Diff. Eq. 45, 113–127 (1982) 22. Lescarret, V., Blank, C., Chirilus-Bruckner, M., Chong, C., Schneider, G.: Standing modulating pulse solutions for a nonlinear wave equation in periodic media. Nonlinearity 22(8), 1869–1898 (2009) 23. Ntinos, A.A.: Lengths of instability intervals of second order periodic differential equations. Quart. J. Math. Oxford 27, 387–394 (1976) 24. Pankov, A.: Periodic nonlinear Schrödinger equation with application to photonic crystals. Milan J. Math. 73, 259–287 (2005) 25. Pelinovsky, D.E., Kevrekidis, P.G., Frantzeskakis, D.J.: Persistence and stability of discrete vortices in nonlinear Schrödinger lattices. Physica D 212, 20–53 (2005) 26. Pelinovsky, D., Schneider, G.: Justification of the coupled-mode approximation for a nonlinear elliptic problem with a periodic potential. Applicable Analysis 86(8), 1017–1036 (2007) 27. Pelinovsky, D., Schneider, G., MacKay, R.S.: Justification of the lattice equation for a nonlinear elliptic problem with a periodic potential. Commun. Math. Phys. 284(3), 803–831 (2008) 28. Reed, M., Simon, B.: Methods of modern mathematical physics. IV. Analysis of operators. New YorkLondon: Academic Press, 1978 29. Vanderbauwhede, A., Iooss, G.: Center manifold theory in infinite dimensions. In: Dynamics reported: expositions in dynamical systems, Berlin: Springer, 1992, pp. 125–163 Communicated by P. Constantin

Commun. Math. Phys. 302, 843–873 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1169-6

Communications in

Mathematical Physics

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes D. Cherney1 , E. Latini1,2 , A. Waldron1 1 Department of Mathematics, University of California, Davis, CA 95616, USA.

E-mail: [email protected]; [email protected]; [email protected]

2 INFN, Laboratori Nazionali di Frascati, CP 13, I-00044 Frascati, Italy. E-mail: [email protected]

Received: 18 March 2010 / Accepted: 6 August 2010 Published online: 5 January 2011 – © The Author(s) 2011. This article is published with open access at Springerlink.com

Abstract: We study a class of supersymmetric spinning particle models derived from the radial quantization of stationary, spherically symmetric black holes of four dimensional N = 2 supergravities. By virtue of the c-map, these spinning particles move in quaternionic Kähler manifolds. Their spinning degrees of freedom describe mini-superspace-reduced supergravity fermions. We quantize these models using BRST detour complex technology. The construction of a nilpotent BRST charge is achieved by using local (worldline) supersymmetry ghosts to generate special holonomy transformations. (An interesting byproduct of the construction is a novel Dirac operator on the superghost extended Hilbert space.) The resulting quantized models are gauge invariant field theories with fields equaling sections of special quaternionic vector bundles. They underly and generalize the quaternionic version of Dolbeault cohomology discovered by Baston. In fact, Baston’s complex is related to the BPS sector of the models we write down. Our results rely on a calculus of operators on quaternionic Kähler manifolds that follows from BRST machinery, and although directly motivated by black hole physics, can be broadly applied to any model relying on quaternionic geometry. Contents 1. 2. 3. 4. 5.

6.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . Detour Complexes . . . . . . . . . . . . . . . . . . . . . . . . Special Geometry . . . . . . . . . . . . . . . . . . . . . . . . N = 2 Supersymmetric Black Holes and Quaternionic Geometry HyperKähler Sigma Model . . . . . . . . . . . . . . . . . . . 5.1 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Charges . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . Quaternionic Kähler, N = 4, d = 1 SUGRA . . . . . . . . . . . 6.1 Rigid Lefschetz–Verbitsky model . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

844 845 849 850 852 853 854 855 856 857

844

D. Cherney, E. Latini, A. Waldron

6.2 Gauged Lefschetz–Verbitsky model . 6.3 Dirac quantization . . . . . . . . . . . 7. BRST and the Geometry of Ghosts . . . . 8. A Quaternionic Geometric Calculus . . . . 9. The Quaternionic Kähler Detour Complex 10. Conclusions . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

859 859 860 862 865 868

1. Introduction The main result of this paper is a detour complex for quaternionic Kähler manifolds. In physics language, this amounts to a gauge theory of higher (quaternionic) “forms” on these manifolds. To be precise, we utilize special holonomy to split the tangent bundle of a 4n-dimensional quaternionic Kähler manifold M into a product of rank 2 and 2n vector bundles H and E [1], TM ∼ = E ⊗ H, and present an equation of motion and gauge invariances for sections of ∧E (or, more generally, ∧E ⊗ H ). The results of the paper will appeal to multiple audiences including: (i) Those readers interested in the differential geometry of quaternionic Kähler spaces. (ii) Readers studying various supersymmetric quantum mechanical and spinning particle models in quaternionic Kähler and hyperKähler backgrounds (such as gravitational instanton moduli spaces [2], Hitchin’s moduli space of stable Higgs bundles [3,4], geometric Langlands theory [5] and hypermultiplet moduli spaces [6], to name a few). (iii) Readers looking for applications of the BRST detour quantization of orthosymplectic constraint algebras developed for applications to higher spin systems in [7], on which these results heavily rely. (iv) Readers wanting to apply our results to supergravity (SUGRA) black hole quantization since, remarkably, the mathematical structure presented above is exactly what is called for when studying the minisuperspace quantization of N = 2 SUGRA black holes [8–11]. (In particular, wavefunctions valued in ∧E describe the fermionic degrees of freedom of these models.) Therefore the paper is structured so that any of these readerships can easily extract the information they need. In Sect. 2, we introduce the notion of a detour complex, beginning with simple examples. We then generalize our previous results on Kähler detour complexes to hyperKähler manifolds. This result follows immediately from an isomorphism between super Lie algebras of geometric operators mapping Dolbeault and Lefschetz operators on Kähler forms to their hyperKähler analogues acting on sections of ∧E. We then explain a main difficulty solved in this paper: the construction of a geometric detour complex for quaternionic Kähler manifolds is seemingly obstructed by the higher rank of the analogous geometric super algebra. This problem is overcome in later sections by understanding the key rôle played by the BRST superghosts in the description of quaternionic geometry. The main requisite geometric data is presented in Sect. 3 together with our notations and conventions. In Sect. 4 we review the relationship between quaternionic Kähler spinning particles and four dimensional black holes; the original motivation for this work. The latter can be described by a spinning particle model coming from the minisuperspace reduction of N = 2 supergravities [8,9]. The “BPS” conditions of this spinning particle model (i.e., requiring solutions for which the local fermion supersymmetry transformations vanish) equal the reduction of the analogous conditions in the four dimensional SUGRA.

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

845

Since those conditions amount to the attractor mechanism [12–14] for four dimensional supersymmetric black holes, the quantized spinning particle model is an excellent laboratory for studying these objects.1 In particular, it allows a minisuperspace analysis of the Ooguri–Strominger–Vafa conjecture [19] and the relationship between black hole wave functions and vacuum selection in string theory [20]. This equivalence between the attractor flow equation and supersymmetric geodesic motion was observed in [8,9,21]. The introduction of BRST techniques to solve what could be stated as a purely geometrical problem suggests the presence of an underlying gauge invariant physical model. This is indeed the case. The first of the relevant models is a hyperKähler supersymmetric quantum mechanics. This model can be enhanced to include quaternionic Kähler backgrounds once its four worldline supersymmetries are gauged. This yields a supersymmetric spinning particle model consistent in any quaternionic Kähler manifold. We describe these models in Sects. 5 and 6, respectively. Sections 3, 8 and 9 can in principle be read by geometers in isolation from the other more physical sections. In Sect. 8, we give a calculus of geometric operators acting on sections of ∧E. Although, we were motivated to write these operators for quantum mechanical BRST reasons, the results themselves are purely geometric. They form the basic building blocks of our quaternionic detour complex. They also place in a much more general setting the Dirac, Dirac–Fueter and detour operator employed some time ago by Baston [24]. Finally our main result is given in Sect. 9, orchestrating all the previous results to build a gauge invariant, higher “form” quantum field theory on quaternionic Kähler manifolds. It relies on the construction of a nilpotent BRST charge given in Sect. 7 achieved by utilizing the supersymmetry ghosts to generate special holonomy transformations. An interesting byproduct of this computation is a novel Dirac operator on the BRST superghost Hilbert space. Aside from providing an explicit quantization of the fermion modes of minisuperspace N = 2 supersymmetric black holes, our quaternionic detour complex has many potential further applications and generalizations. In particular, it is closely related to the twistor methods of [25]. Also, in some sense, the model is a higher spin theory, so the methods of Vasiliev may be applicable to writing interactions for infinite towers of these quantum fields (see [26,27] for an excellent review of these methods). Given the existence of the underlying SUGRA theory, this is a very tantalizing possiblity. These and other directions for future work are discussed in the conclusions. 2. Detour Complexes The simplest example of a geometrical detour complex is given by the superalgebra, on any Riemannian manifold M, generated by the exterior derivative d and the codifferential δ: {δ, d} = .

(1)

Here, the right hand side is the form Laplacian which is a central element of this algebra. These operators act on differential forms ∈ (∧M), which may be viewed as wavefunctions of an N = 2 supersymmetric quantum mechanical model [28], with the Hamiltonian and (δ, d) the two supercharges. Gauging the corresponding worldline 1 A very useful introduction to BPS black holes and the attractor mechanism is [10,11] (the formulation in [16–18] also fits our viewpoint well).

846

D. Cherney, E. Latini, A. Waldron

translation and supersymmetries yields a spinning particle (or 1-dimensional SUGRA) model which can be quantized using BRST machinery. In mathematical terms this amounts to computing the Lie algebra cohomology of the superalgebra (1). However, when defining Lie algebra cohomology for superalgebras, some care is needed [29]. In physics terms this amounts to choices of vacua/polarizations for commuting superghosts [30,31]. It turns out that a distinguished choice exists such that the cohomology is neatly arranged in terms of gauge invariances, Bianchi identities and the equations of motion of a gauge invariant field theory. In a higher spin setting this was first observed in the context of an unfolded formulation and what is called the “twisted adjoint representation” [37,38]. (Very recently the unfolding technique has been shown to be equivalent to the BRST one [32]. The idea of studying worldline descriptions of higher spin systems, via detour and path integral quantization has also been analyzed in [33] and [34,35].) In [36] we used a split choice of ghost polarization2 to construct detour complexes from constraint algebras. (For systems with anti-commuting ghosts, this method reproduces known results [44,45] for totally symmetric higher spin fields). The term “detour complex” was chosen because the BRST technology produced complexes of the type studied recently by conformal geometers, the main idea being to connect standard complexes and their duals by (typically higher order in derivatives) detour operators [46–49]. For the simplest case of the de Rham complex, the detour machinery yields a cohomology neatly encapsulated by the complex. d d d δ δ δ · · · −→ M −→ M −→ M → · · · · · · → M −→ M −→ M −→ · · · . ⏐ ⏐ δd

The self-adjoint detour operator δd encodes the equations of motion δd A = 0 of a p-form gauge field A and connects the standard de Rham complex to its dual. These incoming and outgoing complexes encode the gauge and gauge for gauge symmetries, and Bianchi as well as Bianchi for Bianchi identities of p-form electromagnetism. A more sophisticated example is that of the Kähler detour complex; on these manifolds the exterior derivative and codifferential decompose into Dolbeault operators and their duals [50,51] ¯ d = ∂ + ∂,

∗

δ = ∂ ∗ + ∂¯ ,

subject to the superalgebra 1 ¯ ∂¯ ∗ }. = {∂, 2 In addition, an sl(2) Lefschetz algebra acts on the Dolbeault cohomology of a Kähler manifold M. This corresponds to the R symmetry algebra of the above N = 4 superalgebra ∗ ∗ ∂ ∂ ∂¯ ∂¯ , ¯ = , ,L = ¯ , ∂ ∂ −∂ ∗ −∂ ∗ [H, ] = −2, [H, L] = 2L, [, L] = H. {∂, ∂ ∗ } =

Differential forms on a Kähler manifold are bigraded by their holomorphic and antiholomorphic degrees ( p, q) in terms of which the eigenvalues of the operator H are 2 The technique of split ghost polarizations is equivalent to the twisted adjoint representation of [37,38]. It has also been employed in [39–43].

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

847

p + q − 21 dim M. The operator maps ( p, q) to ( p − 1, q − 1)-forms by contracting with the Kähler form and the operator L is its dual. The Kähler analog of p-form electromagnetism [52] follows by a detour complex treatment of the spinning particle3 model obtained by gauging worldline translations, supersymmetries and the R-symmetry . Nilpotentcy of Q = ∂ ∂∂p + ∂¯ ∂∂p¯ acting on polynomials in Grassmann even variables p, p¯ with coefficients in ∧M yields the left-hand side of the complex. ∂

M

M

∂

∂¯

M

∂

···

∂¯

∗ ∂¯

M

M

∂∗

M

∂¯

∂

∂

∂∗

∂∗

···

M

∂∗

∗ ∂¯

∂¯

∂∗

¯∗

¯∗

M −−−→ M

∂

M

∂

G

∂

∂¯

M

M

∗ ∂¯

∗ ∂¯

∂∗

∂¯

Upon fixing a dimension for M and a bi-grading ( p, q) this incoming complex becomes the Hodge diamond from complex manifold theory. It may be interpreted as gauge (and gauge for gauge) invariances of the “long” or detour operator G. Explicitly, gauge invariance reads A → A + ∂α + ∂¯ α. ¯ Clearly the equations ∂ ∂¯ A = 0 are invariant, yet potentially over or underdetermined. Taking the Kähler trace yields the desired equations of motion ∂ ∂¯ A = 0. However, the operator ∂ ∂¯ is not self-adjoint and so does not naturally connect the “incoming” Dolbeault complex with the “outgoing” dual complex depicted on the right hand side above. The self adjoint operator √ I1 (2 L) ¯ ∗ ∗ ¯ ¯ G = : I0 (2 L) ( − 2∂∂ − 2∂ ∂ ) + 2 √ (∂ ∂ + L ∂ ∗ ∂¯ ) : L √

∗

found in [52] gives an equivalent equation of motion G A = 0. Here : • : denotes normal ordering of • by form degree and the functional dependence on L through the modified Bessel functions of the second kind is analytic at the origin. In the special case that M is hyperKähler, replacing differential forms by sections of ∧E gives another representation of the above N = 4 supersymmetry algebra: The 3 Supersymmetric mechanics on Kähler manifolds have been extensively studied in [53–56 and 57–60].

848

D. Cherney, E. Latini, A. Waldron

tangent bundle T M for 4n-dimensional manifolds M with quaternionic holonomy splits into a product of vector bundles TM ∼ = H ⊗ E, of rank 2 and 2n, respectively. The connection on a hyperKähler manifold acts on sections X α and X A of H and E, respectively, as ∇ X α = d X α + ωβα X β ,

∇ X A = d X A + BA X B ,

where the one-form BA is sp(2n)-valued. Writing the Levi-Civita connection as ∇ α A in a basis for H ⊗ E, there are sp(2) doublets of exterior derivatives and codifferentials acting on ∧E via dα : X A1 ... Ak → ∇ α[A1 X A2 ...Ak+1 ] , δ α : X A1 ... Ak → k∇α A X A A1 ...Ak−1 , in the index notation explained in Sect. 3. They obey the N = 4 algebra {dα , dβ } = 0 = {δ α , δ β }, 1 {δ α , dβ } = − δαβ , 2 where is the Bochner Laplacian ∇μ ∇ μ . Only an sp(2) subalgebra of the so(2, 2) R-symmetry of this N = 4 superalgebra acts non-trivially in this hyperKähler representation. The non-trivial R-symmetries are built from the sp(2n) invariant tensor J , g : X A1 ...Ak → J [A1 A2 X A3 ...Ak+2 ] , N : X A1 ... Ak → k X A1 ... Ak , tr : X A1 ...Ak → k(k − 1) J AB X B A A1 ...Ak−2 , and obey the algebra [tr, N] = 2 tr,

The dictionary dα ↔

[tr, g] = 4(N − n), [N, g] = 2 g, α [δ α , N] = δ α , [N, d ] = dα , [δ α , g] = 2 dα . [tr, dα ] = 2 δ α ,

∗ ∂ ¯∗ ¯∂ , δ α ↔ −∂ −∂ ,

g ↔ 2L, tr ↔ 2,

between the Kähler and hyperKähler representations of the N = 4 superalgebra allows the Kähler detour complex to be translated directly to a hyperKähler one. In particular, nilpotence of the operator Q = dα ∂ ∂pα on polynomials in the Grassmann even variables p α with coefficients in (∧E) gives gauge and gauge for gauge invariances of the over-determined, Maxwell like, and Einstein versions of the hyperKähler equations of motion dα dα A = 0 ⇒ tr dα dα A = 0 ⇔ G A = 0, √ I1 ( g tr) (dα dα tr + g δ α δ α ) :, G = : I0 ( g tr) ( + 2 dα δ α ) − 2 √ g tr

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

849

for gauge fields A ∈ (∧E). Explicitly, the gauge invariance reads A → A + dα αα . The equation of motion dα dα A = 0 was first generalized to the more complicated quaternionic Kähler case by Baston [24], and later recovered in the context of BPS, N = 2 supersymmetric black hole systems in [25]. The main result of this paper is to further extend this generalization to the full “Einstein” equations of motion G A = 0 in the quaternionic Kähler setting. It relies on a trio of geometric operators (one of which is Baston’s original second order operator) transforming as a triplet under sp(2)R-symmetries. We now present the basic geometric data on quaternionic Kähler manifolds needed for this paper. 3. Special Geometry HyperKähler and quaternionic Kähler manifolds in dimension 4n and signature (2n, 2n) enjoy sp(2n) and sp(2) ⊗ sp(2n) holonomy, respectively.4 In either case, this implies that the tangent bundle splits into a product of vector bundles [6] TM ∼ =H⊗E of rank 2 and 2n, respectively. Therefore, we denote curved and flat indices by μ, ν, . . . and m, n, . . . respectively, and decompose tangent space indices as m = α A, where A = 1, . . . , 2n and α = 1, 2 label the fundamental representations of sp(2n) and sp(2), respectively. The invariant so(2n, 2n) metric decomposes this way as ηmn = εαβ J AB , where εαβ and J AB are the sp(2) and sp(2n) invariant, antisymmetric tensors. This allows for all indices to be raised and lowered independently. For example, v A ≡ J AB v B , v α ≡ vβ εβα and εα β = δβα = −εβ α . Note that we use an uphill convention. The action of the connection on sections of H and E, respectively, is given by ∇ X α = d X α + ωβα X β ,

∇ X A = d X A + BA X B ,

where both ωαβ and AB are symmetric. On hyperKähler manifolds, only the latter is non-zero. This may be extended to arbitrary tensor products of sections of H and E in the obvious way. For the purposes of calculations involving such products, we specify this action by introducing representations of the sp(2n) and sp(2) subalgebras of the full local Lorentz algebra so(2n, 2n). The generators of these algebras are represented as operators T AB and t αβ , indexed by symmetric pairs of indices, that act on sp(2n) and sp(2) indices by T AB X C = J C A X B + J C B X A , t αβ X γ = γ α X β + γβ X α .

(2)

4 The maximally split signature corresponds to paraquaternionic holonomy – all our results apply to general signatures, this choice being a matter of notational convenience.

850

D. Cherney, E. Latini, A. Waldron

These operators satisfy [T AB , T C D ] = J C A T B D + J C B T AD + J D A T BC + J D B T AC , αβ γ δ t ,t = εγ α t βδ + εγβ t αδ + εδα t βγ + εδβ t αγ , their extension to higher tensors is by the usual Leibnitz rule, and thus ∇=d+

1 α β 1 A B ω t + T . 2 β α 2 B A

Throughout this paper, the symbol ∇ will refer to this definition. The final geometric ingredient needed here is the Riemann tensor. As a result of special holonomy it has the decomposition [6] Rα A β B γ C δ D = ε(α|γ | εβ)δ J AB JC D + εαβ εγ δ [J(A|C| J B)D + ABC D ].

(3)

Hence, the commutator of covariant derivatives on sections of H and E follows from: [∇ Aα , ∇ Bβ ] φCγ = J B A εγ (α φCβ) + εβα JC(A φ B)γ + εβα D ABC φ Dγ . This specifies an action on higher rank tensors which can be succinctly expressed in terms of the operators 1 1 C [∇ Aα , ∇ Bβ ] = J B A tαβ + εβα T AB + D T ABC D . 2 2 The tensor ABC D is totally symmetric and will appear only seldomly in this paper since it cannot couple to the antisymmetric sections of ∧E which appear in our models. The terms proportional to the constant are present only on quaternionic Kähler manifolds and vanish for the hyperKähler case.5 Finally, note that the Ricci and scalar curvatures are Rmn = −(n + 2)ηmn and R = −4n(n + 2). 4. N = 2 Supersymmetric Black Holes and Quaternionic Geometry Breitenlohner, Maison and Gibbons [60] showed that Kaluza–Klein reduction along a single isometry of a four dimensional, curved space non-linear sigma models coupled to Maxwell fields 1 1 I (4) 4 A ∗ B ∗ J J S=− d x −g R + gAB (ϕ)dϕ ∧ dϕ + F ∧ MIJ F + NIJ F , 2 2 (where A, B = 1, . . . , n S the number of scalar fields and I, J = 1, . . . , n V the number of vector fields) yields a three dimensional curved space non-linear sigma model 1 3 S=− d x −g R + gμν (φ)dφ μ ∧ ∗ dϕ ν . 2 The metric gμν on the moduli space of the three dimensional non-linear sigma model (4) depends on that of the four dimensional sigma model gAB as well as the couplings MIJ and NIJ of the Maxwell field strengths F I to the four dimensional scalars ϕ A . We refer 5 Note that these are not proportional to η r [m ηn]s – the constant curvature Riemann tensor – since general quaternionic Kähler manifolds are not constant curvature.

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

851

to the original paper [60] for the precise formulæ. Suffice it to say, that the n S scalars in four dimensions are enlarged to a set of n S + 2n V + 2 scalars coming from the dilaton, dualized graviphoton, Maxwell Kaluza–Klein scalar modes and dualized three Maxwell fields. They span the moduli space M of the three dimensional sigma model, and in this paper we will be primarily interested in the case that dim M = 4n. In particular when the original four dimensional theory is the bosonic sector of N = 2 SUGRA, the four dimensional scalar moduli space is a Kähler manifold and its image under dimensional reduction is a (para)quaternionic Kähler manifold. This correspondence is known as the c-map [61–64]. When the reduction isometry is generated by a timelike Killing vector, solutions of the three dimensional sigma model correspond to stationary solutions of the four dimensional theory. If we make the additional assumption of spherical symmetry of the three dimensional stationary slices ds 2 = N 2 (ρ)dρ 2 + r 2 (ρ)(dθ 2 + sin2 θ dϕ 2 ), solutions then derive from a one dimensional action 1 S=− dρ N + N −1 (r 2 − r 2 φ μ gμν φ ν ) , 2 where primes denote ρ-derivatives. This model can be interpreted as a relativistic particle moving in a cone metric dr 2 − r 2 dφ μ gμν dφ ν , over the quaternionic Kähler moduli space M. Classical solutions separate into radial motion and geodesics on the moduli space M. Of these, the extremal black hole solutions of the original four-dimensional theory are necessarily in correspondence with lightlike geodesics [60]; the radial quantization of static, spherically symmetric black holes in Einstein and Einstein-Maxwell gravity has been studied in [65–70]. The consequences of the four dimensional local supersymmetry of the underlying N = 2 SUGRA can be incorporated in this minisuperspace approximation by computing the dimensional reduction of the supersymmetry transformations (see [8,9]). BPS states follow by requiring that the transformations of the fermions vanish. This requirement splits into a radial condition dr = N dρ, as well as the BPS conditions of a (worldline) locally supersymmetric extension of a relativistic, massless particle moving in the moduli space M. Indeed, imposing r = N on the constraint N 2 = r 2 −r 2 φ μ gμν φ ν implied by the N -variation of the above action yields r 2 φ μ gμν φ ν = 0. Therefore we can reinterpret r 2 = 1/e as the inverse einbein of a massless relativistic particle moving in M. The coupling of this particle to worldline fermions θ Ai = (θ A∗ , θ A ) is determined by requiring that their supersymmetry variations coincide with those obtained by dimensional reduction of the four dimensional SUGRA variations. This leads to a one dimensional SUGRA with action principle 1 ◦μ 1 i ∇θiA ◦ν i jA B x gμν x + θ A + e θ A θi B θ θ j . S = dt 2e 2 dt 4

852

D. Cherney, E. Latini, A. Waldron

In this formula ◦

x μ ≡ x˙ μ − V μ αA θ Ai ψiα , is the supercovariantized tangent vector and ψiα are worldline gravitini; the gauge fields for the four local worldline supersymmetries. The BRST quantization of this supersymmetric spinning particle model is a central focus of this paper. 5. HyperKähler Sigma Model We now construct a supersymmetric, non-linear sigma model in a 4n-dimensional, hyperKähler target space (M, gμν ). The field content of the model consists of bosonic worldline embedding coordinates x μ (t), and fermionic spinning degrees of freedom θ Ai (t). Their dynamics are governed by the simple action A 1 μ ν i ∇θi S= dt x˙ gμν x˙ + θ A . (4) 2 dt The (rigid) symmetries of the model are 1. Worldline translations: δx μ = ξ x˙ μ ,

δθ Ai = ξ θ˙Ai .

(5)

λi j = λ ji .

(6)

2. Sp(2)R-symmetry: δθ Ai = λi j θ A j , 3. N = 4 supersymmetry: δx μ = V μ αA θ Ai εiα , μ Vm

μ V Aα

Dθ Ai = −x˙ μ Vμ αA εαi .

(7)

Here = are the inverse written with split flat indices and D is the i i B μ covariant variation: Dθ A ≡ δθ A − δx μ A θ Bi . On functions of x μ it equals δx ρ ∇ρ ; it obviates the requirement to vary covariantly constant quantities. In this regard it helps to observe that δ = D when varying scalars (such as the action). To see explicitly that the action (4) is supersymmetric, we note the identities vielbeine6

∇δx μ D x˙ μ = , dt ∇ A B θ A = δx μ x˙ ν Rμν BA θiB = δx Cα x˙αD BC D, D θi . dt i

(8)

Variations linear in fermions cancel by virtue of the first identity, but there are potentially ∇ cubic fermion terms proportional to 21 θ Ai [D, dt ]θiA . Using the second identity we see C A B that these vanish since ABC D θi θ j θk ≡ 0. 6 The vielbeine/orthonormal frames, denoted V m obey μ

Vμ αA Vν αA = −gμν ,

β

β

Vμ αA V μ B = −δ BA δα .

Special holonomy dictates that in addition to these identities for Vμ αA (jocularly, the “zweimalhalbsovielbein”) it is also true that: 1 1 β β V(μ αA Vν) A = − gμν δα , V(μ αA Vν) αB = − gμν δ BA . 2 2n

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

853

5.1. Quantization. To quantize the model we write it in first order form 1 i A 1 (1) μ μν ˙ S = dt pμ x˙ + θ A θi − πμ g πν , 2 2 where πμ = pμ + θ Ai μ BA θiB , and directly impose the canonical commutation relations dictated by the Darboux form of the first order kinetic terms: [ pμ , x ν ] = −iδμν ,

j

{θ Ai , θ B } = −i i j J AB .

(9)

We introduce a Fock representation on a vacuum state |0 as7 ηA ∂ i , pμ |0 = 0 = θ A → |0. −i ∂η∂ A ∂η A The fermionic anticommutator (9) implies {

∂ , η B } = δ AB , ∂η A

so the creation operators η A produce Fock states which may be identified with sections of the bundle ∧E: (∧E) ≡ φ A1 ...Ak (x)η A1 · · · η Ak |0 ≡ |φ A1 ...Ak .

(10)

The form of πμ in the action above may be understood in terms of this representation; in general the covariant momentum is πμ = pμ −

i Pμ mn M mn , 2

where M mn generate the local Lorentz algebra [M mn , M r s ] = M ms ηnr − M ns ηmr + M nr ηms − M mr ηns . For hyperKähler manifolds the spin connection acts as Pmn M mn = AB T AB , where T AB , defined in (2), generate sp(2n). On ∧E one may alternatively represent sp(2n) by bilinears in the spinning degrees of freedom; T AB ≡ −2η(A

∂ ∂η B)

(11)

acts identically on to the operator introduced in (2). This explains the form of πμ ; acting on ∧E-valued states it produces the covariant derivative8 i πμ = pμ − μAB T AB = −i∇μ . 2 7 The positive definite quantum mechanical inner product for the spinning degrees of freedom is defined † by taking η A = ∂ A . ∂η 8 As usual for first quantized models, π π = ∇ ∇ because π does not see the open index of π . μ ν μ ν μ ν

854

D. Cherney, E. Latini, A. Waldron

5.2. Charges. Our next task is to write down charges generating the symmetries (5)–(7). At the quantum level these are subject to ordering ambiguities which we resolve by relating symmetry charges and geometric operations. Firstly, we expect the Hamiltonian – the generator of worldline translations – to correspond to the Laplacian ≡ ∇μ ∇ μ : −2H = . This is true so long as we adopt the quantum ordering H=

i 1 π Aα π Aα − Aα BA π Bα , 2 2

παA ≡ V μ αA πμ .

The four supercharges transform as a doublets under the sp(2) holonomy subalgebras as well as under a Lefschetz-Verbitsky sp(2) algebra which we introduce below. They are built from the sp(2n) contraction of the spinning degrees of freedom θ Ai with the covariant momenta. On states they act as A η ∇α A dα i ≡ Qα ≡ , δα −∇αA ∂η∂ A where, again, the operator ordering is chosen based on the natural geometric action: |∇α[A1 φ A2 ...Ak+1 ] dα . = Q iα = δα |k∇α A φ A A2 ...Ak The operator dα : k E → H ⊗ k+1 E, belongs to a sequence of Dirac operators introduced by Baston in a study of quaternionic complexes [24]. Indeed the operators dα and δ α are analogous to the Dolbeault operators on forms, but they act on (∧E) instead of (∧T M). Next, we present the R-symmetry charges generating (6). They can be derived from geometric grounds alone as follows: Firstly observe that since we deal with wavefunctions (10), there is no prohibition on adding anti-symmetric E-tensors with differing number of indices. The state in (10) is in fact an eigenstate of the number or “index” operator N = ηA

∂ . ∂η A

(12)

The invariant tensor J AB allows us to construct two further bilinears, tr =

∂ ∂ , ∂η A ∂η A

g = η AηA.

(13)

These act on states as suggested by their names; the operator tr removes a pair of indices by tracing with the invariant tensor J AB : tr |φ A1 ...Ak = k(k − 1)|φ A AA3 ...Ak . Conversely, its adjoint, g adds a pair of indices by multiplying by J AB and antisymmetrizing: g |φ A1 ...Ak = |J[A1 A2 φ A3 ...Ak+2 ] .

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

We arrange these generators in a symmetric 2 × 2 matrix g N−n ij f = . N−n tr

855

(14)

These are precisely the charges corresponding to the R-symmetries (6) and obey the sp(2) algebra [ f i j , f kl ] = ki f

jl

+ k j f il + li f

jk

+ l j f ik .

We note that one may view this representation of sp(2) as the Howe dual of the representation of sp(2n) generated by T AB (i.e., sp(2) and sp(2n) are the commutants of one another in so(2n, 2n)). In an equation [ f i j , T AB ] = 0. Moreover, the quadratic Casimirs of these two algebras are related by c = g tr − N(N − 2n − 2) =

1 ij 1 f f i j + n(n + 2) = − T AB T AB . 2 2

(15)

The above geometric operators are closely related to the so(4, 1) Verbitsky algebra acting on differential forms on hyperKähler manifolds. (An elegant description of this algebra from a supersymmetric quantum mechanical viewpoint is given in [51].) In fact {g, N, tr} generate an sp(2) subalgebra of so(4, 1) corresponding to writing d x μ as d xαA and studying Verbitsky transformations which do not act on the H -index α. Alternatively, we may view this algebra as a generalization of the Lefschetz subalgebra that acts on forms on a Kähler manifold. Henceforth we adopt the hybrid designation “Lefschetz–Verbitsky algebra”. After some calculation we find9 1 ij εαβ , [ f i j , Q kα ] = 2 k(i Q αj) , 2 (16) [ f i j , f kl ] = ki f jl + k j f il + li f jk + l j f ik , [, f i j ] = 0 = [, Q iα ]. j

{Q iα , Q β } =

5.3. Summary. The hyperKähler sigma model presented in this section (and summarized in Fig. 1) provides a geometric representation of the algebra {Q I , Q J } = J I J D, with J the invariant rank two tensor of so(2, 2). This algebra belongs to the family of orthosymplectic algebras for which the BRST detour quantization procedure [36] was developed. The most general R-symmetry of this algebra is so(2, 2), with generators R I J acting as [R I J , Q K ] = 2JK [I Q J ] . 9 It is interesting to note that this algebra is an Inönü–Wigner contraction of the osp(2|2) superalgebra where the bosonic sp(2) and so(1, 1) blocks are generated by f i j and H respectively while Q iα belong to off diagonal fermionic blocks. The rescaling of osp(2|2) generators H → λ2 H and Q iα → λ Q iα , and the limit λ → ∞ recovers the algebra above.

856

D. Cherney, E. Latini, A. Waldron

Fig. 1. Geometric data for the quantized hyperKähler sigma model

Upon breaking the index I = iα , so that J I J = αβ i j , a Howe dual pair of sp(2) α(i j) i subalgebras generated by R(αβ)i and R α are readily identified. In our hyperKähler sigma model, only the Lefschetz–Verbitsky sp(2) part of the R-symmetry algebra acts α(i j) non-trivially and is identified by R α → f i j . The model we have written down makes sense also on a quaternionic Kähler manifold. The geometric interpretations of the charges and wavefunctions is unaltered. What does change however is the algebra of charges which is no longer a super Lie algebra, but receives deformations from the non-vanishing sp(2) holonomy of a quaternionic Kähler manifold. Fortunately however, these deformations produce a first class constraint algebra. Therefore local, or spinning particle models can be constructed by gauging supersymmetries. These are the subject of the next section. 6. Quaternionic Kähler, N = 4, d = 1 SUGRA Upon replacing the hyperKähler target space with a quaternionic Kähler one, it is no longer possible to maintain the rigid N = 4 supersymmetry algebra (16). However, by requiring the algebra to hold only weakly we may instead study local symmetries. There are various choices for first class algebras built from the generators H , Q iα and f i j . Gauging the Hamiltonian H yields a model which is worldline reparameterization independent—generally a desirable feature. Local, N = 4, worldline supersymmetry is achieved by gauging the supercharges Q iα . Thereafter, one can also consider gauging some combination of R symmetry generators. From a spinning particle perspective gauging {H, Q iα } and {H, Q iα , f i j } might seem most natural. In general the choice depends on the particular physical or geometric application one has in mind. Also, in general, when quantizing a first class constraint algebra, one needs to keep in mind what quantization

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

857

procedure will be employed. Possibly the simplest choice is a naïve Dirac quantization where one attempts to impose the constraints directly as operator relations on the physical Hilbert space. Often however, this is not the most interesting choice, and far more can be learned from a BRST approach. In this section we construct the classical spinning particle models corresponding to the {H, Q iα } and {H, Q iα , f i j } gaugings. In the remainder of the paper, we will be primarily concerned with the BRST quantization of the former of these. In particular we show, motivated by ideas from higher spin theories, that gauging only a single R symmetry generator tr within a BRST detour setting produces a gauge invariant quantum field theoretical model on quaternionic Kähler spaces. The first step is to introduce Lagrange multipliers (gauge fields) for each constraint Constraints

Gauge Fields

H ≈0 Q iα ≈ 0

Lapse N Gravitini ψiα

H ≈0 Q iα ≈ 0 f ij ≈ 0

Lapse N Gravitini ψiα Yang–Mills Ai j

In this one-dimensional setting, these gauge fields have no dynamics. The charges Q iα and f i j are the same as those of the hyperKähler sigma model in Sect. 5, while we add curvature corrections to the Hamiltonian H reflecting that the background is now quaternionic Kähler. These are determined by ensuring that the algebra of charges is first class. Let us give details for each model separately. 6.1. Rigid Lefschetz–Verbitsky model. Gauging only the Q iα and H yields a model with rigid Lefschetz–Verbitsky symmetries. Since we work in a quaternionic Kähler target space as described in Sect. 3 the connection ∇ now is both sp(2) and sp(2n)-valued. There are two easy methods to compute the (second order) action and its symmetries. The first is to start with the sigma model action (4) and to proceed using the Noether method, whose first step couples the gravitini to the supersymmetry current/charges Q iα . This computation is analogous to the one employed by Bagger and Witten [6] to compute matter couplings to N = 2, d = 4 SUGRA. Alternatively, we can begin with a first order action given by the sum of the standard symplectic current dt{ pμ x˙ μ + 21 θ Ai θ˙iA } and the product of Lagrange multipliers (N , ψiα ) with their corresponding constraint. Thereafter, a Legendre transformation yields the second order action. The results are equivalent and we find 1 ◦μ 1 i ∇θiA N i ◦ν jA B x gμν x + θ A + θ θ θ θj , S = dt (17) 2N 2 dt 4 A iB which enjoys symmetries:

858

D. Cherney, E. Latini, A. Waldron

1. Local worldline reparameterizations: δx μ = ξ x˙ μ ,

δθ Ai = ξ θ˙Ai ,

δN =

d(ξ N ) dt

δψiα =

d(ξ ψiα ) . dt

2. Sp(2)R-Rigid symmetry: δθ Ai = λi j θ A j ,

δψαi = λi j ψα j .

3. Local N = 4 supersymmetry: δx μ = V μ αA θ Ai εiα , 1 ◦ Dθ Ai = − x μ Vμ αA εαi , N δ N = ψαi εiα , Dψαi =

N i A j ∇εαi + θ θ ε . dt 2 A j α

In these formulæ, D is again the covariant variation, but just like the connection ∇, it β too is now sp(2) covariant so that, for example, Dψαi = δψαi − δx μ ωμ α ψβi . Also, we have introduced the supercovariant tangent vector ◦

x μ ≡ x˙ μ − V μ αA θ Ai ψiα . To verify invariance of this action, notice that the supercovariant tangent vector transforms as i ∇θ δN ◦ μ 1 ◦ ◦μ A α x + V μ αA x ν V [μ αA V ν] βA εβi ψiα . Dx = ε − θ Ai ψiα + 2N dt i N ∇εαi dt is shorthand for the two fermion gravitini variations. The last ◦ x ν A[μν] so do not contribute to the variation of the bosonic matter

Here ψiα ≡ Dψαi −

terms are of the form 1 x◦ 2 kinetic term 2N , while the leading term perfectly ensures the kinetic terms vary into 1 ◦ ◦ μ 1 i ∇θiA ∇ 1 ◦α 1 i i x μ x + θA − x A ψα + θ A D, θiA . (18) δ = 2N 2 dt N 2 dt

These cancel the variation of the four point fermi coupling to the Riemann tensor. This relies on the quaternionic Kähler analog of the identity (8) which yields δx x˙ times the Riemann tensor for the commutator of covariant worldline derivatives and variations. ◦ Trading x˙ for x yields exactly the terms required to cancel the variation of the lapse N multiplying the four point coupling. A final point worth stressing is that the parameter is not fixed by the requirement of local supersymmetry in one dimension. In dimension four, coupling N = 2 SUGRA to matter fixes the scalar curvature in terms of Newton’s constant κ [6]. (This follows by requiring variations of the Einstein–Hilbert and Rarita–Schwinger terms to cancel at order κ 0 in the Noether procedure.) Both these terms are absent in our one dimensional model.

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

859

6.2. Gauged Lefschetz–Verbitsky model. To gauge the Lefschetz–Verbitsky sp(2) symmetry we need only replace the covariant derivative ∇ in (17) by its sp(2) covariantization A

∇ defined by A

∇v i ∇ vi ≡ + Ai j v j . dt dt Therefore the gauged action reads ⎧ ⎫ A ⎨ 1 ◦ ⎬ A N i 1 ∇ θi ◦ x μ gμν x ν + θ Ai + θ A θi B θ j A θ jB , S = dt ⎩ 2N ⎭ 2 dt 4

(19)

which differs from (17) by a Lagrange multiplier term 21 θ Ai Ai j θ j A (so the gauge field Ai j is a unit weight, worldline tensor density or volume form). In addition to the new local Lefschetz–Verbitsky symmetry, j

δθ Ai = λij θ A ,

δψαi = λij ψαj ,

δ Ai j = λ˙ i j + 2 Ak(i λk , j)

the supersymmetry transformations are modified to read δx μ = V μ αA θ Ai εiα , 1 ◦ Dθ Ai = − x μ Vμ αA εαi , N δ N = ψαi εiα , A

N i A j ∇ εαi + θ θ ε , = dt 2 A j α δ Ai j = 0.

Dψαi

These results and other gaugings follow easily from the canonical analysis of the next section. 6.3. Dirac quantization. To perform a canonical analysis and Dirac quantization of Lefschetz–Verbitsky model we first note that the symplectic structure the rigid ! dt pμ x˙ μ + 21 θ Ai θ˙iA implies the same Fock space structure as in the hyperKähler case (see in particular formulæ (9)–10)). The Dirac Hilbert space is therefore again sections of the antisymmetric sp(2n) tensor bundle ∧E. The (quantized) supercharges Q iα and Lefschetz–Verbitsky generators take the same form as in the analysis of the hyperKähler sigma model in Sect. 5.2. The Hamiltonian H receives a curvature correction term (implied by the four-fermi term in the action (17) proportional to the lapse N ). Again these charges may all be quantized with orderings obtained by ensuring that the quantum algebra of constraints is first class. The Dirac quantization of the model then amounts simply to imposing the conditions H = Q iα = 0 on wavefunctions valued in (∧E). (The gauged Lefschetz–Verbitsky model incurs the additional constraint f i j = 0.) We pay little attention to an analysis of this quantum system because it suffers a certain deficiency which we now explain, and will remedy in the next section by means of a BRST analysis:

860

D. Cherney, E. Latini, A. Waldron

On a quaternionic Kähler manifold we must remember that the spin connection has both sp(2n) and sp(2) valued parts which couple naturally to the respective generators T AB and tαβ . However, from the spinning degrees of freedom θ Ai of this model, we can only build a representation of the sp(2n) generators T AB . On the one hand, this seems sufficient because acting on ∧E-sections, we still have iπμ = ∇μ . But, acting with a supersymmetry generator Q iα introduces an sp(2) index α, and we seem to have no way, in the spinning particle model context, to obtain further covariant derivatives acting correctly on α. A geometer might consider constructing supersymmetry-like operators built from the covariant derivative by fiat (and in fact, the geometric calculus Sect. 8 of this paper can be taken on its own and read this way). However, there is a very natural physical mechanism to introduce additional spinning degrees of freedom that can represent the sp(2) generators tαβ . In fact, this is precisely what BRST quantization of the model does. 7. BRST and the Geometry of Ghosts The one dimensional quaternionic Kähler spinning particle model enjoys local worldline supersymmetry and reparameterization invariances. This implies that they form a first class algebra (even though the supercharges do not commute with the Hamiltonian unlike those in the hyperKähler sigma model where they generate genuine symmetries). In this section we present the nilpotent, quantum, BRST charge for this algebra. Again, unlike the hyperKähler model, this constraint algebra is higher rank; it does not form a Lie algebra. This means that, in principle, we need to resort to homological perturbation methods to construct the BRST charge. (The reader may consult [15] for a detailed account of the analysis of gauge theories using BRST techniques and in particular the construction of a nilpotent BRST charge for higher rank algebras.) Although standard, such a computation is rather involved, so instead we present a solution relying on the underlying quaternionic geometry. The general structure of the BRST charge we search for is given by expanding it in powers of the worldline reparameterization ghost c and its antighost b represented ∂ as ∂c , Q BRST = c D + Q − M

∂ . ∂c

(20)

If our constraint algebra were a Lie algebra (as it is in the hyperKähler case), the operator D would be the worldline Hamiltonian and Q the contraction of the supercharges with commuting supersymmetry ghosts ciα . However, since we have a higher rank constraint algebra, we must add terms with higher powers of ghosts and antighosts. We determine these by making a simple geometric ansatz for Q and then requiring nilpotency of Q BRST . The key geometric idea is that ghosts and antighosts can be used to represent the sp(2) special holonomy generators. The quantized commuting superghosts ciα and superantighosts bαi with algebra β

[bαi , c j ] = δ ij δαβ j

j

(21)

allow formation of bilinears cαi bβ − cβ bαi that generate a faithful representation of so(2, 2), the R-symmetry algebra of our first class constraint superalgebra, on the ghosts

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

861

(and/or antighosts). Specializing to the Howe dual subalgebras generated by f i j = −2c(iα b j)α , gh

αβ

β)

tgh = −2ci(α b i ,

(22)

we obtain representations of the Lefschetz–Verbitsky and H -bundle special holonomy sp(2) algebras, respectively. (We will discuss the precise definition of the superghost Hilbert space at the end of this section, but for now concentrate on building a nilpotent BRST charge.) This means that we can solve the problem of the covariant momentum operator πμ discussed in the previous section—namely that it was not covariantized with respect to the sp(2) holonomy—by using the above ghost representation for tαβ . So we now construct a covariant momentum operator i i μ ≡ pμ − μ BA T AB − ωμ αβ tαβ , 2 2

(23)

which acts on both E and H bundles. (In some sense, the ghosts play the rôle of frames for the bundle H .) In turn we introduce BRST-extended supersymmetry charges θ Ai V μ αA μ and consider the ansatz A i η Q ≡ iciα V μ α A μ ∂ ∂η A

for the form of Eq. (20). Before proceeding, it is worth noting that we have actually found a new Dirac operator: Reunifying sp(2) and sp(2n) indices as a single so(2n, 2n) index m = Aα and forming the combination γ m = cαi η A ∂η∂ A , i

we find a Clifford algebra {γ m , γ n } = M ηmn , 1 M ≡ cαi cαi . 2 Since the covariant momentum (23) acts as the covariant derivative, a Dirac-type operator follows Q = γ m ∇m .

(24)

Returning to our BRST charge computation, a simple Weitzenbock-like calculation10 shows Q2 = MD,

(25)

where the BRST-extended Hamiltonian is 1 n gh ij 2 D = − ( f i j + f gh )( f i j + f i j ) − (n + 2). 4 2 10 Note that the computation of the term coupling the curvature to two Dirac matrices relies heavily on γ m being a composite built from ghosts and spinning degrees of freedom.

862

D. Cherney, E. Latini, A. Waldron

In this expression, = + 41 (T 2 + t 2 ) is a quaternionic Kähler Lichnerowicz wave operator, which will be introduced in Sect. 8. It satisfies [, Q] = 0. Further, since f i j gh ij and f gh obey [ f i j + f i j , ckα Q kα ] = 0 and the latter commutes with11 M, we have the following identities [D, M] = [Q, D] = [Q, M] = Q2 − MD = 0.

(26)

These immediately imply that the BRST charge (20) is nilpotent. The form of this BRST charge is exactly suited to the detour quantization methods of [36]. To that end we next specify our choice of ghost vacuum. We represent the ghost algebra (21) in a Fock representation by splitting the ghosts and antighosts into derivatives and power series coordinate coefficients. The choice of vacuum is determined by splitting the Verbitsky–Lefschetz doublets as ∂ ∂ ciα = z α , biα = − p α . (27) ∂ pα ∂z α Therefore we may view (z α , p α ) as creation operators for symmetric H -bundle indices. So states in the superghost extended Hilbert space are sections of A1 t (∧E ⊗ (H )⊗2 ) ≡ φ A1 ...Ak βα11 ...β · · · η Ak z α1 · · · z αs pβ1 · · · pβt |0 ...αs (x) η (β ...β )

= |φ[A1 ...Ak ] (α11 ...αts ) = ⎧ ⎨

k

⎩

"

⊗

t #$

%

.

⊗ $ %" # s

In the Young diagram notation the column denotes antisymmetrized E-indices while the rows are symmetrized H -indices. We now have a well-defined BRST cohomology. Before analyzing it via BRST detour methods, we take a short geometric excursion to develop a quaternionic calculus of the various operators that will appear in those results. 8. A Quaternionic Geometric Calculus On a d-dimensional Einstein manifold the Riemann tensor decomposes as Rμνρσ =

2 (gμρ gνσ − gνρ gμσ ) + Wμνρσ . (d − 1)(d − 2) $ %" # $ %" # Constant Curvature

Weyl

The special constant curvature case—when the Weyl tensor vanishes—enjoys many distinguishing properties, including a Lichnerowicz wave operator which commutes with generalized gradient and divergence operators acting on tensors of very general types. Comparing this formula with the one for the quaternionic Kähler Riemann tensor in (3) we see that the totally symmetric tensor ABC D plays a rôle similar to the Weyl tensor;12 11 In fact, linear combinations of the ghost bilinears mentioned below Eq. (21) are precisely those which commute with M. 12 In fact, in four dimensions it plays the rôle of the anti-self dual Weyl tensor [1,24].

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

863

if we could somehow find a “regime” in which it did not contribute we might be able to analyze quaternionic Kähler geometry along lines similar to the constant curvature case. In fact, exactly such a regime does exist, namely sections of the product of ∧E with the tensor bundle T H (with sections being arbitrary H -tensors) (E ⊗ T H ) φ[A1 ...Ak ] α1 ...αs ,

φ[A1 ...Ak ]

∈

∈

the idea being that antisymmetry in sp(2n) indices prevents the totally symmetric tensor ABC D from contributing. In particular, the central operations will be the quaternionic generalizations of the Dolbeault operators α d : (E ⊗ T H ) −→ (E ⊗ T H )⊗2 δα α1 ...αs

→

α φ α1 ...αs ∇[A 1 A2 ...Ak+1 ]

k∇ αA φ A [A2 ...Ak ] α1 ...αs

.

These operators are motivated by the quantized supersymmetry charges of the previous sections, but are more general since they can act on arbitrary H -tensors. For computations, it is often useful to adopt a hybrid E-index free notation where φ[A1 ...Ak ] α1 ...αs → α1 ...αs = φ A1 ...Ak α1 ...αs η A1 · · · η Ak , dα = η A ∇ αA , δ α = −∇ α A

∂ , ∂η A

and the Grassmann variables η A play the rôle of the anticommuting differentials d x μ employed in the theory of differential forms. The non-dynamical Lefschetz–Verbitsky charges g N−n ij f = N − n tr act exactly as described in 5.2 on the antisymmetric E-indices (with the same expressions in terms of η’s), namely adding or removing pairs of antisymmetrized indices using the invariant tensor J AB or counting indices. In terms of these dα , δ α obey a very elegant algebra 1 {dα , dβ } = − g t αβ , 2 1 αβ 1 α β {d , δ } = ε ( − c) − t αβ (N − n), 2 2 1 {δ α , δ β } = − tr t αβ , 2

(28)

where c is again the Lefschetz–Verbitsky sp(2) Casimir operator of (15). These formulæ can be repackaged even more simply by noticing that the operator 2 1 1 T = T AB T AB = + T 2 + t 2 , with t 2 = tαβ t αβ 4 4

864

D. Cherney, E. Latini, A. Waldron

commutes with dα and δ α . This is an extremely important result, so we shall call a quaternionic Kähler Lichnerowicz wave operator. Its existence validates our claim that by studying the bundle ∧E ⊗ T H , quaternionic Kähler geometry could be made to mimic its constant curvature counterpart. Specialized to totally symmetric H -tensors, the operators (dα , δ α ) coincide with the action of the BRST-extended supersymmetry charges in Sect. 7, therefore we adopt the suggestive notation dα Qiα = , δα and call these operators generalized supercharges. We may now unify the algebra (28) as j

{Qiα , Qβ } =

1 & − 1 f i j tαβ , εαβ i j 2 2

with & ≡ − 1 f i j f i j − 1 tαβ t αβ − n (n + 2). 4 4 2 It is interesting to note that these formulæ enjoy a complete symmetry when all H -indices α, β, . . . are exchanged with their Lefschetz–Verbitsky counterparts i, j, . . .. This symmetry appears more starkly when we compute the products of generalized supercharges j

Qiα Qβ =

1 & − 1 f i j tαβ − 1 εαβ bi j − 1 i j bαβ , εαβ i j 4 4 2 2

where we have defined the bilinears bi j ≡ Q(iα Q j)α ,

i bαβ ≡ Qi(α Qβ) .

Observe that, since the generalized supercharges form sp(2) doublets under Lefschetz– Verbitsky and H -symmetries [ f i j , Qkα ] = ki Qαj + k j Qiα ,

[tαβ , Qiγ ] = εγ α Qiβ + εγβ Qiα ,

the six charge bilinears bαβ and bi j form two adjoint sp(2) triplets. This leads one to wonder whether these operators form a pair of sp(2) algebras when commuted among themselves. This question is particularly pressing when we observe that the operator dα dα + g, coincides with that introduced by Baston in his construction of quaternionic analogues of Dolbeault cohomology on quaternionic Kähler manifolds. In fact, this operator is one of a triplet of operators Bi j = bi j + f i j which we shall call Baston operators. This structure of R-symmetry groups represented in terms of bilinears in supercharges has appeared before [24]. For example, for differ¯ − 2∂δ − ential forms on a Kähler manifold, bilinears in the Dolbeault operators {δ δ, ¯ ∂ ∂} ¯ obey an sp(2) Lie algebra (up to an overall factor of the central form Laplacian 2∂¯ δ, on the right-hand side of commutators). Also a similar phenomenon holds for more

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

865

Fig. 2. The quaternionic Kähler calculus

general orthosymplectic algebras [7]. Moreover, the Kähler result immediately implies the same algebra for the bi j on hyperKähler manifolds. In the more general quaternionic Kähler case one no longer finds a Lie algebra built from bi j but instead the following rather interesting deformation thereof:13 & − 2) B j)l) + B j)l) ( & − 2) − f j)l) (bαβ t αβ + 1 t 2 ) . [Bi j , Bkl ] = (i(k ( 2 The Weyl ordering on the right-hand side is necessary because (as opposed to the qua& is not central. Note ternionic Kähler Lichnerowicz wave operator ) the operator that the operators bαβ + tαβ obey an analogous algebra, thanks to the aforementioned symmetry between H -indices and Lefschetz–Verbitsky ones. The main formulæ of this section are summarized in Fig. 2. We now orchestrate these geometric results with our BRST detour techniques to construct our main result, a gauge invariant quaternionic Kähler quantum field theory. 9. The Quaternionic Kähler Detour Complex The BRST detour quantization formalism presented in [36], takes as its input a BRST charge of the form (20), together with a representation of the underlying constraint algebra acting on sections of a bundle over some manifold M, and outputs a classical field 13 It would be interesting to investigate whether the last terms in this formula can be absorbed by replacing & with the BRST Hamiltonian. Of course, this could only be the case specializing to the BRST the operator superghost Hilbert space of the previous section.

866

D. Cherney, E. Latini, A. Waldron

theory on M. The equation of motion, gauge invariances, and Bianchi identities are concisely summarized in a detour complex Q Q Q Q Gauge Gauge Equations of Bianchi/Noether −→ −→ · · · . · · · −→ parameters −→ fields motion/currents identities ⏐ D−Q M −1 Q ⏐ The · · · on the ends of the complex describe any gauge for gauge symmetries and their accompanying Bianchi for Bianchi identities. The models described by the above complex depend on towers of gauge fields (possibly infinitely many for the case when the constraint algebra contains Grassmann odd generators). There are cases when these towers of gauge fields have a simple geometric interpretation (including the quaternionic Kähler models described here–see our conclusions for a discussion of this point). These towers of gauge fields arise because the physical cohomology retains a dependence on certain bilinears in ghosts. Generically it is desirable to remove this ghost dependence; this can be achieved by gauging further combinations of R symmetries (the “ghostbusting” procedure of [36]). This leads to more standard physical models with equations of motion and local invariances of the form ( + · · · )A = 0,

δ A = Dα,

where is typically the Laplace operator, A denotes some type of gauge field, and the operator D generates its gauge invariance. The · · · ’s stand for terms required for the equation of motion to be gauge invariant. The operator + · · · can be expressed in a simple “Labastida” form (a name which refers to its origin in the theory of higher spin theories) or equivalently as a self-adjoint “Einstein operator” (this name was chosen since the linearized Einstein tensor is one of the simplest examples). The latter form immediately implies a gauge invariant action principle. Let us now apply these results to the model at hand, we focus on the main formulæ, referring the reader to the articles [36] for detailed derivations of the underlying methodology. Firstly the “long operator” D − QM −1 Q can be defined as acting on wavefunctions (y) ∈ ∧E[y] built from polynomials in a commuting bilinear in superghosts y = 2z α pα with coefficients in (∧E) (because this space forms the ghost number zero kernel of the operator M). Explicitly it yields a gauge invariant equation of motion gh

Bi j f i j = 0,

(29) gh

where, acting on functions of only y, the operators f i j have the simple expression gh fi j

y = −2(y∂ y + 1)

− 2(y∂ y + 1) 4(y∂ y2 + 2∂ y )

.

This model is but a stepping stone to our theory of interest, obtained by also gauging the Lefschetz–Verbitsky generator tr. This choice may seem ad hoc, but is well known in the higher spin literature (for example, it is necessary to obtain the linearized Einstein tensor in the case of a spin 2 theory). In particular it removes all dependence of the

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

867

physical cohomology on the ghost bilinear y. The physical gauge fields now take values in ∧E only. In fact, gauging the R-symmetry tr amounts to restricting the y dependence of (y) in the detour complex to √ I1 ( ytr) = √ ϕ, ϕ ∈ ∧E, ytr and pushing the long operator in (29) past the operator-valued Bessel function yields the very simple “Labastida” equation of motion tr (dα dα + g) ϕ = 0.

(30)

In particular, notice that this equation factorizes as the product of tr with the operator discovered long ago by Baston [24]. In fact this gauge theory, on a quaternionic Kähler manifold mimics the higher form ( p, q)-form Kähler Electromagnetism theory presented in [52] (observe the correspondence between the Dolbeault bilinear ∂ ∂¯ and the Baston operator dα dα + g). The Labastida equation of motion enjoys the Maxwell like gauge invariance δϕ = dα ξα , thanks to the identity

dα dα + g dβ ξβ = 0, first uncovered by Baston [24]. In fact the Labastida equation of motion has further gauge for gauge symmetries and accompanying Bianchi for Bianchi identites. These are most easily displayed by writing the Labastida equation of motion in a form following from the variation of an action. This is achieved by constructing the self-adjoint Einstein operator14 √ √

I1 ( g tr) I1 ( g tr) G = : √ : tr dα dα + g = δ α δ α + tr g : √ : = G∗ , 2 g tr 2 g tr in terms of which the Labastida equation of motion is equivalent to the “Einstein” equation of motion Gϕ = 0. The Einstein operator has the compact, and manifestly self-adjoint expression

√ G = : I0 ( g tr) dα δ α + δ α dα + 2 (N − n) −2

√ I1 ( g tr) √ g tr

(dα dα + g) tr + g (δ α δ α + tr) :

In all the above formulæ, normal ordering denoted by : • : puts all factors of g and tr to the far left and right, respectively and we have restored the dependence on the scalar 14 The derivation of this result is described in [36,42,43] and amounts to composing the long operator with the Bessel series to balance its appearance on the right in [42,43] and fixing y-independent representatives of coker (y + g).

868

D. Cherney, E. Latini, A. Waldron

curvature through so that the → 0 hyperKähler limit is manifest. It is important to note that this operator acts on sections of ∧E of arbitrary degree. Therefore, the equation of motion we write down is really the generating function for the equations valid at any degree and in arbitrary dimensions, this is what necessitates the operator-valued Bessel functions. Given the Einstein operator, we can now express the equations of motion, gauge and gauge for gauge invariances, Bianchi and Bianchi for Bianchi identities neatly in a single complex D D F F · · · −→ ∧E ⊗ H −→ · · · · · · −→ ∧E ⊗ H −→ · · · . ⏐ ⏐ G

(31)

Here the operators D and F are closely related to the Dirac and Dirac–Fueter operators introduced by Baston [24]. Explicitly, they act on sections of ∧E ⊗ H as D : φ A1 ...Ak α1 ...αs → s∇[Aα 1 φ A2 ...Ak+1 ]α α1 ...αs−1 ,

(32) α2 ...αs+1 ) 1 A F : φ A1 ...Ak α1 ...αs → k∇ (α . A φ A1 ...Ak−1 ' s A1 · · · η Ak z · · · z ∈ ∧E ⊗ H , In an index free notation where = k,s φ αA11...α α1 αs ...Ak η we may simply write D = η A ∇α A

∂ ∂ = dα , ∂z α ∂z α

F = z α ∇ αA

∂ = zα δα . ∂η A

Both these operators are nilpotent by virtue of the algebra (28) and the identity t αβ ψαβγ1 ···γs = 0. Moreover, (dα dα + g) D = 0 = F (δ α δ α + tr), verify the veracity of the complex (31). The incoming complex with differential D can be viewed as the quaternionic generalization of the Dolbeault complex [24], while the outgoing complex with differential F is its dual (i.e. the Dirac–Fueter type operator F is a codifferential). Physically they encode gauge invariances and Bianchi identities. The Einstein operator G gives the detour connecting the two complexes and, physically, the equations of motion. Notice also, that it can connect the equations of motion at any degree in ∧E or H , so gauge potentials are generic sections of ∧E ⊗ H . The mathematical elegance of this model is perhaps surprising, but even more remarkable is its rôle as the arena for a minisuperspace quantization of N = 2 supersymmetric black holes. We further discuss this and other possible applications of our theory in the conclusions. 10. Conclusions The results presented in this paper rely on an analogy between (i) differential forms on a Kähler manifold, (ii) tensors on a constant curvature manifold and (iii) the bundle ∧E ⊗ T H over a quaternionic manifold obtained by splitting its tangent bundle using the sp(2n) ⊗ sp(2) special holonomy and then taking antisymmetric sections of the sp(2n) part E

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

869

Fig. 3. A map of the physical models encountered in this paper

along with arbitrary H -tensors. The analogy with Kähler differential forms holds because the natural geometric operators on this bundle are in correspondence with the Dolbeault operators and the generators of the Lefschetz symmetry of Dolbeault cohomology. There is a relation to constant curvature manifolds because, acting on sections of ∧E, only the covariantly constant part of the quaternionic Kähler Riemann tensor contributes. This means that the properties of the geometric operators we have studied are algebraically similar to the Lichnerowicz wave operator and the set of geometric operators that commute with it on a constant curvature manifold. In fact a main result of this paper is the geometric calculus of operators, including a central wave operator, acting on (∧E ⊗T H ). Remarkably, this seemingly purely mathematical structure was motivated by a study of supersymmetric black holes in four dimensional spacetime. The route from four dimensional black holes to local quantum field theories on quaternionic Kähler manifolds is sketched in Fig. 3. It began with N = 2 SUGRA in four dimensions. Reducing along an isometry and specializing to spherical symmetry led to a spinning model with four local worldline supersymmetries. Thanks to the c-map this spinning particle moves in a quaternionic Kähler manifold. Moreover, fermionic degrees of freedom were retained in order that the BPS conditions of the spinning particle model corresponded to the reduced ones of the four dimensional SUGRA, and therefore in turn to the linear evolution equations of the attractor mechanism. We then studied the quantization of this model through BRST detour methods. This led to the gauge invariant equation of motion (29). Let us make a few remarks on this model. Given a 4n-dimensional quaternionic Kähler manifold, it is always possible to find a 4n + 4 dimensional hyperKähler manifold whose metric is a quaternionic cone over the

870

D. Cherney, E. Latini, A. Waldron

original 4n-dimensional model [71–75]. In the work [71], the dimensionally reduced supersymmetry parameters of the four dimensional SUGRA were shown to correspond to the extra four coordinates required to build a 4n + 4 dimensional hyperKähler cone over the quaternionic Kähler, stationary, spherically symmetric, black hole moduli space. However, in BRST quantization the ghosts correspond to the local gauge parameters, in particular the superghosts play the rôle of the supersymmetry parameters. Hence, the model (29), where we made no additional gaugings to eliminate ghosts, really should be viewed as a model on the hyperKähler cone. This explains the third signpost on the roadmap 3. The next stop on the roadmap was motivated by ideas from higher spin models. In particular, our aim was to write down a model where all ghosts had been eliminated from the physical cohomology. Based on ideas coming from our earlier work on orthosymplectic constraint algebras, we suspected that gauging the Lefschetz–Verbitsky trace operator would lead to a gauge invariant quantum field theory generalizing both p-form electromagnetism and ( p, q)-form Kähler electromagnetism to quaternionic Kähler manifolds. This hunch was correct and led to the model (31). Interestingly enough, it could have been the case that this choice of route would lead to a model that did not describe supersymmetric black holes. However, it is clear that in fact the quaternionic Kähler model does so, and in a fascinating way. Examining the Labastida form of the equation of motion (30) we see that it is a product of the Baston operator and the Lefschetz–Verbitsky trace operator. As shown in [25], by explicitly constructing the quaternionic Penrose transform underlying Baston’s quaternionic generalization of the Dolbeault complex, at least in the scalar sector of ∧E, zero modes of the Baston operator correspond to supersymmetric black hole states. We suspect that within BRST quantization, this picture can be extended to a general correpsondence with the Baston complex. In this case, solutions to our quaternionic Kähler electromagnetism theory would fall into two classes: 1. BPS solutions in the kernel of dα dα + g. 2. Solutions whose non-vanishing image under dα dα + g lies in the kernel of tr. This explains the last signpost of the roadmap (3). Clearly our work opens many avenues for further study: Firstly, since our BRST quantization methods produce a gauge theory on the hyperKähler cone and furthermore rely on a polarization where one fourier transforms over half the ghost variables (alias quaternionic cone coordinates), there should exist a rather direct relationship between BRST quantization and the quaternionic twistor methods of [25]. Secondly, our quaternionic Kähler higher form electromagnetism may provide an interesting arena for further studies of minisuperspace black hole quantization. One might hope that constructing interactions for this abelian gauge theory could lead to a far more detailed understanding of these theories (perhaps along the lines of the multicentered configuration and attractor flow trees—“third quantization” [76]). This might sound extremely ambitious, since higher spin interactions are fraught with inconsistencies. However, it is possible that some of the methods of Vasiliev, who has constructed three point higher spin interaction using a combination of unfolding techniques (which are closely related to our BRST framework) and Chern–Simons like equations of motions based on a star product, could solve this problem. Also, we cannot help but remark that whenever two seemingly disparate fields (such as higher spin interactions and four dimensional black hole physics) turn out to be related, oftentimes the flow of new ideas is

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

871

bidirectional. In fact, we suspect that higher quantum corrections to N = 2 supergravities in four dimensions could even have implications for possible higher spin interactions. Finally, another topic that is worth further investigation is the novel Dirac operator in (24). This operator acts on the BRST superghost Hilbert space; in the context of this paper it was merely a tool for constructing a nilpotent BRST charge. However, we suspect that it might have a distinguished rôle to play. In particular, it would be fascinating to compute the Witten index of this operator. Given that it was built from a supersymmetric quantum mechanical model, standard quantum methods may suffice for this. Acknowledgements. A.W. would like to thank Andy Neitzke and Boris Pioline for an early collaboration on this work, as well as many absolutely invaluable discussions. We would also like to thank Fiorenzo Bastianelli, Roberto Bonezzi, Olindo Corradini, Dmitry Fuchs, Carlo Iazeolla and Albert Schwarz for useful discussions and comments. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References 1. Salamon, S.M.: Differential geometry of quaternionic manifolds. Ann. Sc. Ec. Norm. Sup. 19, 31 (1986) 2. Kronheimer, P.B., Nakajima, H.: Yang-Mills instantons on ALE gravitational instantons. Math. Ann. 288, 263 (1990) 3. Hitchin, N.J.: The self-duality equations on a Riemann surface. Proc. London Math. Soc. 55, 59 (1987) 4. Hitchin, N.J., Karlhede, A., Lindström, U., Roˇcek, M.: Hyperkähler metrics and supersymmetry. Commun. Math. Phys. 108, 535 (1987) 5. Kapustin, A., Witten, E.: Electric-magnetic duality and the geometric Langlands program. http://arxiv. org/abs/hep-th/0604151v3, 2007 6. Bagger, J., Witten, E.: Matter couplings in N = 2 supergravity. Nucl. Phys. B 222(1), 1–10 (1983) 7. Cherney, D., Latini, E., Waldron, A.: Generalized Einstein Operator Generating Functions. Phys. Lett. B 682, 472 (2010) 8. Gunaydin, M., Neitzke, A., Pioline, B., Waldron, A.: Quantum Attractor Flows. JHEP 0709, 056 (2007) 9. Gunaydin, M., Neitzke, A., Pioline, B., Waldron, A.: BPS black holes, quantum attractor flows and automorphic forms. Phys. Rev. D 73, 084019 (2006) 10. Pioline, B.: Lectures on black holes, topological strings and quantum attractors. Class. Quant. Grav. 23, S981 (2006) 11. Bellucci, S., Ferrara, S., Marrani, A.: Supersymmetric mechanics. Vol. 2: The attractor mechanism and space time singularities. Lect. Notes Phys. 701, Berlin-Heidelberg-New York: Springer-Verlag, 2006 12. Ferrara, S., Kallosh, R.: Universality of supersymmetric attractors. Phys. Rev. D 54, 1525–1534 (1996) 13. Ferrara, S., Gibbons, G.W., Kallosh, R.: Black holes and critical points in moduli space. Nucl. Phys. B 500, 75–93 (1997) 14. Strominger, A., Vafa, C.: Microscopic origin of the Bekenstein-Hawking entropy. Phys. Lett. B 379, 99–104 (1996) 15. Henneaux, M., Teitelboim, C.: Quantization of Gauge Systems. Princeton, NJ: Princeton University Press, 1994 16. Denef, F.: Supergravity flows and D-brane stability. JHEP 0008, 050 (2000) 17. Bellucci, S., Ferrara, S., Gunaydin, M., Marrani, A.: SAM Lectures on Extremal Black Holes in d = 4 Extended Supergravity. http://arxiv.org.abs/0905.3739v1 [hep-th], 2009 18. Gunaydin, M.: Lectures on Spectrum Generating Symmetries and U-duality in Supergravity, Extremal Black Holes, Quantum Attractors and Harmonic Superspace. http://arxiv.org.abs/0908.0374V1 [hep-th], 2009 19. Ooguri, H., Strominger, A., Vafa, C.: Black hole attractors and the topological string. Phys. Rev. D 70, 106007 (2004) 20. Ooguri, H., Vafa, C., Verlinde, E.: Hartle-Hawking wave-function for flux compactifications. Lett. Math. Phys. 74, 311–342 (2005) 21. Gutperle, M., Spalinski, M.: Supergravity instantons for N = 2 hypermultiplets. Nucl. Phys. B 598, 509–529 (2001) 22. Behrndt, K., Gaida, I., Lust, D., Mahapatra, S., Mohaupt, T.: From type IIA black holes to T-dual type IIB D-instantons in N = 2, D = 4 supergravity. Nucl. Phys. B 508, 659 (1997)

872

D. Cherney, E. Latini, A. Waldron

23. de Vroome, M., Vandoren, S.: Supergravity description of spacetime instantons. Class. Quant. Grav. 24, 509–534 (2007) 24. Baston, R.J.: Quaternionic complexes. J. Geom. Phys. 8, 29 (1992) 25. Neitzke, A., Pioline, B., Vandoren, S.: Twistors and black holes. JHEP 0704, 038 (2007) 26. Vasiliev, M.A.: Higher spin gauge theories in various dimensions. Fortsch. Phys. 52, 702 (2004) 27. Bekaert, X., Cnockaert, S., Iazeolla, C., Vasiliev, M.A.: Nonlinear higher spin theories in various dimensions. http://arxiv.org/abs/0503128v2, 2005 28. Witten, E.: Supersymmetry and Morse theory. J. Diff. Geom. 17, 661 (1982) 29. Fuchs, D.: Cohomology of Infinite-Dimensional Lie Algebras. Boston: Kluwer, 1986 30. Fuster, A., Henneaux, M., Maas, A.: BRST quantization: A short review. Int. J. Geom. Meth. Mod. Phys. 2, 939 (2005) 31. Siegel, W.: Boundary conditions in first quantization. Int. J. Mod. Phys. A 6, 3997 (1991) 32. Gelfond, O.A., Vasiliev, M.A.: Unfolding versus BRST and currents in Sp(2M) invariant higher-spin theory. http://arxiv.org/abs/1001.2585v2 [hep-th], 2010 33. Bastianelli, F., Corradini, O., Waldron, A.: Detours and Paths: BRST Complexes and Worldline Formalism. JHEP 0905, 017 (2009) 34. Bastianelli, F., Corradini, O., Latini, E.: Spinning particles and higher spin fields on (A)dS backgrounds. JHEP 0811, 054 (2008) 35. Bastianelli, F., Corradini, O., Latini, E.: Higher spin fields from a worldline perspective. JHEP 0702, 072 (2007) 36. Cherney, D., Latini, E., Waldron, A.: BRST Detour Quantization. J. Math. Phys 51, 062302 (2010) 37. Vasiliev, M.A.: Consistent equations for interacting massless fields of all spins in the first order in curvatures. Ann. Phys. 190, 59 (1989) 38. Vasiliev, M.A.: Higher spin gauge theories: Star-product and AdS space. http://arxiv.org/abs/hep-th/ 9910096v1, 1999 39. Barnich, G., Grigoriev, M., Semikhatov, A., Tipunin, I.: Parent field theory and unfolding in BRST first-quantized terms. Commun. Math. Phys. 260, 147 (2005) 40. Barnich, G., Grigoriev, M.: Parent form for higher spin fields on anti-de Sitter space. JHEP 0608, 013 (2006) 41. Alkalaev, K.B., Grigoriev, M., Tipunin, I.Y.: Massless Poincare modules and gauge invariant equations. http://arxiv.org/abs/0811.3999v2 [hep-th], 2009 42. Campoleoni, A., Francia, D., Mourad, J., Sagnotti, A.: Unconstrained Higher Spins of Mixed Symmetry. I. Bose Fields. Nucl. Phys. B 815, 289 (2009) 43. Campoleoni, A., Francia, D., Mourad, J., Sagnotti, A.: Unconstrained Higher Spins of Mixed Symmetry. II. Fermi Fields. http://arxiv.org/abs/0904.4447v2 [hep-th], 2009 44. Sorokin, D.: Introduction to the classical theory of higher spins. AIP Conf. Proc. 767, 172 (2005) 45. Bouatta, N., Compere, G., Sagnotti, A.: An introduction to free higher-spin fields. http://arxiv.org/abs/ hep-th/0409068v1, 2004 46. Branson, T., Gover, A.R.: Conformally invariant operators, differential forms, cohomology and a generalisation of Q-curvature. http://arxiv.org/abs/math/0309085v2 [math.D6], 2003 47. Gover, A.R., Šilhan, J.: Conformal operators on forms and detour complexes on Einstein manifolds. Commun. Math. Phys. 284, 291 (2008) 48. Gover, A.R., Somberg, P., Soucek, V.: Yang-Mills detour complexes and conformal geometry. Commun. Math. Phys. 278, 307 (2008) 49. Gover A.R., Hallowell K., Waldron A.: Higher spin gravitational couplings and the Yang-Mills detour complex. Phys. Rev. D 75, 024032 (2007) 50. Griffiths, P., Harris, J.: Principles of algebraic geometry. NewYork: Wiley, 1978 51. Figueroa-O’Farrill, J.M., Kohl, C., Spence, B.J.: Supersymmetry and the cohomology of (hyper)Kaehler manifolds. Nucl. Phys. B 503, 614 (1997) 52. Cherney, D., Latini, E., Waldron, A.: (p,q)-form Kaehler Electromagnetism. Phys. Lett. B 674, 316 (2009) 53. Marcus, N., Yankielowicz, S.: The topological B model as a twisted spinning particle. Nucl. Phys. B 432, 225 (1994) 54. Marcus, N.: Kähler spinning particles. Nucl. Phys. B 439, 583 (1995) 55. Bastianelli, F., Bonezzi, R.: U (N ) spinning particles and higher spin equations on complex manifolds. JHEP 0903, 063 (2009) 56. Bastianelli, F., Bonezzi, R.: U(N|M) quantum mechanics on Kaehler manifolds. http://arxiv.org/abs/1003. 1046v2 [hep-th], 2010 57. Bellucci, S., Nersessian, A.: A note on N = 4 supersymmetric mechanics on Kaehler manifolds. Phys. Rev. D 64, 021702 (2001) 58. Bellucci, S., Nersessian, A.: Kaehler geometry and SUSY mechanics. Nucl. Phys. Proc. Suppl. 102, 227 (2001)

Quaternionic Kähler Detour Complexes and N = 2 Supersymmetric Black Holes

873

59. Bellucci, S., Krivonos, S., Nersessian, A.: N = 8 supersymmetric mechanics on special Kaehler manifolds. Phys. Lett. B 605, 181 (2005) 60. Breitenlohner, P., Gibbons, G.W., Maison, D.: Four-dimensional black holes from Kaluza-Klein theories. Commun. Math. Phys. 120, 295 (1988) 61. Ferrara, S., Sabharwal, S.: Quaternionic manifolds for type II superstring vacua of Calabi-Yau spaces. Nucl. Phys. B 332, 317 (1990) 62. Günaydin, M., Sierra, G., Townsend, P.K.: Exceptional supergravity theories and the magic square. Phys. Lett. B 133, 72 (1983) 63. Günaydin, M., Sierra, G., Townsend, P.K.: The geometry of N = 2 Maxwell-Einstein supergravity and Jordan algebras. Nucl. Phys. B 242, 244 (1984) 64. Cecotti, S., Ferrara, S., Girardello, L.: Geometry of type II superstrings and the moduli of superconformal field theories. Int. J. Mod. Phys. A 4, 2475 (1989) 65. Kastrup, H.A., Thiemann, T.: Canonical quantization of spherically symmetric gravity in Ashtekar’s selfdual representation. Nucl. Phys. B 399, 211–258 (1993) 66. Kuchar, K.V.: Geometrodynamics of Schwarzschild black holes. Phys. Rev. D 50, 3961–3981 (1994) 67. Cavaglia, M., de Alfaro, V., Filippov, A.T.: Hamiltonian formalism for black holes and quantization. Int. J. Mod. Phys. D 4, 661–672 (1995) 68. Hollmann, H.: Group theoretical quantization of Schwarzschild and Taub-NUT. Phys. Lett. B 388, 702–706 (1996) 69. Hollmann, H.: A harmonic space approach to spherically symmetric quantum gravity. http://arxiv.org/ abs/gr-qc/9610042v1, 1996 70. Breitenlohner, P., Hollmann, H., Maison, D.: Quantization of the Reissner-Nordström black hole. Phys. Lett. B 432, 293–297 (1998) 71. Swann, A.: Hyper-Kähler and quaternionic Kähler geometry. Math. Ann. 289(3), 421–450 (1991) 72. LeBrun, C., Salamon, S.: Strong rigidity of positive quaternion-Kähler manifolds. Inventiones Mathematicae 118, 109 (1994) 73. de Wit, B., Roˇcek, M., Vandoren, S.: Hypermultiplets, hyperkähler cones and quaternion-Kähler geometry. JHEP 02, 039 (2001) 74. Galicki, K.: A generalization of the momentum mapping construction for quaternionic Kähler manifolds. Comm. Math. Phys. 108(1), 117–138 (1987) 75. de Wit, B., Rocek, M., Vandoren, S.: Gauging isometries on hyperKähler cones and quaternion-Kähler manifolds. Phys. Lett. B 511, 302–310 (2001) 76. Giddings, S.B., Strominger, A.: Baby universes, third quantization and the cosmological constant. Nucl. Phys. B 321, 481 (1989) Communicated by A. Kapustin

Communications in Mathematical Physics - Volume 221

Read more

Communications in Mathematical Physics - Volume 220

Read more

Communications in Mathematical Physics - Volume 235

Read more

Communications in Mathematical Physics - Volume 223

Read more

Communications In Mathematical Physics - Volume 283

Read more

Communications In Mathematical Physics - Volume 270

Read more

Communications in Mathematical Physics - Volume 208

Read more

Communications in Mathematical Physics - Volume 186

Read more

Communications In Mathematical Physics - Volume 294

Read more

Communications in Mathematical Physics - Volume 217

Read more

Communications In Mathematical Physics - Volume 274

Read more

Communications in Mathematical Physics - Volume 239

Read more

Communications in Mathematical Physics - Volume 306

Read more

Communications in Mathematical Physics - Volume 264

Read more

Communications in Mathematical Physics - Volume 227

Read more

Communications in Mathematical Physics - Volume 184

Read more

Communications in Mathematical Physics - Volume 261

Read more

Communications in Mathematical Physics - Volume 225

Read more

Communications In Mathematical Physics - Volume 263

Read more

Communications in Mathematical Physics - Volume 211

Read more

Communications In Mathematical Physics - Volume 293

Read more

Communications in Mathematical Physics - Volume 246

Read more

Communications In Mathematical Physics - Volume 298

Read more

Communications in Mathematical Physics - Volume 234

Read more

Communications In Mathematical Physics - Volume 288

Read more

Communications in Mathematical Physics - Volume 304

Read more

Communications In Mathematical Physics - Volume 292

Read more

Communications in Mathematical Physics - Volume 233

Read more

Communications in Mathematical Physics - Volume 253

Read more

Communications in Mathematical Physics - Volume 222

Read more

Recommend Documents

Communications in Mathematical Physics - Volume 221

Commun. Math. Phys. 221, 1 – 26 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Evolution of a ...

Communications in Mathematical Physics - Volume 220

Commun. Math. Phys. 220, 1 – 12 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 On the Definiti...

Communications in Mathematical Physics - Volume 235

Commun. Math. Phys. 235, 1–45 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0778-0 Communications in Mathe...

Communications in Mathematical Physics - Volume 223

Commun. Math. Phys. 223, 1 – 12 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Resonance Expan...

Communications In Mathematical Physics - Volume 283

Commun. Math. Phys. 283, 1–24 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0556-8 Communications in Mathe...

Communications In Mathematical Physics - Volume 270

Commun. Math. Phys. 270, 1–12 (2007) Digital Object Identifier (DOI) 10.1007/s00220-006-0139-5 Communications in Mathe...

Communications in Mathematical Physics - Volume 208

Commun. Math. Phys. 208, 1 – 23 (1999) Communications in Mathematical Physics © Springer-Verlag 1999 Characters of C...

Communications in Mathematical Physics - Volume 186

Commun. Math. Phys. 186, 1-59 (1997) Communications in Mathematical Physics (~) Springer-Verlag1997 Meanders and the...

Communications In Mathematical Physics - Volume 294

Commun. Math. Phys. 294, 1–19 (2010) Digital Object Identifier (DOI) 10.1007/s00220-009-0920-3 Communications in Mathe...

Communications in Mathematical Physics - Volume 217

Commun. Math. Phys. 217, 1 – 31 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Integrable Stru...